Replication variance estimation after sample-based calibration
Section 4. Application

We return to the calibration problem encountered while bridging the two 2016 FHWAR surveys. For both surveys, the population is defined as individuals of ages 16 and older, living in U.S. households. The main data sources for this application are record-level data files, containing weights and replicate weights for both surveys. Using these datasets, we conducted an initial analysis and identified discrepancies in the demographics, which we adjusted by sample-based raking. Estimated population totals constructed using the record-level data from the National survey were considered as random controls, for the crosstabs of census divisions (nine categories) and each of the following demographic variables:

For the application in this article, we use the 50-State survey public use file, which does not contain information on income. Therefore, we illustrate the proposed method in a slightly simplified setting here, using the crosstabs of census divisions and residency, age, sex, and race-ethnicity as the raking dimensions. We implemented both the Fuller (1998) method and the proposed calibration method described in Section 2 using the public-use data files available for both surveys. For comparison, we also show the results of calibrating without adjusting the variance estimates for the random controls, referred to below as the “naive” method because it ignores the variability of the controls in the variance estimates. To compare the variance estimation methods for survey variables that are not control variables, we will also show estimates for domains defined by crosstabs of residency and sex.

While the replication methods of the two FHWAR surveys are different, they both use R = R C = 160 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOuaiaays W7caaI9aGaaGjbVlaadkfadaWgaaWcbaGaam4qaaqabaGccaaMe8Ua aGypaiaaysW7caaIXaGaaGOnaiaaicdaaaa@428A@ replicates. Referring to expression (2.1), the replication constant for the DAGJK method of the 50-State survey is A = 159 / 160 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyqaiaays W7caaI9aGaaGjbVpaalyaabaGaaGymaiaaiwdacaaI5aaabaGaaGym aiaaiAdacaaIWaaaaaaa@3F16@ and the corresponding constant for the SDR method of the National survey is A C = 4 / 160 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyqamaaBa aaleaacaWGdbaabeaakiaaysW7caaI9aGaaGjbVpaalyaabaGaaGin aaqaaiaaigdacaaI2aGaaGimaaaacaGGSaaaaa@3F45@ both available from their respective survey documentation. Hence, the replication adjustment constants a r MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyyamaaBa aaleaacaWGYbaabeaaaaa@37F0@ in (2.10) are equal to 2 / 159 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaSGbaeaaca aIYaaabaWaaOaaaeaacaaIXaGaaGynaiaaiMdaaSqabaaaaaaa@3911@ for r = 1, , R . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOCaiaays W7caaI9aGaaGjbVlaaigdacaaISaGaaGjbVlablAciljaaiYcacaaM e8UaamOuaiaac6caaaa@42AB@

The estimates we will consider are all estimated domain counts, so we define the target variable y i = I { i U d } MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEamaaBa aaleaacaWGPbaabeaakiaaysW7caaI9aGaaGjbVlaadMeadaWgaaWc baGaaG4EaiaadMgacaaMc8UaeyicI4SaaGPaVlaadwfadaWgaaadba GaamizaaqabaWccaaI9baabeaaaaa@4673@ for a domain of interest U d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyvamaaBa aaleaacaWGKbaabeaakiaac6caaaa@3892@ For the 144 domains defined by the raking dimensions, we write the estimated domain counts as t ^ k = s w i I { i U k } , k = 1, , 144. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGabmiDayaaja WaaSbaaSqaaiaadUgaaeqaaOGaaGjbVlaai2dacaaMe8+aaabeaeaa caaMc8Uaam4DamaaBaaaleaacaWGPbaabeaakiaadMeadaWgaaWcba GaaG4EaiaadMgacaaMc8UaeyicI4SaaGPaVlaadwfadaWgaaadbaGa am4AaaqabaWccaaI9baabeaaaeaacaWGZbaabeqdcqGHris5aOGaai ilaiaaysW7caWGRbGaaGjbVlaai2dacaaMe8UaaGymaiaaiYcacaaM e8UaeSOjGSKaaGilaiaaysW7caaIXaGaaGinaiaaisdacaGGUaaaaa@5D66@ Likewise, the control totals are estimated domain counts from the National survey, so the auxiliary variable vector is x i = I i , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaCiEamaaBa aaleaacaWGPbaabeaakiaaysW7caaI9aGaaGjbVlaahMeadaWgaaWc baGaamyAaaqabaGccaGGSaaaaa@3E93@ a vector of length 144 containing the indicators for inclusion of respondent i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyAaaaa@36D5@ in the control domains U k , k = 1, , 144 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyvamaaBa aaleaacaWGRbaabeaakiaaiYcacaaMe8Uaam4AaiaaysW7caaI9aGa aGjbVlaaigdacaaISaGaaGjbVlablAciljaaiYcacaaMe8UaaGymai aaisdacaaI0aGaaiilaaaa@4845@ and let t ^ C , k = s C w C i I { i U k } . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGabmiDayaaja WaaSbaaSqaaiaadoeacaaISaGaaGPaVlaadUgaaeqaaOGaaGjbVlaa i2dacaaMe8+aaabeaeaacaaMc8Uaam4DamaaBaaaleaacaWGdbGaam yAaaqabaGccaWGjbWaaSbaaSqaaiaaiUhacaWGPbGaaGPaVlabgIGi olaaykW7caWGvbWaaSbaaWqaaiaadUgaaeqaaSGaaGyFaaqabaaaba Gaam4CamaaBaaameaacaWGdbaabeaaaSqab0GaeyyeIuoakiaac6ca aaa@528F@ We denote the vector of control totals as t ^ C = ( t ^ C , 1 , , t ^ C , 144 ) T MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGabCiDayaaja WaaSbaaSqaaiaadoeaaeqaaOGaaGjbVlaai2dacaaMe8UaaGikaiqa dshagaqcamaaBaaaleaacaWGdbGaaGilaiaaykW7caaIXaaabeaaki aaiYcacaaMe8UaeSOjGSKaaGilaiaaysW7ceWG0bGbaKaadaWgaaWc baGaam4qaiaaiYcacaaMc8UaaGymaiaaisdacaaI0aaabeaakiaaiM cadaahaaWcbeqaaiaadsfaaaaaaa@4F68@ and the adjusted replicate control totals are t ^ C * ( r ) = t ^ C + 2 / 159 ( t ^ C ( r ) t ^ C ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGabCiDayaaja Waa0baaSqaaiaadoeaaeaacaGGQaGaaGikaiaadkhacaaIPaaaaOGa aGjbVlabg2da9iaaysW7ceWH0bGbaKaadaWgaaWcbaGaam4qaaqaba GccaaMe8Uaey4kaSIaaGjbVpaalyaabaGaaGOmaaqaamaakaaabaGa aGymaiaaiwdacaaI5aaaleqaaaaakiaaykW7caaIOaGabCiDayaaja Waa0baaSqaaiaadoeaaeaacaaIOaGaamOCaiaaiMcaaaGccaaMe8Ua eyOeI0IaaGjbVlqahshagaqcamaaBaaaleaacaWGdbaabeaakiaaiM cacaGGUaaaaa@5674@

In order to implement the Fuller (1998) method, we estimated the variance-covariance matrix of the control totals V ^ C ( t ^ C ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGabmOvayaaja WaaSbaaSqaaiaadoeaaeqaaOGaaGPaVlaaiIcaceWH0bGbaKaadaWg aaWcbaGaam4qaaqabaGccaaIPaaaaa@3CCB@ using the National survey replicate weights. The spectral decomposition of this matrix resulted in a set of 144 eigenvectors q i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaCyCamaaBa aaleaacaWGPbaabeaaaaa@37FB@ and associated eigenvalues λ i , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeq4UdW2aaS baaSqaaiaadMgaaeqaaOGaaiilaaaa@396F@ for i = 1, , 144. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyAaiaays W7caaI9aGaaGjbVlaaigdacaaISaGaaGjbVlablAciljaaiYcacaaM e8UaaGymaiaaisdacaaI0aGaaiOlaaaa@4402@ Following Fuller (1998), we obtain a set of 144 vectors v i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaGjcVlaahA hadaWgaaWcbaGaamyAaaqabaaaaa@3991@ satisfying

V ^ C ( t ^ C )= i=1 144 v i v i , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGabmOvayaaja WaaSbaaSqaaiaadoeaaeqaaOGaaGPaVlaaiIcaceWG0bGbaKaadaWg aaWcbaGaam4qaaqabaGccaaIPaGaaGjbVlabg2da9iaaysW7daaeWb qaaiaaykW7ceWH2bGbauaadaWgaaWcbaGaamyAaaqabaGccaaMc8Ua aCODamaaBaaaleaacaWGPbaabeaaaeaacaWGPbGaeyypa0JaaGymaa qaaiaaigdacaaI0aGaaGinaaqdcqGHris5aOGaaGilaaaa@5031@

where v i = λ i q i , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaGjcVlaahA hadaWgaaWcbaGaamyAaaqabaGccaaMe8UaaGypaiaaysW7daGcaaqa aiabeU7aSnaaBaaaleaacaWGPbaabeaaaeqaaOGaaGjcVlaaysW7ca WHXbWaaSbaaSqaaiaadMgaaeqaaOGaaiilaaaa@4650@ for i = 1, , 144. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyAaiaays W7caaI9aGaaGjbVlaaigdacaaISaGaaGjbVlablAciljaaiYcacaaM e8UaaGymaiaaisdacaaI0aGaaiOlaaaa@4402@ Finally, the adjusted replicate controls are t ^ C * ( r ) = t ^ C + 160 2 v r MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGabCiDayaaja Waa0baaSqaaiaadoeaaeaacaGGQaGaaGikaiaadkhacaaIPaaaaOGa aGjbVlaai2dacaaMe8UabCiDayaajaWaaSbaaSqaaiaadoeaaeqaaO GaaGjbVlabgUcaRiaaysW7daWcbaWcbaWaaOaaaeaacaaIXaGaaGOn aiaaicdaaWqabaaaleaacaaIYaaaaOGaaGjbVlaahAhadaWgaaWcba GaamOCaaqabaaaaa@4BD2@ for r = 1, , 144 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOCaiaays W7caaI9aGaaGjbVlaaigdacaaISaGaaGjbVlablAciljaaiYcacaaM e8UaaGymaiaaisdacaaI0aaaaa@4359@ and t ^ C * ( r ) = t ^ C MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGabCiDayaaja Waa0baaSqaaiaadoeaaeaacaGGQaGaaGikaiaadkhacaaIPaaaaOGa aGjbVlaai2dacaaMe8UabCiDayaajaWaaSbaaSqaaiaadoeaaeqaaa aa@40DF@ for r = 145, , 160. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOCaiaays W7caaI9aGaaGjbVlaaigdacaaI0aGaaGynaiaaiYcacaaMe8UaeSOj GSKaaGilaiaaysW7caaIXaGaaGOnaiaaicdacaGGUaaaaa@4586@ This points to a drawback of the Fuller (1998) method: while our approach perturbs the control totals of all 160 replicates, this method only perturbs a fraction of them in this case. In addition, 30 of the 144 eigenvalues were nearly zero, 18 of which less than zero due to floating point error. We truncated the 18 negative eigenvalues to zero, and left the rest unchanged. Hence, to the extent that not all replicates contribute to variance estimates for some survey estimates (e.g., domain totals), there is a risk that the sample-based calibration will be imperfectly reflected in the variance estimates. In general, we expect that a larger number of replicates will be perturbed using our approach, since the estimated variance-covariance matrix of the control totals can only be reliably estimated if its dimension is suitably smaller than R C . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOuamaaBa aaleaacaWGdbaabeaakiaac6caaaa@386E@

Tables 4.1 and 4.2 show the estimates and standard errors, respectively, for domains defined by residency and sex, before and after calibration. The first four rows contain the results for marginal totals for raking variables, which are exactly calibrated, while the last four are totals that correspond to the intersection of raking dimensions and are therefore not exactly calibrated.

Both surveys are representative of the same target population, but the estimates and associated standard errors differ, reflecting both sampling variability as well as different calibration approaches applied by the two survey organizations. As Table 4.1 confirms, after the 50-state survey is raked to the National survey, the estimated totals for domains defined as exact calibration domains indeed match exactly between both surveys. For the domains defined by the crosstabulation of residency and sex, the raked estimates for the 50-State survey are close but not identical to those of the National survey.


Table 4.1
Population estimates before and after calibration, rounded to the nearest integer, after scaling by 10 3 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqipu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbeqabeWacmGabiqabeqabmqabeabbaGcbaGaaGymaiaaic dadaahaaWcbeqaaiaaiodaaaaaaa@3840@
Table summary
This table displays the results of Population estimates before and after calibration. The information is grouped by Domain (appearing as row headers), Before Calibration and After Calibration (appearing as column headers).
Domain Before Calibration After Calibration
50-State National
Residency: Urban 203,445 208,695 208,695
Rural 51,511 45,991 45,991
Sex: Male 128,276 121,775 121,775
Female 126,680 132,911 132,911
Rural: Male 99,547 98,511 98,089
Female 103,898 110,184 110,607
Urban: Male 28,729 23,264 23,686
Female 22,782 22,727 22,305

Table 4.2 shows the standard errors obtained by the two replication methods with adjusted control totals and by the naive method, which does not account for the randomness of the control totals. By construction, the proposed replication-based adjustment method and the Fuller (1998) method lead to identical variance estimates for domains that are used in the calibration. These variance estimates are also equal to those from the control survey in this case. This reflects the fact that the variance component corresponding to the first term in (2.5) is set to 0 for the control totals, while the variance component for the second term is exactly equal to the control survey variance estimate in the case of raking. Because that variance component is ignored in the naive method, the variance estimates are equal to zero. For the estimated totals for domains defined as the crosstabulation of residency and sex, the variance estimates of the two methods are not identical but close (within 8% of each other), reflecting the fact that both are consistent for the asymptotic variance (2.5). The variance estimates under the naive method are smaller than the variance estimates under the other two calibration methods, leading to an obviously incorrect result due to not accounting for the variance in the random control totals. For other variables, the variance is still expected to be underestimated under the naive method, due to the fact that the second term in the asymptotic variance (2.5) is ignored.


Table 4.2
Standard errors of population estimates before and after calibration, rounded to the nearest integer, after scaling by 10 3 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqipu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbeqabeWacmGabiqabeqabmqabeabbaGcbaGaaGymaiaaic dadaahaaWcbeqaaiaaiodaaaaaaa@3840@
Table summary
This table displays the results of Standard errors of population estimates before and after calibration. The information is grouped by Domain (appearing as row headers), Before Calibration and After Calibration (appearing as column headers).
Domain Before Calibration After Calibration
50-State National Naive Fuller Proposed
Residency: Urban 1,922 2,664 0 2,664 2,664
Rural 1,922 2,598 0 2,598 2,598
Sex: Male 2,117 1,074 0 1,074 1,074
Female 2,117 1,112 0 1,112 1,112
Rural: Male 2,118 1,399 853 1,658 1,533
Female 2,514 1,797 853 1,964 1,970
Urban: Male 1,595 1,449 853 1,709 1,641
Female 979 1,271 853 1,470 1,547

Date modified: