Relative performance of methods based on model-assisted survey regression estimation: A simulation study
Section 2. Model-assisted estimation under probability sampling

2.1   GREG estimators

Consider the estimation of a finite population total t y = iU y i , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG0bWdamaaBaaaleaapeGaamyEaaWdaeqaaOGaaGjbV=qacqGH 9aqpcaaMe8+aaabeaeaacaaMc8UaamyEa8aadaWgaaWcbaWdbiaadM gaa8aabeaaa8qabaGaamyAaiaaykW7cqGHiiIZcaaMc8Uaamyvaaqa b0GaeyyeIuoakiaacYcaaaa@4967@  where U={ 1,,N } MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGvbGaaGjbVlabg2da9iaaysW7daGadaWdaeaapeGaaGymaiaa cYcacaaMe8UaeyOjGWRaaiilaiaaysW7caWGobaacaGL7bGaayzFaa aaaa@44E7@  is the set of units of the finite population and y i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG5bWdamaaBaaaleaapeGaamyAaaWdaeqaaaaa@384D@  is the value of the survey variable of interest for the unit iU. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyAaiaays W7cqGHiiIZcaaMe8Uaamyvaiaac6caaaa@3CFF@  Let sU MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGZbGaaGjbVlabgkOimlaaysW7caWGvbaaaa@3CEF@  be a sample selected according to a sampling design p( . ), MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGWbGaaGPaVpaabmqabaGaaGjcVlaai6cacaaMi8oacaGLOaGa ayzkaaGaaiilaaaa@3E9B@  where p( s ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGWbWaaeWabeaacaaMi8Uaam4CaiaayIW7aiaawIcacaGLPaaa aaa@3CA0@  is the probability of selecting s. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGZbGaaiOlaaaa@37B1@  For i U, MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGPbGaaiiOaiaaysW7cqGHiiIZcaaMe8UaamyvaiaacYcaaaa@3E41@  let π i =Pr[ is ] MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacqaHapaCpaWaaSbaaSqaa8qacaWGPbaapaqabaGccaaMe8+dbiab g2da9iaaysW7caqGqbGaaeOCamaadmaapaqaa8qacaWGPbGaaGjbVl abgIGiolaaysW7caWGZbaacaGLBbGaayzxaaaaaa@47A3@  denote the first-order inclusion probabilities of the design. We assume π i >0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacqaHapaCpaWaaSbaaSqaa8qacaWGPbaapaqabaGccaaMe8UaeyOp a4JaaGjbV=qacaaIWaaaaa@3E02@  for all iU. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGPbGaaGjbVlabgIGiolaaysW7caWGvbGaaiOlaaaa@3D1F@  Additionally, assume d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGKbaaaa@36F0@  auxiliary variables, x i = ( x i1 , x i2 ,, x id ) T MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH4bWaaSbaaSqaaiaadMgaaeqaaOGaaGjbVlabg2da9iaaysW7 daqadeqaaiaadIhapaWaaSbaaSqaa8qacaWGPbGaaGymaaWdaeqaaO WdbiaacYcacaaMe8UaamiEa8aadaWgaaWcbaWdbiaadMgacaaIYaaa paqabaGcpeGaaiilaiaaysW7cqGHMacVcaGGSaGaaGjbVlaadIhapa WaaSbaaSqaa8qacaWGPbGaamizaaWdaeqaaaGcpeGaayjkaiaawMca a8aadaahaaWcbeqaa8qacaWGubaaaaaa@50BD@  are known for each i U. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGPbGaaiiOaiaaysW7cqGHiiIZcaaMe8Uaamyvaiaac6caaaa@3E43@  A standard approach is to use the Horvitz-Thompson estimator

t ^ y,HT = is y i π i = is d i y i MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWG0bGbaKaapaWaaSbaaSqaa8qacaWG5bGaaiilaiaaykW7caqG ibGaaeivaaWdaeqaaOGaaGjbV=qacqGH9aqpcaaMe8+aaabuaeaaca aMc8+aaSaaa8aabaWdbiaadMhapaWaaSbaaSqaa8qacaWGPbaapaqa baaakeaapeGaeqiWda3damaaBaaaleaapeGaamyAaaWdaeqaaaaaa8 qabaGaamyAaiaaykW7cqGHiiIZcaaMc8Uaam4Caaqab0GaeyyeIuoa kiaaysW7cqGH9aqpcaaMe8+aaabuaeaacaaMc8Uaamiza8aadaWgaa WcbaWdbiaadMgaa8aabeaak8qacaWG5bWdamaaBaaaleaapeGaamyA aaWdaeqaaaWdbeaacaWGPbGaaGPaVlabgIGiolaaykW7caWGZbaabe qdcqGHris5aaaa@6333@

where d i = π i 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGKbWdamaaBaaaleaapeGaamyAaaWdaeqaaOGaaGjbV=qacqGH 9aqpcaaMe8UaeqiWda3damaaDaaaleaapeGaamyAaaWdaeaapeGaey OeI0IaaGymaaaaaaa@4130@  denotes design weights. Under this strictly design-based framework, the auxiliary data do not impact the form of the estimator but can impact the design weights, d i , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGKbWdamaaBaaaleaapeGaamyAaaWdaeqaaOGaaiilaaaa@38F2@  through the specification of the sampling design.

One strategy to use auxiliary data in estimation is to employ a model-assisted estimator of t y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG0bWdamaaBaaaleaapeGaamyEaaWdaeqaaaaa@3858@  by specifying a working model for the mean of y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG5baaaa@3705@  given x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH4baaaa@3708@  and use this model to predict y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG5baaaa@3705@  values. Specifying a linear regression working model leads to the generalized regression (GREG) estimator (Cassel, Särndal and Wretman, 1976). The GREG estimator typically has smaller variance than the Horvitz-Thompson estimator if the working model has some predictor power for y. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEaiaac6 caaaa@3797@  Here, we consider the GREG estimator under a linear regression working model

y i = x i T β+ ε i (2.1) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG5bWdamaaBaaaleaapeGaamyAaaWdaeqaaOGaaGjbV=qacqGH 9aqpcaaMe8UaaCiEamaaDaaaleaacaWGPbaabaGaamivaaaakiaahk 7acaaMe8Uaey4kaSIaaGjbVlabew7aLnaaBaaaleaacaWGPbaabeaa k8aacaaMf8UaaGzbVlaaywW7caaMf8UaaGzbVlaacIcacaaIYaGaai OlaiaaigdacaGGPaaaaa@52E1@

with β= ( β 0 , β 1 ,, β p ) T , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWHYoGaaGjbVlabg2da9iaaysW7daqadeqaaiabek7aI9aadaWg aaWcbaWdbiaaicdaa8aabeaak8qacaGGSaGaaGjbVlabek7aI9aada WgaaWcbaWdbiaaigdaa8aabeaak8qacaGGSaGaaGjbVlablAciljaa cYcacaaMe8UaeqOSdi2damaaBaaaleaapeGaamiCaaWdaeqaaaGcpe GaayjkaiaawMcaa8aadaahaaWcbeqaa8qacaWGubaaaOWdaiaacYca aaa@4F5F@   ε i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacqaH1oqzpaWaaSbaaSqaa8qacaWGPbaapaqabaaaaa@38F6@  independent and identically distributed with mean zero and variance σ 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacqaHdpWCpaWaaWbaaSqabeaapeGaaGOmaaaaaaa@38D2@  and x i = ( 1,  x i1 ,, x ip ) T . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH4bWaaSbaaSqaaiaadMgaaeqaaOGaaGjbVlabg2da9iaaysW7 daqadeqaaiaaigdacaGGSaGaaiiOaiaaysW7caWG4bWdamaaBaaale aapeGaamyAaiaaigdaa8aabeaak8qacaGGSaGaaGjbVlabgAci8kaa cYcacaaMe8UaamiEa8aadaWgaaWcbaWdbiaadMgacaWGWbaapaqaba aak8qacaGLOaGaayzkaaWdamaaCaaaleqabaWdbiaadsfaaaGcpaGa aiOlaaaa@5058@  The GREG estimator is given by

t ^ y,GREG = is y i x i T β ^ s π i + iU x i T β ^ s (2.2) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWG0bGbaKaapaWaaSbaaSqaa8qacaWG5bGaaiilaiaaykW7caqG hbGaaeOuaiaabweacaqGhbaapaqabaGccaaMe8+dbiabg2da9iaays W7daaeqbqaaiaaykW7daWcaaqaaiaadMhapaWaaSbaaSqaa8qacaWG PbaapaqabaGccaaMe8+dbiabgkHiTiaaysW7caWH4bWaa0baaSqaai aadMgaaeaacaWGubaaaOGabCOSd8aagaqcamaaBaaaleaaieWapeGa a83CaaWdaeqaaaGcpeqaaiabec8aW9aadaWgaaWcbaWdbiaadMgaa8 aabeaaaaaapeqaaiaadMgacaaMc8UaeyicI4SaaGPaVlaadohaaeqa niabggHiLdGccaaMe8Uaey4kaSIaaGjbVpaaqafabaGaaGPaVlaahI hadaqhaaWcbaGaamyAaaqaaiaadsfaaaGcceWHYoWdayaajaWaaSba aSqaa8qacaWFZbaapaqabaaapeqaaiaadMgacaaMc8UaeyicI4SaaG PaVlaadwfaaeqaniabggHiLdGccaaMf8UaaGzbVlaaywW7caaMf8Ua aGzbVlaacIcacaaIYaGaaiOlaiaaikdacaGGPaaaaa@7A86@

with the regression coefficients β MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWHYoaaaa@3745@  estimated as

β ^ s = argmin β ( Y s X s β ) T Π s 1 ( Y s X s β )= ( X s T Π s 1 X s ) 1 X s T Π s 1 Y s ,(2.3) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWHYoWdayaajaWaaSbaaSqaa8qacaWGZbaapaqabaGccaaMe8+d biabg2da9iaaysW7paWaaCbeaeaapeGaaeyyaiaabkhacaqGNbGaae yBaiaabMgacaqGUbaal8aabaWdbiabek7aIbWdaeqaaOGaaGjbVpaa bmqabaWdbiaahMfapaWaaSbaaSqaa8qacaWGZbaapaqabaGccaaMe8 +dbiabgkHiTiaaysW7caWHybWdamaaBaaaleaapeGaam4CaaWdaeqa aOWdbiaahk7aa8aacaGLOaGaayzkaaWaaWbaaSqabeaapeGaamivaa aakiaahc6apaWaa0baaSqaa8qacaWGZbaapaqaa8qacqGHsislcaaI XaaaaOWdamaabmqabaWdbiaahMfapaWaaSbaaSqaa8qacaWGZbaapa qabaGccaaMe8+dbiabgkHiTiaaysW7caWHybWdamaaBaaaleaapeGa am4CaaWdaeqaaOWdbiaahk7aa8aacaGLOaGaayzkaaGaaGjbV=qacq GH9aqpcaaMe8+aaeWabeaacaWHybWdamaaDaaaleaapeGaam4CaaWd aeaapeGaamivaaaakiaahc6apaWaa0baaSqaa8qacaWGZbaapaqaa8 qacqGHsislcaaIXaaaaOGaaCiwa8aadaWgaaWcbaWdbiaadohaa8aa beaaaOWdbiaawIcacaGLPaaapaWaaWbaaSqabeaapeGaeyOeI0IaaG ymaaaakiaahIfapaWaa0baaSqaa8qacaWGZbaapaqaa8qacaWGubaa aOGaaCiOd8aadaqhaaWcbaWdbiaadohaa8aabaWdbiabgkHiTiaaig daaaGccaWHzbWdamaaBaaaleaapeGaam4CaaWdaeqaaOGaaiilaiaa ywW7caaMf8UaaGzbVlaaywW7caaMf8UaaiikaiaaikdacaGGUaGaaG 4maiaacMcaaaa@89BD@                                                                              

where X s MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWHybWdamaaBaaaleaapeGaam4CaaWdaeqaaaaa@383A@  is a n×( p+1 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOBaiaays W7caaMc8Uaey41aqRaaGjbVpaabmqabaGaamiCaiaaysW7cqGHRaWk caaMe8UaaGymaaGaayjkaiaawMcaaaaa@44CC@  matrix, Y s MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGzbWdamaaBaaaleaapeGaam4CaaWdaeqaaaaa@3837@  is a n MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOBaaaa@36DA@  -vector and Π s MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaqGGoWdamaaBaaaleaapeGaam4CaaWdaeqaaaaa@387F@  is an n×n MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOBaiaays W7caaMc8Uaey41aqRaaGjbVlaaykW7caWGUbaaaa@4014@  diagonal matrix of first-order inclusion probabilities for the sampled units.

The GREG estimator can also be written as a weighted sum of the variable of interest, y, MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG5bGaaiilaaaa@37B5@  yielding regression weights that are independent of y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG5baaaa@3705@  and, therefore, can be applied to any study variable, y: MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG5bGaaiOoaaaa@37C3@

t ^ y,GREG = is [ 1+ ( t x t ^ x,HT ) T ( ks x k x k T d k ) 1 x i ] d i y i = is w i y i (2.4) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWG0bWdayaajaWaaSbaaSqaa8qacaWG5bGaaiilaiaaykW7caqG hbGaaeOuaiaabweacaqGhbaapaqabaGccaaMe8+dbiabg2da9iaays W7daaeqbqaaiaaykW7daWadaWdaeaapeGaaGPaVlaaigdacaaMe8Ua ey4kaSIaaGjbVpaabmqabaGaaCiDa8aadaWgaaWcbaWdbiaadIhaa8 aabeaakiaaysW7peGaeyOeI0IaaGjbVlqahshapaGbaKaadaWgaaWc baWdbiaadIhacaqGSaGaaGPaVlaabIeacaqGubaapaqabaaak8qaca GLOaGaayzkaaWdamaaCaaaleqabaWdbiaadsfaaaGcdaqadaWdaeaa peWaaabuaeaacaaMc8UaaCiEa8aadaWgaaWcbaWdbiaadUgaa8aabe aak8qacaWH4bWdamaaDaaaleaapeGaam4AaaWdaeaapeGaamivaaaa kiaadsgapaWaaSbaaSqaa8qacaWGRbaapaqabaaapeqaaiaadUgaca aMc8UaeyicI4SaaGPaVlaadohaaeqaniabggHiLdaakiaawIcacaGL PaaapaWaaWbaaSqabeaapeGaeyOeI0IaaGymaaaakiaahIhapaWaaS baaSqaa8qacaWGPbaapaqabaaak8qacaGLBbGaayzxaaGaaGjbVlaa ykW7caWGKbWdamaaBaaaleaapeGaamyAaaWdaeqaaOWdbiaadMhapa WaaSbaaSqaa8qacaWGPbaapaqabaaapeqaaiaadMgacaaMc8Uaeyic I4SaaGPaVlaadohaaeqaniabggHiLdGccaaMe8Uaeyypa0JaaGjbVp aaqafabaGaaGPaVlaadEhapaWaaSbaaSqaa8qacaWGPbaapaqabaGc peGaamyEa8aadaWgaaWcbaWdbiaadMgaa8aabeaaa8qabaGaamyAai aaykW7cqGHiiIZcaaMc8Uaam4Caaqab0GaeyyeIuoak8aacaaMf8Ua aGzbVlaaywW7caaMf8UaaGzbVlaacIcacaaIYaGaaiOlaiaaisdaca GGPaaaaa@A248@

where t x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH0bWdamaaBaaaleaapeGaamiEaaWdaeqaaaaa@385B@  is the known population total vector of the covariates x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH4baaaa@3708@  and t ^ x,HT MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWH0bWdayaajaWaaSbaaSqaa8qacaWG4bGaaiilaiaaykW7caqG ibGaaeivaaWdaeqaaaaa@3C48@  is the Horvitz-Thompson estimator vector of the covariate population totals t x = iU x i . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH0bWdamaaBaaaleaapeGaamiEaaWdaeqaaOGaaGjbV=qacqGH 9aqpcaaMe8+aaabeaeaacaaMc8UaaCiEa8aadaWgaaWcbaWdbiaadM gaa8aabeaaa8qabaGaamyAaiaaysW7cqGHiiIZcaaMe8Uaamyvaaqa b0GaeyyeIuoakiaac6caaaa@4973@  The regression weights, w i , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG3bWdamaaBaaaleaapeGaamyAaaWdaeqaaOGaaiilaaaa@3905@  are termed calibration weights because they satisfy the calibration constraint is w i x i = iU x i . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qadaaeqaqaaiaaykW7caWG3bWdamaaBaaaleaapeGaamyAaaWdaeqa aOWdbiaahIhapaWaaSbaaSqaa8qacaWGPbaapaqabaaapeqaaiaadM gacaaMe8UaeyicI4SaaGjbVlaadohaaeqaniabggHiLdGccaaMe8Ua eyypa0JaaGjbVpaaqababaGaaGPaVlaahIhapaWaaSbaaSqaa8qaca WGPbaapaqabaaapeqaaiaadMgacaaMe8UaeyicI4SaaGjbVlaadwfa aeqaniabggHiLdGccaGGUaaaaa@55AD@  The calibration weight w i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaam4DamaaBa aaleaacaWGPbaabeaaaaa@37FD@  does not depend on the study variable y i . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEamaaBa aaleaacaWGPbaabeaakiaac6caaaa@38BB@  Note that the GREG estimator (2.4) can alternatively be expressed as

t ^ y,GREG = t ^ y,HT + ( t x t ^ x,HT ) T β ^ s MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWG0bWdayaajaWaaSbaaSqaa8qacaWG5bGaaiilaiaaykW7caqG hbGaaeOuaiaabweacaqGhbaapaqabaGccaaMe8+dbiabg2da9iaays W7ceWG0bWdayaajaWaaSbaaSqaa8qacaWG5bGaaiilaiaaykW7caqG ibGaaeivaaWdaeqaaOGaaGjbV=qacqGHRaWkcaaMe8+aaeWabeaaca WH0bWdamaaBaaaleaapeGaamiEaaWdaeqaaOGaaGjbV=qacqGHsisl caaMe8UabCiDa8aagaqcamaaBaaaleaapeGaamiEaiaacYcacaaMc8 Uaaeisaiaabsfaa8aabeaaaOWdbiaawIcacaGLPaaapaWaaWbaaSqa beaapeGaamivaaaakiqahk7apaGbaKaadaWgaaWcbaacbmWdbiaa=n haa8aabeaaaaa@5E92@

which only requires known population totals t x . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH0bWdamaaBaaaleaapeGaamiEaaWdaeqaaOGaaiOlaaaa@3917@  For the GREG estimator, the individual population values x i , iU MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH4bWdamaaBaaaleaapeGaamyAaaWdaeqaaOWdbiaacYcacaGG GcGaaGjbVlaadMgacaaMe8UaeyicI4SaaGjbVlaadwfaaaa@4231@  are not needed.

If a variable selection procedure, such as a forward stepwise procedure, is implemented prior to fitting the linear regression model, then the calibration weights will depend on y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG5baaaa@3705@  as the selected models may vary across study variables. This type of stepwise survey regression estimator is calibrated to the auxiliary variables selected by the variable selection procedure for a specific variable of interest, y. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG5bGaaiOlaaaa@37B7@

Using a working linear regression model with many auxiliary variables, including interactions of categorical auxiliary variables, can produce substantially variable weights, and greatly increase the variance of the GREG estimator. Furthermore, some of the regression weights, w i , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG3bWdamaaBaaaleaapeGaamyAaaWdaeqaaOGaaiilaaaa@3905@   is, MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGPbGaaGjbVlabgIGiolaaysW7caWGZbGaaiilaaaa@3D3B@  may be negative, thus losing the interpretation of a weight as the number of population units represented by the sampled unit.

2.2   Survey regression estimator with lasso

If the linear regression model in (2.1) is sparse, i.e., p MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGWbaaaa@36FC@  is large, and, say, only p 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGWbWdamaaBaaaleaapeGaaGimaaWdaeqaaaaa@3810@  of the p MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGWbaaaa@36FC@  regression coefficients are nonzero, then the estimation of the zero coefficients in (2.3) leads to extra variation in the GREG estimator (2.2). In this case, model selection to remove extraneous variables could reduce the overall design variance of the GREG estimator, leading to more efficient estimates of finite population totals. The least absolute shrinkage and selection operator (lasso) method, developed by Tibshirani (1996), simultaneously performs model selection and coefficient estimation by shrinking some regression coefficients to zero. The lasso approach estimates coefficients by minimizing the sum of squared residuals subject to a penalty constraint on the sum of the absolute value of the regression coefficients.

McConville et al. (2017) proposed using survey-weight lasso estimated regression coefficients given by

β ^ s,L = argmin β ( Y s X s β ) T Π s 1 ( Y s X s β ) +λ j=1 p | β j | , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWHYoWdayaajaWaaSbaaSqaa8qacaWGZbGaaiilaiaaysW7caWG mbaapaqabaGccaaMe8+dbiabg2da9iaaysW7paWaaCbeaeaapeGaae yyaiaabkhacaqGNbGaaeyBaiaabMgacaqGUbaal8aabaWdbiabek7a IbWdaeqaaOGaaGPaVpaabmqabaWdbiaahMfapaWaaSbaaSqaa8qaca WGZbaapaqabaGccaaMe8+dbiabgkHiTiaaysW7caWHybWdamaaBaaa leaapeGaam4CaaWdaeqaaOWdbiaahk7aa8aacaGLOaGaayzkaaWaaW baaSqabeaapeGaamivaaaakiaahc6apaWaa0baaSqaa8qacaWGZbaa paqaa8qacqGHsislcaaIXaaaaOWaaeWaa8aabaWdbiaahMfapaWaaS baaSqaa8qacaWGZbaapaqabaGccaaMe8+dbiabgkHiTiaaysW7caWH ybWdamaaBaaaleaapeGaam4CaaWdaeqaaOWdbiaahk7aaiaawIcaca GLPaaacaGGGcGaaGjbVlabgUcaRiaaysW7cqaH7oaBcaaMe8+aaabC aeaacaaMc8+aaqWabeaacaaMi8UaeqOSdi2damaaBaaaleaapeGaam OAaaWdaeqaaOWdbiaayIW7aiaawEa7caGLiWoaaSqaaiaadQgacaaM c8Uaeyypa0JaaGPaVlaaigdaaeaacaWGWbaaniabggHiLdGccaaMc8 Uaaiilaaaa@8423@

where λ0. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacqaH7oaBcaaMe8UaeyyzImRaaGjbVlaaicdacaGGUaaaaa@3E07@  The lasso survey regression estimator for the total t y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG0bWdamaaBaaaleaapeGaamyEaaWdaeqaaaaa@3858@  is then given by

t ^ y,LASSO = is y i x i T β ^ s,L π i + iU x i T β ^ s,L . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWG0bWdayaajaWaaSbaaSqaa8qacaWG5bGaaiilaiaaysW7caqG mbGaaeyqaiaabofacaqGtbGaae4taaWdaeqaaOGaaGjbV=qacqGH9a qpcaaMe8+aaabuaeaacaaMc8+aaSaaaeaacaWG5bWdamaaBaaaleaa peGaamyAaaWdaeqaaOGaaGjbV=qacqGHsislcaaMe8UaaCiEa8aada qhaaWcbaWdbiaadMgaa8aabaWdbiaadsfaaaGcceWHYoWdayaajaWa aSbaaSqaa8qacaWGZbGaaiilaiaaysW7caWGmbaapaqabaaak8qaba GaeqiWda3damaaBaaaleaapeGaamyAaaWdaeqaaaaaa8qabaGaamyA aiaaykW7cqGHiiIZcaaMc8Uaam4Caaqab0GaeyyeIuoakiaaysW7cq GHRaWkcaaMe8+aaabuaeaacaaMc8UaaCiEa8aadaqhaaWcbaWdbiaa dMgaa8aabaWdbiaadsfaaaGcceWHYoWdayaajaWaaSbaaSqaa8qaca WGZbGaaiilaiaaysW7caWGmbaapaqabaaapeqaaiaadMgacaaMc8Ua eyicI4SaaGPaVlaadwfaaeqaniabggHiLdGccaGGUaaaaa@7765@

The value of the penalty parameter λ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacqaH7oaBaaa@37BB@  must be selected prior to obtaining the estimated coefficients. In general, this process of specifying hyperparameters prior to fitting the final model is called hyperparameter tuning. There are several potential selection criteria that can used to select the value of hyperparameters including Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) or cross-validation. We used a version of cross-validation which incorporates the design weights in our simulation study; see McConville (2011) for discussion of the selection of the penalty parameter for survey-weighted lasso coefficient estimates.

2.3   Survey regression estimator with adaptive lasso

An issue with the use of the lasso criterion is that by shrinking the regression coefficients towards zero it yields biased estimates for regression coefficients that are far from zero. Under the adaptive lasso criterion (Zou, 2006), the coefficients in the l 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGSbWdamaaBaaaleaapeGaaGymaaWdaeqaaaaa@380D@  penalty are weighted by the inverse of a root- n MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOBaaaa@36DA@  consistent estimator of β. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWHYoGaaiOlaaaa@37F7@  Therefore, the bias for large coefficients tends to be smaller.

McConville et al. (2017) considered an adaptive lasso survey regression estimator

t ^ y,ALASSO = is y i x i T β ^ s,AL π i + iU x i T β ^ s,AL , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWG0bWdayaajaWaaSbaaSqaa8qacaWG5bGaaiilaiaaysW7caqG bbGaaeitaiaabgeacaqGtbGaae4uaiaab+eaa8aabeaakiaaysW7pe Gaeyypa0JaaGjbVpaaqafabaGaaGPaVpaalaaabaGaamyEa8aadaWg aaWcbaWdbiaadMgaa8aabeaakiaaysW7peGaeyOeI0IaaGjbVlaahI hapaWaa0baaSqaa8qacaWGPbaapaqaa8qacaWGubaaaOGabCOSd8aa gaqcamaaBaaaleaapeGaam4CaiaacYcacaaMe8UaaeyqaiaabYeaa8 aabeaaaOWdbeaacqaHapaCpaWaaSbaaSqaa8qacaWGPbaapaqabaaa aaWdbeaacaWGPbGaaGPaVlabgIGiolaaykW7caWGZbaabeqdcqGHri s5aOGaaGjbVlabgUcaRiaaysW7daaeqbqaaiaaykW7caWH4bWdamaa DaaaleaapeGaamyAaaWdaeaapeGaamivaaaakiqahk7apaGbaKaada WgaaWcbaWdbiaadohacaGGSaGaaGjbVlaabgeacaqGmbaapaqabaaa peqaaiaadMgacaaMc8UaeyicI4SaaGPaVlaadwfaaeqaniabggHiLd GccaGGSaaaaa@79AB@

where

β ^ s,AL = argmin β ( Y s X s β ) T Π s 1 ( Y s X s β ) +λ j=1 p | β j | | β ^ sj | MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWHYoWdayaajaWaaSbaaSqaa8qacaWGZbGaaiilaiaaysW7caqG bbGaaeitaaWdaeqaaOWdbiabg2da98aadaWfqaqaa8qacaqGHbGaae OCaiaabEgacaqGTbGaaeyAaiaab6gaaSWdaeaapeGaeqOSdigapaqa baGccaaMe8+aaeWabeaapeGaaCywa8aadaWgaaWcbaWdbiaadohaa8 aabeaakiaaysW7peGaeyOeI0IaaGjbVlaahIfapaWaaSbaaSqaa8qa caWGZbaapaqabaGcpeGaaCOSdaWdaiaawIcacaGLPaaadaahaaWcbe qaa8qacaWGubaaaOGaaCiOd8aadaqhaaWcbaWdbiaadohaa8aabaWd biabgkHiTiaaigdaaaGcdaqadaWdaeaapeGaaCywa8aadaWgaaWcba Wdbiaadohaa8aabeaakiaaysW7peGaeyOeI0IaaGjbVlaahIfapaWa aSbaaSqaa8qacaWGZbaapaqabaGcpeGaaCOSdaGaayjkaiaawMcaai aacckacaaMe8Uaey4kaSIaaGjbVlabeU7aSjaaykW7daaeWbqaaiaa ykW7daWcaaqaamaaemqabaGaaGjcVlabek7aI9aadaWgaaWcbaWdbi aadQgaa8aabeaakiaayIW7a8qacaGLhWUaayjcSdaabaWaaqWabeaa caaMi8UafqOSdi2dayaajaWaaSbaaSqaa8qacaWGZbGaamOAaaWdae qaaOGaaGjcVdWdbiaawEa7caGLiWoaaaaaleaacaWGQbGaeyypa0Ja aGymaaqaaiaadchaa0GaeyyeIuoaaaa@86D1@

and β ^ s MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWHYoGbaKaapaWaaSbaaSqaaGqad8qacaWFZbaapaqabaaaaa@38AF@  is given by (2.3). The reliance of the adaptive lasso method on the standard weighted linear regression coefficient estimates, β ^ s , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWHYoWdayaajaWaaSbaaSqaaGqad8qacaWFZbaapaqabaGccaGG Saaaaa@3969@  leads to a loss of efficiency in settings when p MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiCaaaa@36DC@  is large because the estimates β ^ s MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWHYoWdayaajaWaaSbaaSqaaGqad8qacaWFZbaapaqabaaaaa@38AF@  tend to be very unstable.

2.4   Lasso calibration estimators

The lasso and adaptive lasso methods do not produce regression weights directly, as the estimators cannot be expressed as weighted combinations of the y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEaaaa@36E5@  -values. McConville et al. (2017) developed lasso survey regression weights using a model calibration approach and a ridge regression approximation. These lasso regression weights depend on the variable of interest, y. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEaiaac6 caaaa@3797@

The lasso calibration estimator is calculated by regressing the variable of interest, y i , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG5bWdamaaBaaaleaapeGaamyAaaWdaeqaaOGaaiilaaaa@3907@  on an intercept and the lasso-fitted mean function x i T β ^ s,L . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH4bWdamaaDaaaleaapeGaamyAaaWdaeaapeGaamivaaaakiqa hk7apaGbaKaadaWgaaWcbaWdbiaadohacaGGSaGaaGjbVlaadYeaa8 aabeaakiaac6caaaa@3FAE@  The lasso calibration estimator can be written in the same form as (2.4), where x i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH4bWdamaaBaaaleaapeGaamyAaaWdaeqaaaaa@3850@  is replaced by x i * = ( 1, x i T β ^ s,L ) T : MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH4bWdamaaDaaaleaaieWapeGaa8xAaaWdaeaapeGaa8Nkaaaa k8aacaaMe8+dbiabg2da9iaaysW7daqadeqaaiaaigdacaGGSaGaaG jbVlaahIhapaWaa0baaSqaa8qacaWGPbaapaqaa8qacaWGubaaaOGa bCOSd8aagaqcamaaBaaaleaapeGaam4CaiaacYcacaaMe8Uaamitaa WdaeqaaaGcpeGaayjkaiaawMcaa8aadaahaaWcbeqaa8qacaWGubaa aOWdaiaaygW7caGG6aaaaa@4E69@

t ^ y,CLASSO = is [ 1+ ( t x * t ^ x * ,HT ) T ( ks x k * x k *T d k ) 1 x i * ] d i y i .(2.5) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWG0bWdayaajaWaaSbaaSqaa8qacaWG5bGaaiilaiaaysW7caqG dbGaaeitaiaabgeacaqGtbGaae4uaiaab+eaa8aabeaakiaaysW7pe Gaeyypa0JaaGjbVpaaqafabaGaaGjbVpaadmqabaGaaGPaVlaaigda caaMe8Uaey4kaSIaaGjbVpaabmqabaGaaCiDa8aadaWgaaWcbaWdbi aadIhapaWaaWbaaWqabeaapeGaaiOkaaaaaSWdaeqaaOGaaGjbV=qa cqGHsislcaaMe8UabCiDa8aagaqcamaaBaaaleaapeGaamiEa8aada ahaaadbeqaa8qacaGGQaaaaSGaaiilaiaaysW7caqGibGaaeivaaWd aeqaaaGcpeGaayjkaiaawMcaa8aadaahaaWcbeqaa8qacaWGubaaaO WaaeWaa8aabaWdbmaaqafabaGaaGPaVlaahIhapaWaa0baaSqaa8qa caWGRbaapaqaa8qacaGGQaaaaOGaaCiEa8aadaqhaaWcbaWdbiaadU gaa8aabaWdbiaacQcacaWGubaaaOGaamiza8aadaWgaaWcbaWdbiaa dUgaa8aabeaaa8qabaGaam4AaiaaykW7cqGHiiIZcaaMc8Uaam4Caa qab0GaeyyeIuoaaOGaayjkaiaawMcaa8aadaahaaWcbeqaa8qacqGH sislcaaIXaaaaOGaaCiEa8aadaqhaaWcbaWdbiaadMgaa8aabaWdbi aacQcaaaaakiaawUfacaGLDbaacaaMe8UaaGPaVlaadsgadaWgaaWc baGaamyAaaqabaGccaWG5bWaaSbaaSqaaiaadMgaaeqaaaqaaiaadM gacaaMc8UaeyicI4SaaGPaVlaadohaaeqaniabggHiLdGccaGGUaGa aGzbVlaaywW7caaMf8UaaGzbVlaaywW7caGGOaGaaGOmaiaac6caca aI1aGaaiykaaaa@9516@

Similarly, the adaptive lasso calibration estimator is given by

t ^ y,CALASSO = is [ 1+ ( t x ** t ^ x ** ,HT ) T ( ks x k ** x k **T d k ) 1 x i ** ] d i y i , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWG0bWdayaajaWaaSbaaSqaa8qacaWG5bGaaiilaiaaysW7caqG dbGaaeyqaiaabYeacaqGbbGaae4uaiaabofacaqGpbaapaqabaGcca aMe8+dbiabg2da9iaaysW7daaeqbqaaiaaykW7daWadeqaaiaaykW7 caaIXaGaaGjbVlabgUcaRiaaysW7daqadeqaaiaahshapaWaaSbaaS qaa8qacaWG4bWdamaaCaaameqabaWdbiaacQcacaGGQaaaaaWcpaqa baGccaaMe8+dbiabgkHiTiaaysW7ceWH0bWdayaajaWaaSbaaSqaa8 qacaWG4bWdamaaCaaameqabaWdbiaacQcacaGGQaaaaSGaaiilaiaa ysW7caqGibGaaeivaaWdaeqaaaGcpeGaayjkaiaawMcaa8aadaahaa Wcbeqaa8qacaWGubaaaOWaaeWaa8aabaWdbmaaqafabaGaaGPaVlaa hIhapaWaa0baaSqaa8qacaWGRbaapaqaa8qacaGGQaGaaiOkaaaaki aahIhapaWaa0baaSqaa8qacaWGRbaapaqaa8qacaGGQaGaaiOkaiaa dsfaaaGccaWGKbWdamaaBaaaleaapeGaam4AaaWdaeqaaaWdbeaaca WGRbGaeyicI4Saam4Caaqab0GaeyyeIuoaaOGaayjkaiaawMcaa8aa daahaaWcbeqaa8qacqGHsislcaaIXaaaaOGaaCiEa8aadaqhaaWcba WdbiaadMgaa8aabaWdbiaacQcacaGGQaaaaaGccaGLBbGaayzxaaGa aGjbVlaaykW7caWGKbWdamaaBaaaleaapeGaamyAaaWdaeqaaOWdbi aadMhapaWaaSbaaSqaa8qacaWGPbaapaqabaaapeqaaiaadMgacqGH iiIZcaWGZbaabeqdcqGHris5aOGaaiilaaaa@8840@

where the lasso-fitted mean for x i * MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH4bWdamaaDaaaleaapeGaamyAaaWdaeaapeGaaiOkaaaaaaa@390F@  in (2.5) is replaced by the adaptive lasso fit, x i ** = ( 1, x i T β ^ s,AL ) T . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH4bWdamaaDaaaleaapeGaamyAaaWdaeaapeGaaiOkaiaacQca aaGcpaGaaGjbV=qacqGH9aqpcaaMe8+aaeWabeaacaaIXaGaaiilai aaysW7caWH4bWdamaaDaaaleaapeGaamyAaaWdaeaapeGaamivaaaa kiqahk7apaGbaKaadaWgaaWcbaWdbiaadohacaGGSaGaaGPaVlaabg eacaqGmbaapaqabaaak8qacaGLOaGaayzkaaWdamaaCaaaleqabaWd biaadsfaaaGcpaGaaiOlaaaa@4E3D@  The weights for the lasso calibration estimators are calibrated to the population size N MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOtaaaa@36BA@  and to the population total of the lasso-fitted mean functions.

2.5   Regression tree estimator

The GREG estimator can also be expressed as

t ^ y,r = is y i h ^ n ( x i ) π i + iU h ^ n ( x i ) ,(2.6) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWG0bWdayaajaWaaSbaaSqaa8qacaWG5bGaaiilaiaaykW7caWG YbaapaqabaGccaaMe8+dbiabg2da9iaaysW7daaeqbqaaiaaykW7da WcaaqaaiaadMhapaWaaSbaaSqaa8qacaWGPbaapaqabaGccaaMe8+d biabgkHiTiaaysW7ceWGObWdayaajaWaaSbaaSqaa8qacaWGUbaapa qabaGcpeWaaeWaa8aabaWdbiaahIhapaWaaSbaaSqaa8qacaWGPbaa paqabaaak8qacaGLOaGaayzkaaaabaGaeqiWda3damaaBaaaleaape GaamyAaaWdaeqaaaaaa8qabaGaamyAaiaaykW7cqGHiiIZcaaMc8Ua am4Caaqab0GaeyyeIuoakiaaysW7cqGHRaWkcaaMe8+aaabuaeaaca aMc8UabmiAa8aagaqcamaaBaaaleaapeGaamOBaaWdaeqaaOWdbmaa bmaapaqaa8qacaWH4bWdamaaBaaaleaapeGaamyAaaWdaeqaaaGcpe GaayjkaiaawMcaaaWcbaGaamyAaiaaykW7cqGHiiIZcaaMc8Uaamyv aaqab0GaeyyeIuoakiaacYcacaaMf8UaaGzbVlaaywW7caaMf8UaaG zbVlaacIcacaaIYaGaaiOlaiaaiAdacaGGPaaaaa@7A7D@

where h ^ n ( x i ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWGObWdayaajaWaaSbaaSqaa8qacaWGUbaapaqabaGcpeWaaeWa a8aabaWdbiaahIhapaWaaSbaaSqaa8qacaWGPbaapaqabaaak8qaca GLOaGaayzkaaaaaa@3C76@  is an estimator of the mean function of Y i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGzbWdamaaBaaaleaapeGaamyAaaWdaeqaaaaa@382D@  given X i = x i , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWHybWdamaaBaaaleaapeGaamyAaaWdaeqaaOGaaGjbV=qacqGH 9aqpcaaMe8UaaCiEa8aadaWgaaWcbaWdbiaadMgaa8aabeaakiaacY caaaa@3F6D@   h( x i )=E( Y i | X i = x i ), MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGObGaaGPaVpaabmaapaqaa8qacaWH4bWdamaaBaaaleaapeGa amyAaaWdaeqaaaGcpeGaayjkaiaawMcaaiaaysW7cqGH9aqpcaaMe8 UaamyraiaaykW7daqadeqaamaaeiqabaGaamywa8aadaWgaaWcbaWd biaadMgaa8aabeaakiaaykW7a8qacaGLiWoacaaMc8UaaCiwa8aada WgaaWcbaWdbiaadMgaa8aabeaakiaaysW7peGaeyypa0JaaGjbVlaa hIhapaWaaSbaaSqaa8qacaWGPbaapaqabaaak8qacaGLOaGaayzkaa Gaaiilaaaa@54EC@  based on the sample data ( y i , x i ),is. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qadaqadeqaaiaadMhapaWaaSbaaSqaa8qacaWGPbaapaqabaGcpeGa aiilaiaaysW7caWH4bWdamaaBaaaleaapeGaamyAaaWdaeqaaaGcpe GaayjkaiaawMcaaiaacYcacaaMe8UaamyAaiaaysW7cqGHiiIZcaaM e8Uaam4Caiaac6caaaa@4804@  As an alternative to a linear regression model, McConville and Toth (2019) proposed estimating h( x ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGObGaaGPaVpaabmaapaqaa8qacaWH4baacaGLOaGaayzkaaaa aa@3B28@  with a regression tree model using the following algorithm:

The resulting regression tree model groups the categories of an auxiliary variable based on their relationship to the variable of interest and only includes auxiliary variables and interactions associated with this variable. Importantly, including a categorical variable does not require a split for each category, potentially reducing the model size substantially while still capturing important interactions.

After fitting a regression tree model, we obtain a set of boxes Q n ={ B n1 , B n2 ,, B nq } MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGrbWdamaaBaaaleaapeGaamOBaaWdaeqaaOGaaGjbV=qacqGH 9aqpcaaMe8+aaiWaa8aabaWdbiaadkeapaWaaSbaaSqaa8qacaWGUb GaaGymaaWdaeqaaOWdbiaacYcacaaMe8UaamOqa8aadaWgaaWcbaWd biaad6gacaaIYaaapaqabaGcpeGaaiilaiaaysW7cqGHMacVcaGGSa GaaGjbVlaadkeapaWaaSbaaSqaa8qacaWGUbGaamyCaaWdaeqaaaGc peGaay5Eaiaaw2haaaaa@4FF0@  which partition the data. Let I( x i B nk )=1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGjbGaaGPaVpaabmaapaqaa8qacaWH4bWdamaaBaaaleaapeGa amyAaaWdaeqaaOGaaGjbV=qacqGHiiIZcaaMe8UaamOqa8aadaWgaa WcbaWdbiaad6gacaWGRbaapaqabaaak8qacaGLOaGaayzkaaGaaGjb Vlabg2da9iaaysW7caaIXaaaaa@4902@  if x i B nk MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH4bWdamaaBaaaleaapeGaamyAaaWdaeqaaOGaaGjbV=qacqGH iiIZcaaMe8UaamOqa8aadaWgaaWcbaWdbiaad6gacaWGRbaapaqaba aaaa@400C@  and 0 otherwise, for k=1,,q. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGRbGaaGjbVlabg2da9iaaysW7caaIXaGaaiilaiaaysW7cqWI MaYscaGGSaGaaGjbVlaadghacaGGUaaaaa@4316@  This means that I( x i B nk )=1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGjbGaaGPaVpaabmaapaqaa8qacaWH4bWdamaaBaaaleaapeGa amyAaaWdaeqaaOGaaGjbV=qacqGHiiIZcaaMe8UaamOqa8aadaWgaa WcbaWdbiaad6gacaWGRbaapaqabaaak8qacaGLOaGaayzkaaGaaGjb Vlabg2da9iaaysW7caaIXaaaaa@4902@  for exactly one box B nk Q n MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGcbWdamaaBaaaleaapeGaamOBaiaadUgaa8aabeaakiaaysW7 peGaeyicI4SaaGjbVlaadgfapaWaaSbaaSqaa8qacaWGUbaapaqaba aaaa@3FE6@  for every is. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGPbGaaGjbVlabgIGiolaaysW7caWGZbGaaiOlaaaa@3D3D@  For every x i B nk , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWH4bWdamaaBaaaleaapeGaamyAaaWdaeqaaOGaaGjbV=qacqGH iiIZcaaMe8UaamOqa8aadaWgaaWcbaWdbiaad6gacaWGRbaapaqaba GccaGGSaaaaa@40C6@  the estimator of h( x i ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGObGaaGPaVpaabmaapaqaa8qacaWH4bWdamaaBaaaleaapeGa amyAaaWdaeqaaaGcpeGaayjkaiaawMcaaaaa@3C8A@  is given by

h ˜ n ( x i )= # ˜ ( B nk ) 1 is π i 1 y i I( x i B nk ) = μ ˜ nk ,(2.7) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWGObWdayaaiaWaaSbaaSqaa8qacaWGUbaapaqabaGcpeWaaeWa a8aabaWdbiaahIhapaWaaSbaaSqaa8qacaWGPbaapaqabaaak8qaca GLOaGaayzkaaGaaGjbVlabg2da9iaaysW7ceGGJaWdayaaiaWdbmaa bmqabaGaamOqa8aadaWgaaWcbaWdbiaad6gacaWGRbaapaqabaaak8 qacaGLOaGaayzkaaWdamaaCaaaleqabaWdbiabgkHiTiaaigdaaaGc paWaaabuaeaacaaMc8+dbiabec8aW9aadaqhaaWcbaWdbiaadMgaa8 aabaWdbiabgkHiTiaaigdaaaGccaWG5bWdamaaBaaaleaapeGaamyA aaWdaeqaaOWdbiaadMeacaaMc8+aaeWaa8aabaWdbiaahIhapaWaaS baaSqaa8qacaWGPbaapaqabaGccaaMe8+dbiabgIGiolaaysW7caWG cbWdamaaBaaaleaapeGaamOBaiaadUgaa8aabeaaaOWdbiaawIcaca GLPaaaaSWdaeaapeGaamyAaiaaykW7cqGHiiIZcaaMc8Uaam4CaaWd aeqaniabggHiLdGccaaMe8+dbiabg2da9iaaysW7cuaH8oqBpaGbaG aadaWgaaWcbaWdbiaad6gacaWGRbaapaqabaGcpeGaaiilaiaaywW7 caaMf8UaaGzbVlaaywW7caaMf8UaaiikaiaaikdacaGGUaGaaG4nai aacMcaaaa@7C26@

where

# ˜ ( B nk )= is π i 1 I( x i B nk ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceGGJaWdayaaiaWdbiaaykW7daqadaWdaeaapeGaamOqa8aadaWg aaWcbaWdbiaad6gacaWGRbaapaqabaaak8qacaGLOaGaayzkaaGaey ypa0Zaaybuaeqal8aabaWdbiaadMgacqGHiiIZcaWGZbaabeqdpaqa a8qacqGHris5aaGccqaHapaCpaWaa0baaSqaa8qacaWGPbaapaqaa8 qacqGHsislcaaIXaaaaOGaamysamaabmaapaqaa8qacaWH4bWdamaa BaaaleaapeGaamyAaaWdaeqaaOWdbiabgIGiolaadkeapaWaaSbaaS qaa8qacaWGUbGaam4AaaWdaeqaaaGcpeGaayjkaiaawMcaaaaa@5253@

is the HT estimator of the population size in box B nk . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGcbWdamaaBaaaleaapeGaamOBaiaadUgaa8aabeaakiaac6ca aaa@39C7@  The regression tree estimator t ^ y,TREE MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWG0bWdayaajaWaaSbaaSqaa8qacaWG5bGaaiilaiaaysW7caqG ubGaaeOuaiaabweacaqGfbaapaqabaaaaa@3DE1@  is obtained by inserting equation (2.7) into the generalized regression estimator, given in equation (2.6), leading to the post stratified estimator

t ^ y,TREE = k N k μ ˜ nk , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWG0bWdayaajaWaaSbaaSqaa8qacaWG5bGaaiilaiaaysW7caqG ubGaaeOuaiaabweacaqGfbaapaqabaGccaaMe8+dbiabg2da9iaays W7daaeqbqaaiaaykW7caWGobWdamaaBaaaleaapeGaam4AaaWdaeqa aOWdbiqbeY7aT9aagaacamaaBaaaleaapeGaamOBaiaadUgaa8aabe aaa8qabaGaam4Aaaqab0GaeyyeIuoakiaacYcaaaa@4DB0@

where N k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGobWdamaaBaaaleaapeGaam4AaaWdaeqaaaaa@3824@  is the number of units in U MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGvbaaaa@36E1@  that belong to box k. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGRbGaaiOlaaaa@37A9@

Since h ˜ n ( x i ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWGObWdayaaiaWaaSbaaSqaa8qacaWGUbaapaqabaGcpeWaaeWa a8aabaWdbiaahIhapaWaaSbaaSqaa8qacaWGPbaapaqabaaak8qaca GLOaGaayzkaaaaaa@3C75@  can be written as a linear regression estimator with q MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGXbaaaa@36FD@  indicator function covariates, the regression tree estimator is also a post-stratified estimator, where each box B nk   MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGcbWdamaaBaaaleaapeGaamOBaiaadUgaa8aabeaak8qacaGG Gcaaaa@3A49@  represents a post-stratum. This implies that this estimator is calibrated to the population total of each box, providing a data-driven mechanism, dependent on y, MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG5bGaaiilaaaa@37B5@  for selecting post-strata that ensures that none of them are empty. As a result, the regression weights are guaranteed to be non-negative. The weights produced by this estimation procedure depend on the variable of interest, y. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG5bGaaiOlaaaa@37B7@  Therefore, unlike the GREG approach, a single set of generic weights to apply to all study variables is not available. Instead, a set of weights for each survey variable of interest is produced.

2.6   Variance estimation under stratified simple random sampling

Under stratified simple random sampling, a variance estimator of the model-assisted survey regression estimators described above is obtained by the Taylor linearization method and given by

V ^ ( t ^ y ) = h N h ( N h n h ) n h 1 n h 1 i s h ( e hi e ¯ h ) 2 ,(2.8) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qaceWGwbGbaKaacaaMc8+aaeWaa8aabaGabmiDayaajaWaaSbaaSqa aiaadMhaaeqaaaGcpeGaayjkaiaawMcaaiaacckacaaMe8Uaeyypa0 JaaGjbVpaaqafabaGaaGPaVpaalaaabaGaamOta8aadaWgaaWcbaWd biaadIgaa8aabeaakiaaykW7peWaaeWaa8aabaWdbiaad6eapaWaaS baaSqaa8qacaWGObaapaqabaGccaaMe8+dbiabgkHiTiaaysW7caWG UbWdamaaBaaaleaapeGaamiAaaWdaeqaaaGcpeGaayjkaiaawMcaaa qaaiaad6gapaWaaSbaaSqaa8qacaWGObaapaqabaaaaOWdbiaaysW7 daWcaaqaaiaaigdaaeaacaWGUbWdamaaBaaaleaapeGaamiAaaWdae qaaOGaaGjbV=qacqGHsislcaaMe8UaaGymaaaacaaMe8UaaGjbVpaa qafabaGaaGPaVpaabmqabaGaamyza8aadaWgaaWcbaWdbiaadIgaca WGPbaapaqabaGccaaMe8+dbiabgkHiTiaaysW7ceWGLbGbaebadaWg aaWcbaGaamiAaaqabaaakiaawIcacaGLPaaapaWaaWbaaSqabeaape GaaGOmaaaaaeaacaWGPbGaaGPaVlabgIGiolaaykW7caWGZbWaaSba aWqaaiaadIgaaeqaaaWcbeqdcqGHris5aaWcbaGaamiAaaqab0Gaey yeIuoakiaacYcacaaMf8UaaGzbVlaaywW7caaMf8UaaGzbVlaacIca caaIYaGaaiOlaiaaiIdacaGGPaaaaa@866A@

where h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGObaaaa@36F4@  indexes the strata, N h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGobWdamaaBaaaleaapeGaamiAaaWdaeqaaaaa@3821@  is the number of population units in stratum h, MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGObGaaiilaaaa@37A4@   n h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGUbWdamaaBaaaleaapeGaamiAaaWdaeqaaaaa@3841@  is the number of sampled units s h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaam4CamaaBa aaleaacaWGObaabeaaaaa@37F8@  in stratum h, MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGObGaaiilaaaa@37A4@   e hi = y hi h ^ n ( x hi ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGLbWdamaaBaaaleaapeGaamiAaiaadMgaa8aabeaakiaaysW7 peGaeyypa0JaaGjbVlaadMhapaWaaSbaaSqaa8qacaWGObGaamyAaa WdaeqaaOGaaGjbV=qacqGHsislcaaMe8UabmiAa8aagaqcamaaBaaa leaapeGaamOBaaWdaeqaaOWdbmaabmaapaqaa8qacaWH4bWdamaaBa aaleaapeGaamiAaiaadMgaa8aabeaaaOWdbiaawIcacaGLPaaaaaa@4C10@  is the residual of sample unit i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGPbaaaa@36F5@  in stratum h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGObaaaa@36F4@  under the regression model and e ¯ h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGabmyzayaara WaaSbaaSqaaiaadIgaaeqaaaaa@3802@  is the average residual in stratum h. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGObGaaiOlaaaa@37A6@

The variance estimators readily extend to more complex sampling designs, but for simplicity we have given the expression only for stratified simple random sampling which is used in the simulation study of Section 3.


Date modified: