Are probability surveys bound to disappear for the production of official statistics?
Section 4. Model-based approaches

Model-based approaches can eliminate the selection bias of the non-probability source and enable valid statistical inferences, provided that their underlying assumptions hold. The objective of the methods in Sections 4.1, 4.2 and 4.3 is to reduce respondent burden and costs by eliminating data collection for some variables of interest in a probability sample. The greater the number of variables of interest for which the values are not collected, the greater the reduction in data collection costs and respondent burden. However, these methods assume that the variables of interest are measured without error in the non-probability sample ( y k * = y k ). MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paabmqaba GaamyEamaaDaaaleaacaWGRbaabaGaaiOkaaaakiaaysW7caaMc8Ua eyypa0JaaGjbVlaaykW7caWG5bWaaSbaaSqaaiaadUgaaeqaaaGcca GLOaGaayzkaaGaaiOlaaaa@4565@

From the non-probability sample s NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaOGaaiilaaaa@3A7E@  we can obtain the naive estimator θ ^ NP =N k s NP y k / n NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGobGaaeiuaaaakiaaysW7caaMc8Uaeyyp a0JaaGjbVlaaykW7caWGobWaaabeaeaadaWcgaqaaiaayIW7caWG5b WaaSbaaSqaaiaadUgaaeqaaOGaaGjcVdqaaiaayIW7caWGUbWaaWba aSqabeaacaqGobGaaeiuaaaaaaaabaGaam4AaiaaykW7cqGHiiIZca aMc8Uaam4CamaaBaaameaacaqGobGaaeiuaaqabaaaleqaniabggHi Ldaaaa@568D@  of the total θ, MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXj aacYcaaaa@3962@ where n NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6gada ahaaWcbeqaaiaab6eacaqGqbaaaaaa@39C0@  is the number of units in s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaaaa@39C4@  and N MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6eaaa a@37CF@  is the size of the population U. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaca GGUaaaaa@3888@  It is well known that the selection bias of the naive estimator may be significant (see, for example, Bethlehem, 2016). The objective of the methods in Sections 4.1, 4.2 and 4.3 is to reduce the bias of the naïve estimator by using a vector of auxiliary variables, x k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4AaaqabaGccaGGUaaaaa@39D5@  We use X MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaaa a@37DD@  to denote the matrix that contains the values of vector x k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4AaaqabaGccaGGSaaaaa@39D3@   kU. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaamyvaiaac6caaaa@412C@  We assume that x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@  is measured without error in both samples s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaaaa@39C4@  and s P . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaamiuaaqabaGccaGGUaaaaa@39B1@

Section 4.4 briefly discusses small area estimation and the area-level model of Fay and Herriot (1979). Small area estimation methods are generally used to improve the precision of estimates for population sub-groups (domains) that have a small probability sample size. They require collecting the variable y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@  in the probability sample, but not in the non-probability sample. Therefore, they do not require the condition y k * = y k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaacQcaaaGccaaMe8UaaGPaVlabg2da9iaa ysW7caaMc8UaamyEamaaBaaaleaacaWGRbaabeaakiaac6caaaa@43DB@  Ideally, the non-probability sample contains variables correlated to y. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaca GGUaaaaa@38AC@

4.1  Calibration of the non-probability sample

The most natural approach to correcting the selection bias of a non-probability source is to model the relationship between the variable of interest y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ and the auxiliary variables x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@ and then predict the total θ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXb aa@38B2@ by predicting the variable y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ for each unit outside the non-probability sample. This prediction approach is described in Royall (1970) and generalized in Royall (1976); see also Elliott and Valliant (2017). Readers are referred to Valliant, Dorfman and Royall (2000) for more details. With this approach, inferences are conditional on δ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahs7aaa a@383C@ and X . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaca GGUaaaaa@388F@ As a result, Y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahMfaaa a@37DE@ is considered random as well as Ω MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahM6aaa a@3831@ (unless Ω = X ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahM6aca aMe8UaaGPaVlabg2da9iaaykW7caaMe8UaaCiwaiaacMcaaaa@40F5@ . If a probability sample is used, I MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahMeaaa a@37CE@ is also considered random. It is usually assumed that the nonprobability sample selection mechanism is not informative:

Assumption 3: Y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahMfaaa a@37DE@ and δ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahs7aaa a@383C@ are independent after conditioning on X . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaca GGUaaaaa@388F@

Assumption 3 is the key to eliminating the selection bias. The more access we have to auxiliary variables that are strongly related to both y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ and δ k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labes7aKn aaBaaaleaacaWGRbaabeaakiaacYcaaaa@3A77@ the more plausible assumption 3 becomes. In other words, the richer X MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaaa a@37DD@ is, the more the conditional independence between Y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahMfaaa a@37DE@ and δ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahs7aaa a@383C@ becomes a realistic assumption. This assumption, called the exchangeability assumption, is discussed in Mercer, Kreuter, Keeter and Stuart (2017). Schonlau and Couper (2017) also discuss the selection of auxiliary variables and emphasize their key role in reducing selection bias.

Often, a linear model is considered where it is assumed that the observations y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ are mutually independent with E ( y k | X ) = x k β MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiaadMhadaWgaaWcbaGaam4AaaqabaGc caaMc8oacaGLiWoacaaMc8UaaGjcVlaahIfaaiaawIcacaGLPaaaca aMe8UaaGPaVlabg2da9iaaysW7caaMc8UaaCiEamaaDaaaleaacaWG RbaabaqcLbwacWaGyBOmGikaaOGaaCOSdaaa@5267@ and var ( y k | X ) v k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGacAhaca GGHbGaaiOCaiaaykW7daqadaqaamaaeiaabaGaamyEamaaBaaaleaa caWGRbaabeaakiaaykW7aiaawIa7aiaaykW7caaMi8UaaCiwaaGaay jkaiaawMcaaiaaysW7caaMc8UaeyyhIuRaaGjbVlaaykW7caWG2bWa aSbaaSqaaiaadUgaaeqaaOGaaiilaaaa@50AA@ where β MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahk7aaa a@383A@ is a vector of unknown model parameters and v k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadAhada WgaaWcbaGaam4Aaaqabaaaaa@3913@ is a known function of x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@ . The best linear unbiased predictor of θ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXb aa@38B2@ (see, for example, Valliant, Dorfman and Royall, 2000) is given by

θ ^ BLUP = k s NP y k + k U s NP x k β ^ = T x β ^ + k s NP ( y k x k β ^ ) , ( 4.1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqiUdeNbaK aadaahaaWcbeqaaiaabkeacaqGmbGaaeyvaiaabcfaaaGccaaMe8Ua aGPaVlabg2da9iaaykW7caaMe8+aaabeaeaacaaMi8UaamyEamaaBa aaleaacaWGRbaabeaaaeaacaWGRbGaaGPaVlabgIGiolaaykW7caWG ZbWaaSbaaWqaaiaab6eacaqGqbaabeaaaSqab0GaeyyeIuoakiaays W7caaMc8Uaey4kaSIaaGjbVlaaykW7daaeqaqaaiaayIW7caWH4bWa a0baaSqaaiaadUgaaeaajugybiadaITHYaIOaaGcceWHYoGbaKaaaS qaaiaadUgacaaMc8UaeyicI4SaaGPaVlaadwfacaaMe8UaeyOeI0Ia aGjbVlaadohadaWgaaadbaGaaeOtaiaabcfaaeqaaaWcbeqdcqGHri s5aOGaaGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVlaahsfadaqhaaWc baGaaCiEaaqaaKqzGfGamai2gkdiIcaakiqahk7agaqcaiaaysW7ca aMc8Uaey4kaSIaaGjbVlaaykW7daaeqaqaamaabmqabaGaamyEamaa BaaaleaacaWGRbaabeaakiaaysW7caaMc8UaeyOeI0IaaGjbVlaayk W7caWH4bWaa0baaSqaaiaadUgaaeaajugybiadaITHYaIOaaGcceWH YoGbaKaaaiaawIcacaGLPaaacaGGSaaaleaacaWGRbGaaGPaVlabgI GiolaaykW7caWGZbWaaSbaaWqaaiaab6eacaqGqbaabeaaaSqab0Ga eyyeIuoakiaaywW7caaMf8UaaGzbVlaaywW7caaMf8UaaGjcVlaacI cacaaI0aGaaiOlaiaaigdacaGGPaaaaa@AE19@

where

β ^ = ( k s NP v k 1 x k x k ) 1 k s NP v k 1 x k y k . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGabCOSdyaaja GaaGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVpaabmaabaWaaabeaeaa caaMi8UaamODamaaDaaaleaacaWGRbaabaGaeyOeI0IaaGymaaaaki aahIhadaWgaaWcbaGaam4AaaqabaGccaWH4bWaa0baaSqaaiaadUga aeaajugybiadaITHYaIOaaaaleaacaWGRbGaaGPaVlabgIGiolaayk W7caWGZbWaaSbaaWqaaiaab6eacaqGqbaabeaaaSqab0GaeyyeIuoa aOGaayjkaiaawMcaamaaCaaaleqabaGaeyOeI0IaaGymaaaakmaaqa babaGaaGjcVlaadAhadaqhaaWcbaGaam4AaaqaaiabgkHiTiaaigda aaGccaWH4bWaaSbaaSqaaiaadUgaaeqaaOGaamyEamaaBaaaleaaca WGRbaabeaaaeaacaWGRbGaaGPaVlabgIGiolaaykW7caWGZbWaaSba aWqaaiaab6eacaqGqbaabeaaaSqab0GaeyyeIuoakiaac6caaaa@6DBC@

The predictor θ ^ BLUP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGcbGaaeitaiaabwfacaqGqbaaaaaa@3C2E@ can also be re-written in the weighted form θ ^ BLUP = k s NP w k C y k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGcbGaaeitaiaabwfacaqGqbaaaOGaaGjb VlaaykW7cqGH9aqpcaaMe8UaaGPaVpaaqababaGaaGjcVlaadEhada qhaaWcbaGaam4AaaqaaiaadoeaaaGccaWG5bWaaSbaaSqaaiaadUga aeqaaaqaaiaadUgacaaMc8UaeyicI4SaaGPaVlaadohadaWgaaadba GaaeOtaiaabcfaaeqaaaWcbeqdcqGHris5aOGaaiilaaaa@54F4@ where

w k C = 1 + v k 1 x k ( k s NP v k 1 x k x k ) 1 ( T x k s NP x k ) . ( 4.2 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaam4DamaaDa aaleaacaWGRbaabaGaam4qaaaakiaaysW7caaMc8Uaeyypa0JaaGjb VlaaykW7caaIXaGaaGjbVlaaykW7cqGHRaWkcaaMe8UaaGPaVlaadA hadaqhaaWcbaGaam4AaaqaaiabgkHiTiaaigdaaaGccaWH4bWaa0ba aSqaaiaadUgaaeaajugybiadaITHYaIOaaGcdaqadaqaamaaqababa GaamODamaaDaaaleaacaWGRbaabaGaeyOeI0IaaGymaaaakiaahIha daWgaaWcbaGaam4AaaqabaGccaWH4bWaa0baaSqaaiaadUgaaeaaju gybiadaITHYaIOaaaaleaacaWGRbGaaGPaVlabgIGiolaaykW7caWG ZbWaaSbaaWqaaiaab6eacaqGqbaabeaaaSqab0GaeyyeIuoaaOGaay jkaiaawMcaamaaCaaaleqabaGaeyOeI0IaaGymaaaakmaabmaabaGa aCivamaaBaaaleaacaWH4baabeaakiabgkHiTmaaqababaGaaGjcVl aahIhadaWgaaWcbaGaam4AaaqabaaabaGaam4AaiaaykW7cqGHiiIZ caaMc8Uaam4CamaaBaaameaacaqGobGaaeiuaaqabaaaleqaniabgg HiLdaakiaawIcacaGLPaaacaGGUaGaaGzbVlaaywW7caaMf8UaaGzb VlaaywW7caGGOaGaaGinaiaac6cacaaIYaGaaiykaaaa@891D@

It can easily be shown that w k C MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaam4DamaaDa aaleaacaWGRbaabaGaam4qaaaaaaa@388A@ is a calibrated weight that satisfies the calibration equation k s NP w k C x k = T x . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVlaadEhadaqhaaWcbaGaam4AaaqaaiaadoeaaaGccaWH4bWa aSbaaSqaaiaadUgaaeqaaaqaaiaadUgacaaMc8UaeyicI4SaaGPaVl aadohadaWgaaadbaGaaeOtaiaabcfaaeqaaaWcbeqdcqGHris5aOGa aGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVlaahsfadaWgaaWcbaGaaC iEaaqabaGccaGGUaaaaa@51D1@ Therefore, the prediction approach is equivalent to calibration when a linear model is used to describe the relationship between y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ and x k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4AaaqabaGccaGGUaaaaa@39D5@ The calibration equation satisfies what Mercer et al. (2017) call the composition assumption. This approach requires knowing the vector of control totals T x . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahsfada WgaaWcbaGaaCiEaaqabaGccaGGUaaaaa@39C2@ If it is unknown, an alternative is to replace it in (4.1) or (4.2) with an estimate, T ^ x = k s P w k x k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqahsfaga qcamaaBaaaleaacaWH4baabeaakiaaysW7caaMc8Uaeyypa0JaaGjb VlaaykW7daaeqaqaaiaayIW7caWG3bWaaSbaaSqaaiaadUgaaeqaaO GaaCiEamaaBaaaleaacaWGRbaabeaaaeaacaWGRbGaaGPaVlabgIGi olaaykW7caWGZbWaaSbaaWqaaiaadcfaaeqaaaWcbeqdcqGHris5aO Gaaiilaaaa@5047@ from a probability survey (Elliott and Valliant, 2017). If assumptions 1 to 3 are satisfied, it can be shown that the predictor θ ^ BLUP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGcbGaaeitaiaabwfacaqGqbaaaaaa@3C2E@ is unbiased, i.e., E ( θ ^ BLUP θ | δ , X ) = 0 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiqbeI7aXzaajaWaaWbaaSqabeaacaqG cbGaaeitaiaabwfacaqGqbaaaOGaaGjbVlaaykW7cqGHsislcaaMe8 UaaGPaVlabeI7aXjaaykW7aiaawIa7aiaaykW7caaMi8UaaCiTdiaa cYcacaaMe8UaaCiwaaGaayjkaiaawMcaaiaaysW7caaMc8Uaeyypa0 JaaGjbVlaaykW7caaIWaGaaiilaaaa@5C24@ whether T x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahsfada WgaaWcbaGaaCiEaaqabaaaaa@3906@ or T ^ x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqahsfaga qcamaaBaaaleaacaWH4baabeaaaaa@3916@ is used, provided that the latter is design-unbiased, i.e., E ( T ^ x | Ω P ) = T x . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiqahsfagaqcamaaBaaaleaacaWH4baa beaakiaaykW7aiaawIa7aiaaykW7caaMi8UaaCyQdmaaBaaaleaaca WGqbaabeaaaOGaayjkaiaawMcaaiaaysW7caaMc8Uaeyypa0JaaGPa VlaaykW7caWHubWaaSbaaSqaaiaahIhaaeqaaOGaaiOlaaaa@4F75@ Of course, the unbiasedness property of the predictor θ ^ BLUP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGcbGaaeitaiaabwfacaqGqbaaaaaa@3C2E@ requires the linear model to be valid.

Remark: In practice, auxiliary variables for which the population total is known are usually few in number and not sufficiently predictive of the variable y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ for eliminating the selection bias. These may be supplemented with other auxiliary variables for which the total can be estimated using an existing probability survey. Therefore, the vector of population totals may be a blend of known and estimated totals. If the probability survey itself is calibrated to known population totals, then only the estimated totals T ^ x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqahsfaga qcamaaBaaaleaacaWH4baabeaaaaa@3916@ from the probability survey can be used.

A linear model is not always appropriate. This is the case when the variable y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ is categorical. Another typical example occurs when it is desired to estimate the total of a quantitative variable in a domain of interest. The variable y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ is then defined as the product of that quantitative variable and a binary variable indicating domain membership. To model such a variable, it is natural to consider a mixture of a degenerate distribution at 0 and a continuous distribution. When the relationship between y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ and x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@ is not linear, model-assisted calibration of Wu and Sitter (2001) can be used to preserve the weighted form of the predictor θ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXb aa@38B2@ while taking into account the non-linearity of the relationship. Suppose that we replace the above linear model with a non-linear (or non-parametric) model such that E ( y k | X ) = h ( x k ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiaadMhadaWgaaWcbaGaam4AaaqabaGc caaMc8oacaGLiWoacaaMc8UaaGjcVlaahIfaaiaawIcacaGLPaaaca aMe8UaaGPaVlabg2da9iaaysW7caaMc8UaamiAaiaaykW7daqadeqa aiaahIhadaWgaaWcbaGaam4AaaqabaaakiaawIcacaGLPaaacaGGSa aaaa@522B@ where h ( ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadIgaca aMc8+aaeWaaeaacqGHflY1aiaawIcacaGLPaaaaaa@3D47@ is some function. The Wu and Sitter (2001) calibration first involves predicting y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ by y ^ k = h ^ ( x k ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadMhaga qcamaaBaaaleaacaWGRbaabeaakiaaysW7caaMc8Uaeyypa0JaaGjb VlaaykW7ceWGObGbaKaacaaMc8+aaeWabeaacaWH4bWaaSbaaSqaai aadUgaaeqaaaGccaGLOaGaayzkaaGaaiilaaaa@474F@ k U , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8UaamyvaiaacYcaaaa@412A@ where h ^ ( x k ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadIgaga qcaiaaykW7daqadeqaaiaahIhadaWgaaWcbaGaam4Aaaqabaaakiaa wIcacaGLPaaaaaa@3D35@ is a model-based estimate of h ( x k ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadIgaca aMc8+aaeWabeaacaWH4bWaaSbaaSqaaiaadUgaaeqaaaGccaGLOaGa ayzkaaGaaiOlaaaa@3DD7@ Then, the total T y ^ = k U y ^ k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsfada WgaaWcbaGabmyEayaajaaabeaakiaaysW7caaMc8Uaeyypa0JaaGjb VlaaykW7daaeqaqaaiaayIW7ceWG5bGbaKaadaWgaaWcbaGaam4Aaa qabaaabaGaam4AaiaaykW7cqGHiiIZcaaMc8Uaamyvaaqab0Gaeyye Iuoaaaa@4C46@ is calculated, and weights, w k MC , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada qhaaWcbaGaam4Aaaqaaiaab2eacaqGdbaaaOGaaiilaaaa@3B65@ k s NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaqG obGaaeiuaaqabaGccaGGSaaaaa@4322@ are found that satisfy the calibration equation:

k s NP w k MC ( 1 y ^ k ) = ( N T y ^ ) . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaabeaeaaca aMi8Uaam4DamaaDaaaleaacaWGRbaabaGaaeytaiaaboeaaaGcdaqa daabaiqabaGaaGymaaqaaiqadMhagaqcamaaBaaaleaacaWGRbaabe aaaaGccaGLOaGaayzkaaaaleaacaWGRbGaaGPaVlabgIGiolaaykW7 caWGZbWaaSbaaWqaaiaab6eacaqGqbaabeaaaSqab0GaeyyeIuoaki aaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7daqadaabaiqabaGaamOt aaqaaiaadsfadaWgaaWcbaGabmyEayaajaaabeaaaaGccaGLOaGaay zkaaGaaiOlaaaa@5626@

In other words, the equation (4.2) can be used, where x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaKqzGfGamai2gkdiIcaaaaa@3CC9@ is replaced with ( 1 , y ^ k ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paabmqaba GaaGymaiaacYcacaaMe8UaaGPaVlqadMhagaqcamaaBaaaleaacaWG RbaabeaaaOGaayjkaiaawMcaaiaac6caaaa@3FEF@ This method requires knowing the population size N MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6eaaa a@37CF@ as well as the vector x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@ for all units in the population U . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaca GGUaaaaa@3888@ If N MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6eaaa a@37CF@ and T y ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsfada WgaaWcbaGabmyEayaajaaabeaaaaa@390F@ are unknown, they can be replaced with estimates from a probability survey. For example, we can replace N MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6eaaa a@37CE@ with N ^ = k s P w k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqad6eaga qcaiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7daaeqaqaaiaayIW7 caWG3bWaaSbaaSqaaiaadUgaaeqaaaqaaiaadUgacaaMc8UaeyicI4 SaaGPaVlaadohadaWgaaadbaGaamiuaaqabaaaleqaniabggHiLdaa aa@4C25@ and T y ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsfada WgaaWcbaGabmyEayaajaaabeaaaaa@390F@ with T ^ y ^ = k s P w k y ^ k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadsfaga qcamaaBaaaleaaceWG5bGbaKaaaeqaaOGaaGjbVlaaykW7cqGH9aqp caaMe8UaaGPaVpaaqababaGaaGjcVlaadEhadaWgaaWcbaGaam4Aaa qabaGcceWG5bGbaKaadaWgaaWcbaGaam4AaaqabaaabaGaam4Aaiaa ykW7cqGHiiIZcaaMc8Uaam4CamaaBaaameaacaWGqbaabeaaaSqab0 GaeyyeIuoakiaac6caaaa@505F@ The approach can also be extended to the case of multiple variables of interest.

We mentioned that the selection bias may be considerably reduced if x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@ is rich and contains variables that are related to both δ k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labes7aKn aaBaaaleaacaWGRbaabeaaaaa@39BD@ and y k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4AaaqabaGccaGGSaaaaa@39D0@ which makes assumption 3 more realistic. It can therefore be useful in practice to consider a large number of potential auxiliary variables and select the most relevant ones using a variable selection technique. Chen, Valliant and Elliott (2018) suggest the LASSO technique for selecting auxiliary variables and show its good properties.

It should be noted that the predictor θ ^ BLUP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGcbGaaeitaiaabwfacaqGqbaaaaaa@3C2E@ reduces to the naive estimator, θ ^ NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGobGaaeiuaaaakiaacYcaaaa@3B4D@ in the simplest case possible where only one constant auxiliary variable is used: x k = 1 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadIhada WgaaWcbaGaam4AaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaM c8UaaGymaiaacYcaaaa@41C0@ k U . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaamyvaiaac6caaaa@412C@ The naive estimator is usually highly biased. Its bias can be significantly reduced if the population U MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaaa a@37D6@ can be subdivided into H MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadIeaaa a@37C9@ disjoint and exhaustive post-strata, U h , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfada WgaaWcbaGaamiAaaqabaGccaGGSaaaaa@39A9@ h = 1 , , H , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadIgaca aMe8UaaGPaVlabg2da9iaaysW7caaMc8UaaGymaiaacYcacaaMe8Ua eSOjGSKaaiilaiaaysW7caWGibGaaiilaaaa@46F3@ of size N h . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6eada WgaaWcbaGaamiAaaqabaGccaGGUaaaaa@39A4@ The post-stratification model, E ( y k | X ) = β h , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiaadMhadaWgaaWcbaGaam4AaaqabaGc caaMc8oacaGLiWoacaaMc8UaaGjcVlaahIfaaiaawIcacaGLPaaaca aMe8UaaGPaVlabg2da9iaaysW7caaMc8UaeqOSdi2aaSbaaSqaaiaa dIgaaeqaaOGaaiilaaaa@4EC6@ k U h , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8UaamyvamaaBaaaleaacaWG ObaabeaakiaacYcaaaa@424D@ is then postulated, which is an important special case of the above linear model. Assuming that the variance var ( y k | X ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGacAhaca GGHbGaaiOCaiaaykW7daqadaqaamaaeiaabaGaamyEamaaBaaaleaa caWGRbaabeaakiaaykW7aiaawIa7aiaaykW7caaMi8UaaCiwaaGaay jkaiaawMcaaaaa@4629@ is constant for k U h , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8UaamyvamaaBaaaleaacaWG ObaabeaakiaacYcaaaa@424D@ the predictor θ ^ BLUP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGcbGaaeitaiaabwfacaqGqbaaaaaa@3C2E@ is written: θ ^ BLUP = h = 1 H N h β ^ h , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGcbGaaeitaiaabwfacaqGqbaaaOGaaGjb VlaaykW7cqGH9aqpcaaMe8UaaGPaVpaaqadabaGaaGjcVlaad6eada WgaaWcbaGaamiAaaqabaGccuaHYoGygaqcamaaBaaaleaacaWGObaa beaaaeaacaWGObGaaGPaVlabg2da9iaaykW7caaIXaaabaGaamisaa qdcqGHris5aOGaaiilaaaa@5301@ where β ^ h = k s NP , h y k / n h NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbek7aIz aajaWaaSbaaSqaaiaadIgaaeqaaOGaaGjbVlaaykW7cqGH9aqpcaaM e8UaaGPaVpaaqababaWaaSGbaeaacaaMi8UaamyEamaaBaaaleaaca WGRbaabeaaaOqaaiaad6gadaqhaaWcbaGaamiAaaqaaiaab6eacaqG qbaaaaaaaeaacaWGRbGaaGPaVlabgIGiolaaykW7caWGZbWaaSbaaW qaaiaab6eacaqGqbGaaGzaVlaacYcacaaMc8UaamiAaaqabaaaleqa niabggHiLdGccaGGSaaaaa@5824@ s NP , h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfacaGGSaGaaGPaVlaadIgaaeqaaaaa@3CEC@ is the set of units in U h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfada WgaaWcbaGaamiAaaqabaaaaa@38EF@ that are part of the sample s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaaaa@39C4@ and n h NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6gada qhaaWcbaGaamiAaaqaaiaab6eacaqGqbaaaaaa@3AAD@ is the size of s NP , h . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfacaaMb8UaaiilaiaaykW7caWGObaabeaa kiaac6caaaa@3F32@ If the population sizes N h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6eada WgaaWcbaGaamiAaaqabaaaaa@38E8@ are unknown, they can be replaced with estimates, N ^ h = k s P , h w k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqad6eaga qcamaaBaaaleaacaWGObaabeaakiaaysW7caaMc8Uaeyypa0JaaGjb VlaaykW7daaeqaqaaiaayIW7caWG3bWaaSbaaSqaaiaadUgaaeqaaa qaaiaadUgacaaMc8UaeyicI4SaaGPaVlaadohadaWgaaadbaGaamiu aiaaygW7caGGSaGaaGPaVlaadIgaaeqaaaWcbeqdcqGHris5aOGaai ilaaaa@52B4@ from a probability survey, where s P , h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaamiuaiaaygW7caGGSaGaaGPaVlaadIgaaeqaaaaa@3DA7@ is the set of units in U h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfada WgaaWcbaGaamiAaaqabaaaaa@38EF@ that are part of the sample s P MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaamiuaaqabaaaaa@38F5@ . Regression trees could prove to be an interesting approach for forming post-strata, especially when the auxiliary variables are categorical.

If multiple categorical auxiliary variables are available, it can be useful to form a large number of post-strata to reduce the selection bias. If many auxiliary variables are crossed, the sample sizes n h NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6gada qhaaWcbaGaamiAaaqaaiaab6eacaqGqbaaaaaa@3AAD@ could become very small, thereby making the estimators β ^ h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbek7aIz aajaWaaSbaaSqaaiaadIgaaeqaaaaa@39C6@ very unstable. Gelman and Little (1997) suggest using a multi-level regression model to obtain estimators β ˜ h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbek7aIz aaiaWaaSbaaSqaaiaadIgaaeqaaaaa@39C5@ more stable than β ^ h . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbek7aIz aajaWaaSbaaSqaaiaadIgaaeqaaOGaaiOlaaaa@3A82@ They then consider the post-stratified predictor: θ ^ MRP = h = 1 H N h β ˜ h . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGnbGaaeOuaiaabcfaaaGccaaMe8UaaGPa Vlabg2da9iaaysW7caaMc8+aaabmaeaacaaMi8UaamOtamaaBaaale aacaWGObaabeaakiqbek7aIzaaiaWaaSbaaSqaaiaadIgaaeqaaaqa aiaadIgacaaMe8Uaeyypa0JaaGjbVlaaigdaaeaacaWGibaaniabgg HiLdGccaGGUaaaaa@523F@ Nowadays, this method is known as Mr.P or MRP (Multilevel Regression and Poststratification); see, for example, Mercer et al. (2017). A similar approach would use small area estimation methods (Rao and Molina, 2015) to stabilize the estimators β ^ h . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbek7aIz aajaWaaSbaaSqaaiaadIgaaeqaaOGaaiOlaaaa@3A82@ Although such methods are likely to produce much more precise estimates of the average of variable y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ over the population U h , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfada WgaaWcbaGaamiAaaqabaGccaGGSaaaaa@39A9@ it remains to be determined whether such methods can produce significant efficiency gains for estimating the overall total θ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXb aa@38B2@ compared to the simple post-stratified predictor θ ^ BLUP = h = 1 H N h β ^ h . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGcbGaaeitaiaabwfacaqGqbaaaOGaaGjb VlaaykW7cqGH9aqpcaaMe8UaaGPaVpaaqadabaGaaGjcVlaad6eada WgaaWcbaGaamiAaaqabaGccuaHYoGygaqcamaaBaaaleaacaWGObaa beaaaeaacaWGObGaaGjbVlabg2da9iaaysW7caaIXaaabaGaamisaa qdcqGHris5aOGaaiOlaaaa@5307@ It seems that regression trees provide another way to control the instability of the estimators β ^ h MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbek7aIz aajaWaaSbaaSqaaiaadIgaaeqaaaaa@39C6@ since a criterion is generally used to prevent an overly narrow subdivision of the population. These various methods warrant further investigation in future research. Precise estimation of population sizes N h , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6eada WgaaWcbaGaamiAaaqabaGccaGGSaaaaa@39A2@ if not known, is also a problem not to be overlooked when the population is divided into a large number of post-strata.

4.2  Statistical matching

Statistical matching, or data fusion, is an approach developed for combining data from two different sources that contain both source-specific variables and common variables. Readers are referred to D’Orazio, Di Zio and Scanu (2006) or Rässler (2012) for a review of statistical matching methods. In the context of this article, statistical matching involves modelling the relationship between y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ and the auxiliary variables x k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4AaaqabaGccaGGSaaaaa@39D3@ which are common to both sources, using data from the non-probability sample. As with calibration, the non-probability sample selection mechanism is assumed to be non-informative, and the auxiliary variables must be chosen carefully in order to make assumption 3 as plausible as possible. Once a model has been determined, it is used to predict the y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ values in a probability sample. Statistical matching can be viewed as an imputation problem with an imputation rate of 100%. The predictor of θ , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXj aacYcaaaa@3962@ obtained from the probability sample, takes the form: θ ^ SM = k s P w k y k imp , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGtbGaaeytaaaakiaaysW7caaMc8Uaeyyp a0JaaGjbVlaaykW7daaeqaqaaiaayIW7caWG3bWaaSbaaSqaaiaadU gaaeqaaOGaamyEamaaDaaaleaacaWGRbaabaGaaeyAaiaab2gacaqG WbaaaaqaaiaadUgacaaMe8UaeyicI4SaaGjbVlaadohadaWgaaadba GaamiuaaqabaaaleqaniabggHiLdGccaGGSaaaaa@5497@ where y k imp MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaabMgacaqGTbGaaeiCaaaaaaa@3BE6@ is the imputed value for the unit k s P . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaac6caaaa@4255@ As in calibration, inferences are conditional on δ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahs7aaa a@383C@ and X . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaca GGUaaaaa@388F@ Assumption 3, in a statistical matching context, can be viewed as analogous to the Population Missing At Random (PMAR) assumption introduced by Berg, Kim and Skinner (2016) in a non-response context.

If the linear regression model E ( y k | X ) = x k β MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiaadMhadaWgaaWcbaGaam4AaaqabaGc caaMc8oacaGLiWoacaaMc8UaaGjcVlaahIfaaiaawIcacaGLPaaaca aMe8UaaGPaVlabg2da9iaaysW7caaMc8UaaCiEamaaDaaaleaacaWG RbaabaqcLbwacWaGyBOmGikaaOGaaCOSdaaa@5267@ is used, the imputed value for the unit k s P MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaykW7caaMe8Uaam4CamaaBaaaleaacaWG qbaabeaaaaa@4199@ is y k imp = x k β ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaabMgacaqGTbGaaeiCaaaakiaaysW7caaM c8Uaeyypa0JaaGjbVlaaykW7caWH4bWaa0baaSqaaiaadUgaaeaaju gybiadaITHYaIOaaGcceWHYoGbaKaaaaa@4A4B@ and the resulting predictor is given by θ ^ SM = T ^ x β ^ . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGtbGaaeytaaaakiaaysW7caaMc8Uaeyyp a0JaaGjbVlaaykW7ceWHubGbaKaadaqhaaWcbaGaaCiEaaqaaKqzGf Gamai2gkdiIcaakiaayIW7ceWHYoGbaKaacaGGUaaaaa@4B3A@ If assumptions 1 to 3 are satisfied and E ( T ^ x | Ω P ) = T x , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiqahsfagaqcamaaBaaaleaacaWH4baa beaakiaaykW7aiaawIa7aiaaykW7caaMi8UaaCyQdmaaBaaaleaaca WGqbaabeaaaOGaayjkaiaawMcaaiaaysW7caaMc8Uaeyypa0JaaGjb VlaaykW7caWHubWaaSbaaSqaaiaahIhaaeqaaOGaaiilaaaa@4F75@ statistical matching produces an unbiased predictor, θ ^ SM , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGtbGaaeytaaaakiaacYcaaaa@3B4F@ i.e., E ( θ ^ SM θ | δ , X ) = 0. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiqbeI7aXzaajaWaaWbaaSqabeaacaqG tbGaaeytaaaakiaaysW7caaMc8UaeyOeI0IaaGjbVlaaykW7cqaH4o qCcaaMe8oacaGLiWoacaaMe8UaaCiTdiaacYcacaaMe8UaaCiwaaGa ayjkaiaawMcaaiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7caaIWa GaaiOlaaaa@5900@ Also, if v k = x k λ , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadAhada WgaaWcbaGaam4AaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaM c8UaaCiEamaaDaaaleaacaWGRbaabaqcLbwacWaGyBOmGikaaOGaaC 4UdiaacYcaaaa@4821@ for a certain known vector λ , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahU7aca GGSaaaaa@38F3@ it can be shown that k s NP ( y k x k β ^ ) = 0 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa WaaeWabeaacaWG5bWaaSbaaSqaaiaadUgaaeqaaOGaaGjbVlaaykW7 cqGHsislcaaMe8UaaGPaVlaahIhadaqhaaWcbaGaam4AaaqaaKqzGf Gamai2gkdiIcaakiqahk7agaqcaaGaayjkaiaawMcaaaWcbaGaam4A aiaaykW7cqGHiiIZcaaMc8Uaam4CamaaBaaameaacaqGobGaaeiuaa qabaaaleqaniabggHiLdGccaaMe8UaaGPaVlabg2da9iaaysW7caaM c8UaaGimaiaacYcaaaa@5BD7@ and the predictor θ ^ SM MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGtbGaaeytaaaaaaa@3A95@ is equivalent to the predictor θ ^ BLUP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGcbGaaeitaiaabwfacaqGqbaaaaaa@3C2E@ if we replace T x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahsfada WgaaWcbaGaaCiEaaqabaaaaa@3906@ in (4.1) with T ^ x . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqahsfaga qcamaaBaaaleaacaWH4baabeaakiaac6caaaa@39D2@ It can also be shown that, for a post-stratification model where we impute y k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4AaaqabaGccaGGSaaaaa@39D0@ k s P , h , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbGaaGzaVlaacYcacaaMc8UaamiAaaqabaGccaGGSaaaaa@4705@ with y k imp = β ^ h , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaabMgacaqGTbGaaeiCaaaakiaaysW7caaM c8Uaeyypa0JaaGjbVlaaykW7cuaHYoGygaqcamaaBaaaleaacaWGOb aabeaakiaacYcaaaa@46AA@ the predictor θ ^ SM MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGtbGaaeytaaaaaaa@3A95@ reduces to θ ^ SM = h = 1 H N ^ h β ^ h . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGtbGaaeytaaaakiaaysW7caaMc8Uaeyyp a0JaaGjbVlaaykW7daaeWaqaaiqad6eagaqcamaaBaaaleaacaWGOb aabeaakiqbek7aIzaajaWaaSbaaSqaaiaadIgaaeqaaaqaaiaadIga caaMc8Uaeyypa0JaaGPaVlaaigdaaeaacaWGibaaniabggHiLdGcca GGUaaaaa@4FE9@ Therefore, statistical matching and calibration produce similar predictors, even identical in some cases, when a linear model is postulated and the totals T x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahsfada WgaaWcbaGaaCiEaaqabaaaaa@3906@ are estimated.

Choosing between statistical matching or calibration can depend on the user’s perspective. For example, if it is the content of the non-probability source, in terms of variables of interest, that is relevant to the user, then it seems natural to weight the non-probability sample in the hopes of reducing the selection bias for all variables of interest. The calibration technique or the methods in Section 4.3 are obvious choices for such weighting. Conversely, if instead it is the content of the probability survey that is relevant, then statistical matching is the appropriate choice. This method enriches the probability survey by imputing the missing variables of interest.

Statistical matching is easily generalized to non-linear or non-parametric models such that E ( y k | X ) = h ( x k ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiaadMhadaWgaaWcbaGaam4AaaqabaGc caaMc8oacaGLiWoacaaMc8UaaGjcVlaahIfaaiaawIcacaGLPaaaca aMe8UaaGPaVlabg2da9iaaysW7caaMc8UaamiAamaabmqabaGaaCiE amaaBaaaleaacaWGRbaabeaaaOGaayjkaiaawMcaaiaac6caaaa@50A2@ The imputed values y k imp MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaabMgacaqGTbGaaeiCaaaaaaa@3BE6@ are simply obtained by predicting the missing values y k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4AaaqabaGccaGGSaaaaa@39D0@ k s P , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaacYcaaaa@4253@ using the chosen model. The predictor θ ^ SM = k s P w k y k imp MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGtbGaaeytaaaakiaaysW7caaMc8Uaeyyp a0JaaGjbVlaaykW7daaeqaqaaiaadEhadaWgaaWcbaGaam4Aaaqaba GccaWG5bWaa0baaSqaaiaadUgaaeaacaqGPbGaaeyBaiaabchaaaaa baGaam4AaiaaykW7cqGHiiIZcaaMc8Uaam4CamaaBaaameaacaWGqb aabeaaaSqab0GaeyyeIuoaaaa@5248@ remains unbiased if assumptions 1 to 3 are satisfied and if E ( y k imp y k | δ , X ) = 0. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiaadMhadaqhaaWcbaGaam4Aaaqaaiaa bMgacaqGTbGaaeiCaaaakiaaysW7caaMc8UaeyOeI0IaaGjbVlaayk W7caWG5bWaaSbaaSqaaiaadUgaaeqaaOGaaGPaVdGaayjcSdGaaGPa Vlaahs7acaGGSaGaaGjbVlaahIfaaiaawIcacaGLPaaacaaMe8UaaG PaVlabg2da9iaaysW7caaMc8UaaGimaiaac6caaaa@5ABB@ Donor or nearest neighbour imputation is a non-parametric imputation method commonly used for handling non-response (see, for example, Beaumont and Bocci, 2009) that does not require a linear relationship between y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ and x k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4AaaqabaGccaGGUaaaaa@39D5@ In the context of matching non-probability and probability samples, donor imputation was popularized by Rivers (2007). For a given unit k s P , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaacYcaaaa@4253@ the method involves finding the nearest donor, with respect to the auxiliary variables x , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhaca GGSaaaaa@38AD@ among the units of the non-probability sample and replacing the missing value y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ with the y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ value from this donor. For donor imputation, the condition E ( y k imp y k | δ , X ) = 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiaadMhadaqhaaWcbaGaam4Aaaqaaiaa bMgacaqGTbGaaeiCaaaakiaaysW7caaMc8UaeyOeI0IaaGjbVlaayk W7caWG5bWaaSbaaSqaaiaadUgaaeqaaOGaaGPaVdGaayjcSdGaaGPa VlaayIW7caWH0oGaaiilaiaaysW7caWHybaacaGLOaGaayzkaaGaaG jbVlaaykW7cqGH9aqpcaaMe8UaaGPaVlaaicdaaaa@5B9A@ is satisfied if, for each recipient k s P , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaacYcaaaa@4253@ the donor has exactly the same values of x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhaaa a@37FD@ as the recipient. When one or more auxiliary variables are continuous, this condition is satisfied only asymptotically in general. A very large non-probability sample provides a large pool of donors, which should help to approximately satisfy this condition.

Remark: In some applications, a very large non-probability panel of volunteers, s NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaam4CamaaBa aaleaacaqGobGaaeiuaaqabaGccaGGSaaaaa@392B@ is available, which contains a few auxiliary variables for matching, x , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhaca GGSaaaaa@38AD@ but no variable of interest. Ideally, the variables of interest would be collected for all units of the panel s NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaOGaaiilaaaa@3A7E@ but that is impossible due to the cost and the burden on the panel members. Therefore, in practice, a sub-sample s NP * MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada qhaaWcbaGaaeOtaiaabcfaaeaacaGGQaaaaaaa@3A73@ of s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaaaa@39C4@ is selected using random or non-random sampling methods. Quota sampling (e.g., Deville, 1991) is often considered in this context. In addition to collecting the variables of interest for all units of s NP * , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada qhaaWcbaGaaeOtaiaabcfaaeaacaGGQaaaaOGaaiilaaaa@3B2D@ there may also be interest in collecting other auxiliary variables for matching in order to enhance the vector x . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhaca GGUaaaaa@38AF@ The matching can then be done to the probability sample, often much smaller in size, as long as the latter contains the same auxiliary variables as those of the non-probability sub-sample s NP * . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada qhaaWcbaGaaeOtaiaabcfaaeaacaGGQaaaaOGaaiOlaaaa@3B2F@ By carefully choosing the auxiliary variables for the matching, the potential for bias reduction is increased (Schonlau and Cooper, 2017). The implementation proposed by Rivers (2007) is slightly different. Rivers (2007) suggests conducting the matching between the probability sample and the panel s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaam4CamaaBa aaleaacaqGobGaaeiuaaqabaaaaa@3871@ using the auxiliary variables available in both sources. The variables of interest are collected only for the set of donors in s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaaaa@39C4@ who have been matched to a unit in the probability sample, which allows for a significant reduction of data collection costs and burden. The implicit assumption is that the panel members, initially volunteers, are more likely to respond than individuals chosen at random in the population. Obviously, non-response is unavoidable, and this problem must be dealt with, potentially through imputation. The advantage of this method is that the matching is carried out using the panel s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaaaa@39C4@ rather than a sub-sample of this panel; the pool of donors is larger. However, the matching cannot be done using the enhanced vector of auxiliary variables because it is not available for the units in s NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaOGaaiilaaaa@3A7E@ which limits the potential for bias reduction.

Lavallée and Brisbane (2016) point out the connection between statistical matching and indirect sampling (Lavallée, 2007; Deville and Lavallée, 2006). They propose an estimator obtained by imputing each missing value y k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4AaaqabaGccaGGSaaaaa@39D0@ k s P , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaacYcaaaa@4253@ by a weighted average of the y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ values of nearest donors. In reality, their estimator can also be obtained equivalently by imputing the missing values using fractional donor imputation (for example, Kim and Fuller, 2004). The use of more than one donor to impute the missing values yields a typically modest variance reduction.

Several imputation methods used in practice can be considered linear (Beaumont and Bissonnette, 2011). This is the case for linear regression imputation, donor imputation and fractional donor imputation. An imputation method is said to be linear if the imputed value y k imp , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaabMgacaqGTbGaaeiCaaaakiaacYcaaaa@3CA0@ k s P , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaacYcaaaa@4253@ can be written as y k imp = l s NP ω k l y l , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaabMgacaqGTbGaaeiCaaaakiaaysW7caaM c8Uaeyypa0JaaGjbVlaaykW7daaeqaqaaiaayIW7cqaHjpWDdaWgaa WcbaGaam4AaiaadYgaaeqaaOGaamyEamaaBaaaleaacaWGSbaabeaa aeaacaWGSbGaaGPaVlabgIGiolaaykW7caWGZbWaaSbaaWqaaiaab6 eacaqGqbaabeaaaSqab0GaeyyeIuoakiaacYcaaaa@55A7@ where ω k l MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeM8a3n aaBaaaleaacaWGRbGaamiBaaqabaaaaa@3AD6@ is a function of δ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahs7aaa a@383C@ or X MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaaa a@37DD@ but not of Y . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahMfaca GGUaaaaa@3890@ For example, for donor or nearest-neighbour imputation, ω k l = 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeM8a3n aaBaaaleaacaWGRbGaamiBaaqabaGccaaMe8UaaGPaVlabg2da9iaa ysW7caaMc8UaaGymaaaa@42D1@ if the unit l s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadYgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaqG obGaaeiuaaqabaaaaa@4269@ is the donor for the recipient k s P ; MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaacUdaaaa@4262@ otherwise ω k l = 0. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeM8a3n aaBaaaleaacaWGRbGaamiBaaqabaGccaaMe8UaaGPaVlabg2da9iaa ysW7caaMc8UaaGimaiaac6caaaa@4382@ For a linear imputation method, the estimator θ ^ SM = k s P w k y k imp MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGtbGaaeytaaaakiaaysW7caaMc8Uaeyyp a0JaaGjbVlaaykW7daaeqaqaaiaayIW7caWG3bWaaSbaaSqaaiaadU gaaeqaaOGaamyEamaaDaaaleaacaWGRbaabaGaaeyAaiaab2gacaqG WbaaaaqaaiaadUgacaaMc8UaeyicI4SaaGPaVlaadohadaWgaaadba GaamiuaaqabaaaleqaniabggHiLdaaaa@53D9@ can be rewritten as a weighted sum over the non-probability sample: θ ^ SM = l s NP W l y l , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGtbGaaeytaaaakiaaysW7caaMc8Uaeyyp a0JaaGjbVlaaykW7daaeqaqaaiaayIW7caWGxbWaaSbaaSqaaiaadY gaaeqaaOGaamyEamaaBaaaleaacaWGSbaabeaaaeaacaWGSbGaaGPa VlabgIGiolaaykW7caWGZbWaaSbaaWqaaiaab6eacaqGqbaabeaaaS qab0GaeyyeIuoakiaacYcaaaa@5275@ where W l = k s P w k ω k l . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEfada WgaaWcbaGaamiBaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaM c8+aaabeaeaacaaMi8Uaam4DamaaBaaaleaacaWGRbaabeaakiabeM 8a3naaBaaaleaacaWGRbGaamiBaaqabaaabaGaam4AaiaaykW7cqGH iiIZcaaMc8Uaam4CamaaBaaameaacaWGqbaabeaaaSqab0GaeyyeIu oakiaac6caaaa@51E5@ Therefore, for linear imputation methods, statistical matching is an alternative to calibration and to the methods in Section 4.3 if the objective is to properly weight the non-probability sample.

So far, we have considered only the estimation of the total θ = k U y k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXj aaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7daaeqaqaaiaayIW7caWG 5bWaaSbaaSqaaiaadUgaaeqaaaqaaiaadUgacaaMc8UaeyicI4SaaG PaVlaadwfaaeqaniabggHiLdGccaGGUaaaaa@4C8B@ However, the probability sample contains other variables, and there may be interest in the relationship between two or more variables, some from the probability survey and others imputed from the non-probability sample. As an example, suppose that the estimation of the total θ = k U y ˜ k y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXj aaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7daaeqaqaaiaayIW7ceWG 5bGbaGaadaWgaaWcbaGaam4AaaqabaGccaWG5bWaaSbaaSqaaiaadU gaaeqaaaqaaiaadUgacaaMc8UaeyicI4SaaGPaVlaadwfaaeqaniab ggHiLdaaaa@4E02@ is of interest, where y ˜ k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadMhaga acamaaBaaaleaacaWGRbaabeaaaaa@3925@ is a variable collected in the probability survey, but not available in the non-probability sample. It could, for example, define membership in a domain of interest. Statistical matching can be used to estimate this parameter by θ ^ SM = k s P w k y ˜ k y k imp . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGtbGaaeytaaaakiaaysW7caaMc8Uaeyyp a0JaaGjbVlaaykW7daaeqaqaaiaayIW7caWG3bWaaSbaaSqaaiaadU gaaeqaaOGabmyEayaaiaWaaSbaaSqaaiaadUgaaeqaaOGaamyEamaa DaaaleaacaWGRbaabaGaaeyAaiaab2gacaqGWbaaaaqaaiaadUgaca aMc8UaeyicI4SaaGPaVlaadohadaWgaaadbaGaamiuaaqabaaaleqa niabggHiLdGccaGGUaaaaa@56C8@ We use Y ˜ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqahMfaga acaaaa@37ED@ to denote the vector that contains the values of the variable y ˜ k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadMhaga acamaaBaaaleaacaWGRbaabeaakiaacYcaaaa@39DF@ k U . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaamyvaiaac6caaaa@412C@ It can be shown that θ ^ SM MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGtbGaaeytaaaaaaa@3A95@ is unbiased, E ( θ ^ SM θ | δ , X , Y ˜ ) = 0 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiqbeI7aXzaajaWaaWbaaSqabeaacaqG tbGaaeytaaaakiaaysW7caaMc8UaeyOeI0IaaGjbVlaaykW7cqaH4o qCcaaMe8oacaGLiWoacaaMe8UaaCiTdiaacYcacaaMe8UaaCiwaiaa cYcacaaMe8UabCywayaaiaaacaGLOaGaayzkaaGaaGjbVlaaykW7cq GH9aqpcaaMe8UaaGPaVlaaicdacaGGSaaaaa@5C2C@ if assumptions 1 to 3 are satisfied in addition to the following assumption:

Assumption 4: Y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahMfaaa a@37DE@ and Y ˜ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqahMfaga acaaaa@37ED@ are independent after conditioning on δ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahs7aaa a@383C@ and X . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaca GGUaaaaa@388F@

Assumption 4 is known as the conditional independence assumption in the statistical matching literature.

4.3  Inverse propensity score weighting

Instead of modelling the relationship between y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ and x k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4AaaqabaGccaGGSaaaaa@39D3@ the relationship between δ k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labes7aKn aaBaaaleaacaWGRbaabeaaaaa@39BD@ and x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@ could be modelled. The main advantage of this approach is to simplify the modelling effort when there are multiple variables of interest since there is always only one variable δ k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labes7aKn aaBaaaleaacaWGRbaabeaaaaa@39BD@ . With this approach, inferences are conditional on Y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahMfaaa a@37DE@ and X . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaca GGUaaaaa@388F@ Also, it is usually assumed that assumption 3 is valid and thus Pr ( δ k = 1 | Y , X ) = Pr ( δ k = 1 | X ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGaccfaca GGYbGaaGPaVpaabmaabaWaaqGaaeaacqaH0oazdaWgaaWcbaGaam4A aaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaMc8UaaGymaiaays W7aiaawIa7aiaaysW7caWHzbGaaiilaiaaysW7caWHybaacaGLOaGa ayzkaaGaaGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVlGaccfacaGGYb GaaGPaVpaabmaabaWaaqGaaeaacqaH0oazdaWgaaWcbaGaam4Aaaqa baGccaaMe8UaaGPaVlabg2da9iaaysW7caaMc8UaaGymaiaaysW7ai aawIa7aiaaysW7caWHybaacaGLOaGaayzkaaGaaiOlaaaa@6C5D@ The probability of participation p k = Pr ( δ k = 1 | X ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadchada WgaaWcbaGaam4AaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaM c8UaciiuaiaackhacaaMc8+aaeWaaeaadaabcaqaaiabes7aKnaaBa aaleaacaWGRbaabeaakiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7 caaIXaGaaGjbVdGaayjcSdGaaGjbVlaahIfaaiaawIcacaGLPaaaaa a@557A@ is then estimated by p ^ k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadchaga qcamaaBaaaleaacaWGRbaabeaakiaacYcaaaa@39D7@ and the estimate θ ^ PS = k s NP w k PS y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGqbGaae4uaaaakiaaysW7caaMc8Uaeyyp a0JaaGjbVlaaykW7daaeqaqaaiaayIW7caWG3bWaa0baaSqaaiaadU gaaeaacaqGqbGaae4uaaaakiaadMhadaWgaaWcbaGaam4Aaaqabaaa baGaam4AaiaaykW7cqGHiiIZcaaMc8Uaam4CamaaBaaameaacaqGob GaaeiuaaqabaaaleqaniabggHiLdaaaa@5385@ is calculated, where w k PS = 1 / p ^ k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada qhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaOGaaGjbVlaaykW7cqGH 9aqpcaaMe8UaaGPaVpaalyaabaGaaGymaiaaykW7aeaacaaMi8Uabm iCayaajaWaaSbaaSqaaiaadUgaaeqaaaaakiaac6caaaa@48C8@ The assumption that p k > 0 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadchada WgaaWcbaGaam4AaaqabaGccaaMe8UaaGPaVJGaaiab=5da+iaaysW7 caaMc8UaaGimaiaacYcaaaa@41BC@ k U , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8UaamyvaiaacYcaaaa@412A@ must be made. It is called the positivity assumption by Mercer et al. (2017). It may also be required in the calibration and statistical matching approaches. For example, empty post-strata ( n h NP = 0 ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paabmqaba GaamOBamaaDaaaleaacaWGObaabaGaaeOtaiaabcfaaaGccaaMe8Ua aGPaVlabg2da9iaaysW7caaMc8UaaGimaaGaayjkaiaawMcaaaaa@4431@ may occur if it is not satisfied. To fix this issue, these empty post-strata are usually collapsed with other non-empty post-strata. This collapsing may jeopardize the validity of assumption 3 if the collapsed post-strata are different.

The estimation of p k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadchada WgaaWcbaGaam4Aaaqabaaaaa@390D@ can be achieved by postulating a parametric model p k = g ( x k ; α ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadchada WgaaWcbaGaam4AaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaM c8Uaam4zaiaaykW7daqadeqaaiaahIhadaWgaaWcbaGaam4Aaaqaba GccaGG7aGaaGjbVlaaykW7caWHXoaacaGLOaGaayzkaaGaaiilaaaa @4C39@ where g MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEgaaa a@37E8@ is some function, normally bounded by 0 and 1, and α MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahg7aaa a@3839@ is a vector of unknown model parameters. The logistic function g ( x k ; α ) = exp ( x k α ) / [ 1 + exp ( x k α ) ] MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEgaca aMc8+aaeWabeaacaWH4bWaaSbaaSqaaiaadUgaaeqaaOGaai4oaiaa ysW7caaMc8UaaCySdaGaayjkaiaawMcaaiaaysW7caaMc8Uaeyypa0 JaaGjbVlaaykW7daWcgaqaaiGacwgacaGG4bGaaiiCaiaaykW7daqa deqaaiaahIhadaqhaaWcbaGaam4AaaqaaKqzGfGamai2gkdiIcaaki aahg7aaiaawIcacaGLPaaacaaMc8oabaGaaGPaVpaadmaabaGaaGym aiaaysW7caaMc8Uaey4kaSIaaGjbVlaaykW7ciGGLbGaaiiEaiaacc hacaaMc8+aaeWabeaacaWH4bWaa0baaSqaaiaadUgaaeaajugybiad aITHYaIOaaGccaWHXoaacaGLOaGaayzkaaaacaGLBbGaayzxaaaaaa aa@7061@ predominates in the applications (see Kott, 2019, for a recent application). The estimator of α MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahg7aaa a@3839@ is denoted by α ^ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqahg7aga qcaaaa@3849@ and the estimated probability by p ^ k = g ( x k ; α ^ ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadchaga qcamaaBaaaleaacaWGRbaabeaakiaaysW7caaMc8Uaeyypa0JaaGjb VlaaykW7caWGNbGaaGPaVpaabmqabaGaaCiEamaaBaaaleaacaWGRb aabeaakiaacUdacaaMe8UaaGPaVlqahg7agaqcaaGaayjkaiaawMca aiaac6caaaa@4C5B@ Ideally, α MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahg7aaa a@3839@ would be estimated using x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@ for all the units in the population U MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyvaaaa@3683@ similar to what would be done in a non-response context. For example, assuming the logistic function is used, α MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahg7aaa a@3839@ could be estimated by solving the maximum likelihood equation:

k U [ δ k p k ( α ) ] x k = k s NP x k k U p k ( α ) x k = 0 . ( 4.3 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaabeaeaada Wadeqaaiabes7aKnaaBaaaleaacaWGRbaabeaakiaaysW7caaMc8Ua eyOeI0IaaGjbVlaaykW7caWGWbWaaSbaaSqaaiaadUgaaeqaaOGaaG PaVpaabmaabaGaaCySdaGaayjkaiaawMcaaaGaay5waiaaw2faaiaa ysW7caWH4bWaaSbaaSqaaiaadUgaaeqaaaqaaiaadUgacaaMc8Uaey icI4SaaGPaVlaadwfaaeqaniabggHiLdGccaaMe8Uaeyypa0JaaGjb VlaaykW7daaeqaqaaiaayIW7caWH4bWaaSbaaSqaaiaadUgaaeqaaa qaaiaadUgacaaMc8UaeyicI4SaaGPaVlaadohadaWgaaadbaGaaeOt aiaabcfaaeqaaaWcbeqdcqGHris5aOGaaGjbVlaaykW7cqGHsislca aMe8UaaGPaVpaaqababaGaaGjcVlaadchadaWgaaWcbaGaam4Aaaqa baGccaaMc8+aaeWaaeaacaWHXoaacaGLOaGaayzkaaGaaGjbVlaahI hadaWgaaWcbaGaam4AaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7 caaMc8oaleaacaWGRbGaaGPaVlabgIGiolaaykW7caWGvbaabeqdcq GHris5aOGaaCimaiaac6cacaaMf8UaaGzbVlaaywW7caaMf8UaaGzb VlaacIcacaaI0aGaaiOlaiaaiodacaGGPaaaaa@9684@

This is impossible when x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@ is not known for all units k U s NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8UaamyvaiaaysW7caaMc8Ua eyOeI0IaaGjbVlaaykW7caWGZbWaaSbaaSqaaiaab6eacaqGqbaabe aakiaacYcaaaa@4B19@ which is almost always the case in practice. Iannacchione, Milne and Folsom (1991) proposed another unbiased estimation equation for α MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahg7aaa a@3839@ (see also Deville and Dupont, 1993):

k s NP x k p k ( α ) k U x k = 0 . ( 4.4 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaabeaeaaca aMi8+aaSaaaeaacaWH4bWaaSbaaSqaaiaadUgaaeqaaaGcbaGaamiC amaaBaaaleaacaWGRbaabeaakiaaykW7daqadaqaaiaahg7aaiaawI cacaGLPaaaaaaaleaacaWGRbGaaGPaVlabgIGiolaaykW7caWGZbWa aSbaaWqaaiaab6eacaqGqbaabeaaaSqab0GaeyyeIuoakiaaysW7ca aMc8UaeyOeI0IaaGjbVlaaykW7daaeqaqaaiaahIhadaWgaaWcbaGa am4AaaqabaaabaGaam4AaiaaykW7cqGHiiIZcaaMc8Uaamyvaaqab0 GaeyyeIuoakiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7caWHWaGa aiOlaiaaywW7caaMf8UaaGzbVlaaywW7caaMf8Uaaiikaiaaisdaca GGUaGaaGinaiaacMcaaaa@6F95@

The main advantage of equation (4.4) is that it does not require knowing x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@ for each unit k U s NP . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8UaamyvaiaaysW7caaMc8Ua eyOeI0IaaGjbVlaaykW7caWGZbWaaSbaaSqaaiaab6eacaqGqbaabe aakiaac6caaaa@4B1B@ However, it is necessary to have access to the vector of totals k U x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVlaahIhadaWgaaWcbaGaam4AaaqabaaabaGaam4AaiaaykW7 cqGHiiIZcaaMc8Uaamyvaaqab0GaeyyeIuoaaaa@42E6@ from an external source. An interesting property of equation (4.4) is that the resulting weights w k PS = 1 / p k ( α ^ ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada qhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaOGaaGjbVlaaykW7cqGH 9aqpcaaMe8UaaGPaVpaalyaabaGaaGymaiaaykW7aeaacaaMc8Uaam iCamaaBaaaleaacaWGRbaabeaakiaaykW7daqadaqaaiqahg7agaqc aaGaayjkaiaawMcaaaaaaaa@4C61@ satisfy the calibration equation k s NP w k PS x k = k U x k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVlaadEhadaqhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaOGa aCiEamaaBaaaleaacaWGRbaabeaaaeaacaWGRbGaaGPaVlabgIGiol aaykW7caWGZbWaaSbaaWqaaiaab6eacaqGqbaabeaaaSqab0Gaeyye IuoakiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7daaeqaqaaiaayI W7caWH4bWaaSbaaSqaaiaadUgaaeqaaaqaaiaadUgacaaMc8Uaeyic I4SaaGPaVlaadwfaaeqaniabggHiLdGccaGGSaaaaa@5C90@ just like the weights w k C MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada qhaaWcbaGaam4Aaaqaaiaadoeaaaaaaa@39DD@ given in (4.2). Indeed, it can be shown that solving (4.4) yields w k PS = w k C MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada qhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaOGaaGjbVlaaykW7cqGH 9aqpcaaMe8UaaGPaVlaadEhadaqhaaWcbaGaam4Aaaqaaiaadoeaaa aaaa@44DF@ if the model p k ( α ) = ( 1 + v k 1 x k α ) 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadchada WgaaWcbaGaam4AaaqabaGccaaMc8+aaeWaaeaacaWHXoaacaGLOaGa ayzkaaGaaGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVpaabmaabaGaaG ymaiaaysW7caaMc8Uaey4kaSIaaGjbVlaaykW7caWG2bWaa0baaSqa aiaadUgaaeaacqGHsislcaaIXaaaaOGaaCiEamaaDaaaleaacaWGRb aabaqcLbwacWaGyBOmGikaaOGaaCySdaGaayjkaiaawMcaamaaCaaa leqabaGaeyOeI0IaaGymaaaaaaa@5AA7@ is used. However, this is a less natural model than the above logistic model for modelling a probability.

To get around the problem of missing values x k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaCiEamaaBa aaleaacaWGRbaabeaakiaacYcaaaa@3880@ k U s NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8UaamyvaiaaysW7caaMc8Ua eyOeI0IaaGjbVlaaykW7caWGZbWaaSbaaSqaaiaab6eacaqGqbaabe aakiaacYcaaaa@4B19@ Chen et al. (2019) suggest estimating k U p k ( α ) x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVlaadchadaWgaaWcbaGaam4AaaqabaGccaaMc8+aaeWaaeaa caWHXoaacaGLOaGaayzkaaGaaGjbVlaahIhadaWgaaWcbaGaam4Aaa qabaaabaGaam4AaiaaykW7cqGHiiIZcaaMc8Uaamyvaaqab0Gaeyye Iuoaaaa@4ADF@ in (4.3) using a probability survey. The equation to be solved becomes:

k s NP x k k s P w k p k ( α ) x k = 0 . ( 4.5 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaabeaeaaca aMi8UaaCiEamaaBaaaleaacaWGRbaabeaaaeaacaWGRbGaaGPaVlab gIGiolaaykW7caWGZbWaaSbaaWqaaiaab6eacaqGqbaabeaaaSqab0 GaeyyeIuoakiaaysW7caaMc8UaeyOeI0IaaGjbVlaaykW7daaeqaqa aiaayIW7caWG3bWaaSbaaSqaaiaadUgaaeqaaOGaamiCamaaBaaale aacaWGRbaabeaakiaaykW7daqadaqaaiaahg7aaiaawIcacaGLPaaa caaMe8UaaCiEamaaBaaaleaacaWGRbaabeaaaeaacaWGRbGaaGPaVl abgIGiolaaykW7caWGZbWaaSbaaWqaaiaadcfaaeqaaaWcbeqdcqGH ris5aOGaaGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVlaahcdacaGGUa GaaGzbVlaaywW7caaMf8UaaGzbVlaaywW7caGGOaGaaGinaiaac6ca caaI1aGaaiykaaaa@75DC@

Equation (4.5) is unbiased conditionally on Y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahMfaaa a@37DE@ and X MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaCiwaaaa@368A@ provided that the probability survey allows for unbiased estimation, conditionally on Y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahMfaaa a@37DE@ and Ω , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahM6aca GGSaaaaa@38E1@ of any population total that is not a function of δ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahs7aaa a@383C@ such as k U p k ( α ) x k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVlaadchadaWgaaWcbaGaam4AaaqabaGccaaMc8+aaeWaaeaa caWHXoaacaGLOaGaayzkaaGaaGjbVlaahIhadaWgaaWcbaGaam4Aaa qabaaabaGaam4AaiaaykW7cqGHiiIZcaaMc8Uaamyvaaqab0Gaeyye Iuoakiaac6caaaa@4B9B@ Assumptions 1 and 3 are required, but not assumption 2. Using the idea of Iannacchione et al. (1991), an alternative to (4.5) is obtained by solving:

k s NP x k p k ( α ) k s P w k x k = 0 . ( 4.6 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaabeaeaaca aMc8+aaSaaaeaacaWH4bWaaSbaaSqaaiaadUgaaeqaaaGcbaGaamiC amaaBaaaleaacaWGRbaabeaakiaaykW7daqadaqaaiaahg7aaiaawI cacaGLPaaaaaaaleaacaWGRbGaaGPaVlabgIGiolaaykW7caWGZbWa aSbaaWqaaiaab6eacaqGqbaabeaaaSqab0GaeyyeIuoakiaaysW7ca aMc8UaeyOeI0IaaGjbVlaaykW7daaeqaqaaiaayIW7caWG3bWaaSba aSqaaiaadUgaaeqaaOGaaCiEamaaBaaaleaacaWGRbaabeaaaeaaca WGRbGaaGPaVlabgIGiolaaykW7caWGZbWaaSbaaWqaaiaadcfaaeqa aaWcbeqdcqGHris5aOGaaGjbVlabg2da9iaaysW7caaMc8UaaCimai aac6cacaaMf8UaaGzbVlaaywW7caaMf8UaaGzbVlaacIcacaaI0aGa aiOlaiaaiAdacaGGPaaaaa@72E4@

Equation (4.6) produces weights w k PS = 1 / p k ( α ^ ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada qhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaOGaaGjbVlaaykW7cqGH 9aqpcaaMe8UaaGPaVpaalyaabaGaaGymaiaaykW7aeaacaaMc8Uaam iCamaaBaaaleaacaWGRbaabeaakiaaykW7daqadaqaaiqahg7agaqc aaGaayjkaiaawMcaaaaaaaa@4C61@ that satisfy the calibration equation k s NP w k PS x k = k s P w k x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVlaadEhadaqhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaOGa aCiEamaaBaaaleaacaWGRbaabeaaaeaacaWGRbGaaGPaVlabgIGiol aaykW7caWGZbWaaSbaaWqaaiaab6eacaqGqbaabeaaaSqab0Gaeyye IuoakiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7daaeqaqaaiaayI W7caWG3bWaaSbaaSqaaiaadUgaaeqaaOGaaCiEamaaBaaaleaacaWG RbaabeaaaeaacaWGRbGaaGPaVlabgIGiolaaykW7caWGZbWaaSbaaW qaaiaadcfaaeqaaaWcbeqdcqGHris5aaaa@5F23@ (see also Lesage, 2017; Rao, 2020). The estimators of α MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahg7aaa a@3839@ obtained using (4.5) or (4.6) are likely less efficient than those obtained using (4.3) or (4.4). If x k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4AaaqabaGccaGGSaaaaa@39D3@ k U s NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8UaamyvaiaaysW7caaMc8Ua eyOeI0IaaGjbVlaaykW7caWGZbWaaSbaaSqaaiaab6eacaqGqbaabe aakiaacYcaaaa@4B19@ or the vector k U x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVlaahIhadaWgaaWcbaGaam4AaaqabaaabaGaam4AaiaaykW7 cqGHiiIZcaaMc8Uaamyvaaqab0GaeyyeIuoaaaa@42E6@ is known, then using (4.3) or (4.4) is preferable. Otherwise, the estimating equations (4.5) or (4.6) can be used provided that x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@ is collected in a probability survey. Note that the indicators δ k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiTdq2aaS baaSqaaiaadUgaaeqaaaaa@386A@ do not need to be observed in the probability sample.

Equations (4.5) and (4.6) may be more difficult to solve than equations (4.3) and (4.4) and may not have a solution. Consider, for example, the case where there is only one auxiliary variable: x k = 1. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadIhada WgaaWcbaGaam4AaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaM c8UaaGymaiaac6caaaa@41C2@ Using (4.5) or (4.6), it can be seen that the estimated probability reduces to: p ^ k = n NP / k s P w k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadchaga qcamaaBaaaleaacaWGRbaabeaakiaaysW7caaMc8Uaeyypa0JaaGjb VlaaykW7daWcgaqaaiaad6gadaahaaWcbeqaaiaab6eacaqGqbaaaO GaaGPaVdqaaiaaykW7daaeqaqaaiaayIW7caWG3bWaaSbaaSqaaiaa dUgaaeqaaaqaaiaadUgacaaMc8UaeyicI4SaaGPaVlaadohadaWgaa adbaGaamiuaaqabaaaleqaniabggHiLdaaaOGaaiOlaaaa@5423@ If the size of the probability sample is sufficiently large, it is expected that 0 < p ^ k < 1. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laaicdaca aMe8UaaGPaVJGaaiab=Xda8iaaysW7caaMc8UabmiCayaajaWaaSba aSqaaiaadUgaaeqaaOGaaGjbVlaaykW7cqWF8aapcaaMe8UaaGPaVl aaigdacaGGUaaaaa@49B2@ For small sample sizes, it may happen that p ^ k > 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadchaga qcamaaBaaaleaacaWGRbaabeaakiaaysW7caaMc8occaGae8Npa4Ja aGjbVlaaykW7caaIXaaaaa@411D@ due to the variability of k s P w k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVlaadEhadaWgaaWcbaGaam4AaaqabaaabaGaam4AaiaaykW7 cqGHiiIZcaaMc8Uaam4CamaaBaaameaacaWGqbaabeaaaSqab0Gaey yeIuoakiaac6caaaa@44C8@ In that case, equations (4.5) and (4.6) would not have a solution if the logistic function is used since it requires that 0 < p ^ k < 1. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laaicdaca aMe8UaaGPaVJGaaiab=Xda8iaaysW7caaMc8UabmiCayaajaWaaSba aSqaaiaadUgaaeqaaOGaaGjbVlaaykW7cqWF8aapcaaMe8UaaGPaVl aaigdacaGGUaaaaa@49B2@ To avoid this issue, it may be helpful to consider other functions not bounded by 1, such as g ( x k ; α ) = exp ( x k α ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEgaca aMc8+aaeWabeaacaWH4bWaaSbaaSqaaiaadUgaaeqaaOGaai4oaiaa ysW7caaMc8UaaCySdaGaayjkaiaawMcaaiaaysW7caaMc8Uaeyypa0 JaaGjbVlaaykW7ciGGLbGaaiiEaiaacchacaaMc8+aaeWabeaacaWH 4bWaa0baaSqaaiaadUgaaeaajugybiadaITHYaIOaaGccaWHXoaaca GLOaGaayzkaaGaaiOlaaaa@5724@

Kim and Wang (2019) suggest using the probability sample to estimate the participation probability. Assuming the logistic function is used, the equation to be solved is:

k s P w k [ δ k p k ( α ) ] x k = k s P w k δ k x k k s P w k p k ( α ) x k = 0 . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaabeaeaaca WG3bWaaSbaaSqaaiaadUgaaeqaaOGaaGPaVpaadmaabaGaeqiTdq2a aSbaaSqaaiaadUgaaeqaaOGaaGjbVlaaykW7cqGHsislcaaMe8UaaG PaVlaadchadaWgaaWcbaGaam4AaaqabaGccaaMc8+aaeWaaeaacaWH XoaacaGLOaGaayzkaaaacaGLBbGaayzxaaGaaGjbVlaahIhadaWgaa WcbaGaam4AaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaMc8oa leaacaWGRbGaaGPaVlabgIGiolaaykW7caWGZbWaaSbaaWqaaiaadc faaeqaaaWcbeqdcqGHris5aOWaaabeaeaacaaMi8Uaam4DamaaBaaa leaacaWGRbaabeaakiabes7aKnaaBaaaleaacaWGRbaabeaakiaahI hadaWgaaWcbaGaam4AaaqabaaabaGaam4AaiaaykW7cqGHiiIZcaaM c8Uaam4CamaaBaaameaacaWGqbaabeaaaSqab0GaeyyeIuoakiaays W7caaMc8UaeyOeI0IaaGjbVlaaykW7daaeqaqaaiaayIW7caWG3bWa aSbaaSqaaiaadUgaaeqaaOGaamiCamaaBaaaleaacaWGRbaabeaaki aaykW7daqadaqaaiaahg7aaiaawIcacaGLPaaacaaMe8UaaCiEamaa BaaaleaacaWGRbaabeaakiaaysW7caaMc8Uaeyypa0JaaGjbVlaayk W7aSqaaiaadUgacaaMc8UaeyicI4SaaGPaVlaadohadaWgaaadbaGa amiuaaqabaaaleqaniabggHiLdGccaWHWaGaaiOlaaaa@991A@

The method requires knowing the indicators δ k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labes7aKn aaBaaaleaacaWGRbaabeaaaaa@39BD@ in the probability sample and the validity of assumptions 1, 2 and 3 to ensure the estimating equation is unbiased. Also, the probability sample size is usually small relative to the non-probability sample size, and it can be numerically difficult to estimate α , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahg7aca GGSaaaaa@38E9@ especially when x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@ contains a large number of variables and the overlap between the two samples is small.

Lee (2006), see also Rivers (2007), Valliant and Dever (2011) and Elliott and Valliant (2017), proposes to combine the two samples and then estimate p k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadchada WgaaWcbaGaam4Aaaqabaaaaa@390D@ using logistic regression. It seems that the author implicitly assumes that the two samples do not overlap, i.e., that δ k = 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labes7aKn aaBaaaleaacaWGRbaabeaakiaaysW7caaMc8Uaeyypa0JaaGjbVlaa ykW7caaIWaaaaa@41B7@ for all units in s P . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaamiuaaqabaGccaGGUaaaaa@39B1@ Using again the logistic function, the resulting estimating equation is:

k s NP η k NP [ 1 p k ( α ) ] x k k s P w k p k ( α ) x k = 0 , ( 4.7 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaabeaeaaca aMi8Uaeq4TdG2aa0baaSqaaiaadUgaaeaacaqGobGaaeiuaaaakmaa dmqabaGaaGymaiaaysW7caaMc8UaeyOeI0IaaGjbVlaaykW7caWGWb WaaSbaaSqaaiaadUgaaeqaaOGaaGPaVpaabmaabaGaaCySdaGaayjk aiaawMcaaaGaay5waiaaw2faaiaaysW7caWH4bWaaSbaaSqaaiaadU gaaeqaaOGaaGjbVlaaykW7aSqaaiaadUgacaaMc8UaeyicI4SaaGPa VlaadohadaWgaaadbaGaaeOtaiaabcfaaeqaaaWcbeqdcqGHris5aO GaeyOeI0IaaGjbVlaaykW7daaeqaqaaiaayIW7caWG3bWaaSbaaSqa aiaadUgaaeqaaOGaamiCamaaBaaaleaacaWGRbaabeaakiaaykW7da qadaqaaiaahg7aaiaawIcacaGLPaaacaaMe8UaaCiEamaaBaaaleaa caWGRbaabeaakiaaysW7caaMc8Uaeyypa0daleaacaWGRbGaaGPaVl abgIGiolaaykW7caWGZbWaaSbaaWqaaiaadcfaaeqaaaWcbeqdcqGH ris5aOGaaGjbVlaaykW7caWHWaGaaiilaiaaywW7caaMf8UaaGzbVl aaywW7caaMf8UaaiikaiaaisdacaGGUaGaaG4naiaacMcaaaa@8C41@

where η k NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeE7aOn aaDaaaleaacaWGRbaabaGaaeOtaiaabcfaaaaaaa@3B69@ is a certain weight for the units in the non-probability sample. The method is somewhat similar to the one proposed by Chen et al. (2019), but the estimating equation (4.7) is not unbiased, conditionally on Y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahMfaaa a@37DE@ and X , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaca GGSaaaaa@388D@ unlike equations (4.5) and (4.6). However, if we assume η k NP = 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeE7aOn aaDaaaleaacaWGRbaabaGaaeOtaiaabcfaaaGccaaMe8UaaGPaVlab g2da9iaaysW7caaMc8UaaGymaaaa@4364@ and if max { p k ; k U } MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGac2gaca GGHbGaaiiEaiaaykW7daGadaqaaiaadchadaWgaaWcbaGaam4Aaaqa baGccaGG7aGaaGjbVlaaykW7caWGRbGaaGjbVlaaykW7cqGHiiIZca aMe8UaaGPaVlaadwfaaiaawUhacaGL9baaaaa@4CFC@ is small, equation (4.7) becomes approximately equivalent to equation (4.5). Yet Lee (2006) does not directly use the estimated probabilities resulting from (4.7). The author uses them only to order the union of the two samples and then create homogeneous classes. Using homogeneous classes brings some robustness to model misspecification and can help prevent very small estimated probabilities and thus very large weights. In the context of non-response, forming homogeneous imputation or reweighting classes was studied by Little (1986), Eltinge and Yansaneh (1997), and Haziza and Beaumont (2007), among others. Haziza and Lesage (2016) illustrate the robustness of the method when the function g ( x k ; α ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEgaca aMc8+aaeWabeaacaWH4bWaaSbaaSqaaiaadUgaaeqaaOGaai4oaiaa ysW7caaMc8UaaCySdaGaayjkaiaawMcaaaaa@4238@ is misspecified. The method is used regularly in Statistics Canada surveys for dealing with non-response.

Rather than using (4.7), homogeneous classes could be formed by starting with the unbiased equations (4.5) or (4.6). These initial estimated probabilities are denoted by p ^ k 0 = g ( x k ; α ^ ) . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadchaga qcamaaDaaaleaacaWGRbaabaGaaGimaaaakiaaysW7caaMc8Uaeyyp a0JaaGjbVlaaykW7caWGNbGaaGPaVpaabmqabaGaaCiEamaaBaaale aacaWGRbaabeaakiaacUdacaaMe8UaaGPaVlqahg7agaqcaaGaayjk aiaawMcaaiaac6caaaa@4D15@ The sample s = s P s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohaca aMe8UaaGPaVlabg2da9iaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaaysW7caaMc8UaeyOkIGSaaGjbVlaaykW7caWGZbWaaS baaSqaaiaab6eacaqGqbaabeaaaaa@4BC5@ can then be sorted by p ^ k 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadchaga qcamaaDaaaleaacaWGRbaabaGaaGimaaaaaaa@39D8@ and divided into C MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadoeaaa a@37C4@ homogeneous classes of equal or unequal sizes. The set of units in s P MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaamiuaaqabaaaaa@38F5@ that are part of class c MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadogaaa a@37E4@ is denoted by s P , c MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaamiuaiaaygW7caGGSaGaaGjbVlaadogaaeqaaaaa@3DA4@ whereas the set of units in s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaaaa@39C4@ that are part of class c MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadogaaa a@37E4@ is denoted by s NP , c . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfacaaMb8UaaiilaiaaykW7caWGJbaabeaa kiaac6caaaa@3F2D@ The weight w k PS MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada qhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaaaa@3ABE@ for a unit k s NP , c MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaqG obGaaeiuaiaaygW7caGGSaGaaGPaVlaadogaaeqaaaaa@4715@ is equal to the inverse of the estimated participation rate in class c MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadogaaa a@37E4@ and is given by w k PS = N ^ c / n c NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada qhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaOGaaGjbVlaaykW7cqGH 9aqpcaaMe8UaaGPaVpaalyaabaGabmOtayaajaWaaSbaaSqaaiaado gaaeqaaOGaaGPaVdqaaiaaykW7caWGUbWaa0baaSqaaiaadogaaeaa caqGobGaaeiuaaaaaaGccaGGSaaaaa@4B91@ where N ^ c = k s P , c w k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqad6eaga qcamaaBaaaleaacaWGJbaabeaakiaaysW7caaMc8Uaeyypa0JaaGjb VlaaykW7daaeqaqaaiaayIW7caWG3bWaaSbaaSqaaiaadUgaaeqaaa qaaiaadUgacaaMc8UaeyicI4SaaGPaVlaadohadaWgaaadbaGaamiu aiaaygW7caGGSaGaaGPaVlaadogaaeqaaaWcbeqdcqGHris5aaaa@51F0@ and n c NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6gada qhaaWcbaGaam4yaaqaaiaab6eacaqGqbaaaaaa@3AA8@ is the number of units in s NP , c MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfacaaMb8UaaiilaiaaykW7caWGJbaabeaa aaa@3E71@ This weight ensures the calibration property: k s NP , c w k PS = N ^ c . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVlaadEhadaqhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaaqa aiaadUgacaaMc8UaeyicI4SaaGPaVlaadohadaWgaaadbaGaaeOtai aabcfacaaMb8UaaiilaiaaykW7caWGJbaabeaaaSqab0GaeyyeIuoa kiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7ceWGobGbaKaadaWgaa WcbaGaam4yaaqabaGccaGGUaaaaa@5525@ The number of classes must be large enough to capture a high percentage of the variability of the initial probabilities p ^ k 0 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadchaga qcamaaDaaaleaacaWGRbaabaGaaGimaaaaaaa@39D8@ thereby reducing the bias. On the other hand, it must not be too large to prevent the occurrence of empty classes since the weights w k PS = N ^ c / n c NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada qhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaOGaaGjbVlaaykW7cqGH 9aqpcaaMe8UaaGPaVpaalyaabaGabmOtayaajaWaaSbaaSqaaiaado gaaeqaaOGaaGPaVdqaaiaaykW7caWGUbWaa0baaSqaaiaadogaaeaa caqGobGaaeiuaaaaaaaaaa@4AD7@ cannot be calculated if n c NP = 0. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6gada qhaaWcbaGaam4yaaqaaiaab6eacaqGqbaaaOGaaGjbVlaaykW7cqGH 9aqpcaaMe8UaaGPaVlaaicdacaGGUaaaaa@4354@ Regression trees can prove to be an effective alternative for forming classes. In a non-response context, they have been studied by Phipps and Toth (2012). The estimator θ ^ PS = k s NP w k PS y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGqbGaae4uaaaakiaaysW7caaMc8Uaeyyp a0JaaGjbVlaaykW7daaeqaqaaiaayIW7caWG3bWaa0baaSqaaiaadU gaaeaacaqGqbGaae4uaaaakiaadMhadaWgaaWcbaGaam4Aaaqabaaa baGaam4AaiaaykW7cqGHiiIZcaaMc8Uaam4CamaaBaaameaacaqGob GaaeiuaaqabaaaleqaniabggHiLdaaaa@5385@ obtained after forming homogeneous classes has exactly the same form as the post-stratified estimator described in the calibration approach in Section 4.1; the only difference is that the classes are built by modelling δ k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labes7aKn aaBaaaleaacaWGRbaabeaaaaa@39BD@ rather than y k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4AaaqabaGccaGGUaaaaa@39D2@

Assumption 3 may not be realistic in some contexts so that Pr ( δ k = 1 | Y , X ) Pr ( δ k = 1 | X ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGaccfaca GGYbGaaGPaVpaabmaabaWaaqGaaeaacqaH0oazdaWgaaWcbaGaam4A aaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaMc8UaaGymaiaays W7aiaawIa7aiaaysW7caWHzbGaaiilaiaaysW7caWHybaacaGLOaGa ayzkaaGaaGjbVlaaykW7cqGHGjsUcaaMe8UaaGPaVlGaccfacaGGYb GaaGPaVpaabmaabaWaaqGaaeaacqaH0oazdaWgaaWcbaGaam4Aaaqa baGccaaMe8UaaGPaVlabg2da9iaaysW7caaMc8UaaGymaiaaysW7ai aawIa7aiaaysW7caWHybaacaGLOaGaayzkaaGaaiOlaaaa@6D1E@ In this case, the participation probability p k = Pr ( δ k = 1 | Y , X ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadchada WgaaWcbaGaam4AaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaM c8UaciiuaiaackhacaaMc8+aaeWaaeaadaabcaqaaiabes7aKnaaBa aaleaacaWGRbaabeaakiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7 caaIXaGaaGjbVdGaayjcSdGaaGjbVlaahMfacaGGSaGaaGjbVlaahI faaiaawIcacaGLPaaaaaa@5899@ might be modelled using a vector of explanatory variables x k * , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaiaacQcaaaGccaGGSaaaaa@3A82@ defined using the variable of interest y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ (or variables of interest if there are several) and potentially other auxiliary variables x k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4AaaqabaGccaGGUaaaaa@39D5@ A parametric model, p k = g ( x k * ; α ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadchada WgaaWcbaGaam4AaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaM c8Uaam4zaiaaykW7daqadeqaaiaahIhadaqhaaWcbaGaam4Aaaqaai aacQcaaaGccaGG7aGaaGjbVlaaykW7caWHXoaacaGLOaGaayzkaaGa aiilaaaa@4CE8@ can be considered for modelling the participation probability. Equations (4.5) and (4.6) cannot be used to estimate α MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahg7aaa a@3839@ because y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ (and therefore x k * ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaiaacQcaaaGccaGGPaaaaa@3A7F@ is not available in the probability sample. However, an equation similar to (4.6) can be used:

k s NP x k I g ( x k * ; α ) k s P w k x k I = 0 . ( 4.8 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaabeaeaada WcaaqaaiaahIhadaqhaaWcbaGaam4AaaqaaiaadMeaaaaakeaacaWG NbGaaGPaVpaabmqabaGaaCiEamaaDaaaleaacaWGRbaabaGaaiOkaa aakiaacUdacaaMe8UaaGPaVlaahg7aaiaawIcacaGLPaaaaaaaleaa caWGRbGaaGPaVlabgIGiolaaykW7caWGZbWaaSbaaWqaaiaab6eaca qGqbaabeaaaSqab0GaeyyeIuoakiaaysW7caaMc8UaeyOeI0IaaGjb VlaaykW7daaeqaqaaiaayIW7caWG3bWaaSbaaSqaaiaadUgaaeqaaO GaaCiEamaaDaaaleaacaWGRbaabaGaamysaaaaaeaacaWGRbGaaGPa VlabgIGiolaaykW7caWGZbWaaSbaaWqaaiaadcfaaeqaaaWcbeqdcq GHris5aOGaaGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVlaahcdacaGG UaGaaGzbVlaaywW7caaMf8UaaGzbVlaaywW7caGGOaGaaGinaiaac6 cacaaI4aGaaiykaaaa@7A03@

The vector x k I , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaiaadMeaaaGccaGGSaaaaa@3AA2@ of the same size as α , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahg7aca GGSaaaaa@38E9@ contains calibration variables, also called instrumental variables in the econometric literature. We use X I MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfada ahaaWcbeqaaiaadMeaaaaaaa@38D8@ to denote the matrix that contains the values of vector x k I , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaiaadMeaaaGccaGGSaaaaa@3AA2@ k U . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaamyvaiaac6caaaa@412C@ Equation (4.8) requires knowing the calibration variables x k I MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaiaadMeaaaaaaa@39E8@ for both samples. However, the explanatory variables x k * MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaiaacQcaaaaaaa@39C8@ can be observed only for the units in the non-probability sample. Equation (4.8) produces weights w k PS = 1 / g ( x k * ; α ^ ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada qhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaOGaaGjbVlaaykW7cqGH 9aqpcaaMe8UaaGPaVpaalyaabaGaaGymaiaaykW7aeaacaaMc8Uaam 4zaiaaykW7daqadeqaaiaahIhadaqhaaWcbaGaam4AaaqaaiaacQca aaGccaGG7aGaaGjbVlaaykW7ceWHXoGbaKaaaiaawIcacaGLPaaaaa aaaa@51E0@ that satisfy the calibration equation k s NP w k PS x k I = k s P w k x k I . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVlaadEhadaqhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaOGa aCiEamaaDaaaleaacaWGRbaabaGaamysaaaaaeaacaWGRbGaaGPaVl abgIGiolaaykW7caWGZbWaaSbaaWqaaiaab6eacaqGqbaabeaaaSqa b0GaeyyeIuoakiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7daaeqa qaaiaayIW7caWG3bWaaSbaaSqaaiaadUgaaeqaaOGaaCiEamaaDaaa leaacaWGRbaabaGaamysaaaaaeaacaWGRbGaaGPaVlabgIGiolaayk W7caWGZbWaaSbaaWqaaiaadcfaaeqaaaWcbeqdcqGHris5aOGaaiOl aaaa@617D@ An equation similar to (4.8) was originally proposed by Deville (1998) to deal with non-response (see also Kott, 2006; Haziza and Beaumont, 2017). Equation (4.8) is unbiased, conditionally on Y , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahMfaca GGSaaaaa@388E@ X MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaaa a@37DD@ and X I , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfada ahaaWcbeqaaiaadMeaaaGccaGGSaaaaa@3992@ if the instrumental variables x k I MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaiaadMeaaaaaaa@39E8@ can be selected such that the following assumption is satisfied:

Assumption 5: δ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahs7aaa a@383C@ and X I MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfada ahaaWcbeqaaiaadMeaaaaaaa@38D8@ are independent after conditioning on Y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahMfaaa a@37DE@ and X . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaca GGUaaaaa@388F@

Assumption 3 is no longer required, but is replaced with another assumption. The choice of instrumental variables x k I MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaiaadMeaaaaaaa@39E8@ that satisfy assumption 5 is not always obvious in practice. They must not be predictive of δ k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labes7aKn aaBaaaleaacaWGRbaabeaaaaa@39BD@ after conditioning on x k * . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaiaacQcaaaGccaGGUaaaaa@3A84@ Ideally, for efficiency reasons, the instrumental variables are selected so as to be predictive of x k * MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaiaacQcaaaaaaa@39C8@ without compromising assumption 5. Unlike equations (4.5) and (4.6), equation (4.8) cannot be used to form homogeneous classes because the participation probabilities p ^ k = g ( x k * ; α ^ ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadchaga qcamaaBaaaleaacaWGRbaabeaakiaaysW7caaMc8Uaeyypa0JaaGjb VlaaykW7caWGNbGaaGPaVpaabmqabaGaaCiEamaaDaaaleaacaWGRb aabaGaaiOkaaaakiaacUdacaaMe8UaaGPaVlqahg7agaqcaaGaayjk aiaawMcaaaaa@4C58@ cannot be calculated for the units in the probability sample. As such, the property of robustness that comes with homogeneous classes is lost. Because of these drawbacks, equation (4.8) should be considered only when there are strong reasons to believe that assumption 3 is not appropriate.

Once weights w k PS MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada qhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaaaa@3ABE@ have been calculated using one of the methods in this section, they can still be adjusted through calibration. The objective of this calibration is to improve the precision of the estimator θ ^ PS MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGqbGaae4uaaaaaaa@3A98@ and also obtain a double robustness property (see Chen et al., 2019).

In general, the variable y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ is observed for the entire non-probability sample, and the inverse propensity-score weighted estimator, θ ^ PS = k s NP w k PS y k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaWbaaSqabeaacaqGqbGaae4uaaaakiaaysW7caaMc8Uaeyyp a0JaaGjbVlaaykW7daaeqaqaaiaayIW7caWG3bWaa0baaSqaaiaadU gaaeaacaqGqbGaae4uaaaakiaadMhadaWgaaWcbaGaam4Aaaqabaaa baGaam4AaiaaykW7cqGHiiIZcaaMc8Uaam4CamaaBaaameaacaqGob GaaeiuaaqabaaaleqaniabggHiLdGccaGGSaaaaa@543F@ or a weighted estimator obtained by calibration or statistical matching can be used. Sometimes, the non-probability sample is too large and the variable y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ can only be collected for a sub-sample of s NP . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaOGaaiOlaaaa@3A80@ Quota sampling (e.g., Deville, 1991) is a commonly used method for drawing the sub-sample if auxiliary variables are available for k s NP . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaqG obGaaeiuaaqabaGccaGGUaaaaa@4324@ An alternative to quota sampling is to calculate the weights w k PS MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada qhaaWcbaGaam4AaaqaaiaabcfacaqGtbaaaaaa@3ABE@ for the entire non-probability sample and use them to select a random sub-sample with probabilities proportional to the weights. The variable y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEaaaa@36A7@ is then collected only for the sub-sample, and the estimates are obtained as if the sub-sample was drawn from the population using an equal probability design. This approach is called inverse sampling in the literature on probability surveys (see, for example, Hinkins, Oh and Scheuren, 1997; or Rao, Scott and Benhin, 2003) and was proposed by Kim and Wang (2019) for non-probability samples.

4.4  Small area estimation

In most surveys, it is desired to estimate the total of the variable y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ not just for the entire population U , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaca GGSaaaaa@3886@ but also for different subgroups of the population, called domains. Probability surveys conducted by national statistical agencies generally produce reliable estimates for domains with a sufficient number of sample units. Their bias is controlled through the various sampling and data collection procedures, and their variance is typically small enough to draw accurate conclusions. When the domain of interest contains few sample units, the survey estimates may become unstable to the point of being unusable even when their bias stays under control. To remedy a lack of data in a domain of interest, small area estimation methods may be considered. These methods offset the lack of observed data in a domain through model assumptions that link auxiliary data to survey data. Two types of models are commonly used: unit-level models and area-level models. The area-level model of Fay and Herriot (1979) is undoubtedly the most popular. It requires auxiliary data to be available at the domain level only, unlike unit-level models, which require auxiliary variables for each unit of the population U . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaca GGUaaaaa@3888@ Readers are referred to Rao and Molina (2015) for an excellent coverage of the various approaches. Below, we focus on the Fay-Herriot model.

Suppose it is desired to estimate D MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadseaaa a@37C5@ totals, θ d = k U d y k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXn aaBaaaleaacaWGKbaabeaakiaaysW7caaMc8Uaeyypa0JaaGjbVlaa ykW7daaeqaqaaiaayIW7caWG5bWaaSbaaSqaaiaadUgaaeqaaaqaai aadUgacaaMc8UaeyicI4SaaGPaVlaadwfadaWgaaadbaGaamizaaqa baaaleqaniabggHiLdGccaGGSaaaaa@4EC9@ d = 1 , , D , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsgaca aMe8UaaGPaVlabg2da9iaaysW7caaMc8UaaGymaiaacYcacaaMe8Ua eSOjGSKaaiilaiaaysW7caWGebGaaiilaaaa@46EB@ where U d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfada WgaaWcbaGaamizaaqabaaaaa@38EB@ are D MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadseaaa a@37C5@ disjoint subsets of the population. Using a probability survey, θ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXn aaBaaaleaacaWGKbaabeaaaaa@39C7@ can be estimated by θ ^ d = k s P , d w k y k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaSbaaSqaaiaadsgaaeqaaOGaaGjbVlabg2da9iaaysW7daae qaqaaiaayIW7caWG3bWaaSbaaSqaaiaadUgaaeqaaOGaamyEamaaBa aaleaacaWGRbaabeaaaeaacaWGRbGaaGPaVlabgIGiolaaykW7caWG ZbWaaSbaaWqaaiaadcfacaaMb8UaaiilaiaaykW7caWGKbaabeaaaS qab0GaeyyeIuoakiaacYcaaaa@529D@ where s P , d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaamiuaiaaygW7caGGSaGaaGPaVlaadsgaaeqaaaaa@3DA3@ is the set of sample units that fall within domain d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsgaca GGUaaaaa@3897@ The estimator θ ^ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaSbaaSqaaiaadsgaaeqaaaaa@39D7@ is called the direct estimator of θ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXn aaBaaaleaacaWGKbaabeaaaaa@39C7@ because it only uses y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ values of units belonging to domain d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsgaca GGUaaaaa@3897@ Small area estimation techniques generally lead to indirect estimators that combine the sample y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ values of domain d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsgaaa a@37E5@ with y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ values of units outside domain d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsgaca GGUaaaaa@3897@ We assume that a vector of auxiliary variables is available at the area level, and these variables come from sources independent of the probability sample. This vector for domain d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsgaaa a@37E5@ is denoted by x d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaamizaaqabaGccaGGUaaaaa@39CE@ For example, the vector x d = ( N d , N d μ ^ d NP ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaamizaaqaaKqzGfGamai2gkdiIcaakiaaysW7caaMc8Ua eyypa0JaaGjbVlaaykW7daqadaqaaiaad6eadaWgaaWcbaGaamizaa qabaGccaGGSaGaaGjbVlaaykW7caWGobWaaSbaaSqaaiaadsgaaeqa aOGafqiVd0MbaKaadaqhaaWcbaGaamizaaqaaiaab6eacaqGqbaaaa GccaGLOaGaayzkaaaaaa@51C1@ could be considered, where N d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6eada WgaaWcbaGaamizaaqabaaaaa@38E4@ is the population size in domain d , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsgaca GGSaaaaa@3895@ μ ^ d NP = k s NP , d y k * / n d NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeY7aTz aajaWaa0baaSqaaiaadsgaaeaacaqGobGaaeiuaaaakiaaysW7caaM c8Uaeyypa0JaaGjbVlaaykW7daaeqaqaamaalyaabaGaaGjcVlaadM hadaqhaaWcbaGaam4AaaqaaiaacQcaaaGccaaMc8oabaGaaGPaVlaa d6gadaqhaaWcbaGaamizaaqaaiaab6eacaqGqbaaaaaaaeaacaWGRb GaaGPaVlabgIGiolaaykW7caWGZbWaaSbaaWqaaiaab6eacaqGqbGa aGzaVlaacYcacaaMc8UaamizaaqabaaaleqaniabggHiLdaaaa@5CDD@ is the average of variable y * MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada ahaaWcbeqaaiaacQcaaaaaaa@38D5@ in a non-probability sample, s NP , d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfacaaMb8UaaiilaiaaykW7caWGKbaabeaa aaa@3E72@ is the set of units in the non-probability sample that are in domain d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsgaaa a@37E5@ and n d NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6gada qhaaWcbaGaamizaaqaaiaab6eacaqGqbaaaaaa@3AA9@ is the size of the non-probability sample in domain d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsgaca GGUaaaaa@3897@ If the population size N d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laad6eada WgaaWcbaGaamizaaqabaaaaa@38E4@ is unknown, it can be replaced with an estimate independent of the probability survey. We use X MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaaa a@37DD@ to denote the matrix that contains the values of vector x d , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaamizaaqabaGccaGGSaaaaa@39CC@ d = 1 , , D . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsgaca aMe8UaaGPaVlabg2da9iaaysW7caaMc8UaaGymaiaacYcacaaMe8Ua eSOjGSKaaiilaiaaysW7caWGebGaaiOlaaaa@46ED@ Note that the vector δ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahs7aaa a@383C@ is hidden in the matrix X MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaaa a@37DD@ in this section.

The Fay-Herriot model has two components: the sampling model and the linking model. The sampling model is based on the assumption that, conditionally on Ω P , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahM6ada WgaaWcbaGaamiuaaqabaGccaGGSaaaaa@39EC@ the direct estimators θ ^ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaSbaaSqaaiaadsgaaeqaaaaa@39D7@ are independent and unbiased, i.e., E ( θ ^ d | Ω P ) = θ d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiqbeI7aXzaajaWaaSbaaSqaaiaadsga aeqaaOGaaGjbVdGaayjcSdGaaGjbVlaahM6adaWgaaWcbaGaamiuaa qabaaakiaawIcacaGLPaaacaaMe8UaaGPaVlabg2da9iaaysW7caaM c8UaeqiUde3aaSbaaSqaaiaadsgaaeqaaOGaaiOlaaaa@4F6C@ Their design variance is denoted by ψ d = var ( θ ^ d | Ω P ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI8a5n aaBaaaleaacaWGKbaabeaakiaaysW7caaMc8Uaeyypa0JaaGjbVlaa ykW7ciGG2bGaaiyyaiaackhacaaMc8+aaeWaaeaadaabcaqaaiqbeI 7aXzaajaWaaSbaaSqaaiaadsgaaeqaaOGaaGjbVdGaayjcSdGaaGjb VlaahM6adaWgaaWcbaGaamiuaaqabaaakiaawIcacaGLPaaacaGGUa aaaa@5191@ The sampling model is usually written in the form:

θ ^ d = θ d + e d , ( 4.9 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqiUdeNbaK aadaWgaaWcbaGaamizaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7 caaMc8UaeqiUde3aaSbaaSqaaiaadsgaaeqaaOGaaGjbVlaaykW7cq GHRaWkcaaMe8UaaGPaVlaadwgadaWgaaWcbaGaamizaaqabaGccaGG SaGaaGzbVlaaywW7caaMf8UaaGzbVlaaywW7caGGOaGaaGinaiaac6 cacaaI5aGaaiykaaaa@57B5@

where e d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwgada WgaaWcbaGaamizaaqabaaaaa@38FB@ is the sampling error such that E ( e d | Ω P ) = 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiaadwgadaWgaaWcbaGaamizaaqabaGc caaMe8oacaGLiWoacaaMe8UaaCyQdmaaBaaaleaacaWGqbaabeaaaO GaayjkaiaawMcaaiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7caaI Waaaaa@4BC3@ and var ( e d | Ω P ) = ψ d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGacAhaca GGHbGaaiOCaiaaykW7daqadaqaamaaeiaabaGaamyzamaaBaaaleaa caWGKbaabeaakiaaysW7aiaawIa7aiaaysW7caWHPoWaaSbaaSqaai aadcfaaeqaaaGccaGLOaGaayzkaaGaaGjbVlaaykW7cqGH9aqpcaaM e8UaaGPaVlabeI8a5naaBaaaleaacaWGKbaabeaakiaac6caaaa@50B5@ The independence assumption of the estimators θ ^ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaSbaaSqaaiaadsgaaeqaaaaa@39D7@ (and therefore of the sampling errors e d ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwgada WgaaWcbaGaamizaaqabaGccaGGPaaaaa@39B2@ can be questioned when the strata do not coincide with the domains of interest. Section 8.2 of Rao and Molina (2015) discusses methods that take into account correlated sampling errors. In practice, it is often assumed that these correlations are weak, and they are ignored.

The linking model assumes that, conditionally on X , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaca GGSaaaaa@388D@ the totals θ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXn aaBaaaleaacaWGKbaabeaaaaa@39C7@ are independent, E ( θ d | X ) = x d β MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiabeI7aXnaaBaaaleaacaWGKbaabeaa kiaaysW7aiaawIa7aiaaysW7caWHybaacaGLOaGaayzkaaGaaGjbVl aaykW7cqGH9aqpcaaMe8UaaGPaVlaahIhadaqhaaWcbaGaamizaaqa aKqzGfGamai2gkdiIcaakiaahk7aaaa@5184@ and var ( θ d | X ) = b d 2 σ v 2 , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGacAhaca GGHbGaaiOCaiaaykW7daqadaqaamaaeiaabaGaeqiUde3aaSbaaSqa aiaadsgaaeqaaOGaaGjbVdGaayjcSdGaaGjbVlaahIfaaiaawIcaca GLPaaacaaMe8UaaGPaVlabg2da9iaaysW7caaMc8UaamOyamaaDaaa leaacaWGKbaabaGaaGOmaaaakiabeo8aZnaaDaaaleaacaWG2baaba GaaGOmaaaakiaacYcaaaa@53A6@ where b d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadkgada WgaaWcbaGaamizaaqabaaaaa@38F8@ are known constants used for controlling heteroscedasticity and β MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahk7aaa a@383A@ and σ v 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo8aZn aaDaaaleaacaWG2baabaGaaGOmaaaaaaa@3AA3@ are unknown model parameters. The linking model is usually written in the form:

θ d = x d β + b d v d , ( 4.10 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiUde3aaS baaSqaaiaadsgaaeqaaOGaaGjbVlaaykW7cqGH9aqpcaaMc8UaaGjb VlaahIhadaqhaaWcbaGaamizaaqaaKqzGfGamai2gkdiIcaakiaahk 7acaaMe8UaaGPaVlabgUcaRiaaysW7caaMc8UaamOyamaaBaaaleaa caWGKbaabeaakiaadAhadaWgaaWcbaGaamizaaqabaGccaGGSaGaaG zbVlaaywW7caaMf8UaaGzbVlaaywW7caGGOaGaaGinaiaac6cacaaI XaGaaGimaiaacMcaaaa@5EA7@

where v d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadAhada WgaaWcbaGaamizaaqabaaaaa@390C@ is the model error such that E ( v d | X ) = 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiaadAhadaWgaaWcbaGaamizaaqabaGc caaMe8oacaGLiWoacaaMe8UaaCiwaaGaayjkaiaawMcaaiaaysW7ca aMc8Uaeyypa0JaaGjbVlaaykW7caaIWaaaaa@4A75@ and var ( v d | X ) = σ v 2 . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGacAhaca GGHbGaaiOCaiaaykW7daqadaqaamaaeiaabaGaamODamaaBaaaleaa caWGKbaabeaakiaaysW7aiaawIa7aiaaysW7caWHybaacaGLOaGaay zkaaGaaGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVlabeo8aZnaaDaaa leaacaWG2baabaGaaGOmaaaakiaac6caaaa@502B@ When the parameters of interest, θ d , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXn aaBaaaleaacaWGKbaabeaakiaacYcaaaa@3A81@ are totals, it is often appropriate to let b d = N d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadkgada WgaaWcbaGaamizaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaM c8UaamOtamaaBaaaleaacaWGKbaabeaakiaac6caaaa@42DC@ From (4.9) and (4.10), we obtain the combined model:

θ ^ d = x d β + a d , ( 4.11 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqiUdeNbaK aadaWgaaWcbaGaamizaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7 caaMc8UaaCiEamaaDaaaleaacaWGKbaabaqcLbwacWaGyBOmGikaaO GaaCOSdiaaysW7caaMc8Uaey4kaSIaaGjbVlaaykW7caWGHbWaaSba aSqaaiaadsgaaeqaaOGaaiilaiaaywW7caaMf8UaaGzbVlaaywW7ca aMf8UaaiikaiaaisdacaGGUaGaaGymaiaaigdacaGGPaaaaa@5C9D@

where a d = b d v d + e d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadggada WgaaWcbaGaamizaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaM c8UaamOyamaaBaaaleaacaWGKbaabeaakiaadAhadaWgaaWcbaGaam izaaqabaGccaaMe8UaaGPaVlabgUcaRiaaysW7caaMc8Uaamyzamaa BaaaleaacaWGKbaabeaaaaa@4D68@ is the combined error. When using the Fay-Herriot model (4.11), inferences are usually made conditionally on X . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIfaca GGUaaaaa@388F@ It can easily be shown that E ( a d | X ) = 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiaadggadaWgaaWcbaGaamizaaqabaGc caaMe8oacaGLiWoacaaMe8UaaCiwaaGaayjkaiaawMcaaiaaysW7ca aMc8Uaeyypa0JaaGjbVlaaykW7caaIWaaaaa@4A60@ and var ( a d | X ) = b d 2 σ v 2 + ψ ˜ d , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGacAhaca GGHbGaaiOCaiaaykW7daqadaqaamaaeiaabaGaamyyamaaBaaaleaa caWGKbaabeaakiaaysW7aiaawIa7aiaaysW7caWHybaacaGLOaGaay zkaaGaaGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVlaadkgadaqhaaWc baGaamizaaqaaiaaikdaaaGccqaHdpWCdaqhaaWcbaGaamODaaqaai aaikdaaaGccaaMe8UaaGPaVlabgUcaRiaaysW7caaMc8UafqiYdKNb aGaadaWgaaWcbaGaamizaaqabaGccaGGSaaaaa@5CE5@ where ψ ˜ d = E ( ψ d | X ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI8a5z aaiaWaaSbaaSqaaiaadsgaaeqaaOGaaGjbVlaaykW7cqGH9aqpcaaM e8UaaGPaVlaadweacaaMc8+aaeWaaeaadaabcaqaaiabeI8a5naaBa aaleaacaWGKbaabeaakiaaysW7aiaawIa7aiaaysW7caWHybaacaGL OaGaayzkaaaaaa@4D8A@ is called the smooth design variance (Beaumont and Bocci, 2016; and Hidiroglou, Beaumont and Yung, 2019).

Now suppose that it is desired to predict the total θ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXn aaBaaaleaacaWGKbaabeaaaaa@39C7@ using a linear predictor θ ^ d LIN = i = 1 D λ d i θ ^ i , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaa0baaSqaaiaadsgaaeaacaqGmbGaaeysaiaab6eaaaGccaaM e8UaaGPaVlabg2da9iaaysW7caaMc8+aaabmaeaacaaMi8Uaeq4UdW 2aaSbaaSqaaiaadsgacaWGPbaabeaakiqbeI7aXzaajaWaaSbaaSqa aiaadMgaaeqaaaqaaiaadMgacaaMc8Uaeyypa0JaaGPaVlaaigdaae aacaWGebaaniabggHiLdGccaGGSaaaaa@54F5@ where λ d i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeU7aSn aaBaaaleaacaWGKbGaamyAaaqabaaaaa@3AB3@ are constants to be determined. A linear predictor uses all the data from the probability sample for predicting θ d , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXn aaBaaaleaacaWGKbaabeaakiaacYcaaaa@3A81@ not just the data from domain d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadsgaca GGUaaaaa@3897@ This explains how it derives its efficiency. However, not all linear predictors are appropriate for predicting θ d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXn aaBaaaleaacaWGKbaabeaakiaac6caaaa@3A83@ A strategy often used for determining the constants λ d i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeU7aSn aaBaaaleaacaWGKbGaamyAaaqabaaaaa@3AB3@ is to minimize the variance of the prediction error, var ( θ ^ d LIN θ d | X ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGacAhaca GGHbGaaiOCaiaaykW7daqadaqaamaaeiaabaGafqiUdeNbaKaadaqh aaWcbaGaamizaaqaaiaabYeacaqGjbGaaeOtaaaakiaaysW7caaMc8 UaeyOeI0IaaGjbVlaaykW7cqaH4oqCdaWgaaWcbaGaamizaaqabaGc caaMe8oacaGLiWoacaaMe8UaaCiwaaGaayjkaiaawMcaaiaacYcaaa a@526C@ subject to the constraint that the predictor must be unbiased, E ( θ ^ d LIN θ d | X ) = 0. MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiqbeI7aXzaajaWaa0baaSqaaiaadsga aeaacaqGmbGaaeysaiaab6eaaaGccaaMe8UaaGPaVlabgkHiTiaays W7caaMc8UaeqiUde3aaSbaaSqaaiaadsgaaeqaaOGaaGjbVdGaayjc SdGaaGjbVlaahIfaaiaawIcacaGLPaaacaaMe8UaaGPaVlabg2da9i aaysW7caaMc8UaaGimaiaac6caaaa@5850@ The resulting predictor, called the Best Linear Unbiased Predictor (BLUP), is denoted by θ ^ d BLUP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaa0baaSqaaiaadsgaaeaacaqGcbGaaeitaiaabwfacaqGqbaa aOGaaiilaaaa@3DD1@ and can be written in the form (see, for example, Rao and Molina, 2015):

θ ^ d BLUP = γ d θ ^ d + ( 1 γ d ) x d β ^ , ( 4.12 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqiUdeNbaK aadaqhaaWcbaGaamizaaqaaiaabkeacaqGmbGaaeyvaiaabcfaaaGc caaMe8UaaGPaVlabg2da9iaaysW7caaMc8Uaeq4SdC2aaSbaaSqaai aadsgaaeqaaOGafqiUdeNbaKaadaWgaaWcbaGaamizaaqabaGccaaM e8UaaGPaVlabgUcaRiaaysW7caaMc8+aaeWabeaacaaIXaGaaGjbVl aaykW7cqGHsislcaaMe8UaaGPaVlabeo7aNnaaBaaaleaacaWGKbaa beaaaOGaayjkaiaawMcaaiaaysW7caWH4bWaa0baaSqaaiaadsgaae aajugybiadaITHYaIOaaGcceWHYoGbaKaacaGGSaGaaGzbVlaaywW7 caaMf8UaaGzbVlaaywW7caGGOaGaaGinaiaac6cacaaIXaGaaGOmai aacMcaaaa@7149@

where γ d = b d 2 σ v 2 / ( b d 2 σ v 2 + ψ ˜ d ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeq4SdC2aaS baaSqaaiaadsgaaeqaaOGaaGjbVlaaykW7cqGH9aqpcaaMe8UaaGPa VpaalyaabaGaamOyamaaDaaaleaacaWGKbaabaGaaGOmaaaakiabeo 8aZnaaDaaaleaacaWG2baabaGaaGOmaaaakiaaykW7aeaacaaMc8+a aeWabeaacaWGIbWaa0baaSqaaiaadsgaaeaacaaIYaaaaOGaeq4Wdm 3aa0baaSqaaiaadAhaaeaacaaIYaaaaOGaaGjbVlaaykW7cqGHRaWk caaMe8UaaGPaVlqbeI8a5zaaiaWaaSbaaSqaaiaadsgaaeqaaaGcca GLOaGaayzkaaaaaaaa@5B50@ is bounded by 0 and 1, and

β ^ = ( d = 1 D x d x d b d 2 σ v 2 + ψ ˜ d ) 1 d = 1 D x d b d 2 σ v 2 + ψ ˜ d θ ^ d . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGabCOSdyaaja GaaGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVpaabmaabaWaaabCaeaa caaMe8+aaSaaaeaacaWH4bWaaSbaaSqaaiaadsgaaeqaaOGaaCiEam aaDaaaleaacaWGKbaabaqcLbwacWaGyBOmGikaaaGcbaGaamOyamaa DaaaleaacaWGKbaabaGaaGOmaaaakiabeo8aZnaaDaaaleaacaWG2b aabaGaaGOmaaaakiaaysW7caaMc8Uaey4kaSIaaGjbVlaaykW7cuaH ipqEgaacamaaBaaaleaacaWGKbaabeaaaaaabaGaamizaiaaykW7cq GH9aqpcaaMc8UaaGymaaqaaiaadseaa0GaeyyeIuoaaOGaayjkaiaa wMcaamaaCaaaleqabaGaeyOeI0IaaGymaaaakiaaysW7daaeWbqaai aaysW7daWcaaqaaiaahIhadaWgaaWcbaGaamizaaqabaaakeaacaWG IbWaa0baaSqaaiaadsgaaeaacaaIYaaaaOGaeq4Wdm3aa0baaSqaai aadAhaaeaacaaIYaaaaOGaaGjbVlaaykW7cqGHRaWkcaaMe8UaaGPa VlqbeI8a5zaaiaWaaSbaaSqaaiaadsgaaeqaaaaakiaaysW7cuaH4o qCgaqcamaaBaaaleaacaWGKbaabeaaaeaacaWGKbGaeyypa0JaaGym aaqaaiaadseaa0GaeyyeIuoakiaac6caaaa@84FD@

The predictor (4.12) is a weighted average of the direct estimator θ ^ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaSbaaSqaaiaadsgaaeqaaaaa@39D7@ and a prediction, x d β ^ , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaamizaaqaaKqzGfGamai2gkdiIcaakiqahk7agaqcaiaa cYcaaaa@3ECA@ often called the synthetic estimator. More weight is given to the direct estimator when the smooth design variance, ψ ˜ d , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI8a5z aaiaWaaSbaaSqaaiaadsgaaeqaaOGaaiilaaaa@3AA8@ is small relative to the variance of the linking model, b d 2 σ v 2 . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadkgada qhaaWcbaGaamizaaqaaiaaikdaaaGccqaHdpWCdaqhaaWcbaGaamOD aaqaaiaaikdaaaGccaGGUaaaaa@3E22@ The predictor θ ^ d BLUP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaa0baaSqaaiaadsgaaeaacaqGcbGaaeitaiaabwfacaqGqbaa aaaa@3D17@ is then similar to the direct estimator. This situation normally occurs when the sample size in the domain is large. Conversely, if the direct estimator is unstable and has a large smooth design variance, more weight is given to the synthetic estimator. If the number of domains is large, the prediction variance of θ ^ d BLUP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaa0baaSqaaiaadsgaaeaacaqGcbGaaeitaiaabwfacaqGqbaa aOGaaiilaaaa@3DD1@ var ( θ ^ d BLUP θ d | X ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGacAhaca GGHbGaaiOCaiaaykW7daqadaqaamaaeiaabaGafqiUdeNbaKaadaqh aaWcbaGaamizaaqaaiaabkeacaqGmbGaaeyvaiaabcfaaaGccaaMe8 UaaGPaVlabgkHiTiaaysW7caaMc8UaeqiUde3aaSbaaSqaaiaadsga aeqaaOGaaGjbVdGaayjcSdGaaGjbVlaahIfaaiaawIcacaGLPaaaca GGSaaaaa@533F@ is approximately equal to γ d ψ ˜ d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo7aNn aaBaaaleaacaWGKbaabeaakiqbeI8a5zaaiaWaaSbaaSqaaiaadsga aeqaaOGaaiOlaaaa@3D70@ Since var ( θ ^ d θ d | X ) = ψ ˜ d , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGacAhaca GGHbGaaiOCaiaaykW7daqadaqaamaaeiaabaGafqiUdeNbaKaadaWg aaWcbaGaamizaaqabaGccaaMe8UaaGPaVlabgkHiTiaaysW7caaMc8 UaeqiUde3aaSbaaSqaaiaadsgaaeqaaOGaaGjbVdGaayjcSdGaaGjb VlaahIfaaiaawIcacaGLPaaacaaMe8UaaGPaVlabg2da9iaaysW7ca aMc8UafqiYdKNbaGaadaWgaaWcbaGaamizaaqabaGccaGGSaaaaa@5A31@ the constant γ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo7aNn aaBaaaleaacaWGKbaabeaaaaa@39B8@ can be interpreted as being a variance reduction factor resulting from using θ ^ d BLUP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaa0baaSqaaiaadsgaaeaacaqGcbGaaeitaiaabwfacaqGqbaa aaaa@3D17@ instead of θ ^ d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaaSbaaSqaaiaadsgaaeqaaOGaaiOlaaaa@3A93@ Therefore, the variance reduction is greater when γ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo7aNn aaBaaaleaacaWGKbaabeaaaaa@39B8@ is small, i.e., when the direct estimator is not precise. On the other hand, if the linking model is not properly specified, there is greater risk of significant bias when γ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo7aNn aaBaaaleaacaWGKbaabeaaaaa@39B8@ is small. To better understand this point, suppose that the real linking model is such that E ( θ d | X ) = μ ( x d ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiabeI7aXnaaBaaaleaacaWGKbaabeaa kiaaysW7aiaawIa7aiaaysW7caWHybaacaGLOaGaayzkaaGaaGjbVl aaykW7cqGH9aqpcaaMe8UaaGPaVlabeY7aTjaaykW7daqadeqaaiaa hIhadaWgaaWcbaGaamizaaqabaaakiaawIcacaGLPaaaaaa@5161@ for some function μ ( ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeY7aTj aaykW7daqadaqaaiabgwSixdGaayjkaiaawMcaaiaac6caaaa@3EC2@ Under this model, it can be shown that the bias of the predictor θ ^ d BLUP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaa0baaSqaaiaadsgaaeaacaqGcbGaaeitaiaabwfacaqGqbaa aaaa@3D17@ is given by

E ( θ ^ d BLUP θ d | X ) = ( 1 γ d ) ( μ ( x d ) x d β 0 ) , ( 4.13 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyraiaayk W7daqadaqaamaaeiaabaGafqiUdeNbaKaadaqhaaWcbaGaamizaaqa aiaabkeacaqGmbGaaeyvaiaabcfaaaGccaaMe8UaaGPaVlabgkHiTi aaysW7caaMc8UaeqiUde3aaSbaaSqaaiaadsgaaeqaaOGaaGjbVdGa ayjcSdGaaGjbVlaahIfaaiaawIcacaGLPaaacaaMe8UaaGPaVlabg2 da9iaaysW7caaMc8UaeyOeI0YaaeWaaeaacaaIXaGaaGjbVlaaykW7 cqGHsislcaaMe8UaaGPaVlabeo7aNnaaBaaaleaacaWGKbaabeaaaO GaayjkaiaawMcaaiaaysW7daqadaqaaiabeY7aTjaaykW7daqadeqa aiaahIhadaWgaaWcbaGaamizaaqabaaakiaawIcacaGLPaaacaaMe8 UaaGPaVlabgkHiTiaaysW7caaMc8UaaCiEamaaDaaaleaacaWGKbaa baqcLbwacWaGyBOmGikaaOGaaCOSdmaaBaaaleaacaaIWaaabeaaaO GaayjkaiaawMcaaiaacYcacaaMf8UaaGzbVlaaywW7caaMf8UaaGzb VlaacIcacaaI0aGaaiOlaiaaigdacaaIZaGaaiykaaaa@894B@

where

β 0 = ( d = 1 D x d x d b d 2 σ v 2 + ψ ˜ d ) 1 d = 1 D x d b d 2 σ v 2 + ψ ˜ d μ ( x d ) . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaCOSdmaaBa aaleaacaaIWaaabeaakiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7 daqadaqaamaaqahabaGaaGjbVpaalaaabaGaaCiEamaaBaaaleaaca WGKbaabeaakiaahIhadaqhaaWcbaGaamizaaqaaKqzGfGamai2gkdi IcaaaOqaaiaadkgadaqhaaWcbaGaamizaaqaaiaaikdaaaGccqaHdp WCdaqhaaWcbaGaamODaaqaaiaaikdaaaGccaaMe8UaaGPaVlabgUca RiaaysW7caaMc8UafqiYdKNbaGaadaWgaaWcbaGaamizaaqabaaaaa qaaiaadsgacqGH9aqpcaaIXaaabaGaamiraaqdcqGHris5aaGccaGL OaGaayzkaaWaaWbaaSqabeaacqGHsislcaaIXaaaaOGaaGjbVpaaqa habaGaaGjbVpaalaaabaGaaCiEamaaBaaaleaacaWGKbaabeaaaOqa aiaadkgadaqhaaWcbaGaamizaaqaaiaaikdaaaGccqaHdpWCdaqhaa WcbaGaamODaaqaaiaaikdaaaGccaaMe8UaaGPaVlabgUcaRiaaysW7 caaMc8UafqiYdKNbaGaadaWgaaWcbaGaamizaaqabaaaaOGaaGjbVl abeY7aTjaaykW7daqadeqaaiaahIhadaWgaaWcbaGaamizaaqabaaa kiaawIcacaGLPaaaaSqaaiaadsgacqGH9aqpcaaIXaaabaGaamiraa qdcqGHris5aOGaaiOlaaaa@86E2@

If the linear model μ ( x d ) = x d β MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeY7aTj aaykW7daqadeqaaiaahIhadaWgaaWcbaGaamizaaqabaaakiaawIca caGLPaaacaaMe8UaaGPaVlabg2da9iaaysW7caaMc8UaaCiEamaaDa aaleaacaWGKbaabaqcLbwacWaGyBOmGikaaOGaaCOSdaaa@4C2B@ is valid, the bias disappears. Otherwise, the bias is not zero and increases as γ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo7aNn aaBaaaleaacaWGKbaabeaaaaa@39B8@ decreases or as the specification error of the linking model, μ ( x d ) x d β 0 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeY7aTj aaykW7daqadeqaaiaahIhadaWgaaWcbaGaamizaaqabaaakiaawIca caGLPaaacaaMe8UaaGPaVlabgkHiTiaaysW7caaMc8UaaCiEamaaDa aaleaacaWGKbaabaqcLbwacWaGyBOmGikaaOGaaCOSdmaaBaaaleaa caaIWaaabeaakiaacYcaaaa@4DB2@ increases. When γ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo7aNn aaBaaaleaacaWGKbaabeaaaaa@39B8@ is close to 1, the bias is usually negligible, but so is the variance reduction.

Remark: Note that the predictor θ ^ d BLUP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaa0baaSqaaiaadsgaaeaacaqGcbGaaeitaiaabwfacaqGqbaa aaaa@3D17@ and the bias (4.13) depend on the variance σ v 2 . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo8aZn aaDaaaleaacaWG2baabaGaaGOmaaaakiaac6caaaa@3B5F@ If the linear model (4.10) is not valid, the parameters β MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahk7aaa a@383A@ and σ v 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo8aZn aaDaaaleaacaWG2baabaGaaGOmaaaaaaa@3AA3@ no longer exist. Yet, the linking model (4.10) can still be postulated and its parameters can be estimated from the observed data as if the model were valid. The model variance σ v 2 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo8aZn aaDaaaleaacaWG2baabaGaaGOmaaaakiaacYcaaaa@3B5D@ which enters in the calculation of the predictor θ ^ d BLUP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaWaa0baaSqaaiaadsgaaeaacaqGcbGaaeitaiaabwfacaqGqbaa aaaa@3D17@ and the bias (4.13), can be viewed as being the value towards which an estimator of σ v 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeufBLUP0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo8aZn aaDaaaleaacaWG2baabaGaaGOmaaaaaaa@3AA3@ converges.

The predictor (4.12) cannot be calculated because it depends on the unknown variances σ v 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo8aZn aaDaaaleaacaWG2baabaGaaGOmaaaaaaa@3AA3@ and ψ ˜ d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI8a5z aaiaWaaSbaaSqaaiaadsgaaeqaaOGaaiOlaaaa@3AAA@ When σ v 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo8aZn aaDaaaleaacaWG2baabaGaaGOmaaaaaaa@3AA3@ and ψ ˜ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI8a5z aaiaWaaSbaaSqaaiaadsgaaeqaaaaa@39EE@ in (4.12) are replaced with estimators σ ^ v 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeo8aZz aajaWaa0baaSqaaiaadAhaaeaacaaIYaaaaaaa@3AB3@ and ψ ˜ ^ d , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI8a5z aaiyaajaWaaSbaaSqaaiaadsgaaeqaaOGaaiilaaaa@3AB7@ the BLUP (4.12) becomes the empirical best linear unbiased predictor, denoted as θ ^ d EBLUP . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz BLUPaajaWaa0baaSqaaiaadsgaaeaacaqGfbGaaeOqaiaabYeacaqGvbGa aeiuaaaakiaac6caaaa@3E9B@ There are a number of methods for estimating σ v 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeo8aZn aaDaaaleaacaWG2baabaGaaGOmaaaaaaa@3AA3@ (see Rao and Molina, 2015). One of the most commonly used methods is restricted maximum likelihood. To estimate ψ ˜ d , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI8a5z aaiaWaaSbaaSqaaiaadsgaaeqaaOGaaiilaaaa@3AA8@ we assume that a design-unbiased estimator of ψ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI8a5n aaBaaaleaacaWGKbaabeaaaaa@39DF@ is available, denoted by ψ ^ d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI8a5z aajaWaaSbaaSqaaiaadsgaaeqaaOGaaiOlaaaa@3AAB@ This assumption is formally written: E ( ψ ^ d | Ω P ) = ψ d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiqbeI8a5zaajaWaaSbaaSqaaiaadsga aeqaaOGaaGjbVdGaayjcSdGaaGjbVlaahM6adaWgaaWcbaGaamiuaa qabaaakiaawIcacaGLPaaacaaMe8UaaGPaVlabg2da9iaaysW7caaM c8UaeqiYdK3aaSbaaSqaaiaadsgaaeqaaOGaaiOlaaaa@4F9C@ It follows that E ( ψ ^ d | X ) = ψ ˜ d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWaaeaadaabcaqaaiqbeI8a5zaajaWaaSbaaSqaaiaadsga aeqaaOGaaGjbVdGaayjcSdGaaGjbVlaahIfaaiaawIcacaGLPaaaca aMe8UaaGPaVlabg2da9iaaysW7caaMc8UafqiYdKNbaGaadaWgaaWc baGaamizaaqabaGccaGGUaaaaa@4E4C@ Therefore, the estimator ψ ^ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI8a5z aajaWaaSbaaSqaaiaadsgaaeqaaaaa@39EF@ is unbiased for ψ ˜ d , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI8a5z aaiaWaaSbaaSqaaiaadsgaaeqaaOGaaiilaaaa@3AA8@ but can be very unstable when the domain sample size is small. A more efficient approach for estimating ψ ˜ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI8a5z aaiaWaaSbaaSqaaiaadsgaaeqaaaaa@39EE@ involves modelling ψ ^ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI8a5z aajaWaaSbaaSqaaiaadsgaaeqaaaaa@39EF@ given the auxiliary variables x d . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaamizaaqabaGccaGGUaaaaa@39CE@ In practice, a linear model is often used for log ( ψ ^ d ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGacYgaca GGVbGaai4zaiaaykW7daqadeqaaiqbeI8a5zaajaWaaSbaaSqaaiaa dsgaaeqaaaGccaGLOaGaayzkaaGaaiilaaaa@408E@ and it is assumed that the model errors follow a normal distribution (for example, Rivest and Belmonte, 2000). Beaumont and Bocci (2016), see also Hidiroglou et al. (2019), provide a method of moments for estimating ψ ˜ d MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI8a5z aaiaWaaSbaaSqaaiaadsgaaeqaaaaa@39EE@ that does not require the normality assumption.

The Fay-Herriot model requires the availability of auxiliary data only at the domain level. The variable y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ must be measured without error in the probability survey, but it is not essential for the auxiliary source to be perfect. This leaves the door open to all kinds of files external to the probability survey such as big data files. Kim, Wang, Zhu and Cruze (2018) is a recent example where an extension of the Fay-Herriot model was used with auxiliary data from satellite images. Small area estimation methods often achieve significant and sometimes impressive variance reductions (see, for example, Hidiroglou et al., 2019). The trade-off for obtaining these gains is the introduction of model assumptions and the risk that these assumptions do not hold. Therefore, model validation is a critical step in producing small area estimates, as in any model-based approach.

Small area estimation methods are generally used to improve the efficiency of estimators for domains with a small sample size. They could also be used to reduce the data collection costs and respondent burden by reducing the overall sample size of a probability survey for a few, if not all, survey variables. The estimates obtained from the reduced sample and the Fay-Herriot model, for example, could thus have a precision similar to the direct estimates from the probability survey obtained from the full sample. In this context, small area estimation methods would not be used to improve the precision for domains containing few units, but instead to reduce the overall data collection effort while preserving the quality of the estimates.


Date modified: