Are probability surveys bound to disappear for the production of official statistics?
Section 3. Design-based approaches

Design-based approaches yield design-consistent estimators of θ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXb aa@38B2@ even when the non-probability source produces estimates with a significant selection bias. In this context, the purpose of using a non-probability sample is to reduce the variance of estimators of θ . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXj aac6caaaa@3964@ The efficiency gains achieved can be used to justify a reduction of the probability sample size, thereby a reduction of the data collection costs and respondent burden. The methods that we consider in Sections 3.1 and 3.2 require collecting the values of the variable of interest y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ in the probability sample, just like small area estimation methods described in Section 4.4. However, the efficiency gains are usually expected to be more modest than those obtained using small area estimation methods. In Section 3.1, we consider the scenario y k * = y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaacQcaaaGccaaMe8UaaGPaVlabg2da9iaa ysW7caaMc8UaamyEamaaBaaaleaacaWGRbaabeaaaaa@431F@ whereas in Section 3.2, we consider the scenario y k * y k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaacQcaaaGccaaMe8UaaGPaVlabgcMi5kaa ysW7caaMc8UaamyEamaaBaaaleaacaWGRbaabeaakiaac6caaaa@449C@

3.1  Weighting by the inverse of the probability of inclusion in the combined sample

 The ideal case occurs when the non-probability sample is a census, i.e., s NP = U . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaOGaaGjbVlaaykW7cqGH9aqpcaaM e8UaaGPaVlaadwfacaGGUaaaaa@4290@ In that case, the value of the parameter of interest θ = k U y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXj aaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7daaeqaqaaiaayIW7caWG 5bWaaSbaaSqaaiaadUgaaeqaaaqaaiaadUgacaaMc8UaeyicI4SaaG PaVlaadwfaaeqaniabggHiLdaaaa@4BCF@ can be directly calculated without worrying about bias or variance since y k * = y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaacQcaaaGccaaMe8UaaGPaVlabg2da9iaa ysW7caaMc8UaamyEamaaBaaaleaacaWGRbaabeaaaaa@431F@ is assumed in this section. In general, we expect under-coverage in the sense that s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaaaa@39C4@ is smaller than the population U . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaca GGUaaaaa@3888@ In a design-based approach, the potential under-coverage bias can be addressed by selecting a probability sample s P MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaamiuaaqabaaaaa@38F5@ from U MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaaa a@37D6@ and collecting the values of the variable y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ for the sample units. Ideally, the probability sample is drawn from U s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaca aMe8UaaGPaVlabgkHiTiaaysW7caaMc8Uaam4CamaaBaaaleaacaqG obGaaeiuaaqabaaaaa@41BB@ but it is possible that the units in s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaaaa@39C4@ cannot be linked to those of the sampling frame U MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaaa a@37D6@ to establish the set U s NP . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaca aMe8UaaGPaVlabgkHiTiaaysW7caaMc8Uaam4CamaaBaaaleaacaqG obGaaeiuaaqabaGccaGGUaaaaa@4277@ In general, the larger the non-probability sample, the more it is possible to reduce the size of the probability sample without jeopardizing the desired precision of the estimates.

It seems desirable to estimate θ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXb aa@38B2@ using all the data collected in the combined sample s = s P s NP . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohaca aMe8UaaGPaVlabg2da9iaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaaysW7caaMc8UaeyOkIGSaaGjbVlaaykW7caWGZbWaaS baaSqaaiaab6eacaqGqbaabeaakiaac6caaaa@4C81@ The inclusion indicator in s MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohaaa a@37F4@ can be defined as I ˜ k = δ k + ( 1 δ k ) I k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadMeaga acamaaBaaaleaacaWGRbaabeaakiaaysW7caaMc8Uaeyypa0JaaGjb VlaaykW7cqaH0oazdaWgaaWcbaGaam4AaaqabaGccaaMe8UaaGPaVl abgUcaRiaaysW7caaMc8+aaeWabeaacaaIXaGaaGjbVlaaykW7cqGH sislcaaMe8UaaGPaVlabes7aKnaaBaaaleaacaWGRbaabeaaaOGaay jkaiaawMcaaiaaysW7caWGjbWaaSbaaSqaaiaadUgaaeqaaOGaaiOl aaaa@5A72@ To obtain a design-unbiased estimator of θ , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXj aacYcaaaa@3962@ each unit k s MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4Caaaa@4098@ is weighted by w ˜ k = π ˜ k 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqadEhaga acamaaBaaaleaacaWGRbaabeaakiaaysW7caaMc8Uaeyypa0JaaGjb VlaaykW7cuaHapaCgaacamaaDaaaleaacaWGRbaabaGaeyOeI0IaaG ymaaaaaaa@44F4@ where π ˜ k = E ( I ˜ k | Ω P ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbec8aWz aaiaWaaSbaaSqaaiaadUgaaeqaaOGaaGjbVlaaykW7cqGH9aqpcaaM e8UaaGPaVlaadweadaqadeqaamaaeiqabaGabmysayaaiaWaaSbaaS qaaiaadUgaaeqaaOGaaGPaVdGaayjcSdGaaGPaVlaayIW7caWHPoWa aSbaaSqaaiaadcfaaeqaaaGccaGLOaGaayzkaaGaaiOlaaaa@4EAB@ Under assumptions 1 and 2, E ( I k | Ω P ) = π k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweada qadeqaamaaeiqabaGaamysamaaBaaaleaacaWGRbaabeaakiaaykW7 aiaawIa7aiaaykW7caaMi8UaaCyQdmaaBaaaleaacaWGqbaabeaaaO GaayjkaiaawMcaaiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7cqaH apaCdaWgaaWcbaGaam4Aaaqabaaaaa@4DD1@ and we obtain

π ˜ k = E ( I ˜ k | Ω P ) = δ k + ( 1 δ k ) π k . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqiWdaNbaG aadaWgaaWcbaGaam4AaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7 caaMc8UaamyramaabmqabaWaaqGabeaaceWGjbGbaGaadaWgaaWcba Gaam4AaaqabaGccaaMc8oacaGLiWoacaaMc8UaaGjcVlaahM6adaWg aaWcbaGaamiuaaqabaaakiaawIcacaGLPaaacaaMe8UaaGPaVlabg2 da9iaaysW7caaMc8UaeqiTdq2aaSbaaSqaaiaadUgaaeqaaOGaaGjb VlaaykW7cqGHRaWkcaaMe8UaaGPaVpaabmqabaGaaGymaiaaysW7ca aMc8UaeyOeI0IaaGjbVlaaykW7cqaH0oazdaWgaaWcbaGaam4Aaaqa baaakiaawIcacaGLPaaacaaMe8UaeqiWda3aaSbaaSqaaiaadUgaae qaaOGaaiOlaaaa@6F07@

The resulting estimator is written:

θ ^ = k s w ˜ k y k = k s NP y k + k s P 1 π k ( 1 δ k ) y k . ( 3.1 ) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGafqiUdeNbaK aacaaMe8UaaGPaVlabg2da9iaaysW7caaMc8+aaabeaeaacaaMi8Ua bm4DayaaiaWaaSbaaSqaaiaadUgaaeqaaOGaamyEamaaBaaaleaaca WGRbaabeaaaeaacaWGRbGaaGPaVlabgIGiolaaykW7caWGZbaabeqd cqGHris5aOGaaGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVpaaqababa GaaGjcVlaadMhadaWgaaWcbaGaam4AaaqabaaabaGaam4AaiaaykW7 cqGHiiIZcaaMc8Uaam4CamaaBaaameaacaqGobGaaeiuaaqabaaale qaniabggHiLdGccaaMe8UaaGPaVlabgUcaRiaaysW7caaMc8+aaabe aeaacaaMe8+aaSaaaeaacaaIXaaabaGaeqiWda3aaSbaaSqaaiaadU gaaeqaaaaaaeaacaWGRbGaaGPaVlabgIGiolaaykW7caWGZbWaaSba aWqaaiaadcfaaeqaaaWcbeqdcqGHris5aOGaaGjbVlaaykW7daqade qaaiaaigdacaaMe8UaaGPaVlabgkHiTiaaysW7caaMc8UaeqiTdq2a aSbaaSqaaiaadUgaaeqaaaGccaGLOaGaayzkaaGaaGPaVlaadMhada WgaaWcbaGaam4AaaqabaGccaGGUaGaaGzbVlaaywW7caaMf8UaaGzb VlaaywW7caGGOaGaaG4maiaac6cacaaIXaGaaiykaaaa@96A8@

Note that estimator (3.1) requires the indicator δ k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labes7aKn aaBaaaleaacaWGRbaabeaaaaa@39BD@ to be available for all units in the sample s P . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaamiuaaqabaGccaGGUaaaaa@39B1@ For the units k s P s NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaaysW7caaMc8UaeyykICSaaGjbVlaaykW7caWGZbWaaS baaSqaaiaab6eacaqGqbaabeaakiaacYcaaaa@4CF3@ we have two values: y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@ and y k * . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaacQcaaaGccaGGUaaaaa@3A81@ In principle, we should have y k * = y k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaacQcaaaGccaaMe8UaaGPaVlabg2da9iaa ysW7caaMc8UaamyEamaaBaaaleaacaWGRbaabeaakiaacYcaaaa@43D9@ but it is possible that this relationship is not exactly satisfied. These units can be used to validate the assumption y k * y k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaacQcaaaGccaaMe8UaaGPaVlabgIKi7kaa ysW7caaMc8UaamyEamaaBaaaleaacaWGRbaabeaakiaac6caaaa@4486@ If significant differences are observed, it may be preferable to not consider this approach and to rely on the methods in Section 3.2 that use data from the non-probability source as auxiliary data. If we trust the data quality of the non-probability source, it may be advisable not to collect the variable y MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaaa a@37FA@ in the probability sample for the units also present in the non-probability sample in order to reduce the data collection costs and respondent burden.

We can view the problem as if we had two sampling frames: U MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaaa a@37D6@ and s NP . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaOGaaiOlaaaa@3A80@ A sample s P MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaamiuaaqabaaaaa@38F5@ is drawn randomly from U MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaaa a@37D6@ and a census is taken from s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaaaa@39C4@ . The probability of selection in the sample s , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohaca GGSaaaaa@38A4@ Pr ( k s | Ω P ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGaccfaca GGYbGaaGPaVpaabmqabaWaaqGabeaacaWGRbGaeyicI4Saam4Caiaa ykW7aiaawIa7aiaaykW7caaMi8UaaCyQdmaaBaaaleaacaWGqbaabe aaaOGaayjkaiaawMcaaiaacYcaaaa@4877@ can then be calculated for each unit k U , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8UaamyvaiaacYcaaaa@412A@ and the estimator (3.1) is recovered by weighting each unit k s MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4Caaaa@4098@ by the inverse of that probability. This approach was proposed by Bankier (1986) to address the problem of multiple sampling frames. In the context of integrating a probability and non-probability sample, estimator (3.1) was proposed by Kim and Tam (2020).

The last sum of (3.1) is a design-unbiased estimator of k U ( 1 δ k ) y k = k U s NP y k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVpaabmqabaGaaGymaiaaysW7caaMc8UaeyOeI0IaaGjbVlaa ykW7cqaH0oazdaWgaaWcbaGaam4AaaqabaaakiaawIcacaGLPaaaca aMc8UaamyEamaaBaaaleaacaWGRbaabeaaaeaacaWGRbGaaGPaVlab gIGiolaaykW7caWGvbaabeqdcqGHris5aOGaaGjbVlaaykW7cqGH9a qpcaaMe8UaaGPaVpaaqababaGaaGjcVlaadMhadaWgaaWcbaGaam4A aaqabaaabaGaam4AaiaaykW7cqGHiiIZcaaMc8UaamyvaiaaysW7cq GHsislcaaMe8Uaam4CamaaBaaameaacaqGobGaaeiuaaqabaaaleqa niabggHiLdGccaGGUaaaaa@6B59@  If a vector of auxiliary variables, x k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4AaaqabaGccaGGSaaaaa@39D3@  is available for k s P MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaykW7caaMe8Uaam4CamaaBaaaleaacaWG qbaabeaaaaa@4199@  as well as the total T x = k U x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahsfada WgaaWcbaGaaCiEaaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaM c8+aaabeaeaacaaMi8UaaCiEamaaBaaaleaacaWGRbaabeaaaeaaca WGRbGaaGPaVlabgIGiolaaykW7caWGvbaabeqdcqGHris5aaaa@4C30@  then the weight 1 / π k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paalyaaba GaaGymaiaaykW7aeaacaaMi8UaeqiWda3aaSbaaSqaaiaadUgaaeqa aaaaaaa@3DC2@  in (3.1) can be replaced with a calibrated weight w k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada WgaaWcbaGaam4Aaaqabaaaaa@3914@  (e.g., Deville and Särndal, 1992; Haziza and Beaumont, 2017). The calibrated weights minimize a distance function between w k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada WgaaWcbaGaam4Aaaqabaaaaa@3914@  and 1 / π k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paalyaaba GaaGymaiaaykW7aeaacaaMi8UaeqiWda3aaSbaaSqaaiaadUgaaeqa aaaakiaacYcaaaa@3E7C@   k s P , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaacYcaaaa@4253@  under the constraint of satisfying the calibration equation k s P w k x k = T x . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVlaadEhadaWgaaWcbaGaam4AaaqabaGccaWH4bWaaSbaaSqa aiaadUgaaeqaaaqaaiaadUgacaaMc8UaeyicI4SaaGPaVlaadohada WgaaadbaGaamiuaaqabaaaleqaniabggHiLdGccaaMe8Uaeyypa0Ja aGjbVlaaykW7caWHubWaaSbaaSqaaiaahIhaaeqaaOGaaiOlaaaa@4EAE@  Ideally, the calibration is done only on the portion not covered by the non-probability sample, U s NP ; MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaca aMe8UaaGPaVlabgkHiTiaaysW7caaMc8Uaam4CamaaBaaaleaacaqG obGaaeiuaaqabaGccaGG7aaaaa@4284@  i.e., the calibration vector ( 1 δ k ) x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paabmqaba GaaGymaiaaysW7caaMc8UaeyOeI0IaaGjbVlaaykW7cqaH0oazdaWg aaWcbaGaam4AaaqabaaakiaawIcacaGLPaaacaaMe8UaaCiEamaaBa aaleaacaWGRbaabeaaaaa@46D3@  is used, and the calibration equation becomes: k s P w k ( 1 δ k ) x k = k U s NP x k . MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVlaadEhadaWgaaWcbaGaam4AaaqabaGccaaMc8+aaeWabeaa caaIXaGaaGjbVlaaykW7cqGHsislcaaMe8UaaGPaVlabes7aKnaaBa aaleaacaWGRbaabeaaaOGaayjkaiaawMcaaiaaysW7caWH4bWaaSba aSqaaiaadUgaaeqaaaqaaiaadUgacaaMc8UaeyicI4SaaGPaVlaado hadaWgaaadbaGaamiuaaqabaaaleqaniabggHiLdGccaaMe8UaaGPa Vlabg2da9iaaysW7caaMc8+aaabeaeaacaaMi8UaaCiEamaaBaaale aacaWGRbaabeaaaeaacaWGRbGaaGPaVlabgIGiolaaykW7caWGvbGa aGjbVlabgkHiTiaaysW7caWGZbWaaSbaaWqaaiaab6eacaqGqbaabe aaaSqab0GaeyyeIuoakiaac6caaaa@7038@  This is not possible when k U s NP x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paaqababa GaaGjcVlaahIhadaWgaaWcbaGaam4AaaqabaaabaGaam4AaiaaykW7 cqGHiiIZcaaMc8UaamyvaiaaysW7cqGHsislcaaMe8Uaam4CamaaBa aameaacaqGobGaaeiuaaqabaaaleqaniabggHiLdaaaa@49C1@  is unknown.

Remark: If assumption 2 is not appropriate, then E ( I k | Ω P ) E ( I k | Z ) = π k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWabeaadaabceqaaiaadMeadaWgaaWcbaGaam4AaaqabaGc caaMc8oacaGLiWoacaaMc8UaaGjcVlaahM6adaWgaaWcbaGaamiuaa qabaaakiaawIcacaGLPaaacaaMe8UaaGPaVlabgcMi5kaaysW7caaM c8UaamyraiaaykW7daqadeqaamaaeiqabaGaamysamaaBaaaleaaca WGRbaabeaakiaaykW7aiaawIa7aiaaykW7caaMi8UaaCOwaaGaayjk aiaawMcaaiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7cqaHapaCda WgaaWcbaGaam4AaaqabaGccaGGUaaaaa@6503@ To get around this problem, all the units for which the data were collected after selecting the sample s P MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaamiuaaqabaaaaa@38F5@ can be removed from s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaaaa@39C4@ . Assumption 2 is then satisfied, but a lot of available data may be omitted. To take advantage of the full set s NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohada WgaaWcbaGaaeOtaiaabcfaaeqaaOGaaiilaaaa@3A7E@ it is necessary to make a few assumptions and partially depart from the design-based approach. Assuming that E ( I k | Ω P ) = Pr ( I k = 1 | δ k , Y , Ω ) , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMc8+aaeWabeaadaabceqaaiaadMeadaWgaaWcbaGaam4AaaqabaGc caaMc8oacaGLiWoacaaMc8UaaGjcVlaahM6adaWgaaWcbaGaamiuaa qabaaakiaawIcacaGLPaaacaaMe8UaaGPaVlabg2da9iaaysW7caaM c8UaciiuaiaackhacaaMc8+aaeWabeaadaabceqaaiaadMeadaWgaa WcbaGaam4AaaqabaGccqGH9aqpcaaIXaGaaGPaVdGaayjcSdGaaGPa VlaayIW7cqaH0oazdaWgaaWcbaGaam4AaaqabaGccaGGSaGaaGjbVl aahMfacaGGSaGaaGjbVlaahM6aaiaawIcacaGLPaaacaGGSaaaaa@6563@ we can use Bayes’ theorem to show that

Pr ( I k = 1 | δ k = 0 , Y , Ω ) = 1 Pr ( δ k = 1 | I k = 1 , Y , Ω ) 1 Pr ( δ k = 1 | Y , Ω ) π k , MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaciiuaiaack hacaaMc8+aaeWabeaadaabceqaaiaadMeadaWgaaWcbaGaam4Aaaqa baGccaaMe8UaaGPaVlabg2da9iaaysW7caaMc8UaaGymaiaaykW7ai aawIa7aiaaykW7caaMi8UaeqiTdq2aaSbaaSqaaiaadUgaaeqaaOGa aGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVlaaicdacaGGSaGaaGjbVl aaykW7caWHzbGaaiilaiaaysW7caaMc8UaaCyQdaGaayjkaiaawMca aiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7daWcaaqaaiaaigdaca aMe8UaaGPaVlabgkHiTiaaysW7caaMc8UaciiuaiaackhacaaMc8+a aeWaaeaadaabcaqaaiabes7aKnaaBaaaleaacaWGRbaabeaakiaays W7caaMc8Uaeyypa0JaaGjbVlaaykW7caaIXaGaaGPaVdGaayjcSdGa aGPaVlaayIW7caWGjbWaaSbaaSqaaiaadUgaaeqaaOGaaGjbVlaayk W7cqGH9aqpcaaMe8UaaGPaVlaaigdacaGGSaGaaGjbVlaaykW7caWH zbGaaiilaiaaysW7caaMc8UaaCyQdaGaayjkaiaawMcaaaqaaiaaig dacaaMe8UaaGPaVlabgkHiTiaaysW7caaMc8UaciiuaiaackhacaaM c8+aaeWaaeaadaabcaqaaiabes7aKnaaBaaaleaacaWGRbaabeaaki aaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7caaIXaGaaGPaVdGaayjc SdGaaGPaVlaayIW7caWHzbGaaiilaiaaysW7caWHPoaacaGLOaGaay zkaaaaaiaaysW7cqaHapaCdaWgaaWcbaGaam4AaaqabaGccaGGSaaa aa@BEB6@

for the units k U s NP . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8UaamyvaiaaysW7caaMc8Ua eyOeI0IaaGPaVlaaysW7caWGZbWaaSbaaSqaaiaab6eacaqGqbaabe aakiaac6caaaa@4B1B@ Therefore, estimating E ( I k | Ω P ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadweaca aMe8+aaeWaaeaadaabcaqaaiaadMeadaWgaaWcbaGaam4AaaqabaGc caaMc8oacaGLiWoacaaMc8UaaGjcVlaahM6adaWgaaWcbaGaamiuaa qabaaakiaawIcacaGLPaaaaaa@454D@ requires postulating a model for δ k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labes7aKn aaBaaaleaacaWGRbaabeaakiaac6caaaa@3A79@ Under some assumptions, Pr ( δ k = 1 | I k = 1 , Y , Ω ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGaccfaca GGYbGaaGPaVpaabmaabaWaaqGaaeaacqaH0oazdaWgaaWcbaGaam4A aaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaMc8UaaGymaiaayk W7aiaawIa7aiaaykW7caaMi8UaamysamaaBaaaleaacaWGRbaabeaa kiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7caaIXaGaaiilaiaays W7caaMc8UaaCywaiaacYcacaaMe8UaaGPaVlaahM6aaiaawIcacaGL Paaaaaa@6061@ can be estimated using the data from the probability sample and, for example, a logistic regression model. Estimating Pr ( δ k = 1 | Y , Ω ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGaccfaca GGYbGaaGPaVpaabmaabaWaaqGaaeaacqaH0oazdaWgaaWcbaGaam4A aaqabaGccaaMe8UaaGPaVlabg2da9iaaysW7caaMc8UaaGymaiaayk W7aiaawIa7aiaaykW7caaMi8UaaCywaiaacYcacaaMe8UaaGPaVlaa hM6aaiaawIcacaGLPaaaaaa@52B4@ can be done using the methods described in Section 4.3 that do not rely on the validity of assumption 2, such as the method by Chen, Li and Wu (2019). These methods require that the auxiliary variables used to model this probability be available for all units of the combined sample s = s P s NP . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadohaca aMe8UaaGPaVlabg2da9iaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaaysW7caaMc8UaeyOkIGSaaGjbVlaaykW7caWGZbWaaS baaSqaaiaab6eacaqGqbaabeaakiaac6caaaa@4C81@ Unlike in Section 4.3, here we can take advantage of the availability of y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEamaaBa aaleaacaWGRbaabeaaaaa@37C3@ for all units of both samples, and we can use the variable of interest as an auxiliary variable. Then, θ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXb aa@38B2@ is estimated by replacing π k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labec8aWn aaBaaaleaacaWGRbaabeaaaaa@39D5@ in (3.1) with an estimate of Pr ( I k = 1 | δ k = 0 , Y , Ω ) . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGaccfaca GGYbGaaGPaVpaabmaabaWaaqGaaeaacaWGjbWaaSbaaSqaaiaadUga aeqaaOGaaGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVlaaigdacaaMc8 oacaGLiWoacaaMc8UaaGjcVlabes7aKnaaBaaaleaacaWGRbaabeaa kiaaysW7caaMc8Uaeyypa0JaaGjbVlaaykW7caaIWaGaaiilaiaays W7caaMc8UaaCywaiaacYcacaaMe8UaaGPaVlaahM6aaiaawIcacaGL PaaacaGGUaaaaa@6112@ Similar approaches were proposed by Beaumont, Bocci and Hidiroglou (2014) to take into account late respondents in Statistics Canada’s National Household Survey, i.e., households that responded to the initial questionnaire after the follow-up probability sample of non-respondents was drawn.

3.2 Calibration of the probability sample to the non-probability source

Data from non-probability sources, such as those provided by web panel respondents, can be fraught with measurement errors large enough to cast doubt on the assumption that y k * y k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaacQcaaaGccaaMe8UaaGPaVlabgIKi7kaa ysW7caaMc8UaamyEamaaBaaaleaacaWGRbaabeaakiaac6caaaa@4486@ Therefore, such data cannot be used to directly replace the values of the variable y . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhaca GGUaaaaa@38AC@ However, they can be used as auxiliary data to enhance the probability survey using the calibration technique. The non-probability source contains the values y k * MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada qhaaWcbaGaam4AaaqaaiaacQcaaaaaaa@39C5@ for k s NP MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaqG obGaaeiuaaqabaaaaa@4268@ and potentially the values of other variables. From all these variables, it is possible to form a vector of auxiliary variables x k * , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaiaacQcaaaGccaGGSaaaaa@3A82@ available for k s NP , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaqG obGaaeiuaaqabaGccaGGSaaaaa@4322@ that could include an intercept. Its total is denoted as T x * = k s NP x k * = k U δ k x k * . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahsfada WgaaWcbaGaaCiEamaaCaaameqabaGaaiOkaaaaaSqabaGccaaMe8Ua aGPaVlabg2da9iaaysW7caaMc8+aaabeaeaacaaMi8UaaCiEamaaDa aaleaacaWGRbaabaGaaiOkaaaaaeaacaWGRbGaaGPaVlabgIGiolaa ykW7caWGZbWaaSbaaWqaaiaab6eacaqGqbaabeaaaSqab0GaeyyeIu oakiaaysW7cqGH9aqpcaaMe8UaaGPaVpaaqababaGaaGjcVlabes7a KnaaBaaaleaacaWGRbaabeaakiaahIhadaqhaaWcbaGaam4Aaaqaai aacQcaaaaabaGaam4AaiaaykW7cqGHiiIZcaaMc8Uaamyvaaqab0Ga eyyeIuoakiaac6caaaa@6595@ Another vector of auxiliary variables, x k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@ may also be available for k s P , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaacYcaaaa@4253@ as well as its total for the entire population U , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadwfaca GGSaaaaa@3886@ T x = k U x k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahsfada WgaaWcbaGaaCiEaaqabaGccqGH9aqpdaaeqaqaaiaayIW7caWH4bWa aSbaaSqaaiaadUgaaeqaaaqaaiaadUgacqGHiiIZcaWGvbaabeqdcq GHris5aOGaaiOlaaaa@43A6@ The calibrated weights w k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada WgaaWcbaGaam4AaaqabaGccaGGSaaaaa@39CE@ k s P , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaacYcaaaa@4253@ are obtained by minimizing a distance function between w k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada WgaaWcbaGaam4Aaaqabaaaaa@3914@ and 1/ π k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=paalyaaba GaaGymaiaayIW7aeaacaaMi8UaeqiWda3aaSbaaSqaaiaadUgaaeqa aaaakiaacYcaaaa@3E82@ k s P , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMe8UaaGPaVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaacYcaaaa@4253@  under the constraint of satisfying the calibration equation

k s P w k ( x k δ k x k * ) =( T x T x * ). MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaabeaeaaca aMi8Uaam4DamaaBaaaleaacaWGRbaabeaakmaabmaaeaGabeaacaWH 4bWaaSbaaSqaaiaadUgaaeqaaaGcbaGaeqiTdq2aaSbaaSqaaiaadU gaaeqaaOGaaCiEamaaDaaaleaacaWGRbaabaGaaiOkaaaaaaGccaGL OaGaayzkaaaaleaacaWGRbGaaGPaVlabgIGiolaaykW7caWGZbWaaS baaWqaaiaadcfaaeqaaaWcbeqdcqGHris5aOGaeyypa0ZaaeWaaqaa ceqaaiaahsfadaWgaaWcbaGaaCiEaaqabaaakeaacaWHubWaaSbaaS qaaiaahIhadaahaaadbeqaaiaacQcaaaaaleqaaaaakiaawIcacaGL PaaacaGGUaaaaa@5488@

Note that this calibration can be done only if x k * MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaiaacQcaaaaaaa@39C8@  is available in the probability sample for all units k s P s NP . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadUgaca aMc8UaaGjbVlabgIGiolaaysW7caaMc8Uaam4CamaaBaaaleaacaWG qbaabeaakiaaysW7caaMc8UaeyykICSaaGjbVlaaykW7caWGZbWaaS baaSqaaiaab6eacaqGqbaabeaakiaac6caaaa@4CF5@  The estimator of θ MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=labeI7aXb aa@38B2@ is again written as θ ^ = k s P w k y k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaGaaGjbVlaaykW7cqGH9aqpcaaMe8UaaGPaVpaaqababaGaaGjc VlaadEhadaWgaaWcbaGaam4AaaqabaGccaWG5bWaaSbaaSqaaiaadU gaaeqaaaqaaiaadUgacqGHiiIZcaWGZbWaaSbaaWqaaiaadcfaaeqa aaWcbeqdcqGHris5aOGaaiilaaaa@4CD0@  where w k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadEhada WgaaWcbaGaam4Aaaqabaaaaa@3914@  is the calibrated weight satisfying the above calibration equation. No model assumption is required for the validity of the approach, and the resulting estimates remain design-consistent regardless of the strength of the relationship between y k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laadMhada WgaaWcbaGaam4Aaaqabaaaaa@3916@  and the auxiliary variables x k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada WgaaWcbaGaam4Aaaqabaaaaa@3919@  and x k * . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=laahIhada qhaaWcbaGaam4AaaqaaiaacQcaaaGccaGGUaaaaa@3A84@  A strong relationship will help reduce the design variance of θ ^ , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lqbeI7aXz aajaGaaiilaaaa@3972@ var( θ ^ | Ω P ). MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xe9GqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbqau=lGacAhaca GGHbGaaiOCaiaaykW7daqadeqaamaaeiqabaGafqiUdeNbaKaacaaM c8oacaGLiWoacaaMc8UaaGjcVlaahM6adaWgaaWcbaGaamiuaaqaba aakiaawIcacaGLPaaacaGGUaaaaa@47DE@  Kim and Tam (2020) discuss the use of such calibration.

Canada’s Labour Force Survey (LFS) provides an example of a potential application for this calibration method. The unemployment rate, defined as the number of unemployed persons divided by the number of persons in the labour force, is a key parameter of interest that the LFS estimates. To improve the precision of the LFS estimates, a calibration variable indicating whether an individual is receiving employment insurance could be effective because there is definitely a connection between receiving employment insurance and being unemployed. The total of this calibration variable, the number of employment insurance beneficiaries, is needed for implementing this calibration and is available from an administrative source. However, applying this method would require adding a question to the LFS to identify LFS respondents who are receiving employment insurance. This information could also be obtained through a linkage between the LFS and the administrative source. It remains to be determined whether such a calibration variable could yield significant gains in the LFS.


Date modified: