Comments on “Statistical inference with non-probability survey samples” – Miniaturizing data defect correlation: A versatile strategy for handling non-probability samples
Section 5. Quasi-randomization and super-population implementations

Once a joint model for { R i , y i } MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaaI7bGaamOuamaaBaaaleaacaWGPb aabeaakiaaiYcacaaMe8UaamyEamaaBaaaleaacaWGPbaabeaakiaa i2haaaa@3A2D@  is set up, of course we can use it for estimating both π i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacqaHapaCdaWgaaWcbaGaamyAaaqaba aaaa@3498@  and the regression function m(x), MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGTbGaaGPaVlaaiIcacaWH4bGaaG ykaiaacYcaaaa@3754@  each of which is made possible by the availability of the auxiliary probability sample, and the assumption of missing at random. But as shown before, correctly specifying and estimating one of them is sufficient for miniaturizing c R ˜ ,z . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGJbWaaSbaaSqaaiqadkfagaacai aacYcacaaMc8UaamOEaaqabaGccaGGUaaaaa@37B1@  However, from (4.3), in order for the covariance/correlation to be zero, neither multiplicative correction to π I MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacqaHapaCdaWgaaWcbaGaamysaaqaba aaaa@3478@  via W I MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGxbWaaSbaaSqaaiaadMeaaeqaaa aa@3397@  nor the additive adjustment for E( y I | x I ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaqGfbGaaGPaVlaaiIcacaWG5bWaaS baaSqaaiaadMeaaeqaaOGaaGPaVpaaeeqabaGaaGPaVlaahIhadaWg aaWcbaGaamysaaqabaaakiaawEa7aiaaiMcaaaa@3E2B@  via m( x I ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGTbGaaGPaVlaaiIcacaWH4bWaaS baaSqaaiaadMeaaeqaaOGaaGykaaaa@37A8@  need to be correct. All we need is that, after the correction or adjustment, what is left would be uncorrelated with each other. The aforementioned framework of Collaborative TMLE was built essentially on this insight (e.g., see Section 3.1 of van der Laan and Gruber, 2009), though the heavy mathematical treatments in its literature might have discouraged readers to seek such intuitive understanding.

To provide a simple illustration, consider a finite population that is an i.i.d. sample from a super-population model:

E[y| x]= k=0 3 β k x k ,x~N(0,1).(5.1) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaqGfbGaaGPaVlaaiUfacaWG5bGaaG jbVpaaeeqabaGaaGPaVlaadIhaaiaawEa7aiaai2facaaMe8UaaGjb Vlabg2da9iaaysW7caaMe8+aaabCaeqaleaacaWGRbGaaGPaVlabg2 da9iaaykW7caaIWaaabaGaaG4maaqdcqGHris5aOGaaGPaVlabek7a InaaBaaaleaacaWGRbaabeaakiaadIhadaahaaWcbeqaaiaadUgaaa GccaaISaGaaGzbVlaadIhacaaMe8UaaGPaVJqaaiaa=5hacaaMc8Ua aGjbVlaad6eacaaMc8UaaGikaiaaicdacaaISaGaaGjbVlaaigdaca aIPaGaaGOlaiaaywW7caaMf8UaaGzbVlaaywW7caaMf8Uaaiikaiaa iwdacaGGUaGaaGymaiaacMcaaaa@70FE@

The non-probability sample is generated by a mechanism R MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGsbaaaa@3298@  such that Pr( R=1| y,x )=π( | x | ), MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaaciGGqbGaaiOCaiaaykW7daqadeqaai aadkfacaaMe8Uaeyypa0JaaGjbVlaaigdacaaMe8+aaqqabeaacaaM c8UaamyEaiaaiYcacaaMe8UaamiEaaGaay5bSdaacaGLOaGaayzkaa GaaGjbVlabg2da9iaaysW7cqaHapaCcaaMc8+aaeWabeaacaaMc8+a aqWabeaacaaMi8UaamiEaiaayIW7aiaawEa7caGLiWoacaaMc8oaca GLOaGaayzkaaGaaiilaaaa@5939@  that is, it is determined by the magnitude of x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWG4baaaa@32BE@  only. Suppose we mis-specify the function form for π MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacqaHapaCaaa@337E@  (e.g., the divine model may not be monotone in | x |, MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaadaabdeqaaiaayIW7caWG4bGaaGjcVd Gaay5bSlaawIa7aiaacYcaaaa@39B3@  but the device model such as the conventional logistic link is), as well the regression model by choosing m(x)= b 0 + b 1 x+ b 2 x 2 . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGTbGaaGPaVlaaiIcacaWG4bGaaG ykaiaaysW7cqGH9aqpcaaMe8UaamOyamaaBaaaleaacaaIWaaabeaa kiaaysW7cqGHRaWkcaaMe8UaamOyamaaBaaaleaacaaIXaaabeaaki aaykW7caWG4bGaaGjbVlabgUcaRiaaysW7caWGIbWaaSbaaSqaaiaa ikdaaeqaaOGaaGPaVlaadIhadaahaaWcbeqaaiaaikdaaaGccaGGUa aaaa@4EF5@  Since x 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWG4bWaaWbaaSqabeaacaaIYaaaaa aa@33A7@  is uncorrelated with x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWG4baaaa@32BE@  or x 3 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWG4bWaaWbaaSqabeaacaaIZaaaaa aa@33A8@  under x~N(0,1), MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWG4bGaaGjbVJqaaiaa=5hacaaMe8 UaamOtaiaaykW7caaIOaGaaGimaiaaiYcacaaMe8UaaGymaiaaiMca caGGSaaaaa@3F0B@  we know that our least-square estimator for b 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGIbWaaSbaaSqaaiaaikdaaeqaaa aa@3390@  would still be valid for β 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacqaHYoGydaWgaaWcbaGaaGOmaaqaba aaaa@344A@  even under the mis-specified regression model. This turns out to be sufficient to ensure the asymptotic unbiasedness (as N) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGobGaaGjbVlabgkziUkaaysW7cq GHEisPcaGGPaaaaa@39B9@  of the following “doubly robust” estimator for μ= y ¯ N , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacqaH8oqBcaaMe8Uaeyypa0JaaGjbVl qadMhagaqeamaaBaaaleaacaWGobaabeaakiaacYcaaaa@3A66@  the finite-population mean,

μ ^ + = i=1 N R i w( | x i | )( y i m ^ ( x i )) i=1 N R i w( | x i | ) + i=1 N R i * m ^ ( x i ) i=1 N R i * ,(5.2) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacuaH8oqBgaqcamaaBaaaleaacqGHRa WkaeqaaOGaaGjbVlaaysW7cqGH9aqpcaaMe8UaaGjbVpaalaaabaWa aabmaeaacaaMc8UaamOuamaaBaaaleaacaWGPbaabeaakiaadEhaca aMc8+aaeWabeaacaaMi8+aaqWabeaacaaMi8UaamiEamaaBaaaleaa caWGPbaabeaakiaayIW7aiaawEa7caGLiWoacaaMi8oacaGLOaGaay zkaaGaaGjbVlaaiIcacaWG5bWaaSbaaSqaaiaadMgaaeqaaOGaaGjb VlabgkHiTiaaysW7ceWGTbGbaKaacaaMc8UaaGikaiaadIhadaWgaa WcbaGaamyAaaqabaGccaaIPaGaaGykaaWcbaGaamyAaiaaykW7cqGH 9aqpcaaMc8UaaGymaaqaaiaad6eaa0GaeyyeIuoaaOqaamaaqadaba GaaGPaVlaadkfadaWgaaWcbaGaamyAaaqabaGccaWG3bGaaGPaVpaa bmqabaGaaGjcVpaaemqabaGaaGjcVlaadIhadaWgaaWcbaGaamyAaa qabaGccaaMi8oacaGLhWUaayjcSdGaaGjcVdGaayjkaiaawMcaaaWc baGaamyAaiaaykW7cqGH9aqpcaaMc8UaaGymaaqaaiaad6eaa0Gaey yeIuoaaaGccaaMe8UaaGjbVlabgUcaRiaaysW7caaMe8+aaSaaaeaa daaeWaqaaiaaykW7caWGsbWaa0baaSqaaiaadMgaaeaacaGGQaaaaO GabmyBayaajaGaaGPaVlaaiIcacaWG4bWaaSbaaSqaaiaadMgaaeqa aOGaaGykaaWcbaGaamyAaiaaykW7cqGH9aqpcaaMc8UaaGymaaqaai aad6eaa0GaeyyeIuoaaOqaamaaqadabaGaaGPaVlaadkfadaqhaaWc baGaamyAaaqaaiaacQcaaaaabaGaamyAaiaai2dacaaIXaaabaGaam OtaaqdcqGHris5aaaakiaaiYcacaaMf8UaaGzbVlaaywW7caaMf8Ua aGzbVlaacIcacaaI1aGaaiOlaiaaikdacaGGPaaaaa@B21F@

where R * MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGsbWaaWbaaSqabeaacaGGQaaaaa aa@3373@  indicates the auxiliary sample (of x MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWH4baaaa@32C2@  only). Or equivalently,

μ ^ + y ¯ N = Cov I ( R I w( | x I | ), y I m ^ ( x I ) ) E I ( R I w( | x I | ) ) + Cov I ( R I * , m ^ ( x I ) ) E I ( R I * ) ,(5.3) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacuaH8oqBgaqcamaaBaaaleaacqGHRa WkaeqaaOGaaGjbVlabgkHiTiaaysW7ceWG5bGbaebadaWgaaWcbaGa amOtaaqabaGccaaMe8UaaGjbVlabg2da9iaaysW7caaMe8+aaSaaae aacaqGdbGaae4BaiaabAhadaWgaaWcbaGaamysaaqabaGccaaMc8+a aeWabeaacaWGsbWaaSbaaSqaaiaadMeaaeqaaOGaam4DaiaaykW7da qadeqaaiaayIW7daabdeqaaiaayIW7caWG4bWaaSbaaSqaaiaadMea aeqaaOGaaGjcVdGaay5bSlaawIa7aiaayIW7aiaawIcacaGLPaaaca aISaGaaGjbVlaadMhadaWgaaWcbaGaamysaaqabaGccaaMe8UaeyOe I0IaaGjbVlqad2gagaqcaiaaykW7daqadeqaaiaadIhadaWgaaWcba GaamysaaqabaaakiaawIcacaGLPaaaaiaawIcacaGLPaaaaeaacaqG fbWaaSbaaSqaaiaadMeaaeqaaOGaaGPaVpaabmqabaGaamOuamaaBa aaleaacaWGjbaabeaakiaadEhacaaMc8+aaeWabeaacaaMi8+aaqWa beaacaaMi8UaamiEamaaBaaaleaacaWGjbaabeaakiaayIW7aiaawE a7caGLiWoacaaMi8oacaGLOaGaayzkaaaacaGLOaGaayzkaaaaaiaa ysW7caaMe8Uaey4kaSIaaGjbVlaaysW7daWcaaqaaiaaboeacaqGVb GaaeODamaaBaaaleaacaWGjbaabeaakiaaykW7daqadeqaaiaadkfa daqhaaWcbaGaamysaaqaaiaacQcaaaGccaaISaGaaGjbVlqad2gaga qcaiaaykW7daqadeqaaiaadIhadaWgaaWcbaGaamysaaqabaaakiaa wIcacaGLPaaaaiaawIcacaGLPaaaaeaacaqGfbWaaSbaaSqaaiaadM eaaeqaaOGaaGPaVpaabmqabaGaamOuamaaDaaaleaacaWGjbaabaGa aiOkaaaaaOGaayjkaiaawMcaaaaacaaISaGaaGzbVlaaywW7caaMf8 UaaGzbVlaaywW7caGGOaGaaGynaiaac6cacaaIZaGaaiykaaaa@AB08@

which makes it clearer that any bias in μ ^ + MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacuaH8oqBgaqcamaaBaaaleaacqGHRa Wkaeqaaaaa@3495@  is controlled by the covariance (or correlation) involving R, MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGsbGaaiilaaaa@3348@  since the covariance involving R * MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGsbWaaWbaaSqabeaacaGGQaaaaa aa@3373@  is already miniaturized by the assumption that the auxiliary sample is probabilistic (which, for simplicity, is assumed to be a simple random sample).

Here w(x) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWG3bGaaGPaVlaaiIcacaWG4bGaaG ykaaaa@36AA@  is any weight function such that E ϕ [ | x | 3 w( | x | ) ]<, MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaqGfbWaaSbaaSqaaiabew9aMbqaba GccaaMc8+aamWabeaacaaMi8+aaqWabeaacaaMi8UaamiEaiaayIW7 aiaawEa7caGLiWoadaahaaWcbeqaaiaaiodaaaGccaaMe8Uaam4Dam aabmqabaGaaGjcVpaaemqabaGaamiEaaGaay5bSlaawIa7aiaayIW7 aiaawIcacaGLPaaaaiaawUfacaGLDbaacaaMe8UaeyipaWJaaGjbVl abg6HiLkaacYcaaaa@5360@  where the expectation is with respect to x~N(0,1), MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWG4bGaaGjbVJqaaiaa=5hacaaMe8 UaamOtaiaaykW7caaIOaGaaGimaiaaiYcacaaMe8UaaGymaiaaiMca caGGSaaaaa@3F0B@  and m ^ (x)= b 0 + b 1 x+ β ^ 2 x 2 , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaaceWGTbGbaKaacaaMc8UaaGikaiaadI hacaaIPaGaaGjbVlabg2da9iaaysW7caWGIbWaaSbaaSqaaiaaicda aeqaaOGaaGjbVlabgUcaRiaaysW7caWGIbWaaSbaaSqaaiaaigdaae qaaOGaaGPaVlaadIhacaaMe8Uaey4kaSIaaGjbVlqbek7aIzaajaWa aSbaaSqaaiaaikdaaeqaaOGaaGPaVlaadIhadaahaaWcbeqaaiaaik daaaGccaGGSaaaaa@4FCD@  with β ^ 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacuaHYoGygaqcamaaBaaaleaacaaIYa aabeaaaaa@345A@  being the least-square estimator for β 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacqaHYoGydaWgaaWcbaGaaGOmaaqaba aaaa@344A@  from the biased sample, and b 0 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGIbWaaSbaaSqaaiaaicdaaeqaaa aa@338E@  and b 1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGIbWaaSbaaSqaaiaaigdaaeqaaa aa@338F@  can be chosen arbitrarily. Because the finite-population covariance/correlation between π( | x I | )w( | x I | ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacqaHapaCcaaMc8+aaeWabeaacaaMi8 +aaqWabeaacaWG4bWaaSbaaSqaaiaadMeaaeqaaaGccaGLhWUaayjc SdGaaGjcVdGaayjkaiaawMcaaiaaysW7caWG3bGaaGPaVpaabmqaba GaaGjcVpaaemqabaGaamiEamaaBaaaleaacaWGjbaabeaaaOGaay5b SlaawIa7aiaayIW7aiaawIcacaGLPaaaaaa@4CBD@  and x I k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWG4bWaa0baaSqaaiaadMeaaeaaca WGRbaaaaaa@34A9@  is O p ( N 1/2 ), MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGpbWaaSbaaSqaaiaadchaaeqaaO GaaGPaVlaaiIcacaWGobWaaWbaaSqabeaacqGHsisldaWcgaqaaiaa igdaaeaacaaIYaaaaaaakiaaiMcacaGGSaaaaa@3AE4@  for k=1 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGRbGaaGjbVlabg2da9iaaysW7ca aIXaaaaa@378C@  and k=3, MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGRbGaaGjbVlabg2da9iaaysW7ca aIZaGaaiilaaaa@383E@  the misfitted parts for π MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacqaHapaCaaa@337E@  or m MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGTbaaaa@32B3@  do not contribute to the ddc (asymptotically) since they are uncorrelated with each other under the super-population model, leading to further robustness going beyond “double robustness”. This of course does not mean that we can misfit a model arbitrarily and still obtain valid estimators, but it does imply that having at least one model being correct is a sufficient, but not necessary, condition for the validity of the doubly robust estimators.

It is also worth stressing that, in formatting the regression model, we do not necessarily need to invoke a device probability, e.g., a super-population regression model, because the FPI variable provides a finite-population regression via applying the least-squares method to regress y i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWG5bWaaSbaaSqaaiaadMgaaeqaaa aa@33D9@  on x i ,iN. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWH4bWaaSbaaSqaaiaadMgaaeqaaO GaaGilaiaaysW7caWGPbGaaGjbVlabgIGiolaaysW7tCvAUfKttLea ryat1nwAKfgidfgBSL2zYfgCOLhaiqGacqWFobGtcaGGUaaaaa@46FB@  This regression fitting itself says little about whether the resulting regression line y= m ^ (x) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWG5bGaaGjbVlabg2da9iaaysW7ce WGTbGbaKaacaaMc8UaaGikaiaahIhacaaIPaaaaa@3BD2@  is a good fit to ( y i , x i ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaaIOaGaamyEamaaBaaaleaacaWGPb aabeaakiaaiYcacaaMe8UaaCiEamaaBaaaleaacaWGPbaabeaakiaa iMcaaaa@39B0@  or not. However, the example above indicates that, for the purpose of estimating the population average of y, MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWG5bGaaiilaaaa@336F@  the lack of fit may not matter that much, as long as the “residual” z I = y I m ^ ( x I ) MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWG6bWaaSbaaSqaaiaadMeaaeqaaO GaaGjbVlabg2da9iaaysW7caWG5bWaaSbaaSqaaiaadMeaaeqaaOGa aGjbVlabgkHiTiaaysW7ceWGTbGbaKaacaaMc8UaaGikaiaahIhada WgaaWcbaGaamysaaqabaGccaaIPaaaaa@43E4@  has little correlation with W I π I , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGxbWaaSbaaSqaaiaadMeaaeqaaO GaeqiWda3aaSbaaSqaaiaadMeaaeqaaOGaaiilaaaa@3712@  as two functions of the FPI variable I. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacaWGjbGaaiOlaaaa@3341@  Indeed, as discussed in Section 3, we can consider including π ^ I MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaacuaHapaCgaqcamaaBaaaleaacaWGjb aabeaaaaa@3488@  in the regression model m ^ ( x I , π ^ I ). MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8srps0l bbf9q8WrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfea0=yr0R Yxir=Jbba9q8aq0=yq=He9q8qqQ8frFve9Fve9Ff0dmeaabaqaciGa caGaaeqabaGabiWadaaakeaaceWGTbGbaKaacaaMc8UaaGikaiaahI hadaWgaaWcbaGaamysaaqabaGccaaISaGaaGjbVlqbec8aWzaajaWa aSbaaSqaaiaadMeaaeqaaOGaaGykaiaac6caaaa@3D7E@  How effective this strategy is in general is a topic of further research.


Date modified: