A method to find an efficient and robust sampling strategy under model uncertainty
Section 7. Conclusions
The strategy that couples
with the
difference estimator is optimal when the parameters of the superpopulation
model are known. Taking into account that these assumptions are seldom
satisfied, it was shown in Section 3 and illustrated in Subsection 6.1
that this optimality breaks down even under small misspecifications of the
model.
In Section 4 we
propose a method for choosing the sampling design, which is extended to its use
with the GREG estimator in Section 5. The method allows for taking the
uncertainty about the model parameters into account by introducing a prior
distribution on them. Although it could be argued that a source of subjectivity
is added by introducing a prior distribution on the parameters, our view is
that it is more subjective to choose the design without any type of assessment
of the assumptions. Furthermore, inference is still design-based, as the prior
is used only for choosing the design.
The method was
illustrated with a real dataset, yielding satisfactory results. It should be
noted that although the illustrations used stratified simple random sampling,
the method in this article is valid for any sampling design.
Appendix
Proof of (4.2)
Proof. The following expectations are required in the proof,
and
are obtained using (A.1) and (A.2),
Now, using (A.3), (A.4) and (A.5) we get
Using (A.6) and (A.7), we obtain an approximation to the correlation
coefficient,
Solving (A.8) for
we get (4.2), as desired. The proof of (5.6)
is analogous.
References
Beaumont, J.-F., Haziza,
D. and Ruiz-Gazen, A. (2013). A
unified approach to robust estimation in finite population sampling. Biometrika, 100, 3, 555-569.
Bramati, M. (2012). Robust Lavallée-Hidiroglou stratified
sampling strategy. Survey Research
Methods, 6, 3, 137-143.
Cassel, C.M., Särndal, C.-E.
and Wretman, J. (1976). Some results
on generalized difference estimation and generalized regression estimation for finite
populations. Biometrika, 63,
3, 615-620.
Cassel, C.M., Särndal, C.-E.
and Wretman, J. (1977). Foundations of Inference in Survey Sampling. New
York: John Wiley & Sons, Inc.
Dalenius, T., and Hodges,
J.L. (1959). Minimum variance
stratification. Journal of the
American Statistical Association, 54,
88-101.
Godambe, V.P. (1955). A unified theory of sampling from finite
populations. Journal of the Royal
Statistical Society, Series B, 17, 269-278.
Hájek, J. (1959). Optimal strategy and other problems in probability
sampling. Casopis Pro Pestování Matematiky,
84, 4, 387-423.
Holmberg, A., and
Swensson, B. (2001). On pareto
sampling: Reflections on unequal probability sampling
strategies. Theory of Stochastic Processes, 7(23), 142-155.
Horvitz, D.G., and
Thompson, D.J. (1952). A generalization
of sampling without replacement from a finite universe. Journal of the American Statistical
Association, 47, 260, 663-685.
Isaki, C.T., and Fuller,
W.A. (1982). Survey design under the
regression superpopulation model. Journal
of the American Statistical Association, 77, 89-96.
Lanke, J. (1973). On UMV-estimators in survey sampling. Metrika, 20, 196-202.
Narasimhan, B., Johnson,
S., Hahn, T., Bouvier, A. and Kiêu, K. (2019). Cubature: Adaptive
Multivariate Integration Over Hypercubes. R package version 2.0.4.
https://CRAN.R-project.org/package=cubature.
Nedyalkova, D., and
Tillé, Y. (2008). Optimal sampling and
estimation strategies under the linear model. Biometrika, 95, 3, 521-537.
R Core Team (2020). R: A language and environment for statistical
computing. The R Foundation for
Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
Rosén, B. (1997). On sampling with probability proportional to
size. Journal of Statistical Planning
and Inference, 62, 159-191.
Rosén, B. (2000). Generalized
Regression Estimation and Pareto
R&D Report
2000:5. Statistics Sweden.
Royall, R.M., and Herson,
J. (1973). Robust estimation in finite
populations I. Journal of the American
Statistical Association, 68, 344, 880-889.
Särndal, C.-E., Swensson,
B. and Wretman, J. (1992). Model Assisted Survey Sampling. Springer.
Tillé, Y., and Wilhelm,
M. (2017). Probability sampling
designs: Principles for choice of design and balancing. Statistical Science, 32(2), 176-189.
Wright, R.L. (1983). Finite population sampling with multivariate
auxiliary information. Journal of
the American Statistical Association, 78, 879-884.
Zhai, Z., and Wiens, D.
(2015). Robust model-based
stratification sampling designs. The
Canadian Journal of Statistics, 43, 4, 554-577.