Survey Methodology
Relative performance of methods based on model-assisted survey regression estimation: A simulation study
by Erin R. Lundy and J.N.K. RaoNote 1
- Release date: June 21, 2022
Abstract
Use of auxiliary data to improve the efficiency of estimators of totals and means through model-assisted survey regression estimation has received considerable attention in recent years. Generalized regression (GREG) estimators, based on a working linear regression model, are currently used in establishment surveys at Statistics Canada and several other statistical agencies. GREG estimators use common survey weights for all study variables and calibrate to known population totals of auxiliary variables. Increasingly, many auxiliary variables are available, some of which may be extraneous. This leads to unstable GREG weights when all the available auxiliary variables, including interactions among categorical variables, are used in the working linear regression model. On the other hand, new machine learning methods, such as regression trees and lasso, automatically select significant auxiliary variables and lead to stable nonnegative weights and possible efficiency gains over GREG. In this paper, a simulation study, based on a real business survey sample data set treated as the target population, is conducted to study the relative performance of GREG, regression trees and lasso in terms of efficiency of the estimators and properties of associated regression weights. Both probability sampling and non-probability sampling scenarios are studied.
Key Words: Model assisted inference, Calibration estimation; Model selection; Generalized regression estimator.
Table of contents
- Section 1. Introduction
- Section 2. Model-assisted estimation under probability sampling
- Section 3. Simulation study using Financing and Growth of Small and Medium Enterprises Survey data
- Section 4. Results of the simulation study
- Section 5. Estimation under non-probability sampling
- Section 6. Conclusions
- Acknowledgements
- References
How to cite
Lundy, E.R., and Rao, J.N.K. (2022). Relative performance of methods based on model-assisted survey regression estimation: A simulation study. Survey Methodology, Statistics Canada, Catalogue No. 12-001-X, Vol. 48, No. 1. Paper available at http://www.statcan.gc.ca/pub/12-001-x/2022001/article/00003-eng.htm.
Note
- Date modified: