Survey Methodology
Relative performance of methods based on model-assisted survey regression estimation: A simulation study

by Erin R. Lundy and J.N.K. RaoNote 1

  • Release date: June 21, 2022

Abstract

Use of auxiliary data to improve the efficiency of estimators of totals and means through model-assisted survey regression estimation has received considerable attention in recent years. Generalized regression (GREG) estimators, based on a working linear regression model, are currently used in establishment surveys at Statistics Canada and several other statistical agencies. GREG estimators use common survey weights for all study variables and calibrate to known population totals of auxiliary variables. Increasingly, many auxiliary variables are available, some of which may be extraneous. This leads to unstable GREG weights when all the available auxiliary variables, including interactions among categorical variables, are used in the working linear regression model. On the other hand, new machine learning methods, such as regression trees and lasso, automatically select significant auxiliary variables and lead to stable nonnegative weights and possible efficiency gains over GREG. In this paper, a simulation study, based on a real business survey sample data set treated as the target population, is conducted to study the relative performance of GREG, regression trees and lasso in terms of efficiency of the estimators and properties of associated regression weights. Both probability sampling and non-probability sampling scenarios are studied.

Key Words: Model assisted inference, Calibration estimation; Model selection; Generalized regression estimator.

Table of contents

How to cite

Lundy, E.R., and Rao, J.N.K. (2022). Relative performance of methods based on model-assisted survey regression estimation: A simulation study. Survey Methodology, Statistics Canada, Catalogue No. 12-001-X, Vol. 48, No. 1. Paper available at http://www.statcan.gc.ca/pub/12-001-x/2022001/article/00003-eng.htm.

Note


Date modified: