Some flexible regression techniques for complex surveys

Articles and reports: 11-522-X20020016749
Description:

Survey sampling is a statistical domain that has been slow to take advantage of flexible regression methods. In this technical paper, two approaches are discussed that could be used to make these regression methods accessible: adapt the techniques to the complex survey design that has been used or sample the survey data so that the standard techniques are applicable.

In following the former route, we introduce techniques that account for the complex survey structure of the data for scatterplot smoothing and additive models. The use of penalized least squares in the sampling context is studied as a tool for the analysis of a general trend in a finite population. We focus on smooth regression with a normal error model. Ties in covariates abound for large scale surveys resulting in the application of scatterplot smoothers to means. The estimation of smooths (for example, smoothing splines) depends on the sampling design only via the sampling weights, meaning that standard software can be used for estimation. Inference for these curves is more challenging, as a result of correlations induced by the sampling design. We propose and illustrate tests that account for the sampling design. Illustrative examples are given using the Ontario health survey, including scatterplot smoothing, additive models and model diagnostics. In an attempt to resolve the problem by appropriate sampling of the survey data file, we discuss some of the hurdles that are faced when using this approach.

Issue Number: 2002001
Author(s): Bellhouse, D.R.; Chipman, H.; Stafford, Janine
Main Product: Statistics Canada International Symposium Series: Proceedings
Format Release date More information
CD-ROM September 13, 2004
PDF September 13, 2004