Pseudo-likelihood-based Bayesian information criterion for variable selection in survey data

Articles and reports: 12-001-X201300211871
Description:

Regression models are routinely used in the analysis of survey data, where one common issue of interest is to identify influential factors that are associated with certain behavioral, social, or economic indices within a target population. When data are collected through complex surveys, the properties of classical variable selection approaches developed in i.i.d. non-survey settings need to be re-examined. In this paper, we derive a pseudo-likelihood-based BIC criterion for variable selection in the analysis of survey data and suggest a sample-based penalized likelihood approach for its implementation. The sampling weights are appropriately assigned to correct the biased selection result caused by the distortion between the sample and the target population. Under a joint randomization framework, we establish the consistency of the proposed selection procedure. The finite-sample performance of the approach is assessed through analysis and computer simulations based on data from the hypertension component of the 2009 Survey on Living with Chronic Diseases in Canada.

Issue Number: 2013002
Author(s): Chen, Jiahua; Mantel, Harold; Xu, Chen
Main Product: Survey Methodology
Format Release date More information
HTML January 15, 2014
PDF January 15, 2014