![]() |
|
![]() | ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please contact us to request a format other than those available.
![]() |
Survey steps > ImputationScope and purposeImputation is the process used to determine and assign replacement values for missing, invalid or inconsistent data that have failed edits. This is done by changing some of the responses or assigning values when they are missing on the record being edited to ensure that estimates are of high quality and that a plausible, internally consistent record is created. Many of these problems would have been solved earlier through follow-up with the respondent or through review and manual correction of the questionnaire. However, it is generally impossible to resolve all problems at these early stages due to concerns of response burden, cost and timeliness. Since it is usually desirable to produce a complete and consistent microdata file containing imputed data, imputation is used to handle the remaining edit failures. Although imputation can improve the quality of the final data by correcting for missing, invalid or inconsistent responses, care must be exercised in choosing an appropriate imputation methodology. Some methods of imputation do not preserve the relationships between variables and can actually distort underlying distributions. Therefore, imputation must be taken into account when producing estimates and their associated variance estimates. Principles Imputation is best done by those with full access to the microdata and in possession of good auxiliary information. It may be automated, manual or a combination of both. Good imputation attempts to limit the bias caused by not having observed all of the desired values, has an audit trail for evaluation purposes and ensures that imputed records are internally consistent. Good imputation processes are automated, objective, reproducible and efficient. Under the Fellegi-Holt principles (1976), changes are made to the minimum number of fields to ensure that the completed record passes all of the edits. Imputation methods can be classified as either deterministic or stochastic, depending upon whether or not there is some degree of randomness in the imputed data (Kalton and Kasprzyk, 1986; Kovar and Whitridge, 1995). Deterministic imputation methods include logical imputation, historical imputation, mean imputation, ratio and regression imputation and single donor nearest-neighbour imputation. These methods can be further divided into methods that rely solely on deducing the imputed value from data available for the nonrespondent and other auxiliary data (logical and historical) and those that make use of the observed data for other responding units for the given survey. Use of observed data from responding units can be made directly by transferring data from a chosen donor record or by means of models (ratio and regression). Stochastic imputation methods include the hot deck, nearest neighbour imputation where a random selection is made from several “closest” nearest neighbours, regression with random residuals, and any other deterministic method with random residuals added. Guidelines
References Bankier, M., Lachance, M. and Poirier, P. (1999). A generic implementation of the New Imputation Methodology. Proceedings of the Survey Research Methods Section, American Statistical Association, 548-553. Fellegi, I.P. and Holt, D. (1976). A systematic approach to automatic edit and imputation. Journal of the American Statistical Association, 71, 17-35. Kalton, G. and Kasprzyk, D. (1986). The treatment of missing survey data. Survey Methodology, 12, 1-16. Kovar, J.G., and Whitridge, P. (1995). Imputation of business survey data. In Business Survey Methods, B.G. Cox et al. (eds.), Wiley, New York, 403-423. Kovar, J.G., MacMillan, J. and Whitridge, P. (1988). Overview and strategy for the Generalized Edit and Imputation System. Statistics Canada, Methodology Branch Working Paper No. BSMD 88-007 E/F. Lee, H., Rancourt, E. and Särndal, C.-E. (2002). Variance estimation from survey data under single imputation. In Survey Nonresponse, R.M. Groves et al. (eds.), Wiley, New York, 315-328. Statistics Canada (2000a). Functional description of the Generalized Edit and Imputation System. Statistics Canada technical report. Statistics Canada (2000d). Policy on Informing Users of Data Quality and Methodology. Policy Manual, 2.3. |
|
|
|