Disclosure risk and variance estimation

Articles and reports: 11-522-X200600110434

Description:

Protecting respondents from disclosure of their identity in publicly released survey data is of practical concern to many government agencies. Methods for doing so include suppression of cluster and stratum identifiers and altering or swapping record values between respondents. Unfortunately, stratum and cluster identifiers are usually needed for variance estimation using linearization and for replication methods as resampling is typically done on first-stage sampling units within strata. One might feel that releasing a set of replicate weights that also have stratum and cluster identifiers suppressed might circumvent this problem to some extent, especially using some random resampling such as the bootstrap. In this article, we first demonstrate that by viewing the replicate weights as observations in a high dimensional space one can easily use clustering algorithms to reconstruct the cluster identifiers irrespective of the resampling method even if the resampling weights are randomly altered. We then propose a fast algorithm for swapping cluster and strata identifiers of ultimate units before creating replicate weights without significantly impacting resulting variance estimates of characteristics of interest. The methods are illustrated by application to publicly released data from the National Health and Nutrition Examination Surveys, where such disclosure issues are extremely important..

Issue Number: 2006001

Author(s): Lu, Wilson; Sitter, Randy R.

Main Product: Statistics Canada International Symposium Series: Proceedings

Format	Release date	More information
CD-ROM	March 17, 2008
PDF	March 17, 2008

Related information

Subjects and keywords

Subjects

Statistical methods
- Disclosure control and data dissemination
- Weighting and estimation

Keywords

Report a problem or mistake on this page

Date modified:: 2024-09-20

Language selection

Search and menus

Search

Disclosure risk and variance estimation - ARCHIVED

Related information

Subjects

Keywords