Inference for partially synthetic, public use microdata sets

Articles and reports: 12-001-X20030026785

Description:

To avoid disclosures, one approach is to release partially synthetic, public use microdata sets. These comprise the units originally surveyed, but some collected values, for example sensitive values at high risk of disclosure or values of key identifiers, are replaced with multiple imputations. Although partially synthetic approaches are currently used to protect public use data, valid methods of inference have not been developed for them. This article presents such methods. They are based on the concepts of multiple imputation for missing data but use different rules for combining point and variance estimates. The combining rules also differ from those for fully synthetic data sets developed by Raghunathan, Reiter and Rubin (2003). The validity of these new rules is illustrated in simulation studies.

Issue Number: 2003002

Author(s): Reiter, J.P.

Main Product: Survey Methodology

Format	Release date	More information
PDF	January 27, 2004

Related information

Subjects and keywords

Subjects

Statistical methods
- Editing and imputation
- Inference and foundations

Keywords

Report a problem or mistake on this page

Date modified:: 2024-10-06

Language selection

Search and menus

Search

Inference for partially synthetic, public use microdata sets - ARCHIVED

Related information

Subjects

Keywords