A generalized Fellegi-Holt paradigm for automatic error localization
Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.
by Sander ScholtusNote 1
- Release date: June 22, 2016
The aim of automatic editing is to use a computer to detect and amend erroneous values in a data set, without human intervention. Most automatic editing methods that are currently used in official statistics are based on the seminal work of Fellegi and Holt (1976). Applications of this methodology in practice have shown systematic differences between data that are edited manually and automatically, because human editors may perform complex edit operations. In this paper, a generalization of the Fellegi-Holt paradigm is proposed that can incorporate a large class of edit operations in a natural way. In addition, an algorithm is outlined that solves the resulting generalized error localization problem. It is hoped that this generalization may be used to increase the suitability of automatic editing in practice, and hence to improve the efficiency of data editing processes. Some first results on synthetic data are promising in this respect.
Key Words: Automatic editing; Edit operations; Maximum likelihood; Numerical data; Linear edits.
Table of content
- 1. Introduction
- 2. Background and related work
- 3. Edit operations
- 4. A generalized error localization problem
- 5. Implied edits for general edit operations
- 6. An error localization algorithm
- 7. Simulation study
- 8. Conclusion