A generalized Fellegi-Holt paradigm for automatic error localization
by Sander ScholtusNote 1
- Release date: June 22, 2016
The aim of automatic editing is to use a computer to detect and amend erroneous values in a data set, without human intervention. Most automatic editing methods that are currently used in official statistics are based on the seminal work of Fellegi and Holt (1976). Applications of this methodology in practice have shown systematic differences between data that are edited manually and automatically, because human editors may perform complex edit operations. In this paper, a generalization of the Fellegi-Holt paradigm is proposed that can incorporate a large class of edit operations in a natural way. In addition, an algorithm is outlined that solves the resulting generalized error localization problem. It is hoped that this generalization may be used to increase the suitability of automatic editing in practice, and hence to improve the efficiency of data editing processes. Some first results on synthetic data are promising in this respect.
Key Words: Automatic editing; Edit operations; Maximum likelihood; Numerical data; Linear edits.
Table of content
- 1. Introduction
- 2. Background and related work
- 3. Edit operations
- 4. A generalized error localization problem
- 5. Implied edits for general edit operations
- 6. An error localization algorithm
- 7. Simulation study
- 8. Conclusion