Survey Methodology
Models of linkage error for capture-recapture estimation without clerical reviews

by Abel Dasylva, Arthur Goussanou and Christian-Olivier NambeuNote 1

  • Release date: December 20, 2024

Abstract

The capture-recapture method can be applied to measure the coverage of administrative and big data sources, in official statistics. In its basic form, it involves the linkage of two sources while assuming a perfect linkage and other standard assumptions. In practice, linkage errors arise and are a potential source of bias, where the linkage is based on quasi-identifiers. These errors include false positives and false negatives, where the former arise when linking a pair of records from different units, and the latter arise when not linking a pair of records from the same unit. So far, the existing solutions have resorted to costly clerical reviews, or they have made the restrictive conditional independence assumption. In this work, these requirements are relaxed by modeling the number of links from a record instead. The same approach may be taken to estimate the linkage accuracy without clerical reviews, when linking two sources that each have some undercoverage.

Key Words:      Big data; Data integration; Data matching; Dual system estimation; Quality; Record linkage.

Table of contents

How to cite

Dasylva, A., Goussanou, A. and Nambeu, C.-O. (2024). Models of linkage error for capture-recapture estimation without clerical reviews. Survey Methodology, 50(2), 375-408. Paper available at http://www.statcan.gc.ca/pub/12-001-x/2024002/article/00007-eng.pdf.

Note

Date modified: