A case study of using Splink: Census duplicate matching
Articles and reports: 11-522-X202200100002Description: The authors used the Splink probabilistic linkage package developed by the UK Ministry of Justice, to link census data from England and Wales to itself to find duplicate census responses. A large gold standard of confirmed census duplicates was available meaning that the results of the Splink implementation could be quality assured. This paper describes the implementation and features of Splink, gives details of the settings and parameters that we used to tune Splink for our particular project, and gives the results that we obtained. Issue Number: 2022001Author(s): Cleaton, Mary; Hall, Johanna; Shipsey, Rachel; White, Zoe; Xhaferaj, KristinaMain Product:Statistics Canada International Symposium Series: Proceedings