Propensity Score Estimation and Optimal Sampling Design when Integrating Probability Samples with Non-probability Data

Articles and reports: 11-522-X202500100032
Description: Although non-probability data sources are not new to official statistics, a revived interest in the topic has emerged from pressures due to falling survey response rates, increasing data collection costs and a desire to take advantage of new data source opportunities from the ongoing societal digitalisation. Due to the exclusion of certain segments of the target population, inference derived solely from a non-probability data source is likely to result in bias. This work approaches the challenge of addressing the bias by integrating non-probability data with reference probability samples. The focus will be on methods to model the propensity of inclusion in the non-probability dataset with the help of the accompanying reference sample, with the modelled propensities then applied in an inverse probability weighting approach to produce population estimates. The reference sample is sometimes assumed as given. In this presentation however, an objective of finding an optimal strategy will be pursued that is, the combination of a data integration-based estimator and sample design for the reference probability sample. Recent work is discussed in which advantage is taken of the good unit identification possibilities in business surveys to study an estimator based on propensities and derive optimal (unequal) selection probabilities for the reference sample.
Issue Number: 2025001
Author(s): Holmberg, Anders; Ang, Lyndon; Clark, Robert G.; Loong, Bronwyn
Main Product: Statistics Canada International Symposium Series: Proceedings
Format Release date More information
PDF September 8, 2025

Related information

Subjects and keywords

Subjects