Sort Help
entries

Results

All (18)

All (18) (0 to 10 of 18 results)

  • Articles and reports: 12-001-X202400100002
    Description: We provide comparisons among three parametric methods for the estimation of participation probabilities and some brief comments on homogeneous groups and post-stratification.
    Release date: 2024-06-25

  • Articles and reports: 12-001-X202300200005
    Description: Population undercoverage is one of the main hurdles faced by statistical analysis with non-probability survey samples. We discuss two typical scenarios of undercoverage, namely, stochastic undercoverage and deterministic undercoverage. We argue that existing estimation methods under the positivity assumption on the propensity scores (i.e., the participation probabilities) can be directly applied to handle the scenario of stochastic undercoverage. We explore strategies for mitigating biases in estimating the mean of the target population under deterministic undercoverage. In particular, we examine a split population approach based on a convex hull formulation, and construct estimators with reduced biases. A doubly robust estimator can be constructed if a followup subsample of the reference probability survey with measurements on the study variable becomes feasible. Performances of six competing estimators are investigated through a simulation study and issues which require further investigation are briefly discussed.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200007
    Description: Conformal prediction is an assumption-lean approach to generating distribution-free prediction intervals or sets, for nearly arbitrary predictive models, with guaranteed finite-sample coverage. Conformal methods are an active research topic in statistics and machine learning, but only recently have they been extended to non-exchangeable data. In this paper, we invite survey methodologists to begin using and contributing to conformal methods. We introduce how conformal prediction can be applied to data from several common complex sample survey designs, under a framework of design-based inference for a finite population, and we point out gaps where survey methodologists could fruitfully apply their expertise. Our simulations empirically bear out the theoretical guarantees of finite-sample coverage, and our real-data example demonstrates how conformal prediction can be applied to complex sample survey data in practice.
    Release date: 2024-01-03

  • Stats in brief: 89-20-00062023001
    Description: This course is intended for Government of Canada employees who would like to learn about evaluating the quality of data for a particular use. Whether you are a new employee interested in learning the basics, or an experienced subject matter expert looking to refresh your skills, this course is here to help.
    Release date: 2023-07-17

  • Articles and reports: 12-001-X202300100002
    Description: We consider regression analysis in the context of data integration. To combine partial information from external sources, we employ the idea of model calibration which introduces a “working” reduced model based on the observed covariates. The working reduced model is not necessarily correctly specified but can be a useful device to incorporate the partial information from the external data. The actual implementation is based on a novel application of the information projection and model calibration weighting. The proposed method is particularly attractive for combining information from several sources with different missing patterns. The proposed method is applied to a real data example combining survey data from Korean National Health and Nutrition Examination Survey and big data from National Health Insurance Sharing Service in Korea.
    Release date: 2023-06-30

  • Articles and reports: 12-001-X202300100005
    Description: Weight smoothing is a useful technique in improving the efficiency of design-based estimators at the risk of bias due to model misspecification. As an extension of the work of Kim and Skinner (2013), we propose using weight smoothing to construct the conditional likelihood for efficient analytic inference under informative sampling. The Beta prime distribution can be used to build a parameter model for weights in the sample. A score test is developed to test for model misspecification in the weight model. A pretest estimator using the score test can be developed naturally. The pretest estimator is nearly unbiased and can be more efficient than the design-based estimator when the weight model is correctly specified, or the original weights are highly variable. A limited simulation study is presented to investigate the performance of the proposed methods.
    Release date: 2023-06-30

  • Articles and reports: 12-001-X202300100010
    Description: Precise and unbiased estimates of response propensities (RPs) play a decisive role in the monitoring, analysis, and adaptation of data collection. In a fixed survey climate, those parameters are stable and their estimates ultimately converge when sufficient historic data is collected. In survey practice, however, response rates gradually vary in time. Understanding time-dependent variation in predicting response rates is key when adapting survey design. This paper illuminates time-dependent variation in response rates through multi-level time-series models. Reliable predictions can be generated by learning from historic time series and updating with new data in a Bayesian framework. As an illustrative case study, we focus on Web response rates in the Dutch Health Survey from 2014 to 2019.
    Release date: 2023-06-30

  • Articles and reports: 36-28-0001202300100003
    Description: Quality of life and well-being research often involves survey content that is subjective in nature, for example questions pertaining to life satisfaction. Two phenomena impacting responses to self-reported life satisfaction are studied across a range of social surveys: the framing effect, where a respondent’s answer is influenced by the theme of the survey or its content; and the mode effect, where a respondent’s answer is influenced by the method in which survey data is collected (with an interviewer, through an online collection portal, etc.). The objective of this paper is to document the effect that survey collection and survey content have on Canadians’ self-reported satisfaction with their lives. The impact of these effects on life satisfaction responses is measured across three Statistics Canada survey series: the General Social Survey, the Canadian Community Health Survey, and the Canadian Social Survey.
    Release date: 2023-01-25

  • Articles and reports: 12-001-X202200200002
    Description:

    We provide a critical review and some extended discussions on theoretical and practical issues with analysis of non-probability survey samples. We attempt to present rigorous inferential frameworks and valid statistical procedures under commonly used assumptions, and address issues on the justification and verification of assumptions in practical applications. Some current methodological developments are showcased, and problems which require further investigation are mentioned. While the focus of the paper is on non-probability samples, the essential role of probability survey samples with rich and relevant information on auxiliary variables is highlighted.

    Release date: 2022-12-15

  • Articles and reports: 12-001-X202200200007
    Description:

    Statistical inference with non-probability survey samples is a notoriously challenging problem in statistics. We introduce two new methods of nonparametric propensity score technique for weighting in the non-probability samples. One is the information projection approach and the other is the uniform calibration in the reproducing kernel Hilbert space.

    Release date: 2022-12-15
Stats in brief (1)

Stats in brief (1) ((1 result))

  • Stats in brief: 89-20-00062023001
    Description: This course is intended for Government of Canada employees who would like to learn about evaluating the quality of data for a particular use. Whether you are a new employee interested in learning the basics, or an experienced subject matter expert looking to refresh your skills, this course is here to help.
    Release date: 2023-07-17
Articles and reports (17)

Articles and reports (17) (0 to 10 of 17 results)

  • Articles and reports: 12-001-X202400100002
    Description: We provide comparisons among three parametric methods for the estimation of participation probabilities and some brief comments on homogeneous groups and post-stratification.
    Release date: 2024-06-25

  • Articles and reports: 12-001-X202300200005
    Description: Population undercoverage is one of the main hurdles faced by statistical analysis with non-probability survey samples. We discuss two typical scenarios of undercoverage, namely, stochastic undercoverage and deterministic undercoverage. We argue that existing estimation methods under the positivity assumption on the propensity scores (i.e., the participation probabilities) can be directly applied to handle the scenario of stochastic undercoverage. We explore strategies for mitigating biases in estimating the mean of the target population under deterministic undercoverage. In particular, we examine a split population approach based on a convex hull formulation, and construct estimators with reduced biases. A doubly robust estimator can be constructed if a followup subsample of the reference probability survey with measurements on the study variable becomes feasible. Performances of six competing estimators are investigated through a simulation study and issues which require further investigation are briefly discussed.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200007
    Description: Conformal prediction is an assumption-lean approach to generating distribution-free prediction intervals or sets, for nearly arbitrary predictive models, with guaranteed finite-sample coverage. Conformal methods are an active research topic in statistics and machine learning, but only recently have they been extended to non-exchangeable data. In this paper, we invite survey methodologists to begin using and contributing to conformal methods. We introduce how conformal prediction can be applied to data from several common complex sample survey designs, under a framework of design-based inference for a finite population, and we point out gaps where survey methodologists could fruitfully apply their expertise. Our simulations empirically bear out the theoretical guarantees of finite-sample coverage, and our real-data example demonstrates how conformal prediction can be applied to complex sample survey data in practice.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300100002
    Description: We consider regression analysis in the context of data integration. To combine partial information from external sources, we employ the idea of model calibration which introduces a “working” reduced model based on the observed covariates. The working reduced model is not necessarily correctly specified but can be a useful device to incorporate the partial information from the external data. The actual implementation is based on a novel application of the information projection and model calibration weighting. The proposed method is particularly attractive for combining information from several sources with different missing patterns. The proposed method is applied to a real data example combining survey data from Korean National Health and Nutrition Examination Survey and big data from National Health Insurance Sharing Service in Korea.
    Release date: 2023-06-30

  • Articles and reports: 12-001-X202300100005
    Description: Weight smoothing is a useful technique in improving the efficiency of design-based estimators at the risk of bias due to model misspecification. As an extension of the work of Kim and Skinner (2013), we propose using weight smoothing to construct the conditional likelihood for efficient analytic inference under informative sampling. The Beta prime distribution can be used to build a parameter model for weights in the sample. A score test is developed to test for model misspecification in the weight model. A pretest estimator using the score test can be developed naturally. The pretest estimator is nearly unbiased and can be more efficient than the design-based estimator when the weight model is correctly specified, or the original weights are highly variable. A limited simulation study is presented to investigate the performance of the proposed methods.
    Release date: 2023-06-30

  • Articles and reports: 12-001-X202300100010
    Description: Precise and unbiased estimates of response propensities (RPs) play a decisive role in the monitoring, analysis, and adaptation of data collection. In a fixed survey climate, those parameters are stable and their estimates ultimately converge when sufficient historic data is collected. In survey practice, however, response rates gradually vary in time. Understanding time-dependent variation in predicting response rates is key when adapting survey design. This paper illuminates time-dependent variation in response rates through multi-level time-series models. Reliable predictions can be generated by learning from historic time series and updating with new data in a Bayesian framework. As an illustrative case study, we focus on Web response rates in the Dutch Health Survey from 2014 to 2019.
    Release date: 2023-06-30

  • Articles and reports: 36-28-0001202300100003
    Description: Quality of life and well-being research often involves survey content that is subjective in nature, for example questions pertaining to life satisfaction. Two phenomena impacting responses to self-reported life satisfaction are studied across a range of social surveys: the framing effect, where a respondent’s answer is influenced by the theme of the survey or its content; and the mode effect, where a respondent’s answer is influenced by the method in which survey data is collected (with an interviewer, through an online collection portal, etc.). The objective of this paper is to document the effect that survey collection and survey content have on Canadians’ self-reported satisfaction with their lives. The impact of these effects on life satisfaction responses is measured across three Statistics Canada survey series: the General Social Survey, the Canadian Community Health Survey, and the Canadian Social Survey.
    Release date: 2023-01-25

  • Articles and reports: 12-001-X202200200002
    Description:

    We provide a critical review and some extended discussions on theoretical and practical issues with analysis of non-probability survey samples. We attempt to present rigorous inferential frameworks and valid statistical procedures under commonly used assumptions, and address issues on the justification and verification of assumptions in practical applications. Some current methodological developments are showcased, and problems which require further investigation are mentioned. While the focus of the paper is on non-probability samples, the essential role of probability survey samples with rich and relevant information on auxiliary variables is highlighted.

    Release date: 2022-12-15

  • Articles and reports: 12-001-X202200200007
    Description:

    Statistical inference with non-probability survey samples is a notoriously challenging problem in statistics. We introduce two new methods of nonparametric propensity score technique for weighting in the non-probability samples. One is the information projection approach and the other is the uniform calibration in the reproducing kernel Hilbert space.

    Release date: 2022-12-15

  • Articles and reports: 12-001-X202200200008
    Description:

    This response contains additional remarks on a few selected issues raised by the discussants.

    Release date: 2022-12-15
Journals and periodicals (0)

Journals and periodicals (0) (0 results)

No content available at this time.

Date modified: