Analysis

Skip to main content
Skip to footer

Language selection

Français

Search and menus

Search and menus

Search

Skip to filters. View results.

Statistics Canada's Trust Centre

Results

All (11)

All (11) (0 to 10 of 11 results)

1. Comments by Changbao Wu on “Handling non-probability samples through inverse probability weighting with an application to Statistics Canada’s crowdsourcing data”
Articles and reports: 12-001-X202400100002
Description: We provide comparisons among three parametric methods for the estimation of participation probabilities and some brief comments on homogeneous groups and post-stratification.
Release date: 2024-06-25
2. Dealing with undercoverage for non-probability survey samples
Articles and reports: 12-001-X202300200005
Description: Population undercoverage is one of the main hurdles faced by statistical analysis with non-probability survey samples. We discuss two typical scenarios of undercoverage, namely...
Description: Population undercoverage is one of the main hurdles faced by statistical analysis with non-probability survey samples. We discuss two typical scenarios of undercoverage, namely, stochastic undercoverage and deterministic undercoverage. We argue that existing estimation methods under the positivity assumption on the propensity scores (i.e., the participation probabilities) can be directly applied to handle the scenario of stochastic undercoverage. We explore strategies for mitigating biases in estimating the mean of the target population under deterministic undercoverage. In particular, we examine a split population approach based on a convex hull formulation, and construct estimators with reduced biases. A doubly robust estimator can be constructed if a followup subsample of the reference probability survey with measurements on the study variable becomes feasible. Performances of six competing estimators are investigated through a simulation study and issues which require further investigation are briefly discussed.
More
Release date: 2024-01-03
3. Statistical inference with non-probability survey samples
Articles and reports: 12-001-X202200200002
Description:
We provide a critical review and some extended discussions on theoretical and practical issues with analysis of non-probability survey samples. We attempt to present rigorous...
Description:
We provide a critical review and some extended discussions on theoretical and practical issues with analysis of non-probability survey samples. We attempt to present rigorous inferential frameworks and valid statistical procedures under commonly used assumptions, and address issues on the justification and verification of assumptions in practical applications. Some current methodological developments are showcased, and problems which require further investigation are mentioned. While the focus of the paper is on non-probability samples, the essential role of probability survey samples with rich and relevant information on auxiliary variables is highlighted.
More
Release date: 2022-12-15
4. Author’s response to comments on “Statistical inference with non-probability survey samples”
Articles and reports: 12-001-X202200200008
Description:
This response contains additional remarks on a few selected issues raised by the discussants.

Release date: 2022-12-15
5. Sparse and efficient replication variance estimation for complex surveys Archived
Articles and reports: 12-001-X201300111826
Description:
It is routine practice for survey organizations to provide replication weights as part of survey data files. These replication weights are meant to produce valid and efficient variance...
Description:
It is routine practice for survey organizations to provide replication weights as part of survey data files. These replication weights are meant to produce valid and efficient variance estimates for a variety of estimators in a simple and systematic manner. Most existing methods for constructing replication weights, however, are only valid for specific sampling designs and typically require a very large number of replicates. In this paper we first show how to produce replication weights based on the method outlined in Fay (1984) such that the resulting replication variance estimator is algebraically equivalent to the fully efficient linearization variance estimator for any given sampling design. We then propose a novel weight-calibration method to simultaneously achieve efficiency and sparsity in the sense that a small number of sets of replication weights can produce valid and efficient replication variance estimators for key population parameters. Our proposed method can be used in conjunction with existing resampling techniques for large-scale complex surveys. Validity of the proposed methods and extensions to some balanced sampling designs are also discussed. Simulation results showed that our proposed variance estimators perform very well in tracking coverage probabilities of confidence intervals. Our proposed strategies will likely have impact on how public-use survey data files are produced and how these data sets are analyzed.
More
Release date: 2013-06-28
6. Pseudo empirical likelihood inference for multiple surveys and multiple-frame surveys Archived
Articles and reports: 11-536-X200900110806
Description:
Recent work using a pseudo empirical likelihood (EL) method for finite population inferences with complex survey data focused primarily on a single survey sample, non-stratified or...
Description:
Recent work using a pseudo empirical likelihood (EL) method for finite population inferences with complex survey data focused primarily on a single survey sample, non-stratified or stratified, with considerable effort devoted to computational procedures. In this talk we present a pseudo empirical likelihood approach to inference from multiple surveys and multiple-frame surveys, two commonly encountered problems in survey practice. We show that inferences about the common parameter of interest and the effective use of various types of auxiliary information can be conveniently carried out through the constrained maximization of joint pseudo EL function. We obtain asymptotic results which are used for constructing the pseudo EL ratio confidence intervals, either using a chi-square approximation or a bootstrap calibration. All related computational problems can be handled using existing algorithms on stratified sampling after suitable re-formulation.
More
Release date: 2009-08-11
7. Simulation-based randomized systematic PPS sampling under substitution of units Archived
Articles and reports: 11-522-X200600110424
Description:
The International Tobacco Control (ITC) Policy Evaluation China Survey uses a multi-stage unequal probability sampling design with upper level clusters selected by the randomized...
Description:
The International Tobacco Control (ITC) Policy Evaluation China Survey uses a multi-stage unequal probability sampling design with upper level clusters selected by the randomized systematic PPS sampling method. A difficulty arises in the execution of the survey: several selected upper level clusters refuse to participate in the survey and have to be replaced by substitute units, selected from units not included in the initial sample and once again using the randomized systematic PPS sampling method. Under such a scenario the first order inclusion probabilities of the final selected units are very difficult to calculate and the second order inclusion probabilities become virtually intractable. In this paper we develop a simulation-based approach for computing the first and the second order inclusion probabilities when direct calculation is prohibitive or impossible. The efficiency and feasibility of the proposed approach are demonstrated through both theoretical considerations and numerical examples. Several R/S-PLUS functions and codes for the proposed procedure are included. The approach can be extended to handle more complex refusal/substitution scenarios one may encounter in practice.
More
Release date: 2008-06-26
8. Simulation-based randomized systematic PPS sampling under substitution of units Archived
Articles and reports: 12-001-X200800110613
Description:
The International Tobacco Control (ITC) Policy Evaluation Survey of China uses a multi-stage unequal probability sampling design with upper level clusters selected by the randomized...
Description:
The International Tobacco Control (ITC) Policy Evaluation Survey of China uses a multi-stage unequal probability sampling design with upper level clusters selected by the randomized systematic PPS sampling method. A difficulty arises in the execution of the survey: several selected upper level clusters refuse to participate in the survey and have to be replaced by substitute units, selected from units not included in the initial sample and once again using the randomized systematic PPS sampling method. Under such a scenario the first order inclusion probabilities of the final selected units are very difficult to calculate and the second order inclusion probabilities become virtually intractable. In this paper we develop a simulation-based approach for computing the first and the second order inclusion probabilities when direct calculation is prohibitive or impossible. The efficiency and feasibility of the proposed approach are demonstrated through both theoretical considerations and numerical examples. Several R/S-PLUS functions and codes for the proposed procedure are included. The approach can be extended to handle more complex refusal/substitution scenarios one may encounter in practice.
More
Release date: 2008-06-26
9. Algorithms and R codes for the pseudo empirical likelihood method in survey sampling Archived
Articles and reports: 12-001-X20050029051
Description:
We present computational algorithms for the recently proposed pseudo empirical likelihood method for the analysis of complex survey data. Several key algorithms for computing the...
Description:
We present computational algorithms for the recently proposed pseudo empirical likelihood method for the analysis of complex survey data. Several key algorithms for computing the maximum pseudo empirical likelihood estimators and for constructing the pseudo empirical likelihood ratio confidence intervals are implemented using the popular statistical software R and S-PLUS. Major codes are written in the form of R/S-PLUS functions and therefore can directly be used for survey applications and/or simulation studies.
More
Release date: 2006-02-17
10. Combining information from multiple surveys through the empirical likelihood method Archived
Articles and reports: 11-522-X20030017711
Description:
This article uses the recently developed pseudo-empirical likelihood method to construct estimators that not only meet the consistency and efficiency requirements but have more...
Description:
This article uses the recently developed pseudo-empirical likelihood method to construct estimators that not only meet the consistency and efficiency requirements but have more attractive features.
More
Release date: 2005-01-26

Stats in brief (0)

Stats in brief (0) (0 results)

No content available at this time.

Articles and reports (11)

Articles and reports (11) (0 to 10 of 11 results)

1. Comments by Changbao Wu on “Handling non-probability samples through inverse probability weighting with an application to Statistics Canada’s crowdsourcing data”
Articles and reports: 12-001-X202400100002
Description: We provide comparisons among three parametric methods for the estimation of participation probabilities and some brief comments on homogeneous groups and post-stratification.
Release date: 2024-06-25
2. Dealing with undercoverage for non-probability survey samples
Articles and reports: 12-001-X202300200005
Description: Population undercoverage is one of the main hurdles faced by statistical analysis with non-probability survey samples. We discuss two typical scenarios of undercoverage, namely...
Description: Population undercoverage is one of the main hurdles faced by statistical analysis with non-probability survey samples. We discuss two typical scenarios of undercoverage, namely, stochastic undercoverage and deterministic undercoverage. We argue that existing estimation methods under the positivity assumption on the propensity scores (i.e., the participation probabilities) can be directly applied to handle the scenario of stochastic undercoverage. We explore strategies for mitigating biases in estimating the mean of the target population under deterministic undercoverage. In particular, we examine a split population approach based on a convex hull formulation, and construct estimators with reduced biases. A doubly robust estimator can be constructed if a followup subsample of the reference probability survey with measurements on the study variable becomes feasible. Performances of six competing estimators are investigated through a simulation study and issues which require further investigation are briefly discussed.
More
Release date: 2024-01-03
3. Statistical inference with non-probability survey samples
Articles and reports: 12-001-X202200200002
Description:
We provide a critical review and some extended discussions on theoretical and practical issues with analysis of non-probability survey samples. We attempt to present rigorous...
Description:
We provide a critical review and some extended discussions on theoretical and practical issues with analysis of non-probability survey samples. We attempt to present rigorous inferential frameworks and valid statistical procedures under commonly used assumptions, and address issues on the justification and verification of assumptions in practical applications. Some current methodological developments are showcased, and problems which require further investigation are mentioned. While the focus of the paper is on non-probability samples, the essential role of probability survey samples with rich and relevant information on auxiliary variables is highlighted.
More
Release date: 2022-12-15
4. Author’s response to comments on “Statistical inference with non-probability survey samples”
Articles and reports: 12-001-X202200200008
Description:
This response contains additional remarks on a few selected issues raised by the discussants.

Release date: 2022-12-15
5. Sparse and efficient replication variance estimation for complex surveys Archived
Articles and reports: 12-001-X201300111826
Description:
It is routine practice for survey organizations to provide replication weights as part of survey data files. These replication weights are meant to produce valid and efficient variance...
Description:
It is routine practice for survey organizations to provide replication weights as part of survey data files. These replication weights are meant to produce valid and efficient variance estimates for a variety of estimators in a simple and systematic manner. Most existing methods for constructing replication weights, however, are only valid for specific sampling designs and typically require a very large number of replicates. In this paper we first show how to produce replication weights based on the method outlined in Fay (1984) such that the resulting replication variance estimator is algebraically equivalent to the fully efficient linearization variance estimator for any given sampling design. We then propose a novel weight-calibration method to simultaneously achieve efficiency and sparsity in the sense that a small number of sets of replication weights can produce valid and efficient replication variance estimators for key population parameters. Our proposed method can be used in conjunction with existing resampling techniques for large-scale complex surveys. Validity of the proposed methods and extensions to some balanced sampling designs are also discussed. Simulation results showed that our proposed variance estimators perform very well in tracking coverage probabilities of confidence intervals. Our proposed strategies will likely have impact on how public-use survey data files are produced and how these data sets are analyzed.
More
Release date: 2013-06-28
6. Pseudo empirical likelihood inference for multiple surveys and multiple-frame surveys Archived
Articles and reports: 11-536-X200900110806
Description:
Recent work using a pseudo empirical likelihood (EL) method for finite population inferences with complex survey data focused primarily on a single survey sample, non-stratified or...
Description:
Recent work using a pseudo empirical likelihood (EL) method for finite population inferences with complex survey data focused primarily on a single survey sample, non-stratified or stratified, with considerable effort devoted to computational procedures. In this talk we present a pseudo empirical likelihood approach to inference from multiple surveys and multiple-frame surveys, two commonly encountered problems in survey practice. We show that inferences about the common parameter of interest and the effective use of various types of auxiliary information can be conveniently carried out through the constrained maximization of joint pseudo EL function. We obtain asymptotic results which are used for constructing the pseudo EL ratio confidence intervals, either using a chi-square approximation or a bootstrap calibration. All related computational problems can be handled using existing algorithms on stratified sampling after suitable re-formulation.
More
Release date: 2009-08-11
7. Simulation-based randomized systematic PPS sampling under substitution of units Archived
Articles and reports: 11-522-X200600110424
Description:
The International Tobacco Control (ITC) Policy Evaluation China Survey uses a multi-stage unequal probability sampling design with upper level clusters selected by the randomized...
Description:
The International Tobacco Control (ITC) Policy Evaluation China Survey uses a multi-stage unequal probability sampling design with upper level clusters selected by the randomized systematic PPS sampling method. A difficulty arises in the execution of the survey: several selected upper level clusters refuse to participate in the survey and have to be replaced by substitute units, selected from units not included in the initial sample and once again using the randomized systematic PPS sampling method. Under such a scenario the first order inclusion probabilities of the final selected units are very difficult to calculate and the second order inclusion probabilities become virtually intractable. In this paper we develop a simulation-based approach for computing the first and the second order inclusion probabilities when direct calculation is prohibitive or impossible. The efficiency and feasibility of the proposed approach are demonstrated through both theoretical considerations and numerical examples. Several R/S-PLUS functions and codes for the proposed procedure are included. The approach can be extended to handle more complex refusal/substitution scenarios one may encounter in practice.
More
Release date: 2008-06-26
8. Simulation-based randomized systematic PPS sampling under substitution of units Archived
Articles and reports: 12-001-X200800110613
Description:
The International Tobacco Control (ITC) Policy Evaluation Survey of China uses a multi-stage unequal probability sampling design with upper level clusters selected by the randomized...
Description:
The International Tobacco Control (ITC) Policy Evaluation Survey of China uses a multi-stage unequal probability sampling design with upper level clusters selected by the randomized systematic PPS sampling method. A difficulty arises in the execution of the survey: several selected upper level clusters refuse to participate in the survey and have to be replaced by substitute units, selected from units not included in the initial sample and once again using the randomized systematic PPS sampling method. Under such a scenario the first order inclusion probabilities of the final selected units are very difficult to calculate and the second order inclusion probabilities become virtually intractable. In this paper we develop a simulation-based approach for computing the first and the second order inclusion probabilities when direct calculation is prohibitive or impossible. The efficiency and feasibility of the proposed approach are demonstrated through both theoretical considerations and numerical examples. Several R/S-PLUS functions and codes for the proposed procedure are included. The approach can be extended to handle more complex refusal/substitution scenarios one may encounter in practice.
More
Release date: 2008-06-26
9. Algorithms and R codes for the pseudo empirical likelihood method in survey sampling Archived
Articles and reports: 12-001-X20050029051
Description:
We present computational algorithms for the recently proposed pseudo empirical likelihood method for the analysis of complex survey data. Several key algorithms for computing the...
Description:
We present computational algorithms for the recently proposed pseudo empirical likelihood method for the analysis of complex survey data. Several key algorithms for computing the maximum pseudo empirical likelihood estimators and for constructing the pseudo empirical likelihood ratio confidence intervals are implemented using the popular statistical software R and S-PLUS. Major codes are written in the form of R/S-PLUS functions and therefore can directly be used for survey applications and/or simulation studies.
More
Release date: 2006-02-17
10. Combining information from multiple surveys through the empirical likelihood method Archived
Articles and reports: 11-522-X20030017711
Description:
This article uses the recently developed pseudo-empirical likelihood method to construct estimators that not only meet the consistency and efficiency requirements but have more...
Description:
This article uses the recently developed pseudo-empirical likelihood method to construct estimators that not only meet the consistency and efficiency requirements but have more attractive features.
More
Release date: 2005-01-26

Journals and periodicals (0)

Journals and periodicals (0) (0 results)

No content available at this time.

Report a problem or mistake on this page

Date modified:: 2025-05-29