Analysis

Skip to filters. View results.

Statistics Canada's Trust Centre

Results

All (65)

All (65) (0 to 10 of 65 results)

1. The Importance of Disaggregated Data: An Introduction (part 1)
Stats in brief: 89-20-00062024001
Description: This short video explains how it can be very effective for all levels of governments and organizations that serve communities to use disaggregated data to make evidence-informed public policy decisions. By using disaggregated data, policymakers are able to design more appropriate and effective policies that meet the needs of each diverse and unique Canadian.
Release date: 2024-07-16
2. The Importance of Disaggregated Data: An Introduction (part 2)
Stats in brief: 89-20-00062024002
Description: This short video explains how the use of disaggregated data can help policymakers to develop more targeted and effective policies by identifying the unique needs and challenges faced by different demographic groups.
Release date: 2024-07-16
3. Heterogeneous causal effects of labour market programs: A machine learning approach Archived
Articles and reports: 11-522-X202200100017
Description: In this paper, we look for presence of heterogeneity in conducting impact evaluations of the Skills Development intervention delivered under the Labour Market Development Agreements. We use linked longitudinal administrative data covering a sample of Skills Development participants from 2010 to 2017. We apply a causal machine-learning estimator as in Lechner (2019) to estimate the individualized program impacts at the finest aggregation level. These granular impacts reveal the distribution of net impacts facilitating further investigation as to what works for whom. The findings suggest statistically significant improvements in labour market outcomes for participants overall and for subgroups of policy interest.
Release date: 2024-06-28
4. Statistics Canada International Symposium Series: Proceedings
Journals and periodicals: 11-522-X
Description: Since 1984, an annual international symposium on methodological issues has been sponsored by Statistics Canada. Proceedings have been available since 1987.
Release date: 2024-06-28
5. Authors’ response to comments on “Handling non-probability samples through inverse probability weighting with an application to Statistics Canada’s crowdsourcing data”: Some new developments on likelihood approaches to estimation of participation probabilities for non-probability samples
Articles and reports: 12-001-X202400100001
Description: Inspired by the two excellent discussions of our paper, we offer some new insights and developments into the problem of estimating participation probabilities for non-probability samples. First, we propose an improvement of the method of Chen, Li and Wu (2020), based on best linear unbiased estimation theory, that more efficiently leverages the available probability and non-probability sample data. We also develop a sample likelihood approach, similar in spirit to the method of Elliott (2009), that properly accounts for the overlap between both samples when it can be identified in at least one of the samples. We use best linear unbiased prediction theory to handle the scenario where the overlap is unknown. Interestingly, our two proposed approaches coincide in the case of unknown overlap. Then, we show that many existing methods can be obtained as a special case of a general unbiased estimating function. Finally, we conclude with some comments on nonparametric estimation of participation probabilities.
Release date: 2024-06-25
6. Comments by Changbao Wu on “Handling non-probability samples through inverse probability weighting with an application to Statistics Canada’s crowdsourcing data”
Articles and reports: 12-001-X202400100002
Description: We provide comparisons among three parametric methods for the estimation of participation probabilities and some brief comments on homogeneous groups and post-stratification.
Release date: 2024-06-25
7. Comments by Julie Gershunskaya and Vladislav Beresovsky on “Handling non-probability samples through inverse probability weighting with an application to Statistics Canada’s crowdsourcing data”
Articles and reports: 12-001-X202400100003
Description: Beaumont, Bosa, Brennan, Charlebois and Chu (2024) propose innovative model selection approaches for estimation of participation probabilities for non-probability sample units. We focus our discussion on the choice of a likelihood and parameterization of the model, which are key for the effectiveness of the techniques developed in the paper. We consider alternative likelihood and pseudo-likelihood based methods for estimation of participation probabilities and present simulations implementing and comparing the AIC based variable selection. We demonstrate that, under important practical scenarios, the approach based on a likelihood formulated over the observed pooled non-probability and probability samples performed better than the pseudo-likelihood based alternatives. The contrast in sensitivity of the AIC criteria is especially large for small probability sample sizes and low overlap in covariates domains.
Release date: 2024-06-25
8. Handling non-probability samples through inverse probability weighting with an application to Statistics Canada’s crowdsourcing data
Articles and reports: 12-001-X202400100004
Description: Non-probability samples are being increasingly explored in National Statistical Offices as an alternative to probability samples. However, it is well known that the use of a non-probability sample alone may produce estimates with significant bias due to the unknown nature of the underlying selection mechanism. Bias reduction can be achieved by integrating data from the non-probability sample with data from a probability sample provided that both samples contain auxiliary variables in common. We focus on inverse probability weighting methods, which involve modelling the probability of participation in the non-probability sample. First, we consider the logistic model along with pseudo maximum likelihood estimation. We propose a variable selection procedure based on a modified Akaike Information Criterion (AIC) that properly accounts for the data structure and the probability sampling design. We also propose a simple rank-based method of forming homogeneous post-strata. Then, we extend the Classification and Regression Trees (CART) algorithm to this data integration scenario, while again properly accounting for the probability sampling design. A bootstrap variance estimator is proposed that reflects two sources of variability: the probability sampling design and the participation model. Our methods are illustrated using Statistics Canada’s crowdsourcing and survey data.
Release date: 2024-06-25
9. Author’s response to comments on “Exchangeability assumption in propensity-score based adjustment methods for population mean estimation using non-probability samples”
Articles and reports: 12-001-X202400100005
Description: In this rejoinder, I address the comments from the discussants, Dr. Takumi Saegusa, Dr. Jae-Kwang Kim and Ms. Yonghyun Kwon. Dr. Saegusa’s comments about the differences between the conditional exchangeability (CE) assumption for causal inferences versus the CE assumption for finite population inferences using nonprobability samples, and the distinction between design-based versus model-based approaches for finite population inference using nonprobability samples, are elaborated and clarified in the context of my paper. Subsequently, I respond to Dr. Kim and Ms. Kwon’s comprehensive framework for categorizing existing approaches for estimating propensity scores (PS) into conditional and unconditional approaches. I expand their simulation studies to vary the sampling weights, allow for misspecified PS models, and include an additional estimator, i.e., scaled adjusted logistic propensity estimator (Wang, Valliant and Li (2021), denoted by sWBS). In my simulations, it is observed that the sWBS estimator consistently outperforms or is comparable to the other estimators under the misspecified PS model. The sWBS, as well as WBS or ABS described in my paper, do not assume that the overlapped units in both the nonprobability and probability reference samples are negligible, nor do they require the identification of overlap units as needed by the estimators proposed by Dr. Kim and Ms. Kwon.
Release date: 2024-06-25
10. Comments by Takumi Saegusa on “Exchangeability assumption in propensity-score based adjustment methods for population mean estimation using non-probability samples”: Causal inference, non-probability sample, and finite population
Articles and reports: 12-001-X202400100006
Description: In some of non-probability sample literature, the conditional exchangeability assumption is considered to be necessary for valid statistical inference. This assumption is rooted in causal inference though its potential outcome framework differs greatly from that of non-probability samples. We describe similarities and differences of two frameworks and discuss issues to consider when adopting the conditional exchangeability assumption in non-probability sample setups. We also discuss the role of finite population inference in different approaches of propensity scores and outcome regression modeling to non-probability samples.
Release date: 2024-06-25

Stats in brief (3)

Stats in brief (3) ((3 results))

1. The Importance of Disaggregated Data: An Introduction (part 1)
Stats in brief: 89-20-00062024001
Description: This short video explains how it can be very effective for all levels of governments and organizations that serve communities to use disaggregated data to make evidence-informed public policy decisions. By using disaggregated data, policymakers are able to design more appropriate and effective policies that meet the needs of each diverse and unique Canadian.
Release date: 2024-07-16
2. The Importance of Disaggregated Data: An Introduction (part 2)
Stats in brief: 89-20-00062024002
Description: This short video explains how the use of disaggregated data can help policymakers to develop more targeted and effective policies by identifying the unique needs and challenges faced by different demographic groups.
Release date: 2024-07-16
3. Agenda 2030 Sustainable Development Goals Report
Stats in brief: 11-637-X
Description: This product presents data on the Sustainable Development Goals. They present an overview of the 17 Goals through infographics by leveraging data currently available to report on Canada’s progress towards the 2030 Agenda for Sustainable Development.
Release date: 2024-01-25

Articles and reports (58)

Articles and reports (58) (0 to 10 of 58 results)

1. Heterogeneous causal effects of labour market programs: A machine learning approach Archived
Articles and reports: 11-522-X202200100017
Description: In this paper, we look for presence of heterogeneity in conducting impact evaluations of the Skills Development intervention delivered under the Labour Market Development Agreements. We use linked longitudinal administrative data covering a sample of Skills Development participants from 2010 to 2017. We apply a causal machine-learning estimator as in Lechner (2019) to estimate the individualized program impacts at the finest aggregation level. These granular impacts reveal the distribution of net impacts facilitating further investigation as to what works for whom. The findings suggest statistically significant improvements in labour market outcomes for participants overall and for subgroups of policy interest.
Release date: 2024-06-28
2. Authors’ response to comments on “Handling non-probability samples through inverse probability weighting with an application to Statistics Canada’s crowdsourcing data”: Some new developments on likelihood approaches to estimation of participation probabilities for non-probability samples
Articles and reports: 12-001-X202400100001
Description: Inspired by the two excellent discussions of our paper, we offer some new insights and developments into the problem of estimating participation probabilities for non-probability samples. First, we propose an improvement of the method of Chen, Li and Wu (2020), based on best linear unbiased estimation theory, that more efficiently leverages the available probability and non-probability sample data. We also develop a sample likelihood approach, similar in spirit to the method of Elliott (2009), that properly accounts for the overlap between both samples when it can be identified in at least one of the samples. We use best linear unbiased prediction theory to handle the scenario where the overlap is unknown. Interestingly, our two proposed approaches coincide in the case of unknown overlap. Then, we show that many existing methods can be obtained as a special case of a general unbiased estimating function. Finally, we conclude with some comments on nonparametric estimation of participation probabilities.
Release date: 2024-06-25
3. Comments by Changbao Wu on “Handling non-probability samples through inverse probability weighting with an application to Statistics Canada’s crowdsourcing data”
Articles and reports: 12-001-X202400100002
Description: We provide comparisons among three parametric methods for the estimation of participation probabilities and some brief comments on homogeneous groups and post-stratification.
Release date: 2024-06-25
4. Comments by Julie Gershunskaya and Vladislav Beresovsky on “Handling non-probability samples through inverse probability weighting with an application to Statistics Canada’s crowdsourcing data”
Articles and reports: 12-001-X202400100003
Description: Beaumont, Bosa, Brennan, Charlebois and Chu (2024) propose innovative model selection approaches for estimation of participation probabilities for non-probability sample units. We focus our discussion on the choice of a likelihood and parameterization of the model, which are key for the effectiveness of the techniques developed in the paper. We consider alternative likelihood and pseudo-likelihood based methods for estimation of participation probabilities and present simulations implementing and comparing the AIC based variable selection. We demonstrate that, under important practical scenarios, the approach based on a likelihood formulated over the observed pooled non-probability and probability samples performed better than the pseudo-likelihood based alternatives. The contrast in sensitivity of the AIC criteria is especially large for small probability sample sizes and low overlap in covariates domains.
Release date: 2024-06-25
5. Handling non-probability samples through inverse probability weighting with an application to Statistics Canada’s crowdsourcing data
Articles and reports: 12-001-X202400100004
Description: Non-probability samples are being increasingly explored in National Statistical Offices as an alternative to probability samples. However, it is well known that the use of a non-probability sample alone may produce estimates with significant bias due to the unknown nature of the underlying selection mechanism. Bias reduction can be achieved by integrating data from the non-probability sample with data from a probability sample provided that both samples contain auxiliary variables in common. We focus on inverse probability weighting methods, which involve modelling the probability of participation in the non-probability sample. First, we consider the logistic model along with pseudo maximum likelihood estimation. We propose a variable selection procedure based on a modified Akaike Information Criterion (AIC) that properly accounts for the data structure and the probability sampling design. We also propose a simple rank-based method of forming homogeneous post-strata. Then, we extend the Classification and Regression Trees (CART) algorithm to this data integration scenario, while again properly accounting for the probability sampling design. A bootstrap variance estimator is proposed that reflects two sources of variability: the probability sampling design and the participation model. Our methods are illustrated using Statistics Canada’s crowdsourcing and survey data.
Release date: 2024-06-25
6. Author’s response to comments on “Exchangeability assumption in propensity-score based adjustment methods for population mean estimation using non-probability samples”
Articles and reports: 12-001-X202400100005
Description: In this rejoinder, I address the comments from the discussants, Dr. Takumi Saegusa, Dr. Jae-Kwang Kim and Ms. Yonghyun Kwon. Dr. Saegusa’s comments about the differences between the conditional exchangeability (CE) assumption for causal inferences versus the CE assumption for finite population inferences using nonprobability samples, and the distinction between design-based versus model-based approaches for finite population inference using nonprobability samples, are elaborated and clarified in the context of my paper. Subsequently, I respond to Dr. Kim and Ms. Kwon’s comprehensive framework for categorizing existing approaches for estimating propensity scores (PS) into conditional and unconditional approaches. I expand their simulation studies to vary the sampling weights, allow for misspecified PS models, and include an additional estimator, i.e., scaled adjusted logistic propensity estimator (Wang, Valliant and Li (2021), denoted by sWBS). In my simulations, it is observed that the sWBS estimator consistently outperforms or is comparable to the other estimators under the misspecified PS model. The sWBS, as well as WBS or ABS described in my paper, do not assume that the overlapped units in both the nonprobability and probability reference samples are negligible, nor do they require the identification of overlap units as needed by the estimators proposed by Dr. Kim and Ms. Kwon.
Release date: 2024-06-25
7. Comments by Takumi Saegusa on “Exchangeability assumption in propensity-score based adjustment methods for population mean estimation using non-probability samples”: Causal inference, non-probability sample, and finite population
Articles and reports: 12-001-X202400100006
Description: In some of non-probability sample literature, the conditional exchangeability assumption is considered to be necessary for valid statistical inference. This assumption is rooted in causal inference though its potential outcome framework differs greatly from that of non-probability samples. We describe similarities and differences of two frameworks and discuss issues to consider when adopting the conditional exchangeability assumption in non-probability sample setups. We also discuss the role of finite population inference in different approaches of propensity scores and outcome regression modeling to non-probability samples.
Release date: 2024-06-25
8. Comments by Jae Kwang Kim and Yonghyun Kwon on “Exchangeability assumption in propensity-score based adjustment methods for population mean estimation using non-probability samples”
Articles and reports: 12-001-X202400100007
Description: Pseudo weight construction for data integration can be understood in the two-phase sampling framework. Using the two-phase sampling framework, we discuss two approaches to the estimation of propensity scores and develop a new way to construct the propensity score function for data integration using the conditional maximum likelihood method. Results from a limited simulation study are also presented.
Release date: 2024-06-25
9. Exchangeability assumption in propensity-score based adjustment methods for population mean estimation using non-probability samples
Articles and reports: 12-001-X202400100008
Description: Nonprobability samples emerge rapidly to address time-sensitive priority topics in different areas. These data are timely but subject to selection bias. To reduce selection bias, there has been wide literature in survey research investigating the use of propensity-score (PS) adjustment methods to improve the population representativeness of nonprobability samples, using probability-based survey samples as external references. Conditional exchangeability (CE) assumption is one of the key assumptions required by PS-based adjustment methods. In this paper, I first explore the validity of the CE assumption conditional on various balancing score estimates that are used in existing PS-based adjustment methods. An adaptive balancing score is proposed for unbiased estimation of population means. The population mean estimators under the three CE assumptions are evaluated via Monte Carlo simulation studies and illustrated using the NIH SARS-CoV-2 seroprevalence study to estimate the proportion of U.S. adults with COVID-19 antibodies from April 01-August 04, 2020.
Release date: 2024-06-25
10. Authors’ response to comments on “Exploring the assumption that commercial online nonprobability survey respondents are answering in good faith”
Articles and reports: 12-001-X202400100009
Description: Our comments respond to discussion from Sen, Brick, and Elliott. We weigh the potential upside and downside of Sen’s suggestion of using machine learning to identify bogus respondents through interactions and improbable combinations of variables. We join Brick in reflecting on bogus respondents’ impact on the state of commercial nonprobability surveys. Finally, we consider Elliott’s discussion of solutions to the challenge raised in our study.
Release date: 2024-06-25

Journals and periodicals (4)

Journals and periodicals (4) ((4 results))

1. Statistics Canada International Symposium Series: Proceedings
Journals and periodicals: 11-522-X
Description: Since 1984, an annual international symposium on methodological issues has been sponsored by Statistics Canada. Proceedings have been available since 1987.
Release date: 2024-06-28
2. Survey Methodology
Journals and periodicals: 12-001-X
Geography: Canada
Description: The journal publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves.
Release date: 2024-06-25
3. Income Research Paper Series
Journals and periodicals: 75F0002M
Description: This series provides detailed documentation on income developments, including survey design issues, data quality evaluation and exploratory research.
Release date: 2024-04-26
4. Analytical Studies: Methods and References
Journals and periodicals: 11-633-X
Description: Papers in this series provide background discussions of the methods used to develop data for economic, health, and social analytical studies at Statistics Canada. They are intended to provide readers with information on the statistical methods, standards and definitions used to develop databases for research purposes. All papers in this series have undergone peer and institutional review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted standards of good professional practice.
Release date: 2024-01-22

Report a problem or mistake on this page

Date modified:: 2024-09-06

Language selection

Search and menus

Search

Analysis

Filter results by

Keyword(s)

Subject

Year of publication

Author(s)

Survey or statistical program

Content

Results

All (65) (0 to 10 of 65 results)

Stats in brief (3) ((3 results))

Articles and reports (58) (0 to 10 of 58 results)

Journals and periodicals (4) ((4 results))

Analysis

Filter results by

Keyword(s)

Subject

Year of publication

Author(s)

Survey or statistical program

Content

Results

All (65) (0 to 10 of 65 results)

Stats in brief (3) ((3 results))

Articles and reports (58) (0 to 10 of 58 results)

Journals and periodicals (4) ((4 results))

How do I use the filters and the search box?

How do I refine my search?

How does the search work?

How are the results ordered?

How are the results ordered?