Keyword search

Skip to main content
Skip to footer

Language selection

Français

Search and menus

Search and menus

Search

Results

All (167)

All (167) (0 to 10 of 167 results)

1. Improvements to the Canadian Income Survey Methodology for the 2022 Reference Year
Articles and reports: 75F0002M2024005
Description: The Canadian Income Survey (CIS) has introduced improvements to the methods and data sources used to produce income and poverty estimates with the release of its 2022 reference year estimates. Foremost among these improvements is a significant increase in the sample size for a large subset of the CIS content. The weighting methodology was also improved and the target population of the CIS was changed from persons aged 16 years and over to persons aged 15 years and over. This paper describes the changes made and presents the approximate net result of these changes on the income estimates and data quality of the CIS using 2021 data. The changes described in this paper highlight the ways in which data quality has been improved while having little impact on key CIS estimates and trends.
Release date: 2024-04-26
2. Enhancing data for rural Canada: Small area estimation of remote work opportunities
Articles and reports: 18-001-X2024001
Description: This study applies small area estimation (SAE) and a new geographic concept called Self-contained Labor Area (SLA) to the Canadian Survey on Business Conditions (CSBC) with a focus on remote work opportunities in rural labor markets. Through SAE modelling, we estimate the proportions of businesses, classified by general industrial sector (service providers and goods producers), that would primarily offer remote work opportunities to their workforce.
Release date: 2024-04-22
3. Ensuring that the Labour Force Survey Remains the Cornerstone of Canada’s Labour Market Information Ecosystem
Articles and reports: 75-005-M2024001
Description: From 2010 to 2019, the Labour Force Survey (LFS) response rate – or the proportion of selected households who complete an LFS interview – had been on a slow downward trend, due to a range of social and technological changes which have made it more challenging to contact selected households and to persuade Canadians to participate when they are contacted. These factors were exacerbated by the COVID-19 pandemic, which resulted in the suspension of face-to-face interviewing between April 2020 and fall 2022. Statistics Canada is committed to restoring LFS response rates to the greatest extent possible. This technical paper discusses two initiatives that are underway to ensure that the LFS estimates continue to provide an accurate and representative portrait of the Canadian labour market.
Release date: 2024-02-16
4. Market Basket Measure research paper: Applying the Market Basket Measure methodology to an administrative data source
Articles and reports: 75F0002M2024002
Description: This discussion paper describes considerations for applying the Market Basket Measure (MBM) methodology onto a purely administrative data source. The paper will begin by outlining a rationale for estimating MBM poverty statistics using administrative income data sources. It then explains a proposal for creating annual samples along with the caveats of creating these samples, followed by a brief analysis using the proposed samples. The paper concludes with potential future improvements to the samples and provides the opportunity for reader’s feedback.
Release date: 2024-02-08
5. Longitudinal Immigration Database (IMDB) Technical Report, 2022
Articles and reports: 11-633-X2024001
Description: The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 35 years.
Release date: 2024-01-22
6. Distributions of Household Economic Accounts, estimates of asset, liability and net worth distributions, 2010 to 2023, technical methodology and quality report
Articles and reports: 13-604-M2024001
Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in January 2024 for the reference years 2010 to 2023. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
Release date: 2024-01-22
7. Model-based stratification of payment populations in Medicare integrity investigations
Articles and reports: 12-001-X202300200001
Description: When a Medicare healthcare provider is suspected of billing abuse, a population of payments X made to that provider over a fixed timeframe is isolated. A certified medical reviewer, in a time-consuming process, can determine the overpayment Y = X - (amount justified by the evidence) associated with each payment. Typically, there are too many payments in the population to examine each with care, so a probability sample is selected. The sample overpayments are then used to calculate a 90% lower confidence bound for the total population overpayment. This bound is the amount demanded for recovery from the provider. Unfortunately, classical methods for calculating this bound sometimes fail to provide the 90% confidence level, especially when using a stratified sample.
In this paper, 166 redacted samples from Medicare integrity investigations are displayed and described, along with 156 associated payment populations. The 7,588 examined (Y, X) sample pairs show (1) Medicare audits have high error rates: more than 76% of these payments were considered to have been paid in error; and (2) the patterns in these samples support an “All-or-Nothing” mixture model for (Y, X) previously defined in the literature. Model-based Monte Carlo testing procedures for Medicare sampling plans are discussed, as well as stratification methods based on anticipated model moments. In terms of viability (achieving the 90% confidence level) a new stratification method defined here is competitive with the best of the many existing methods tested and seems less sensitive to choice of operating parameters. In terms of overpayment recovery (equivalent to precision) the new method is also comparable to the best of the many existing methods tested. Unfortunately, no stratification algorithm tested was ever viable for more than about half of the 104 test populations.
Release date: 2024-01-03
8. A method for estimating the effect of classification errors on statistics for two domains
Articles and reports: 12-001-X202300200002
Description: Being able to quantify the accuracy (bias, variance) of published output is crucial in official statistics. Output in official statistics is nearly always divided into subpopulations according to some classification variable, such as mean income by categories of educational level. Such output is also referred to as domain statistics. In the current paper, we limit ourselves to binary classification variables. In practice, misclassifications occur and these contribute to the bias and variance of domain statistics. Existing analytical and numerical methods to estimate this effect have two disadvantages. The first disadvantage is that they require that the misclassification probabilities are known beforehand and the second is that the bias and variance estimates are biased themselves. In the current paper we present a new method, a Gaussian mixture model estimated by an Expectation-Maximisation (EM) algorithm combined with a bootstrap, referred to as the EM bootstrap method. This new method does not require that the misclassification probabilities are known beforehand, although it is more efficient when a small audit sample is used that yields a starting value for the misclassification probabilities in the EM algorithm. We compared the performance of the new method with currently available numerical methods: the bootstrap method and the SIMEX method. Previous research has shown that for non-linear parameters the bootstrap outperforms the analytical expressions. For nearly all conditions tested, the bias and variance estimates that are obtained by the EM bootstrap method are closer to their true values than those obtained by the bootstrap and SIMEX methods. We end this paper by discussing the results and possible future extensions of the method.
Release date: 2024-01-03
9. Small area prediction of general small area parameters for unit-level count data
Articles and reports: 12-001-X202300200003
Description: We investigate small area prediction of general parameters based on two models for unit-level counts. We construct predictors of parameters, such as quartiles, that may be nonlinear functions of the model response variable. We first develop a procedure to construct empirical best predictors and mean square error estimators of general parameters under a unit-level gamma-Poisson model. We then use a sampling importance resampling algorithm to develop predictors for a generalized linear mixed model (GLMM) with a Poisson response distribution. We compare the two models through simulation and an analysis of data from the Iowa Seat-Belt Use Survey.
Release date: 2024-01-03
10. Bayesian small area models under inequality constraints with benchmarking and double shrinkage
Articles and reports: 12-001-X202300200004
Description: We present a novel methodology to benchmark county-level estimates of crop area totals to a preset state total subject to inequality constraints and random variances in the Fay-Herriot model. For planted area of the National Agricultural Statistics Service (NASS), an agency of the United States Department of Agriculture (USDA), it is necessary to incorporate the constraint that the estimated totals, derived from survey and other auxiliary data, are no smaller than administrative planted area totals prerecorded by other USDA agencies except NASS. These administrative totals are treated as fixed and known, and this additional coherence requirement adds to the complexity of benchmarking the county-level estimates. A fully Bayesian analysis of the Fay-Herriot model offers an appealing way to incorporate the inequality and benchmarking constraints, and to quantify the resulting uncertainties, but sampling from the posterior densities involves difficult integration, and reasonable approximations must be made. First, we describe a single-shrinkage model, shrinking the means while the variances are assumed known. Second, we extend this model to accommodate double shrinkage, borrowing strength across means and variances. This extended model has two sources of extra variation, but because we are shrinking both means and variances, it is expected that this second model should perform better in terms of goodness of fit (reliability) and possibly precision. The computations are challenging for both models, which are applied to simulated data sets with properties resembling the Illinois corn crop.
Release date: 2024-01-03

Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (167)

Analysis (167) (0 to 10 of 167 results)

1. Improvements to the Canadian Income Survey Methodology for the 2022 Reference Year
Articles and reports: 75F0002M2024005
Description: The Canadian Income Survey (CIS) has introduced improvements to the methods and data sources used to produce income and poverty estimates with the release of its 2022 reference year estimates. Foremost among these improvements is a significant increase in the sample size for a large subset of the CIS content. The weighting methodology was also improved and the target population of the CIS was changed from persons aged 16 years and over to persons aged 15 years and over. This paper describes the changes made and presents the approximate net result of these changes on the income estimates and data quality of the CIS using 2021 data. The changes described in this paper highlight the ways in which data quality has been improved while having little impact on key CIS estimates and trends.
Release date: 2024-04-26
2. Enhancing data for rural Canada: Small area estimation of remote work opportunities
Articles and reports: 18-001-X2024001
Description: This study applies small area estimation (SAE) and a new geographic concept called Self-contained Labor Area (SLA) to the Canadian Survey on Business Conditions (CSBC) with a focus on remote work opportunities in rural labor markets. Through SAE modelling, we estimate the proportions of businesses, classified by general industrial sector (service providers and goods producers), that would primarily offer remote work opportunities to their workforce.
Release date: 2024-04-22
3. Ensuring that the Labour Force Survey Remains the Cornerstone of Canada’s Labour Market Information Ecosystem
Articles and reports: 75-005-M2024001
Description: From 2010 to 2019, the Labour Force Survey (LFS) response rate – or the proportion of selected households who complete an LFS interview – had been on a slow downward trend, due to a range of social and technological changes which have made it more challenging to contact selected households and to persuade Canadians to participate when they are contacted. These factors were exacerbated by the COVID-19 pandemic, which resulted in the suspension of face-to-face interviewing between April 2020 and fall 2022. Statistics Canada is committed to restoring LFS response rates to the greatest extent possible. This technical paper discusses two initiatives that are underway to ensure that the LFS estimates continue to provide an accurate and representative portrait of the Canadian labour market.
Release date: 2024-02-16
4. Market Basket Measure research paper: Applying the Market Basket Measure methodology to an administrative data source
Articles and reports: 75F0002M2024002
Description: This discussion paper describes considerations for applying the Market Basket Measure (MBM) methodology onto a purely administrative data source. The paper will begin by outlining a rationale for estimating MBM poverty statistics using administrative income data sources. It then explains a proposal for creating annual samples along with the caveats of creating these samples, followed by a brief analysis using the proposed samples. The paper concludes with potential future improvements to the samples and provides the opportunity for reader’s feedback.
Release date: 2024-02-08
5. Longitudinal Immigration Database (IMDB) Technical Report, 2022
Articles and reports: 11-633-X2024001
Description: The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 35 years.
Release date: 2024-01-22
6. Distributions of Household Economic Accounts, estimates of asset, liability and net worth distributions, 2010 to 2023, technical methodology and quality report
Articles and reports: 13-604-M2024001
Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in January 2024 for the reference years 2010 to 2023. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
Release date: 2024-01-22
7. Model-based stratification of payment populations in Medicare integrity investigations
Articles and reports: 12-001-X202300200001
Description: When a Medicare healthcare provider is suspected of billing abuse, a population of payments X made to that provider over a fixed timeframe is isolated. A certified medical reviewer, in a time-consuming process, can determine the overpayment Y = X - (amount justified by the evidence) associated with each payment. Typically, there are too many payments in the population to examine each with care, so a probability sample is selected. The sample overpayments are then used to calculate a 90% lower confidence bound for the total population overpayment. This bound is the amount demanded for recovery from the provider. Unfortunately, classical methods for calculating this bound sometimes fail to provide the 90% confidence level, especially when using a stratified sample.
In this paper, 166 redacted samples from Medicare integrity investigations are displayed and described, along with 156 associated payment populations. The 7,588 examined (Y, X) sample pairs show (1) Medicare audits have high error rates: more than 76% of these payments were considered to have been paid in error; and (2) the patterns in these samples support an “All-or-Nothing” mixture model for (Y, X) previously defined in the literature. Model-based Monte Carlo testing procedures for Medicare sampling plans are discussed, as well as stratification methods based on anticipated model moments. In terms of viability (achieving the 90% confidence level) a new stratification method defined here is competitive with the best of the many existing methods tested and seems less sensitive to choice of operating parameters. In terms of overpayment recovery (equivalent to precision) the new method is also comparable to the best of the many existing methods tested. Unfortunately, no stratification algorithm tested was ever viable for more than about half of the 104 test populations.
Release date: 2024-01-03
8. A method for estimating the effect of classification errors on statistics for two domains
Articles and reports: 12-001-X202300200002
Description: Being able to quantify the accuracy (bias, variance) of published output is crucial in official statistics. Output in official statistics is nearly always divided into subpopulations according to some classification variable, such as mean income by categories of educational level. Such output is also referred to as domain statistics. In the current paper, we limit ourselves to binary classification variables. In practice, misclassifications occur and these contribute to the bias and variance of domain statistics. Existing analytical and numerical methods to estimate this effect have two disadvantages. The first disadvantage is that they require that the misclassification probabilities are known beforehand and the second is that the bias and variance estimates are biased themselves. In the current paper we present a new method, a Gaussian mixture model estimated by an Expectation-Maximisation (EM) algorithm combined with a bootstrap, referred to as the EM bootstrap method. This new method does not require that the misclassification probabilities are known beforehand, although it is more efficient when a small audit sample is used that yields a starting value for the misclassification probabilities in the EM algorithm. We compared the performance of the new method with currently available numerical methods: the bootstrap method and the SIMEX method. Previous research has shown that for non-linear parameters the bootstrap outperforms the analytical expressions. For nearly all conditions tested, the bias and variance estimates that are obtained by the EM bootstrap method are closer to their true values than those obtained by the bootstrap and SIMEX methods. We end this paper by discussing the results and possible future extensions of the method.
Release date: 2024-01-03
9. Small area prediction of general small area parameters for unit-level count data
Articles and reports: 12-001-X202300200003
Description: We investigate small area prediction of general parameters based on two models for unit-level counts. We construct predictors of parameters, such as quartiles, that may be nonlinear functions of the model response variable. We first develop a procedure to construct empirical best predictors and mean square error estimators of general parameters under a unit-level gamma-Poisson model. We then use a sampling importance resampling algorithm to develop predictors for a generalized linear mixed model (GLMM) with a Poisson response distribution. We compare the two models through simulation and an analysis of data from the Iowa Seat-Belt Use Survey.
Release date: 2024-01-03
10. Bayesian small area models under inequality constraints with benchmarking and double shrinkage
Articles and reports: 12-001-X202300200004
Description: We present a novel methodology to benchmark county-level estimates of crop area totals to a preset state total subject to inequality constraints and random variances in the Fay-Herriot model. For planted area of the National Agricultural Statistics Service (NASS), an agency of the United States Department of Agriculture (USDA), it is necessary to incorporate the constraint that the estimated totals, derived from survey and other auxiliary data, are no smaller than administrative planted area totals prerecorded by other USDA agencies except NASS. These administrative totals are treated as fixed and known, and this additional coherence requirement adds to the complexity of benchmarking the county-level estimates. A fully Bayesian analysis of the Fay-Herriot model offers an appealing way to incorporate the inequality and benchmarking constraints, and to quantify the resulting uncertainties, but sampling from the posterior densities involves difficult integration, and reasonable approximations must be made. First, we describe a single-shrinkage model, shrinking the means while the variances are assumed known. Second, we extend this model to accommodate double shrinkage, borrowing strength across means and variances. This extended model has two sources of extra variation, but because we are shrinking both means and variances, it is expected that this second model should perform better in terms of goodness of fit (reliability) and possibly precision. The computations are challenging for both models, which are applied to simulated data sets with properties resembling the Illinois corn crop.
Release date: 2024-01-03

Reference (0)

Reference (0) (0 results)

No content available at this time.

Report a problem or mistake on this page

Date modified:: 2024-06-10