Statistical methods

Key indicators

Changing any selection will automatically update the page content.

Selected geographical area: Canada

Selected geographical area: Newfoundland and Labrador

Selected geographical area: Prince Edward Island

Selected geographical area: Nova Scotia

Selected geographical area: New Brunswick

Selected geographical area: Quebec

Selected geographical area: Ontario

Selected geographical area: Manitoba

Selected geographical area: Saskatchewan

Selected geographical area: Alberta

Selected geographical area: British Columbia

Selected geographical area: Yukon

Selected geographical area: Northwest Territories

Selected geographical area: Nunavut

Sort Help
entries

Results

All (2,299)

All (2,299) (30 to 40 of 2,299 results)

  • Stats in brief: 11-637-X
    Description: This product presents data on the Sustainable Development Goals. They present an overview of the 17 Goals through infographics by leveraging data currently available to report on Canada’s progress towards the 2030 Agenda for Sustainable Development.
    Release date: 2024-01-25

  • Stats in brief: 11-001-X202402237898
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2024-01-22

  • Articles and reports: 11-633-X2024001
    Description: The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 35 years.
    Release date: 2024-01-22

  • Articles and reports: 13-604-M2024001
    Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in January 2024 for the reference years 2010 to 2023. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
    Release date: 2024-01-22

  • Journals and periodicals: 11-633-X
    Description: Papers in this series provide background discussions of the methods used to develop data for economic, health, and social analytical studies at Statistics Canada. They are intended to provide readers with information on the statistical methods, standards and definitions used to develop databases for research purposes. All papers in this series have undergone peer and institutional review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted standards of good professional practice.
    Release date: 2024-01-22

  • Articles and reports: 12-001-X202300200001
    Description: When a Medicare healthcare provider is suspected of billing abuse, a population of payments X made to that provider over a fixed timeframe is isolated. A certified medical reviewer, in a time-consuming process, can determine the overpayment Y = X - (amount justified by the evidence) associated with each payment. Typically, there are too many payments in the population to examine each with care, so a probability sample is selected. The sample overpayments are then used to calculate a 90% lower confidence bound for the total population overpayment. This bound is the amount demanded for recovery from the provider. Unfortunately, classical methods for calculating this bound sometimes fail to provide the 90% confidence level, especially when using a stratified sample.

    In this paper, 166 redacted samples from Medicare integrity investigations are displayed and described, along with 156 associated payment populations. The 7,588 examined (Y, X) sample pairs show (1) Medicare audits have high error rates: more than 76% of these payments were considered to have been paid in error; and (2) the patterns in these samples support an “All-or-Nothing” mixture model for (Y, X) previously defined in the literature. Model-based Monte Carlo testing procedures for Medicare sampling plans are discussed, as well as stratification methods based on anticipated model moments. In terms of viability (achieving the 90% confidence level) a new stratification method defined here is competitive with the best of the many existing methods tested and seems less sensitive to choice of operating parameters. In terms of overpayment recovery (equivalent to precision) the new method is also comparable to the best of the many existing methods tested. Unfortunately, no stratification algorithm tested was ever viable for more than about half of the 104 test populations.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200002
    Description: Being able to quantify the accuracy (bias, variance) of published output is crucial in official statistics. Output in official statistics is nearly always divided into subpopulations according to some classification variable, such as mean income by categories of educational level. Such output is also referred to as domain statistics. In the current paper, we limit ourselves to binary classification variables. In practice, misclassifications occur and these contribute to the bias and variance of domain statistics. Existing analytical and numerical methods to estimate this effect have two disadvantages. The first disadvantage is that they require that the misclassification probabilities are known beforehand and the second is that the bias and variance estimates are biased themselves. In the current paper we present a new method, a Gaussian mixture model estimated by an Expectation-Maximisation (EM) algorithm combined with a bootstrap, referred to as the EM bootstrap method. This new method does not require that the misclassification probabilities are known beforehand, although it is more efficient when a small audit sample is used that yields a starting value for the misclassification probabilities in the EM algorithm. We compared the performance of the new method with currently available numerical methods: the bootstrap method and the SIMEX method. Previous research has shown that for non-linear parameters the bootstrap outperforms the analytical expressions. For nearly all conditions tested, the bias and variance estimates that are obtained by the EM bootstrap method are closer to their true values than those obtained by the bootstrap and SIMEX methods. We end this paper by discussing the results and possible future extensions of the method.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200003
    Description: We investigate small area prediction of general parameters based on two models for unit-level counts. We construct predictors of parameters, such as quartiles, that may be nonlinear functions of the model response variable. We first develop a procedure to construct empirical best predictors and mean square error estimators of general parameters under a unit-level gamma-Poisson model. We then use a sampling importance resampling algorithm to develop predictors for a generalized linear mixed model (GLMM) with a Poisson response distribution. We compare the two models through simulation and an analysis of data from the Iowa Seat-Belt Use Survey.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200004
    Description: We present a novel methodology to benchmark county-level estimates of crop area totals to a preset state total subject to inequality constraints and random variances in the Fay-Herriot model. For planted area of the National Agricultural Statistics Service (NASS), an agency of the United States Department of Agriculture (USDA), it is necessary to incorporate the constraint that the estimated totals, derived from survey and other auxiliary data, are no smaller than administrative planted area totals prerecorded by other USDA agencies except NASS. These administrative totals are treated as fixed and known, and this additional coherence requirement adds to the complexity of benchmarking the county-level estimates. A fully Bayesian analysis of the Fay-Herriot model offers an appealing way to incorporate the inequality and benchmarking constraints, and to quantify the resulting uncertainties, but sampling from the posterior densities involves difficult integration, and reasonable approximations must be made. First, we describe a single-shrinkage model, shrinking the means while the variances are assumed known. Second, we extend this model to accommodate double shrinkage, borrowing strength across means and variances. This extended model has two sources of extra variation, but because we are shrinking both means and variances, it is expected that this second model should perform better in terms of goodness of fit (reliability) and possibly precision. The computations are challenging for both models, which are applied to simulated data sets with properties resembling the Illinois corn crop.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200005
    Description: Population undercoverage is one of the main hurdles faced by statistical analysis with non-probability survey samples. We discuss two typical scenarios of undercoverage, namely, stochastic undercoverage and deterministic undercoverage. We argue that existing estimation methods under the positivity assumption on the propensity scores (i.e., the participation probabilities) can be directly applied to handle the scenario of stochastic undercoverage. We explore strategies for mitigating biases in estimating the mean of the target population under deterministic undercoverage. In particular, we examine a split population approach based on a convex hull formulation, and construct estimators with reduced biases. A doubly robust estimator can be constructed if a followup subsample of the reference probability survey with measurements on the study variable becomes feasible. Performances of six competing estimators are investigated through a simulation study and issues which require further investigation are briefly discussed.
    Release date: 2024-01-03
Data (9)

Data (9) ((9 results))

No content available at this time.

Analysis (1,874)

Analysis (1,874) (20 to 30 of 1,874 results)

  • Articles and reports: 11-522-X202200100018
    Description: The Longitudinal Social Data Development Program (LSDDP) is a social data integration approach aimed at providing longitudinal analytical opportunities without imposing additional burden on respondents. The LSDDP uses a multitude of signals from different data sources for the same individual, which helps to better understand their interactions and track changes over time. This article looks at how the ethnicity status of people in Canada can be estimated at the most detailed disaggregated level possible using the results from a variety of business rules applied to linked data and to the LSDDP denominator. It will then show how improvements were obtained using machine learning methods, such as decision trees and random forest techniques.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100019
    Description: The purpose of this article is to compare the linkage results for individuals from French tax sources with those of the 2019 Enquête Annuelle de Recensement (EAR), obtained through different methods. Such a comparison will decide whether the Répertoires Statistiques d'Individus et de Logements (Résil) program should be equipped with a probabilistic matching tool for its administrative source identification and matching engine.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100020
    Description: The reconciliation of 2021 census dwellings with the new Statistical Building Register (SBgR) presented linkage challenges. The Census of Population collected information from various dwelling types. For a large proportion of the population, mailing addresses were at the centre: they were used for reaching out to people and collected as contact info. In parallel, the register environment has been evolving. The agency is transitioning from the Address Register (AR) to the SBgR holding both mailing and location addresses, while also covering non-residential buildings. The reconciliation was conducted using a combination of systems, notably the new Register Matching Engine (RME) for difficult cases. The RME holds an interesting range of sophisticated string comparators. A deterministic linkage approach was used, while incorporating some data knowledge like the entropy. Through metadata, the matching expert could also reduce the amounts of false positives and false negatives.
    Release date: 2024-03-25

  • Journals and periodicals: 11-522-X
    Description: Since 1984, an annual international symposium on methodological issues has been sponsored by Statistics Canada. Proceedings have been available since 1987.
    Release date: 2024-03-25

  • Articles and reports: 75-005-M2024001
    Description: From 2010 to 2019, the Labour Force Survey (LFS) response rate – or the proportion of selected households who complete an LFS interview – had been on a slow downward trend, due to a range of social and technological changes which have made it more challenging to contact selected households and to persuade Canadians to participate when they are contacted. These factors were exacerbated by the COVID-19 pandemic, which resulted in the suspension of face-to-face interviewing between April 2020 and fall 2022. Statistics Canada is committed to restoring LFS response rates to the greatest extent possible. This technical paper discusses two initiatives that are underway to ensure that the LFS estimates continue to provide an accurate and representative portrait of the Canadian labour market.
    Release date: 2024-02-16

  • Articles and reports: 75F0002M2024002
    Description: This discussion paper describes considerations for applying the Market Basket Measure (MBM) methodology onto a purely administrative data source. The paper will begin by outlining a rationale for estimating MBM poverty statistics using administrative income data sources. It then explains a proposal for creating annual samples along with the caveats of creating these samples, followed by a brief analysis using the proposed samples. The paper concludes with potential future improvements to the samples and provides the opportunity for reader’s feedback.
    Release date: 2024-02-08

  • Stats in brief: 11-637-X
    Description: This product presents data on the Sustainable Development Goals. They present an overview of the 17 Goals through infographics by leveraging data currently available to report on Canada’s progress towards the 2030 Agenda for Sustainable Development.
    Release date: 2024-01-25

  • Stats in brief: 11-001-X202402237898
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2024-01-22

  • Articles and reports: 11-633-X2024001
    Description: The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 35 years.
    Release date: 2024-01-22

  • Articles and reports: 13-604-M2024001
    Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in January 2024 for the reference years 2010 to 2023. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
    Release date: 2024-01-22
Reference (363)

Reference (363) (10 to 20 of 363 results)

  • Surveys and statistical programs – Documentation: 11-633-X2021005
    Description:

    The Analytical Studies and Modelling Branch (ASMB) is the research arm of Statistics Canada mandated to provide high-quality, relevant and timely information on economic, health and social issues that are important to Canadians. The branch strategically makes use of expert knowledge and a broad range of data sources and modelling techniques to address the information needs of a broad range of government, academic and public sector partners and stakeholders through analysis and research, modeling and predictive analytics, and data development. The branch strives to deliver relevant, high-quality, timely, comprehensive, horizontal and integrated research and to enable the use of its research through capacity building and strategic dissemination to meet the user needs of policy makers, academics and the general public.

    This Multi-year Consolidated Plan for Research, Modelling and Data Development outlines the priorities for the branch over the next two years.

    Release date: 2021-08-12

  • Surveys and statistical programs – Documentation: 11-633-X2021002
    Description:

    The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 35 years. The IMDB includes Immigration, Refugees and Citizenship Canada (IRCC) administrative records which contain exhaustive information about immigrants who were admitted to Canada since 1952. It also includes data about non-permanent residents who have been issued temporary resident permits since 1980. This report will discuss the IMDB data sources, concepts and variables, record linkage, data processing, dissemination, data evaluation and quality indicators, comparability with other immigration datasets, and the analyses possible with the IMDB.

    Release date: 2021-02-01

  • Surveys and statistical programs – Documentation: 11-633-X2019005
    Description:

    The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 35 years. The IMDB includes Immigration, Refugees and Citizenship Canada (IRCC) administrative records which contain exhaustive information about immigrants who were admitted to Canada since 1952. It also includes data about non-permanent residents who have been issued temporary resident permits since 1980. This report will discuss the IMDB data sources, concepts and variables, record linkage, data processing, dissemination, data evaluation and quality indicators, comparability with other immigration datasets, and the analyses possible with the IMDB.

    The IMDB was released in stages. The sections 2.2 and 7 of this report were revised to take the updates into account.

    Release date: 2020-07-20

  • Notices and consultations: 98-26-0001
    Description:

    This white paper presents Statistics Canada’s planned approach to the 2021 Census of Population and provides a clear explanation of the processes behind the census program, touching on historical, legal, operational and content aspects. Statistics Canada recognizes that it is important to not only successfully conduct the census, but also to be transparent and informative about the way in which those efforts are accomplished. Painting a Portrait of Canada: The 2021 Census of Population gives readers an exclusive, detailed look at how census data is collected, analyzed and given back to Canadians, in the form of high-quality statistical information, used to make evidence-based decisions in Canadian society.

    Release date: 2020-07-20

  • Surveys and statistical programs – Documentation: 98-20-00012020020
    Description:

    This fact sheet provides detailed insight into the design and methodology of the content test component of the 2019 Census Test. This test evaluated changes to the wording and flow of some questions, as well as the potential addition of new questions, to help determine the content of the 2021 Census of Population.

    Release date: 2020-07-20

  • Surveys and statistical programs – Documentation: 89-26-0003
    Description:

    Statistics Canada Data Strategy (SCDS) provides a course of action for managing and leveraging the agency’s data assets to ensure their optimal use and value while maintaining public trust. As Statistics Canada is the nation’s trusted provider of high-quality data and information to support evidence-based policy and decision making, the SCDS also naturally includes the agency’s plan for providing support and data expertise to other government organizations (federal, provincial and territorial), non-governmental organizations, the private sector, academia, and other national and international communities).

    The SCDS provides a roadmap for how Statistics Canada will continue to govern and manage its valuable data assets as part of its modernization agenda and in alignment with and response to other federal government strategies and initiatives. These federal strategies include the Data Strategy for the Federal Public Service, Canada’s 2018-2020 National Action Plan on Open Government, and the Treasury Board Secretariat Digital Operations Strategic Plan: 2018-2022.

    Release date: 2020-04-30

  • Surveys and statistical programs – Documentation: 34-26-0002
    Description:

    As of reference year 2018, the Annual Capital and Repair Expenditures Survey (CAPEX) has added additional content allowing to produce estimates of capital and repair expenditures on infrastructure assets. In addition to the existing content, the new questionnaire asks for a breakdown of expenditures by function (or purpose) as well as the source of funding of capital expenditures from government grants and subsidies.

    This product will decribe the sources and methods used to produce capital and repair expenditure estimates specific to infrastructure assets by function.

    Release date: 2020-04-01

  • Surveys and statistical programs – Documentation: 75F0002M2020001
    Description:

    This note provides the definition of a first-time homebuyer concept used in the 2018 Canadian Housing Survey (CHS). It also includes the methodology used to identify first-time homebuyers and provides sensitivity analysis under alternative methodologies.

    Release date: 2020-01-15

  • Surveys and statistical programs – Documentation: 12-539-X
    Description:

    This document brings together guidelines and checklists on many issues that need to be considered in the pursuit of quality objectives in the execution of statistical activities. Its focus is on how to assure quality through effective and appropriate design or redesign of a statistical project or program from inception through to data evaluation, dissemination and documentation. These guidelines draw on the collective knowledge and experience of many Statistics Canada employees. It is expected that Quality Guidelines will be useful to staff engaged in the planning and design of surveys and other statistical projects, as well as to those who evaluate and analyze the outputs of these projects.

    Release date: 2019-12-04

  • Surveys and statistical programs – Documentation: 98-303-X
    Description:

    The Coverage Technical Report will present the error included in census data that results from either persons being missed (not enumerated) or from persons being enumerated more than once by the 2016 Census. The population coverage error is one of the most important types of errors because it affects not only the accuracy of population counts, but also the accuracy of all the census data results describing the characteristics of the population universe.

    Release date: 2019-11-13

Browse our partners page to find a complete list of our partners and their associated products.

Date modified: