Statistical methods

Key indicators

Changing any selection will automatically update the page content.

Selected geographical area: Canada

Selected geographical area: Newfoundland and Labrador

Selected geographical area: Prince Edward Island

Selected geographical area: Nova Scotia

Selected geographical area: New Brunswick

Selected geographical area: Quebec

Selected geographical area: Ontario

Selected geographical area: Manitoba

Selected geographical area: Saskatchewan

Selected geographical area: Alberta

Selected geographical area: British Columbia

Selected geographical area: Yukon

Selected geographical area: Northwest Territories

Selected geographical area: Nunavut

Sort Help
entries

Results

All (2,299)

All (2,299) (30 to 40 of 2,299 results)

  • Stats in brief: 11-637-X
    Description: This product presents data on the Sustainable Development Goals. They present an overview of the 17 Goals through infographics by leveraging data currently available to report on Canada’s progress towards the 2030 Agenda for Sustainable Development.
    Release date: 2024-01-25

  • Journals and periodicals: 11-633-X
    Description: Papers in this series provide background discussions of the methods used to develop data for economic, health, and social analytical studies at Statistics Canada. They are intended to provide readers with information on the statistical methods, standards and definitions used to develop databases for research purposes. All papers in this series have undergone peer and institutional review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted standards of good professional practice.
    Release date: 2024-01-22

  • Articles and reports: 11-633-X2024001
    Description: The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 35 years.
    Release date: 2024-01-22

  • Articles and reports: 13-604-M2024001
    Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in January 2024 for the reference years 2010 to 2023. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
    Release date: 2024-01-22

  • Stats in brief: 11-001-X202402237898
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2024-01-22

  • Articles and reports: 12-001-X202300200001
    Description: When a Medicare healthcare provider is suspected of billing abuse, a population of payments X made to that provider over a fixed timeframe is isolated. A certified medical reviewer, in a time-consuming process, can determine the overpayment Y = X - (amount justified by the evidence) associated with each payment. Typically, there are too many payments in the population to examine each with care, so a probability sample is selected. The sample overpayments are then used to calculate a 90% lower confidence bound for the total population overpayment. This bound is the amount demanded for recovery from the provider. Unfortunately, classical methods for calculating this bound sometimes fail to provide the 90% confidence level, especially when using a stratified sample.

    In this paper, 166 redacted samples from Medicare integrity investigations are displayed and described, along with 156 associated payment populations. The 7,588 examined (Y, X) sample pairs show (1) Medicare audits have high error rates: more than 76% of these payments were considered to have been paid in error; and (2) the patterns in these samples support an “All-or-Nothing” mixture model for (Y, X) previously defined in the literature. Model-based Monte Carlo testing procedures for Medicare sampling plans are discussed, as well as stratification methods based on anticipated model moments. In terms of viability (achieving the 90% confidence level) a new stratification method defined here is competitive with the best of the many existing methods tested and seems less sensitive to choice of operating parameters. In terms of overpayment recovery (equivalent to precision) the new method is also comparable to the best of the many existing methods tested. Unfortunately, no stratification algorithm tested was ever viable for more than about half of the 104 test populations.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200002
    Description: Being able to quantify the accuracy (bias, variance) of published output is crucial in official statistics. Output in official statistics is nearly always divided into subpopulations according to some classification variable, such as mean income by categories of educational level. Such output is also referred to as domain statistics. In the current paper, we limit ourselves to binary classification variables. In practice, misclassifications occur and these contribute to the bias and variance of domain statistics. Existing analytical and numerical methods to estimate this effect have two disadvantages. The first disadvantage is that they require that the misclassification probabilities are known beforehand and the second is that the bias and variance estimates are biased themselves. In the current paper we present a new method, a Gaussian mixture model estimated by an Expectation-Maximisation (EM) algorithm combined with a bootstrap, referred to as the EM bootstrap method. This new method does not require that the misclassification probabilities are known beforehand, although it is more efficient when a small audit sample is used that yields a starting value for the misclassification probabilities in the EM algorithm. We compared the performance of the new method with currently available numerical methods: the bootstrap method and the SIMEX method. Previous research has shown that for non-linear parameters the bootstrap outperforms the analytical expressions. For nearly all conditions tested, the bias and variance estimates that are obtained by the EM bootstrap method are closer to their true values than those obtained by the bootstrap and SIMEX methods. We end this paper by discussing the results and possible future extensions of the method.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200003
    Description: We investigate small area prediction of general parameters based on two models for unit-level counts. We construct predictors of parameters, such as quartiles, that may be nonlinear functions of the model response variable. We first develop a procedure to construct empirical best predictors and mean square error estimators of general parameters under a unit-level gamma-Poisson model. We then use a sampling importance resampling algorithm to develop predictors for a generalized linear mixed model (GLMM) with a Poisson response distribution. We compare the two models through simulation and an analysis of data from the Iowa Seat-Belt Use Survey.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200004
    Description: We present a novel methodology to benchmark county-level estimates of crop area totals to a preset state total subject to inequality constraints and random variances in the Fay-Herriot model. For planted area of the National Agricultural Statistics Service (NASS), an agency of the United States Department of Agriculture (USDA), it is necessary to incorporate the constraint that the estimated totals, derived from survey and other auxiliary data, are no smaller than administrative planted area totals prerecorded by other USDA agencies except NASS. These administrative totals are treated as fixed and known, and this additional coherence requirement adds to the complexity of benchmarking the county-level estimates. A fully Bayesian analysis of the Fay-Herriot model offers an appealing way to incorporate the inequality and benchmarking constraints, and to quantify the resulting uncertainties, but sampling from the posterior densities involves difficult integration, and reasonable approximations must be made. First, we describe a single-shrinkage model, shrinking the means while the variances are assumed known. Second, we extend this model to accommodate double shrinkage, borrowing strength across means and variances. This extended model has two sources of extra variation, but because we are shrinking both means and variances, it is expected that this second model should perform better in terms of goodness of fit (reliability) and possibly precision. The computations are challenging for both models, which are applied to simulated data sets with properties resembling the Illinois corn crop.
    Release date: 2024-01-03

  • Articles and reports: 12-001-X202300200005
    Description: Population undercoverage is one of the main hurdles faced by statistical analysis with non-probability survey samples. We discuss two typical scenarios of undercoverage, namely, stochastic undercoverage and deterministic undercoverage. We argue that existing estimation methods under the positivity assumption on the propensity scores (i.e., the participation probabilities) can be directly applied to handle the scenario of stochastic undercoverage. We explore strategies for mitigating biases in estimating the mean of the target population under deterministic undercoverage. In particular, we examine a split population approach based on a convex hull formulation, and construct estimators with reduced biases. A doubly robust estimator can be constructed if a followup subsample of the reference probability survey with measurements on the study variable becomes feasible. Performances of six competing estimators are investigated through a simulation study and issues which require further investigation are briefly discussed.
    Release date: 2024-01-03
Data (9)

Data (9) ((9 results))

No content available at this time.

Analysis (1,874)

Analysis (1,874) (20 to 30 of 1,874 results)

  • Articles and reports: 11-522-X202200100018
    Description: The Longitudinal Social Data Development Program (LSDDP) is a social data integration approach aimed at providing longitudinal analytical opportunities without imposing additional burden on respondents. The LSDDP uses a multitude of signals from different data sources for the same individual, which helps to better understand their interactions and track changes over time. This article looks at how the ethnicity status of people in Canada can be estimated at the most detailed disaggregated level possible using the results from a variety of business rules applied to linked data and to the LSDDP denominator. It will then show how improvements were obtained using machine learning methods, such as decision trees and random forest techniques.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100019
    Description: The purpose of this article is to compare the linkage results for individuals from French tax sources with those of the 2019 Enquête Annuelle de Recensement (EAR), obtained through different methods. Such a comparison will decide whether the Répertoires Statistiques d'Individus et de Logements (Résil) program should be equipped with a probabilistic matching tool for its administrative source identification and matching engine.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100020
    Description: The reconciliation of 2021 census dwellings with the new Statistical Building Register (SBgR) presented linkage challenges. The Census of Population collected information from various dwelling types. For a large proportion of the population, mailing addresses were at the centre: they were used for reaching out to people and collected as contact info. In parallel, the register environment has been evolving. The agency is transitioning from the Address Register (AR) to the SBgR holding both mailing and location addresses, while also covering non-residential buildings. The reconciliation was conducted using a combination of systems, notably the new Register Matching Engine (RME) for difficult cases. The RME holds an interesting range of sophisticated string comparators. A deterministic linkage approach was used, while incorporating some data knowledge like the entropy. Through metadata, the matching expert could also reduce the amounts of false positives and false negatives.
    Release date: 2024-03-25

  • Journals and periodicals: 11-522-X
    Description: Since 1984, an annual international symposium on methodological issues has been sponsored by Statistics Canada. Proceedings have been available since 1987.
    Release date: 2024-03-25

  • Articles and reports: 75-005-M2024001
    Description: From 2010 to 2019, the Labour Force Survey (LFS) response rate – or the proportion of selected households who complete an LFS interview – had been on a slow downward trend, due to a range of social and technological changes which have made it more challenging to contact selected households and to persuade Canadians to participate when they are contacted. These factors were exacerbated by the COVID-19 pandemic, which resulted in the suspension of face-to-face interviewing between April 2020 and fall 2022. Statistics Canada is committed to restoring LFS response rates to the greatest extent possible. This technical paper discusses two initiatives that are underway to ensure that the LFS estimates continue to provide an accurate and representative portrait of the Canadian labour market.
    Release date: 2024-02-16

  • Articles and reports: 75F0002M2024002
    Description: This discussion paper describes considerations for applying the Market Basket Measure (MBM) methodology onto a purely administrative data source. The paper will begin by outlining a rationale for estimating MBM poverty statistics using administrative income data sources. It then explains a proposal for creating annual samples along with the caveats of creating these samples, followed by a brief analysis using the proposed samples. The paper concludes with potential future improvements to the samples and provides the opportunity for reader’s feedback.
    Release date: 2024-02-08

  • Stats in brief: 11-637-X
    Description: This product presents data on the Sustainable Development Goals. They present an overview of the 17 Goals through infographics by leveraging data currently available to report on Canada’s progress towards the 2030 Agenda for Sustainable Development.
    Release date: 2024-01-25

  • Journals and periodicals: 11-633-X
    Description: Papers in this series provide background discussions of the methods used to develop data for economic, health, and social analytical studies at Statistics Canada. They are intended to provide readers with information on the statistical methods, standards and definitions used to develop databases for research purposes. All papers in this series have undergone peer and institutional review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted standards of good professional practice.
    Release date: 2024-01-22

  • Articles and reports: 11-633-X2024001
    Description: The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 35 years.
    Release date: 2024-01-22

  • Articles and reports: 13-604-M2024001
    Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in January 2024 for the reference years 2010 to 2023. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
    Release date: 2024-01-22
Reference (363)

Reference (363) (0 to 10 of 363 results)

  • Notices and consultations: 13-605-X
    Description: This product contains articles related to the latest methodological, conceptual developments in the Canadian System of Macroeconomic Accounts as well as the analysis of the Canadian economy. It includes articles detailing new methods, concepts and statistical techniques used to compile the Canadian System of Macroeconomic Accounts. It also includes information related to new or expanded data products, provides updates and supplements to information found in various guides and analytical articles touching upon a broad range of topics related to the Canadian economy.
    Release date: 2024-02-29

  • Surveys and statistical programs – Documentation: 32-26-0007
    Description: Census of Agriculture data provide statistical information on farms and farm operators at fine geographic levels and for small subpopulations. Quality evaluation activities are essential to ensure that census data are reliable and that they meet user needs.

    This report provides data quality information pertaining to the Census of Agriculture, such as sources of error, error detection, disclosure control methods, data quality indicators, response rates and collection rates.
    Release date: 2024-02-06

  • Surveys and statistical programs – Documentation: 75-005-M2023001
    Description: This document provides information on the evolution of response rates for the Labour Force Survey (LFS) and a discussion of the evaluation of two aspects of data quality that ensure the LFS estimates continue providing an accurate portrait of the Canadian labour market.
    Release date: 2023-10-30

  • Surveys and statistical programs – Documentation: 98-306-X
    Description:

    This report describes sampling, weighting and estimation procedures used in the Census of Population. It provides operational and theoretical justifications for them, and presents the results of the evaluations of these procedures.

    Release date: 2023-10-04

  • Surveys and statistical programs – Documentation: 84-538-X
    Geography: Canada
    Description: This electronic publication presents the methodology underlying the production of the life tables for Canada, provinces and territories.
    Release date: 2023-08-28

  • Surveys and statistical programs – Documentation: 32-26-0006
    Description: This report provides data quality information pertaining to the Agriculture–Population Linkage, such as sources of error, matching process, response rates, imputation rates, sampling, weighting, disclosure control methods and data quality indicators.
    Release date: 2023-08-25

  • Surveys and statistical programs – Documentation: 75-514-G
    Description: The Guide to the Job Vacancy and Wage Survey contains a dictionary of concepts and definitions, and covers topics such as survey methodology, data collection, processing, and data quality. The guide covers both components of the survey: the job vacancy component, which is quarterly, and the wage component, which is annual.
    Release date: 2023-05-25

  • Surveys and statistical programs – Documentation: 32-26-0002
    Description:

    This reference guide may be useful to both new and experienced users who wish to familiarize themselves with and find specific information about the Census of Agriculture.

    It provides an overview of the Census of Agriculture communications, content determination, collection, processing, data quality evaluation and dissemination activities. It also summarizes the key changes to the census and other useful information.

    Release date: 2022-04-14

  • Geographic files and documentation: 12-572-X
    Description:

    The Standard Geographical Classification (SGC) provides a systematic classification structure that categorizes all of the geographic area of Canada. The SGC is the official classification used in the Census of Population and other Statistics Canada surveys.

    The classification is organized in two volumes: Volume I, The Classification and Volume II, Reference Maps.

    Volume II contains reference maps showing boundaries, names, codes and locations of the geographic areas in the classification. The reference maps show census subdivisions, census divisions, census metropolitan areas, census agglomerations, census metropolitan influenced zones and economic regions. Definitions for these terms are found in Volume I, The Classification. Volume I describes the classification and related standard geographic areas and place names.

    The maps in Volume II can be downloaded in PDF format from our website.

    Release date: 2022-02-09

  • Surveys and statistical programs – Documentation: 12-004-X
    Description:

    Statistics: Power from Data! is a web resource that was created in 2001 to assist secondary students and teachers of Mathematics and Information Studies in getting the most from statistics. Over the past 20 years, this product has become one of Statistics Canada most popular references for students, teachers, and many other members of the general population. This product was last updated in 2021.

    Release date: 2021-09-02

Browse our partners page to find a complete list of our partners and their associated products.

Date modified: