Statistical methods

Skip to filters. View results.

Key indicators

Changing any selection will automatically update the page content.

Selected geographical area:Canada

Selected geographical area:Newfoundland and Labrador

Selected geographical area:Prince Edward Island

Selected geographical area:Nova Scotia

Selected geographical area:New Brunswick

Selected geographical area:Quebec

Selected geographical area:Ontario

Selected geographical area:Manitoba

Selected geographical area:Saskatchewan

Selected geographical area:Alberta

Selected geographical area:British Columbia

Selected geographical area:Yukon

Selected geographical area:Northwest Territories

Selected geographical area:Nunavut

Sort Help
entries

Results

All (2,478)

All (2,478) (10 to 20 of 2,478 results)

  • Articles and reports: 12-001-X202500200003
    Description: In this paper a model-based inference procedure based on a multivariate structural time series model is developed for the production of monthly figures about consumer confidence. The input for the model are five series of direct estimates for the indices that measure consumer confidence, which are derived from the Dutch Consumer Survey. The model improves the accuracy of the direct estimates, since it provides a better separation of measurement errors and sampling errors from estimated target parameters. The standard errors for the month-to-month changes are clearly smaller under the time series model. A second problem addressed in this paper is related to the transition to a new survey process in 2017. Structural time series models in combination with a parallel run are applied to estimate discontinuities induced by the redesign. An algorithm designed for the consumer confidence variables is developed to construct uninterrupted input series for the aforementioned structural time series model. This inference method facilitated a smooth transition to a new survey design and resulted in uninterrupted series about consumer confidence that date back to 1986. The method is implemented for the production of official monthly figures on consumer confidence in the Netherlands.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200004
    Description: The class of generalized linear models (GLM) is a flexible generalization of ordinary least squares regression that allows the linear model to be related to the response variable via a link function and assumes the magnitude of the variance of each measurement to be a function of its predicted value. Multicollinearity in GLMs can inflate variances of the estimated coefficients and cause poor prediction in certain regions of the regression space. It may also cause a nonsignificant Wald statistic even when the predictors are highly predictive in a model of the family of GLMs. Little previous research has closely investigated the diagnostics of multicollinearity in GLMs, especially when complex survey data are used. In this paper, we develop variance inflation factors (VIFs) that measure the amount that the variance of a parameter estimator is increased due to multicollinearity in GLMs. We also extend VIFs and condition indexes to apply to complex survey data, accounting for design features, e.g. weights, clusters, and strata. Illustrations of these methods are given using data from a household survey of health and nutrition.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200005
    Description: The use of non-probability data sources for statistical purposes and for official statistics has become increasingly popular in recent years. However, statistical inference based on non-probability samples is made more difficult by nature of their biasedness and lack of representativity. In this paper we propose quantile balancing inverse probability weighting estimator (QBIPW) for non-probability samples. We apply the idea of Harms and Duchesne (2006) allowing the use of quantile information in the estimation process to reproduce known totals and the distribution of auxiliary variables. We discuss the estimation of the QBIPW probabilities and its variance. Our simulation study has demonstrated that the proposed estimators are robust against model mis-specification and, as a result, help to reduce bias and mean squared error. Finally, we applied the proposed methods to estimate the share of job vacancies aimed at Ukrainian workers in Poland using an integrated set of administrative and survey data about job vacancies.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200006
    Description: National Statistical Institutes (NSIs) are directing resources into advancing the use of administrative data in official statistics. Administrative data, however, are not developed for the purpose of producing statistics rather as a result of an event or transaction relating to administrative procedures of organizations, public administrations and government agencies. Therefore, it is essential to check the quality of the administrative data with respect to sources of error, particularly representativeness to the target population. In this paper, we utilize the strength of probability-based reference samples or censuses that can be used to detect the lack of representativeness in administrative data and introduce quality indicators based on distance metrics and representativity indicators (R-indicators). We demonstrate their application with a simulation study and discuss a real application applied on a UK Office for National Statistics (ONS) administrative dataset.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200007
    Description: Although probability samples have been regarded as the gold standard to collect information for population-based study, non-probability samples have been used frequently in practice due to low cost, convenience, and the lack of the sampling frame for the survey. Naïve estimates based on non-probability samples without any adjustments may be misleading due to selection bias. Recently, a valid data integration approach that includes mass imputation, propensity score weighting, and calibration has been used to improve the representativeness of non-probability samples. The effectiveness of the mass imputation approach depends on the underlying model assumptions. In this paper, we propose using deep learning for the mass imputation in the combining of probability and non-probability samples and compare it with several modern machine learning-based mass imputation approaches, including generalized additive modeling, regression tree, random forest, and XG-boosting. In the simulation study, deep learning-based approaches have been shown to be more robust and effective than other mass imputation approaches against the failure of underlying model assumptions under non-linearity scenarios.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200008
    Description: Classical design-based survey estimation relies on a properly specified sampling design for valid inference. We consider the properties of regression estimation under a misspecified sample design, in which the nominal and true inclusion probabilities do not necessarily match. This general misspecified sample design setting encompasses many challenges in the modern survey environment. Under this setting, an asymptotic analysis of the regression estimator, an expression of the bias, and an expression of the variance are presented. Further, a consistent variance estimator is derived and an expression which estimates the bias in-part or in-whole is discussed. This later expression may be used as an indicator of the presence of bias due to misspecification by a practitioner. A simulation study is conducted to support the presented theory.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200009
    Description: We present and apply methodology to improve inference for small area parameters by using data from several sources. This work extends Cahoy and Sedransk (2023) who showed how to integrate summary statistics from several sources. Our methodology uses hierarchical global-local prior distributions to make inferences for the proportion of individuals in Florida’s counties who do not have health insurance. Results from an extensive simulation study show that this methodology will provide improved inference by using several data sources. Among the five model variants evaluated the ones using horseshoe priors for all variances have better performance than the ones using lasso priors for the local variances.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200010
    Description: In this paper, we study the performance of hierarchical Bayes (HB) small area estimators using noninformative and informative priors. We apply the Bayesian models of You and Chapman (2006) and You (2021) to the Canadian Labor Force Survey (LFS) data and evaluate the impact of the priors on the HB estimators. A Bayesian model comparison and simulation study are also conducted. Our results indicate that a correct informative prior can lead to very good results, and noninformative priors can also perform very well. Incorrect informative priors can lead to poor results in terms of large bias and large coefficient of variation (CV). Noninformative priors are recommended in practice for HB small area estimation unless correctly specified informative priors are available. Informative priors are particularly useful when the number of small areas is relatively small.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200011
    Description: We propose an approximate hierarchical Bayes approach that uses the Natural Exponential Family with Quadratic Variance Function (NEF-QVF) in combining information from multiple sources to improve traditional survey estimates of finite population means for small areas. Unlike other Bayesian approaches in finite population sampling, we do not assume a model for all units of the finite population and do not require linking sampled units to the finite population frame. We assume a model only for the finite population units in which the outcome variable is observed; because, for these units, the assumed model can be checked using existing statistical tools. We do not posit an elaborate model on the true means for unobserved units. Instead, we assume that population means of cells with the same combination of factor levels are identical across small areas, and that the population mean for a cell is identical to the mean of the observed units in that cell. We apply our proposed methodology to a real-life survey, linking information from multiple disparate data sources. We also provide practical ways of model selection that can be applied to a wider class of models under similar setting but for a diverse range of scientific problems.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200012
    Description: The observed best prediction (OBP) under a nested-error regression (NER) model was previously proposed using a design-based mean squared prediction error (MSPE) as a tool to derive the best predictive estimator (BPE). A recent study showed the OBP under the NER model may suffer from numerical instability when computing the BPE. We propose several modifications of the OBP under the NER model, including ones using a model-based MSPE to derive the BPE, to improve the numerical stability and predictive performance. We compare the performance of the modified OBP strategies with the existing methods in a simulation study. A real-data example is discussed.
    Release date: 2025-12-23
Data (10)

Data (10) ((10 results))

  • Public use microdata: 89F0002X
    Description: The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and models, analytical software and user documentation.
    Release date: 2026-02-12

  • Profile of a community or region: 46-26-0002
    Description: The National Address Register (NAR) is a list of commercial and residential addresses in Canada that are extracted from Statistics Canada's Building Register and deemed non-confidential.
    Release date: 2025-12-19

  • Table: 89-26-0006
    Description: PASSAGES is an open-source dynamic microsimulation model aimed at supporting policy analysis and research relating to Canadian retirement income system outcomes at the individual and family level. The publicly available version includes a synthetic starting database, a model, and documentation. A confidential starting database is also available.
    Release date: 2025-03-12

  • Data Visualization: 71-607-X2020010
    Description: The Canadian Statistical Geospatial Explorer empowers users to discover geo enabled data holdings of Statistics Canada at various levels of geography including at the neighbourhood level. Users are able to visualize, thematically map, spatially explore and analyze, export and consume data in various formats. Users can also view the data superimposed on satellite imagery, topographic and street layers.
    Release date: 2024-08-21

  • Table: 11-10-0074-01
    Geography: Census tract
    Frequency: Occasional
    Description:

    The divergence index (D-index) describes the degree that families with different income levels are mixing together in neighbourhoods. It compares neighbourhood (census tract, CT) discrete income distributions to a base distribution, which is the income quintiles of the neighbourhood’s census metropolitan area (CMA).

    Release date: 2020-06-22

  • Data Visualization: 71-607-X2019010
    Description: The Housing Data Viewer is a visualization tool that allows users to explore Statistics Canada data on a map. Users can use the tool to navigate, compare and export data.
    Release date: 2019-10-30

  • Table: 53-500-X
    Description:

    This report presents the results of a pilot survey conducted by Statistics Canada to measure the fuel consumption of on-road motor vehicles registered in Canada. This study was carried out in connection with the Canadian Vehicle Survey (CVS) which collects information on road activity such as distance traveled, number of passengers and trip purpose.

    Release date: 2004-10-21

  • Table: 13-220-X
    Description: In the 1997 edition, new and revised benchmarks were introduced for 1992 and 1988. The indicators are used to monitor supply, demand and employment for tourism in Canada on a timely basis. The annual tables are derived using the National Income and Expenditure Accounts (NIEA) and various industry and travel surveys. Tables providing actual data and percentage changes, for seasonally adjusted current and constant price estimates are included. In addition, an analytical section provides graphs, and time series of first differences, percentage changes, and seasonal factors for selected indicators. Data are published from 1987 and the publication will be available on the day of release. New data are included in the demand tables for non-tourism commodities produced by non-tourism industries and in the employment tables covering direct tourism employment generated by non-tourism industries. This product was commissioned by the Canadian Tourism Commission to provide annual updates for the Tourism Satellite Account.
    Release date: 2003-01-08

  • Table: 11-516-X
    Description:

    The second edition of Historical statistics of Canada was jointly produced by the Social Science Federation of Canada and Statistics Canada in 1983. This volume contains about 1,088 statistical tables on the social, economic and institutional conditions of Canada from the start of Confederation in 1867 to the mid-1970s. The tables are arranged in sections with an introduction explaining the content of each section, the principal sources of data for each table, and general explanatory notes regarding the statistics. In most cases, there is sufficient description of the individual series to enable the reader to use them without consulting the numerous basic sources referenced in the publication.

    The electronic version of this historical publication is accessible on the Internet site of Statistics Canada as a free downloadable document: text as HTML pages and all tables as individual spreadsheets in a comma delimited format (CSV) (which allows online viewing or downloading).

    Release date: 1999-07-29

  • Table: 82-567-X
    Description:

    The National Population Health Survey (NPHS) is designed to enhance the understanding of the processes affecting health. The survey collects cross-sectional as well as longitudinal data. In 1994/95 the survey interviewed a panel of 17,276 individuals, then returned to interview them a second time in 1996/97. The response rate for these individuals was 96% in 1996/97. Data collection from the panel will continue for up to two decades. For cross-sectional purposes, data were collected for a total of 81,000 household residents in all provinces (except people on Indian reserves or on Canadian Forces bases) in 1996/97.

    This overview illustrates the variety of information available by presenting data on perceived health, chronic conditions, injuries, repetitive strains, depression, smoking, alcohol consumption, physical activity, consultations with medical professionals, use of medications and use of alternative medicine.

    Release date: 1998-07-29
Analysis (2,036)

Analysis (2,036) (1,990 to 2,000 of 2,036 results)

  • Articles and reports: 12-001-X197800154829
    Description:

    This paper advances the case that administrative records are a powerful source of statistics and in support of this conclusion provides an overview of the extensive utilization in Canada of administrative records for statistical purposes. The paper discusses recent developments and the changing environment which are seen as major determinants of both the creation of administrative data bases as well as their utilization. The capabilities of the computer, combined with the extensive demand for statistics and the limited financial resources available to meet that demand, are seen as combining to lead to more extensive use of administrative records. A variety of problems associated with the use of administrative records is specified and the development of strategies to meet these problems and permit utilization of administrative records is described. Recent developments in Canada intended to support the use of administrative records are indicated.

    Release date: 1978-06-15

  • Articles and reports: 12-001-X197700254828
    Description: Non-response exists in any survey, but its magnitude depends upon the type of survey, the interviewers’ ability to conduct an interview, and the respondents’ motivation to respond to survey questions. This paper discusses non-response in relation to a number of household surveys and in particular the behaviour of non-response rates over time in a continuous survey such as the Canadian Labour Force Survey.

    A profile of interviewers employed by Statistics Canada shows that the correlation between non-response and a number of interviewer characteristics is not significant. Respondents themselves, and their motivation, are the key elements in an interview process and therefore in respondent relations.

    This article draws on the results of various studies conducted to investigate the effects of response burden, choice of respondent and response incentives to provide some insight into the characteristics of non-respondents.

    Release date: 1977-12-15

  • Articles and reports: 12-001-X197700254829
    Description: Results of an earlier paper on the use of raking ratio estimators are extended to the case of cluster sampling. An empirical study is discussed.
    Release date: 1977-12-12

  • Articles and reports: 12-001-X197700254830
    Description: The distribution of questionnaires to Canadian residents returning by land from the U.S. has been substantially modified, in an effort to improve sample yield at minimal additional cost. For each border crossing ("port") involved, a systematic sample of multi-day distribution stints has been selected. The sample selection method is described, the constraints which determined it are discussed, and some preliminary data on the method’s effectiveness are presented.
    Release date: 1977-12-12

  • Articles and reports: 12-001-X197700254831
    Description: This article describes briefly the methodology of the Occupational Employment Survey, which has been conducted every second year since 1973. The article presents the scope of the survey, the sampling plan and the estimation procedure.
    Release date: 1977-12-12

  • Articles and reports: 12-001-X197700254832
    Description: In periodic household surveys, area samples are usually selected in geographic strata with probability of selection of areal units proportional to population size in these units. The design-based estimates for areas composed of domains within strata can have poor precision due to cluster sampling with a few primary sampling units per stratum. In this paper, synthetic estimates are investigated as an alternative to these estimates. An empirical evaluation based on the design of the Canadian Labour Force, Survey is given.
    Release date: 1977-12-12

  • Articles and reports: 12-001-X197700254833
    Description: The paper first identifies some of the factors which have recently made it more difficult for statistical agencies to satisfy society's growing needs for information, while at the same time reassuring respondents that their privacy is adequately protected.

    The conceptual basis of privacy is then discussed, as well as the privacy provisions of the new Canadian Human Rights Act. The paper next reviews the confidentiality provisions of Canada's Statistics Act by which the privacy rights of respondents are protected. There then follows an account of the circumstances under which the confidential treatment of corporate information is being challenged, and the way in which Statistics Canada is endeavouring to meet governmental needs for access to individual corporate returns in a foreign ownership context without prejudicing traditional confidentiality practices in mainstream statistical reporting.

    Finally, the paper notes two subjects which are likely to feature in future discussions of confidentiality: first, scholarly access to historical statistical records; and second, the possibility of future freedom of information legislation in Canada.
    Release date: 1977-12-12

  • Articles and reports: 12-001-X197700100001
    Description: This article summarizes the findings of a study of the feasibility of an on-going labour force survey in the Yukon Territory. The major aspects of methodology considered are the choice of a sampling frame and the determination of a sample size and allocation. It is shown that area sampling would be preferable to the use of available lists, although substantial field testing would be required because of conditions particular to the Yukon. It is also observed that sampling fractions as high as 15% may be required to produce basic labour force data, because of the small population.
    Release date: 1977-06-20

  • Articles and reports: 12-001-X197700100002
    Description: In multi-stage sampling when selection is without replacement at the first stage, estimation of the variance of the estimate of the population total is often done assuming sampling with replacement. This estimate is biased and the degree of bias is not negligible. In this paper, a procedure which gives unbiased estimates of the variance making use of only estimated primary sampling unit totals is suggested for the case when sampling at the second and subsequent stages is simple random without replacement. This procedure is based on sub-samples drawn from the selected second and subsequent stage units.
    Release date: 1977-06-20

  • Articles and reports: 12-001-X197700100003
    Description: This paper describes the methodology of the Response Incentives Experiment which was carried out in the Canadian Labour Force Survey in order to determine the effectiveness of a response incentive on improving respondent relations and interviewer performance. Included in the paper are various results relating to non-response rates and refusal rates as well as results of an evaluation questionnaire which was completed by all interviewers at the conclusion of the experiment.
    Release date: 1977-06-20
Reference (380)

Reference (380) (10 to 20 of 380 results)

  • Surveys and statistical programs – Documentation: 89-657-X2024009
    Description: The Survey on the Official Language Minority Population (SOLMP) user guide contains a description of the survey, along with survey concepts and definitions and an overview of the content development. The target and survey populations, the sample design and sample size are described in the Methodology section. Finally, in the Data Collection module, the collection period and instrument, modes of collection, collection and communications strategies and response rates are provided.
    Release date: 2024-12-16

  • Surveys and statistical programs – Documentation: 11-633-X2024004
    Description: The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 40 years.
    Release date: 2024-12-09

  • Surveys and statistical programs – Documentation: 11-633-X2024005
    Description: The Analytical Studies and Modelling Branch is the research (ASMB), modelling, training and access hub of Statistics Canada. It focuses on leveraging the agency’s vast data holdings to generate in-depth insights that support evidence-based policy making and to enable others to do so through analytical training and data access. The ASMB, like other program areas in the agency, works to support Statistics Canada’s overall mission of delivering insights through data for a better Canada.
    Release date: 2024-12-06

  • Surveys and statistical programs – Documentation: 98-303-X
    Description: The Coverage Technical Report will present the errors included in census data that result from persons who are either missed (not enumerated) or enumerated more than once. The population coverage error is one of the most important types of errors because it affects the accuracy of not only population counts, but also all the census data results that describe the characteristics of the population universe.
    Release date: 2024-10-23

  • Surveys and statistical programs – Documentation: 89-653-X2024002
    Description: This guide is intended to provide a detailed review of both the 2022 IPS and IPS–NIS with respect to subject matter and methodological approaches. It is designed to help data users by serving as a guide to the concepts and measures of the survey as well as the technical details of the survey’s design, field work and data processing. This guide is meant to provide users with helpful information on how to use and interpret survey results. The discussion on data quality also allows users to review the strengths and limitations of the data for their particular needs.

    Chapter 1 of this guide provides an overview of the 2022 IPS and IPS–NIS by introducing the survey background and objectives. Chapter 2 outlines the survey’s themes and explains the key concepts and definitions used for the survey. Chapters 3 to 6 cover important aspects of the survey methodology, sampling design, data collection and processing. Chapters 7 and 8 review issues of data quality and caution users about comparing 2022 IPS or IPS–NIS data with data from other sources. Chapter 9 outlines the survey products available to the public, including data tables, analytical articles and reference material. The appendices provide a comprehensive list of survey indicators, extra coding categories and standard classifications used on both the IPS and the IPS–NIS. Lastly, a glossary of survey terms and information on confidence intervals is also provided.
    Release date: 2024-08-14

  • Surveys and statistical programs – Documentation: 75-514-G
    Description: The Guide to the Job Vacancy and Wage Survey contains a dictionary of concepts and definitions, and covers topics such as survey methodology, data collection, processing, and data quality. The guide covers both components of the survey: the job vacancy component, which is quarterly, and the wage component, which is annual.
    Release date: 2024-06-18

  • Surveys and statistical programs – Documentation: 32-26-0007
    Description: Census of Agriculture data provide statistical information on farms and farm operators at fine geographic levels and for small subpopulations. Quality evaluation activities are essential to ensure that census data are reliable and that they meet user needs.

    This report provides data quality information pertaining to the Census of Agriculture, such as sources of error, error detection, disclosure control methods, data quality indicators, response rates and collection rates.
    Release date: 2024-02-06

  • Surveys and statistical programs – Documentation: 11-633-X2024001
    Description: The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 35 years.
    Release date: 2024-01-22

  • Surveys and statistical programs – Documentation: 75-005-M2023001
    Description: This document provides information on the evolution of response rates for the Labour Force Survey (LFS) and a discussion of the evaluation of two aspects of data quality that ensure the LFS estimates continue providing an accurate portrait of the Canadian labour market.
    Release date: 2023-10-30

  • Surveys and statistical programs – Documentation: 98-306-X
    Description:

    This report describes sampling, weighting and estimation procedures used in the Census of Population. It provides operational and theoretical justifications for them, and presents the results of the evaluations of these procedures.

    Release date: 2023-10-04