Statistical methods

Skip to filters. View results.

Key indicators

Changing any selection will automatically update the page content.

Selected geographical area:Canada

Selected geographical area:Newfoundland and Labrador

Selected geographical area:Prince Edward Island

Selected geographical area:Nova Scotia

Selected geographical area:New Brunswick

Selected geographical area:Quebec

Selected geographical area:Ontario

Selected geographical area:Manitoba

Selected geographical area:Saskatchewan

Selected geographical area:Alberta

Selected geographical area:British Columbia

Selected geographical area:Yukon

Selected geographical area:Northwest Territories

Selected geographical area:Nunavut

Sort Help
entries

Results

All (2,478)

All (2,478) (0 to 10 of 2,478 results)

  • Surveys and statistical programs – Documentation: 19-20-00012026001
    Description: This reference document provides nontechnical answers on selected topics related to the use and interpretation of seasonally adjusted data. It is designed to complement more technical discussions of seasonal adjustment found in Statistics Canada publications and reference manuals.
    Release date: 2026-05-11

  • Surveys and statistical programs – Documentation: 19-20-0001
    Description: Documents in this series provide insight into the statistical methods used by Statistics Canada to produce official statistics. They include introductory material, in-depth descriptions of techniques and methods, best practices, and guidelines. All documents have undergone review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted methodological standards and practices.
    Release date: 2026-05-11

  • Notices and consultations: 13-605-X
    Description: This product contains articles related to the latest methodological, conceptual developments in the Canadian System of Macroeconomic Accounts as well as the analysis of the Canadian economy. It includes articles detailing new methods, concepts and statistical techniques used to compile the Canadian System of Macroeconomic Accounts. It also includes information related to new or expanded data products, provides updates and supplements to information found in various guides and analytical articles touching upon a broad range of topics related to the Canadian economy.
    Release date: 2026-05-04

  • Surveys and statistical programs – Documentation: 11-633-X2026002
    Description: Recent changes in Canada’s immigration levels have heightened interest in understanding how immigration affects housing demand. This article develops a methodological framework for projecting housing use associated with permanent residents (PRs) and non-permanent residents (NPRs) under alternative immigration scenarios. The framework applies observed per capita housing use rates from the Census of Population to estimate incremental housing use by tenure over time.
    Release date: 2026-04-24

  • Journals and periodicals: 11-633-X
    Description: Papers in this series provide background discussions of the methods used to develop data for economic, health, and social analytical studies at Statistics Canada. They are intended to provide readers with information on the statistical methods, standards and definitions used to develop databases for research purposes. All papers in this series have undergone peer and institutional review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted standards of good professional practice.
    Release date: 2026-04-24

  • Surveys and statistical programs – Documentation: 11-633-X2026001
    Description: This report defines key concepts related to area-level analysis and introduces area-level measures developed and utilized at Statistics Canada for health analysis. It also provides a decision-making framework and practical recommendations to help researchers select appropriate methods. The goal is to guide readers on when area-level analysis is appropriate and what type of area-level measure is suitable to achieve research objectives.
    Release date: 2026-03-05

  • Public use microdata: 89F0002X
    Description: The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and models, analytical software and user documentation.
    Release date: 2026-02-12

  • Articles and reports: 13-604-M2026001
    Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in January 2026 for the reference years 2010 to 2025. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
    Release date: 2026-01-29

  • Articles and reports: 12-001-X202500200001
    Description: Nested error regression models are commonly used to incorporate unit specific auxiliary variables to improve small area estimates. When the mean structure of the model is misspecified, the design-based mean squared prediction error (MSPE) of Empirical Best Linear Unbiased Predictors (EBLUP) generally increases. The Observed Best Prediction (OBP) method has been proposed with the intent to improve on the design-based MSPE over EBLUP. In this paper, we conduct a Monte Carlo simulation experiments to understand the effect of misspsecification of mean structures on different small area estimators. Our findings suggest that the OBP using unit-level auxiliary variables does not outperform the EBLUP in terms of design-based MSPE, unless the number of small areas m is extremely large. Conversely, the performance of OBP significantly improves when area-level auxiliary variables are employed. This paper includes both analytical and numerical evidence to demonstrate these observations, providing practical insights for addressing model misspecification in small area estimation (SAE).
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200002
    Description: This study examines interviewer effects on household nonresponse in three waves of the Household Finance and Consumption Survey (HFCS) in Austria using a multilevel model. Addressing nonresponse at its source is crucial for maintaining survey data quality and representativeness. Our findings indicate that the variation in response behavior explained by interviewer effects decreased from about one-third in the first wave to 7% in the third wave. Effective interviewers tend to have a university degree, be married, homeowners, and have a larger workload. Additionally, higher mean wages in the household’s municipality negatively affect survey participation. These insights suggest targeted interviewer selection and training strategies to improve response rates.
    Release date: 2025-12-23
Data (10)

Data (10) ((10 results))

  • Public use microdata: 89F0002X
    Description: The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and models, analytical software and user documentation.
    Release date: 2026-02-12

  • Profile of a community or region: 46-26-0002
    Description: The National Address Register (NAR) is a list of commercial and residential addresses in Canada that are extracted from Statistics Canada's Building Register and deemed non-confidential.
    Release date: 2025-12-19

  • Table: 89-26-0006
    Description: PASSAGES is an open-source dynamic microsimulation model aimed at supporting policy analysis and research relating to Canadian retirement income system outcomes at the individual and family level. The publicly available version includes a synthetic starting database, a model, and documentation. A confidential starting database is also available.
    Release date: 2025-03-12

  • Data Visualization: 71-607-X2020010
    Description: The Canadian Statistical Geospatial Explorer empowers users to discover geo enabled data holdings of Statistics Canada at various levels of geography including at the neighbourhood level. Users are able to visualize, thematically map, spatially explore and analyze, export and consume data in various formats. Users can also view the data superimposed on satellite imagery, topographic and street layers.
    Release date: 2024-08-21

  • Table: 11-10-0074-01
    Geography: Census tract
    Frequency: Occasional
    Description:

    The divergence index (D-index) describes the degree that families with different income levels are mixing together in neighbourhoods. It compares neighbourhood (census tract, CT) discrete income distributions to a base distribution, which is the income quintiles of the neighbourhood’s census metropolitan area (CMA).

    Release date: 2020-06-22

  • Data Visualization: 71-607-X2019010
    Description: The Housing Data Viewer is a visualization tool that allows users to explore Statistics Canada data on a map. Users can use the tool to navigate, compare and export data.
    Release date: 2019-10-30

  • Table: 53-500-X
    Description:

    This report presents the results of a pilot survey conducted by Statistics Canada to measure the fuel consumption of on-road motor vehicles registered in Canada. This study was carried out in connection with the Canadian Vehicle Survey (CVS) which collects information on road activity such as distance traveled, number of passengers and trip purpose.

    Release date: 2004-10-21

  • Table: 13-220-X
    Description: In the 1997 edition, new and revised benchmarks were introduced for 1992 and 1988. The indicators are used to monitor supply, demand and employment for tourism in Canada on a timely basis. The annual tables are derived using the National Income and Expenditure Accounts (NIEA) and various industry and travel surveys. Tables providing actual data and percentage changes, for seasonally adjusted current and constant price estimates are included. In addition, an analytical section provides graphs, and time series of first differences, percentage changes, and seasonal factors for selected indicators. Data are published from 1987 and the publication will be available on the day of release. New data are included in the demand tables for non-tourism commodities produced by non-tourism industries and in the employment tables covering direct tourism employment generated by non-tourism industries. This product was commissioned by the Canadian Tourism Commission to provide annual updates for the Tourism Satellite Account.
    Release date: 2003-01-08

  • Table: 11-516-X
    Description:

    The second edition of Historical statistics of Canada was jointly produced by the Social Science Federation of Canada and Statistics Canada in 1983. This volume contains about 1,088 statistical tables on the social, economic and institutional conditions of Canada from the start of Confederation in 1867 to the mid-1970s. The tables are arranged in sections with an introduction explaining the content of each section, the principal sources of data for each table, and general explanatory notes regarding the statistics. In most cases, there is sufficient description of the individual series to enable the reader to use them without consulting the numerous basic sources referenced in the publication.

    The electronic version of this historical publication is accessible on the Internet site of Statistics Canada as a free downloadable document: text as HTML pages and all tables as individual spreadsheets in a comma delimited format (CSV) (which allows online viewing or downloading).

    Release date: 1999-07-29

  • Table: 82-567-X
    Description:

    The National Population Health Survey (NPHS) is designed to enhance the understanding of the processes affecting health. The survey collects cross-sectional as well as longitudinal data. In 1994/95 the survey interviewed a panel of 17,276 individuals, then returned to interview them a second time in 1996/97. The response rate for these individuals was 96% in 1996/97. Data collection from the panel will continue for up to two decades. For cross-sectional purposes, data were collected for a total of 81,000 household residents in all provinces (except people on Indian reserves or on Canadian Forces bases) in 1996/97.

    This overview illustrates the variety of information available by presenting data on perceived health, chronic conditions, injuries, repetitive strains, depression, smoking, alcohol consumption, physical activity, consultations with medical professionals, use of medications and use of alternative medicine.

    Release date: 1998-07-29
Analysis (2,036)

Analysis (2,036) (0 to 10 of 2,036 results)

  • Surveys and statistical programs – Documentation: 19-20-0001
    Description: Documents in this series provide insight into the statistical methods used by Statistics Canada to produce official statistics. They include introductory material, in-depth descriptions of techniques and methods, best practices, and guidelines. All documents have undergone review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted methodological standards and practices.
    Release date: 2026-05-11

  • Journals and periodicals: 11-633-X
    Description: Papers in this series provide background discussions of the methods used to develop data for economic, health, and social analytical studies at Statistics Canada. They are intended to provide readers with information on the statistical methods, standards and definitions used to develop databases for research purposes. All papers in this series have undergone peer and institutional review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted standards of good professional practice.
    Release date: 2026-04-24

  • Articles and reports: 13-604-M2026001
    Description: This documentation outlines the methodology used to develop the Distributions of household economic accounts published in January 2026 for the reference years 2010 to 2025. It describes the framework and the steps implemented to produce distributional information aligned with the National Balance Sheet Accounts and other national accounts concepts. It also includes a report on the quality of the estimated distributions.
    Release date: 2026-01-29

  • Articles and reports: 12-001-X202500200001
    Description: Nested error regression models are commonly used to incorporate unit specific auxiliary variables to improve small area estimates. When the mean structure of the model is misspecified, the design-based mean squared prediction error (MSPE) of Empirical Best Linear Unbiased Predictors (EBLUP) generally increases. The Observed Best Prediction (OBP) method has been proposed with the intent to improve on the design-based MSPE over EBLUP. In this paper, we conduct a Monte Carlo simulation experiments to understand the effect of misspsecification of mean structures on different small area estimators. Our findings suggest that the OBP using unit-level auxiliary variables does not outperform the EBLUP in terms of design-based MSPE, unless the number of small areas m is extremely large. Conversely, the performance of OBP significantly improves when area-level auxiliary variables are employed. This paper includes both analytical and numerical evidence to demonstrate these observations, providing practical insights for addressing model misspecification in small area estimation (SAE).
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200002
    Description: This study examines interviewer effects on household nonresponse in three waves of the Household Finance and Consumption Survey (HFCS) in Austria using a multilevel model. Addressing nonresponse at its source is crucial for maintaining survey data quality and representativeness. Our findings indicate that the variation in response behavior explained by interviewer effects decreased from about one-third in the first wave to 7% in the third wave. Effective interviewers tend to have a university degree, be married, homeowners, and have a larger workload. Additionally, higher mean wages in the household’s municipality negatively affect survey participation. These insights suggest targeted interviewer selection and training strategies to improve response rates.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200003
    Description: In this paper a model-based inference procedure based on a multivariate structural time series model is developed for the production of monthly figures about consumer confidence. The input for the model are five series of direct estimates for the indices that measure consumer confidence, which are derived from the Dutch Consumer Survey. The model improves the accuracy of the direct estimates, since it provides a better separation of measurement errors and sampling errors from estimated target parameters. The standard errors for the month-to-month changes are clearly smaller under the time series model. A second problem addressed in this paper is related to the transition to a new survey process in 2017. Structural time series models in combination with a parallel run are applied to estimate discontinuities induced by the redesign. An algorithm designed for the consumer confidence variables is developed to construct uninterrupted input series for the aforementioned structural time series model. This inference method facilitated a smooth transition to a new survey design and resulted in uninterrupted series about consumer confidence that date back to 1986. The method is implemented for the production of official monthly figures on consumer confidence in the Netherlands.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200004
    Description: The class of generalized linear models (GLM) is a flexible generalization of ordinary least squares regression that allows the linear model to be related to the response variable via a link function and assumes the magnitude of the variance of each measurement to be a function of its predicted value. Multicollinearity in GLMs can inflate variances of the estimated coefficients and cause poor prediction in certain regions of the regression space. It may also cause a nonsignificant Wald statistic even when the predictors are highly predictive in a model of the family of GLMs. Little previous research has closely investigated the diagnostics of multicollinearity in GLMs, especially when complex survey data are used. In this paper, we develop variance inflation factors (VIFs) that measure the amount that the variance of a parameter estimator is increased due to multicollinearity in GLMs. We also extend VIFs and condition indexes to apply to complex survey data, accounting for design features, e.g. weights, clusters, and strata. Illustrations of these methods are given using data from a household survey of health and nutrition.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200005
    Description: The use of non-probability data sources for statistical purposes and for official statistics has become increasingly popular in recent years. However, statistical inference based on non-probability samples is made more difficult by nature of their biasedness and lack of representativity. In this paper we propose quantile balancing inverse probability weighting estimator (QBIPW) for non-probability samples. We apply the idea of Harms and Duchesne (2006) allowing the use of quantile information in the estimation process to reproduce known totals and the distribution of auxiliary variables. We discuss the estimation of the QBIPW probabilities and its variance. Our simulation study has demonstrated that the proposed estimators are robust against model mis-specification and, as a result, help to reduce bias and mean squared error. Finally, we applied the proposed methods to estimate the share of job vacancies aimed at Ukrainian workers in Poland using an integrated set of administrative and survey data about job vacancies.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200006
    Description: National Statistical Institutes (NSIs) are directing resources into advancing the use of administrative data in official statistics. Administrative data, however, are not developed for the purpose of producing statistics rather as a result of an event or transaction relating to administrative procedures of organizations, public administrations and government agencies. Therefore, it is essential to check the quality of the administrative data with respect to sources of error, particularly representativeness to the target population. In this paper, we utilize the strength of probability-based reference samples or censuses that can be used to detect the lack of representativeness in administrative data and introduce quality indicators based on distance metrics and representativity indicators (R-indicators). We demonstrate their application with a simulation study and discuss a real application applied on a UK Office for National Statistics (ONS) administrative dataset.
    Release date: 2025-12-23

  • Articles and reports: 12-001-X202500200007
    Description: Although probability samples have been regarded as the gold standard to collect information for population-based study, non-probability samples have been used frequently in practice due to low cost, convenience, and the lack of the sampling frame for the survey. Naïve estimates based on non-probability samples without any adjustments may be misleading due to selection bias. Recently, a valid data integration approach that includes mass imputation, propensity score weighting, and calibration has been used to improve the representativeness of non-probability samples. The effectiveness of the mass imputation approach depends on the underlying model assumptions. In this paper, we propose using deep learning for the mass imputation in the combining of probability and non-probability samples and compare it with several modern machine learning-based mass imputation approaches, including generalized additive modeling, regression tree, random forest, and XG-boosting. In the simulation study, deep learning-based approaches have been shown to be more robust and effective than other mass imputation approaches against the failure of underlying model assumptions under non-linearity scenarios.
    Release date: 2025-12-23
Reference (380)

Reference (380) (60 to 70 of 380 results)

  • Surveys and statistical programs – Documentation: 11-522-X201700014741
    Description:

    Statistics Canada’s mandate includes producing statistical data to shed light on current business issues. The linking of business records is an important aspect of the development, production, evaluation and analysis of these statistical data. As record linkage can intrude on one’s privacy, Statistics Canada uses it only when the public good is clear and outweighs the intrusion. Record linkage is experiencing a revival triggered by a greater use of administrative data in many statistical programs. There are many challenges to business record linkage. For example, many administrative files not have common identifiers, information is recorded is in non-standardized formats, information contains typographical errors, administrative data files are usually large in size, and finally the evaluation of multiple record pairings makes absolute comparison impractical and sometimes impossible. Due to the importance and challenges associated with record linkage, Statistics Canada has been developing a record linkage standard to help users optimize their business record linkage process. For example, this process includes building on a record linkage blocking strategy that reduces the amount of record-pairs to compare and match, making use of Statistics Canada’s internal software to conduct deterministic and probabilistic matching, and creating standard business name and address fields on Statistics Canada’s Business Register. This article gives an overview of the business record linkage methodology and looks at various economic projects which use record linkage at Statistics Canada, these include projects in the National Accounts, International Trade, Agriculture and the Business Register.

    Release date: 2016-03-24

  • Surveys and statistical programs – Documentation: 11-522-X201700014747
    Description:

    The Longitudinal Immigration Database (IMDB) combines the Immigrant Landing File (ILF) with annual tax files. This record linkage is performed using a tax filer database. The ILF includes all immigrants who have landed in Canada since 1980. In looking to enhance the IMDB, the possibility of adding temporary residents (TR) and immigrants who landed between 1952 and 1979 (PRE80) was studied. Adding this information would give a more complete picture of the immigrant population living in Canada. To integrate the TR and PRE80 files into the IMDB, record linkages between these two files and the tax filer database, were performed. This exercise was challenging in part due to the presence of duplicates in the files and conflicting links between the different record linkages.

    Release date: 2016-03-24

  • Surveys and statistical programs – Documentation: 11-522-X201700014749
    Description:

    As part of the Tourism Statistics Program redesign, Statistics Canada is developing the National Travel Survey (NTS) to collect travel information from Canadian travellers. This new survey will replace the Travel Survey of Residents of Canada and the Canadian resident component of the International Travel Survey. The NTS will take advantage of Statistics Canada’s common sampling frames and common processing tools while maximizing the use of administrative data. This paper discusses the potential uses of administrative data such as Passport Canada files, Canada Border Service Agency files and Canada Revenue Agency files, to increase the efficiency of the NTS sample design.

    Release date: 2016-03-24

  • Surveys and statistical programs – Documentation: 11-522-X201700014751
    Description:

    Practically all major retailers use scanners to record the information on their transactions with clients (consumers). These data normally include the product code, a brief description, the price and the quantity sold. This is an extremely relevant data source for statistical programs such as Statistics Canada’s Consumer Price Index (CPI), one of Canada’s most important economic indicators. Using scanner data could improve the quality of the CPI by increasing the number of prices used in calculations, expanding geographic coverage and including the quantities sold, among other things, while lowering data collection costs. However, using these data presents many challenges. An examination of scanner data from a first retailer revealed a high rate of change in product identification codes over a one-year period. The effects of these changes pose challenges from a product classification and estimate quality perspective. This article focuses on the issues associated with acquiring, classifying and examining these data to assess their quality for use in the CPI.

    Release date: 2016-03-24

  • Surveys and statistical programs – Documentation: 11-018-X
    Description: Reports on Plans and Priorities (RPP) are individual expenditure plans for each department and agency. These reports provide increased levels of detail over a three-year period on an organization's main priorities by strategic outcome, program and planned/expected results, including links to related resource requirements presented in the Main Estimates. In conjunction with the Main Estimates, Reports on Plans and Priorities serve to inform members of Parliament on planned expenditures of departments and agencies, and support Parliament's consideration of supply bills. The RPPs are typically tabled soon after the Main Estimates by the President of the Treasury Board.
    Release date: 2016-03-07

  • Surveys and statistical programs – Documentation: 89-654-X2016003
    Description:

    This paper describes the process that led to the creation of the new Disability Screening Questions (DSQ), jointly developped by Statistics Canada and Employment and Social Development Canada. The DSQ form a new module which can be put on general population surveys to allow comparisons of persons with and without a disability. The paper explains why there are two versions of the DSQ—a long and a short one—, the difference between the two, and how each version can be used.

    Release date: 2016-02-29

  • Surveys and statistical programs – Documentation: 75F0002M2015003
    Description:

    This note discusses revised income estimates from the Survey of Labour and Income Dynamics (SLID). These revisions to the SLID estimates make it possible to compare results from the Canadian Income Survey (CIS) to earlier years. The revisions address the issue of methodology differences between SLID and CIS.

    Release date: 2015-12-17

  • Surveys and statistical programs – Documentation: 13-605-X201500414166
    Description:

    Estimates of the underground economy by province and territory for the period 2007 to 2012 are now available for the first time. The objective of this technical note is to explain how the methodology employed to derive upper-bound estimates of the underground economy for the provinces and territories differs from that used to derive national estimates.

    Release date: 2015-04-29

  • Surveys and statistical programs – Documentation: 75-005-M2015001
    Description:

    Using the experimental Workplace Survey conducted in 2011, this technical document summarizes the main results and evaluates the quality of the data.

    Release date: 2015-04-28

  • Notices and consultations: 12-002-X
    Description:

    The Research Data Centres (RDCs) Information and Technical Bulletin (ITB) is a forum by which Statistics Canada analysts and the research community can inform each other on survey data uses and methodological techniques. Articles in the ITB focus on data analysis and modelling, data management, and best or ineffective statistical, computational, and scientific practices. Further, ITB topics will include essays on data content, implications of questionnaire wording, comparisons of datasets, reviews on methodologies and their application, data peculiarities, problematic data and solutions, and explanations of innovative tools using RDC surveys and relevant software. All of these essays may provide advice and detailed examples outlining commands, habits, tricks and strategies used to make problem-solving easier for the RDC user.

    The main aims of the ITB are:

    - the advancement and dissemination of knowledge surrounding Statistics Canada's data; - the exchange of ideas among the RDC-user community;- the support of new users; - the co-operation with subject matter experts and divisions within Statistics Canada.

    The ITB is interested in quality articles that are worth publicizing throughout the research community, and that will add value to the quality of research produced at Statistics Canada's RDCs.

    Release date: 2015-03-25