Statistical methods

Skip to filters. View results.

Key indicators

Changing any selection will automatically update the page content.

Selected geographical area:Canada

Selected geographical area:Newfoundland and Labrador

Selected geographical area:Prince Edward Island

Selected geographical area:Nova Scotia

Selected geographical area:New Brunswick

Selected geographical area:Quebec

Selected geographical area:Ontario

Selected geographical area:Manitoba

Selected geographical area:Saskatchewan

Selected geographical area:Alberta

Selected geographical area:British Columbia

Selected geographical area:Yukon

Selected geographical area:Northwest Territories

Selected geographical area:Nunavut

Sort Help
entries

Results

All (2,481)

All (2,481) (0 to 10 of 2,481 results)

  • Surveys and statistical programs – Documentation: 19-20-00012026003
    Description: This article provides nontechnical answers to questions related to the production, use and interpretation of advance indicators for Statistics Canada’s Monthly Survey of Manufacturing, Monthly Wholesale Trade Survey and Monthly Retail Trade Survey.
    Release date: 2026-06-16

  • Surveys and statistical programs – Documentation: 19-20-0001
    Description: Documents in this series provide insight into the statistical methods used by Statistics Canada to produce official statistics. They include introductory material, in-depth descriptions of techniques and methods, best practices, and guidelines. All documents have undergone review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted methodological standards and practices.
    Release date: 2026-06-16

  • Surveys and statistical programs – Documentation: 19-20-00012026002
    Description: This reference document provides answers on selected topics related to the use, interpretation, and calculation of trend-cycle estimates for seasonally adjusted data. It is designed to complement more technical discussions of seasonal adjustment and trend-cycle estimation found in Statistics Canada publications and reference manuals.
    Release date: 2026-06-08

  • Articles and reports: 36-28-0001202600500003
    Description: This spotlight article outlines practical methods for assessing the economic impacts of public programs delivered by federal agencies and Crown corporations. It summarizes key steps in conducting quantitative impact analysis, including data linkage, cohort construction and implementation of quasi causal estimators.
    Release date: 2026-05-27

  • Journals and periodicals: 11-633-X
    Description: Papers in this series provide background discussions of the methods used to develop data for economic, health, and social analytical studies at Statistics Canada. They are intended to provide readers with information on the statistical methods, standards and definitions used to develop databases for research purposes. All papers in this series have undergone peer and institutional review to ensure that they conform to Statistics Canada's mandate and adhere to generally accepted standards of good professional practice.
    Release date: 2026-05-27

  • Journals and periodicals: 75F0002M
    Description: This series provides detailed documentation on income developments, including survey design issues, data quality evaluation and exploratory research.
    Release date: 2026-05-20

  • Surveys and statistical programs – Documentation: 19-20-00012026001
    Description: This reference document provides nontechnical answers on selected topics related to the use and interpretation of seasonally adjusted data. It is designed to complement more technical discussions of seasonal adjustment found in Statistics Canada publications and reference manuals.
    Release date: 2026-05-11

  • Notices and consultations: 13-605-X
    Description: This product contains articles related to the latest methodological, conceptual developments in the Canadian System of Macroeconomic Accounts as well as the analysis of the Canadian economy. It includes articles detailing new methods, concepts and statistical techniques used to compile the Canadian System of Macroeconomic Accounts. It also includes information related to new or expanded data products, provides updates and supplements to information found in various guides and analytical articles touching upon a broad range of topics related to the Canadian economy.
    Release date: 2026-05-04

  • Surveys and statistical programs – Documentation: 11-633-X2026002
    Description: Recent changes in Canada’s immigration levels have heightened interest in understanding how immigration affects housing demand. This article develops a methodological framework for projecting housing use associated with permanent residents (PRs) and non-permanent residents (NPRs) under alternative immigration scenarios. The framework applies observed per capita housing use rates from the Census of Population to estimate incremental housing use by tenure over time.
    Release date: 2026-04-24

  • Surveys and statistical programs – Documentation: 11-633-X2026001
    Description: This report defines key concepts related to area-level analysis and introduces area-level measures developed and utilized at Statistics Canada for health analysis. It also provides a decision-making framework and practical recommendations to help researchers select appropriate methods. The goal is to guide readers on when area-level analysis is appropriate and what type of area-level measure is suitable to achieve research objectives.
    Release date: 2026-03-05
Data (10)

Data (10) ((10 results))

  • Public use microdata: 89F0002X
    Description: The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and models, analytical software and user documentation.
    Release date: 2026-02-12

  • Profile of a community or region: 46-26-0002
    Description: The National Address Register (NAR) is a list of commercial and residential addresses in Canada that are extracted from Statistics Canada's Building Register and deemed non-confidential.
    Release date: 2025-12-19

  • Table: 89-26-0006
    Description: PASSAGES is an open-source dynamic microsimulation model aimed at supporting policy analysis and research relating to Canadian retirement income system outcomes at the individual and family level. The publicly available version includes a synthetic starting database, a model, and documentation. A confidential starting database is also available.
    Release date: 2025-03-12

  • Data Visualization: 71-607-X2020010
    Description: The Canadian Statistical Geospatial Explorer empowers users to discover geo enabled data holdings of Statistics Canada at various levels of geography including at the neighbourhood level. Users are able to visualize, thematically map, spatially explore and analyze, export and consume data in various formats. Users can also view the data superimposed on satellite imagery, topographic and street layers.
    Release date: 2024-08-21

  • Table: 11-10-0074-01
    Geography: Census tract
    Frequency: Occasional
    Description:

    The divergence index (D-index) describes the degree that families with different income levels are mixing together in neighbourhoods. It compares neighbourhood (census tract, CT) discrete income distributions to a base distribution, which is the income quintiles of the neighbourhood’s census metropolitan area (CMA).

    Release date: 2020-06-22

  • Data Visualization: 71-607-X2019010
    Description: The Housing Data Viewer is a visualization tool that allows users to explore Statistics Canada data on a map. Users can use the tool to navigate, compare and export data.
    Release date: 2019-10-30

  • Table: 53-500-X
    Description:

    This report presents the results of a pilot survey conducted by Statistics Canada to measure the fuel consumption of on-road motor vehicles registered in Canada. This study was carried out in connection with the Canadian Vehicle Survey (CVS) which collects information on road activity such as distance traveled, number of passengers and trip purpose.

    Release date: 2004-10-21

  • Table: 13-220-X
    Description: In the 1997 edition, new and revised benchmarks were introduced for 1992 and 1988. The indicators are used to monitor supply, demand and employment for tourism in Canada on a timely basis. The annual tables are derived using the National Income and Expenditure Accounts (NIEA) and various industry and travel surveys. Tables providing actual data and percentage changes, for seasonally adjusted current and constant price estimates are included. In addition, an analytical section provides graphs, and time series of first differences, percentage changes, and seasonal factors for selected indicators. Data are published from 1987 and the publication will be available on the day of release. New data are included in the demand tables for non-tourism commodities produced by non-tourism industries and in the employment tables covering direct tourism employment generated by non-tourism industries. This product was commissioned by the Canadian Tourism Commission to provide annual updates for the Tourism Satellite Account.
    Release date: 2003-01-08

  • Table: 11-516-X
    Description:

    The second edition of Historical statistics of Canada was jointly produced by the Social Science Federation of Canada and Statistics Canada in 1983. This volume contains about 1,088 statistical tables on the social, economic and institutional conditions of Canada from the start of Confederation in 1867 to the mid-1970s. The tables are arranged in sections with an introduction explaining the content of each section, the principal sources of data for each table, and general explanatory notes regarding the statistics. In most cases, there is sufficient description of the individual series to enable the reader to use them without consulting the numerous basic sources referenced in the publication.

    The electronic version of this historical publication is accessible on the Internet site of Statistics Canada as a free downloadable document: text as HTML pages and all tables as individual spreadsheets in a comma delimited format (CSV) (which allows online viewing or downloading).

    Release date: 1999-07-29

  • Table: 82-567-X
    Description:

    The National Population Health Survey (NPHS) is designed to enhance the understanding of the processes affecting health. The survey collects cross-sectional as well as longitudinal data. In 1994/95 the survey interviewed a panel of 17,276 individuals, then returned to interview them a second time in 1996/97. The response rate for these individuals was 96% in 1996/97. Data collection from the panel will continue for up to two decades. For cross-sectional purposes, data were collected for a total of 81,000 household residents in all provinces (except people on Indian reserves or on Canadian Forces bases) in 1996/97.

    This overview illustrates the variety of information available by presenting data on perceived health, chronic conditions, injuries, repetitive strains, depression, smoking, alcohol consumption, physical activity, consultations with medical professionals, use of medications and use of alternative medicine.

    Release date: 1998-07-29
Analysis (2,037)

Analysis (2,037) (60 to 70 of 2,037 results)

  • Journals and periodicals: 11-522-X
    Description: Since 1984, an annual international symposium on methodological issues has been sponsored by Statistics Canada. Proceedings have been available since 1987.
    Release date: 2025-09-08

  • Articles and reports: 12-001-X202500100001
    Description: Geoffrey J.C. Hole (or Geoff, as he likes to be called) was born on January 24, 1940 at Shardeloes, Amersham, Buckinghamshire, England, to Charles William Hole and Sybil Winifred Hole, formerly Morge. He completed a BSc Honours in Mathematics in 1961, and a Postgraduate Diploma in Statistics at Manchester University the following year. He started his career as a mathematical statistician in London, England, working successively for the National Coal Board (1962-63), the Central Electricity Generating Board (1963-66), and the Electricity Council (1966-67), where his title was Economist. He moved to Canada in 1967 to join the Dominion Bureau of Statistics (DBS) as a survey methodologist. In 1971-72, he was Chief of Census Operations, Methodology and Quality Control Section, and Assistant Coordinator, Socio-Economic Survey Methods Section. He then took a one-year leave of absence to complete an MSc (Econ) in Statistics at the London School of Economics. In 1973, Geoff returned to the DBS, which had become Statistics Canada, as Chief, Methodology Group V, Business Survey Methods Division. In 1974, he was appointed Director, Institutions and Agriculture Survey Methods Division, and, as of 1986, Director, Business Survey Methods Division. His career culminated when he became Director, Social Survey Methods Division, in 1987. He held that position until his retirement, on September 29, 2004. In addition to his long-term involvement at Statistics Canada, including as a member of the Editorial Board of Survey Methodology between 1983 and 1987, Geoff was very active in the Statistical Society of Canada (SSC), serving among others as Chair of the Program Committee for the 1986 Annual Meeting at the Banff Centre, in Alberta, and President of the SSC in 1989-90. He was also Program Chair for a joint conference of the International Association of Survey Statisticians and the International Association for Official Statistics which was held in Aguascalientes, Mexico, in 1998.
    Release date: 2025-06-30

  • Articles and reports: 12-001-X202500100002
    Description: Ivan Fellegi is an expert in statistical science and a public servant who was the Chief Statistician of Canada from 1985 to 2008. This article briefly recounts his early life, long-spanning career and influential research contributions. It includes an interview conducted in February 2017 to mark the 60th year of service of Ivan Fellegi’s career at Statistics Canada.
    Release date: 2025-06-30

  • Articles and reports: 12-001-X202500100003
    Description: In recent years, there has been a significant interest in machine learning in national statistical offices. Thanks to their flexibility, these methods may prove useful at the nonresponse treatment stage. In this article, we conduct an empirical investigation in order to compare several machine learning procedures in terms of bias and efficiency. In addition to the classical machine learning procedures, we assess the performance of ensemble approaches that make use of different machine learning procedures to produce a set of weights adjusted for nonresponse.
    Release date: 2025-06-30

  • Articles and reports: 12-001-X202500100004
    Description: Survey data collection often is plagued by unit and item nonresponse. To reduce reliance on strong assumptions about the missingness mechanisms, statisticians can use information about population marginal distributions known, for example, from censuses or administrative databases. One approach that does so is the Missing Data with Auxiliary Margins, or MD-AM, framework, which uses multiple imputation for both unit and item nonresponse so that survey-weighted estimates accord with the known marginal distributions. However, this framework relies on specifying and estimating a joint distribution for the survey data and nonresponse indicators, which can be computationally and practically daunting in data with many variables of mixed types. We propose two adaptations to the MD-AM framework to simplify the imputation task. First, rather than specifying a joint model for unit respondents’ data, we use random hot deck imputation while still leveraging the known marginal distributions. Second, instead of sampling from conditional distributions implied by the joint model for the missing data due to item nonresponse, we apply multiple imputation by chained equations for item nonresponse before imputation for unit nonresponse. Using simulation studies with nonignorable missingness mechanisms, we demonstrate that the proposed approach can provide more accurate point and interval estimates than models that do not leverage the auxiliary information. We illustrate the approach using data on voter turnout from the U.S. Current Population Survey.
    Release date: 2025-06-30

  • Articles and reports: 12-001-X202500100005
    Description: In this paper, we derive a second-order unbiased (or nearly unbiased) mean squared prediction error (MSPE) estimator of the empirical best linear unbiased predictor (EBLUP) of a small area mean for a semi-parametric extension to the well-known Fay-Herriot model. Specifically, we derive our MSPE estimator essentially assuming certain moment conditions on both the sampling errors and random effects distributions. The normality-based Prasad-Rao MSPE estimator has a surprising robustness property in that it remains second-order unbiased under the non-normality of random effects when a simple Prasad-Rao method-of-moments estimator is used for the variance component and the sampling error distribution is normal. We show that the normality-based MSPE estimator is no longer second-order unbiased when the sampling error distribution has non-zero kurtosis or when the Fay-Herriot moment method is used to estimate the variance component, even when the sampling error distribution is normal. Interestingly, when the simple method-of moments estimator is used for the variance component, our proposed MSPE estimator does not require the estimation of kurtosis of the random effects. Results of a simulation study on the accuracy of the proposed MSPE estimator, under non-normality of both sampling and random effects distributions, are also presented.
    Release date: 2025-06-30

  • Articles and reports: 12-001-X202500100006
    Description: Survey practitioners have increasingly embraced the benefits of modern machine learning techniques, including classification and regression tree algorithms, in the development of nonresponse adjustments. These methods, which do not require a predefined functional relationship between outcomes and predictors, offer a practical means of conducting variable selection and deriving interpretable structures that link response propensity with explanatory variables. However, when applying these algorithms to survey data, it is common to overlook crucial factors like sampling weights, as well as sample design features such as stratification and clustering. To bridge this shortcoming, we propose an extension of the Chi-square Automatic Interaction Detector (CHAID) approach, and we describe the design-based asymptotic properties of the resulting “survey CHAID” (sCHAID) method. To facilitate the practical use of sCHAID, we incorporate a Rao-Scott correction into the splitting criterion, accounting for the survey design. Using data from the U.S. American Community Survey, we illustrate the use of the method and evaluate its performance through comparisons with existing weighted and unweighted algorithms.
    Release date: 2025-06-30

  • Articles and reports: 12-001-X202500100007
    Description: We introduce a novel approach to model-assisted calibration estimation in survey sampling using generalized entropy. The method builds upon recent work by Kwon, Kim and Qiu (2024) and extends it to a model-assisted framework. Unlike traditional calibration techniques, this approach employs a generalized entropy function as the objective for optimization and incorporates a debiasing calibration constraint to ensure design consistency. The proposed estimator is shown to be asymptotically equivalent to an augmented generalized regression (GREG) estimator. It allows for unequal model variance, potentially improving efficiency when the sampling design is informative. The paper presents both design-based and model-based justifications for the method, along with asymptotic properties and variance estimation techniques. Computational aspects are discussed, including an unconstrained optimization approach that facilitates implementation, especially for high-dimensional auxiliary variables. The method’s performance is evaluated through a simulation study, demonstrating its effectiveness in improving estimation efficiency, particularly when the sampling design is informative.
    Release date: 2025-06-30

  • Articles and reports: 12-001-X202500100008
    Description: Tightened budgets, continuing decrease of response rates in traditional probability surveys and increasing pressure by users for more timely data, has stimulated research on the use of nonprobability sample data, such as administrative records, web scraping, mobile phone data and voluntary internet surveys, for inference on finite population parameters like means and totals. These data are often easier, faster and cheaper to collect than traditional probability samples. However, a major concern with the use of this kind of data for official statistics is their nonrepresentativeness due to possible selection bias, which if not accounted for properly, could bias the inference. In this article, we review and discuss methods considered in the literature to deal with this problem and propose new methods, distinguishing between methods based on integration of the nonprobability sample with an appropriate probability sample, and methods that base the inference solely on the nonprobability sample. Empirical illustrations, based on simulated data are provided.
    Release date: 2025-06-30

  • Articles and reports: 12-001-X202500100009
    Description: BigData users and the BigData research community are expanding rapidly, while statisticians at large are seemingly becoming divided between those who are enthusiastic and those who are concerned, if not downright hostile. Is BigData also a big step ahead, truly advancing our ability to extract meaningful information and actual knowledge from data? Is BigData underplaying traditional statistical inference as we know it, supplanting survey methodology as a low-cost futuristic option? In this paper I will attempt to unravel the multifaceted relationship bridging BigData to sampling methodology. Starting by reasoning why it should be interesting to look at BigData from a sampling statistician’s perspective, I will delve deeper into the somewhat ambiguous definition of BigData and share some very personal considerations and views on the matter. In the process, several open questions will arise while discussing a personal selection of insights that are traceable through the vast body of statistical literature around BigData and sampling methodology. The discussion will take various angles explored across nine key points, and it will conclude with a forward-looking perspective on a main challenge for future research: addressing the strong assumptions needed to manage deviations from purely randomized data collection.
    Release date: 2025-06-30
Reference (382)

Reference (382) (270 to 280 of 382 results)

  • Surveys and statistical programs – Documentation: 11-522-X19990015680
    Description:

    To augment the amount of available information, data from different sources are increasingly being combined. These databases are often combined using record linkage methods. When there is no unique identifier, a probabilistic linkage is used. In that case, a record on a first file is associated with a probability that is linked to a record on a second file, and then a decision is taken on whether a possible link is a true link or not. This usually requires a non-negligible amount of manual resolution. It might then be legitimate to evaluate if manual resolution can be reduced or even eliminated. This issue is addressed in this paper where one tries to produce an estimate of a total (or a mean) of one population, when using a sample selected from another population linked somehow to the first population. In other words, having two populations linked through probabilistic record linkage, we try to avoid any decision concerning the validity of links and still be able to produce an unbiased estimate for a total of the one of two populations. To achieve this goal, we suggest the use of the Generalised Weight Share Method (GWSM) described by Lavallée (1995).

    Release date: 2000-03-02

  • Surveys and statistical programs – Documentation: 11-522-X19990015682
    Description:

    The application of dual system estimation (DSE) to matched Census / Post Enumeration Survey (PES) data in order to measure net undercount is well understood (Hogan, 1993). However, this approach has so far not been used to measure net undercount in the UK. The 2001 PES in the UK will use this methodology. This paper presents the general approach to design and estimation for this PES (the 2001 Census Coverage Survey). The estimation combines DSE with standard ratio and regression estimation. A simulation study using census data from the 1991 Census of England and Wales demonstrates that the ratio model is in general more robust than the regression model.

    Release date: 2000-03-02

  • Surveys and statistical programs – Documentation: 11-522-X19990015684
    Description:

    Often, the same information is gathered almost simultaneously for several different surveys. In France, this practice is institutionalized for household surveys that have a common set of demographic variables, i.e., employment, residence and income. These variables are important co-factors for the variables of interest in each survey, and if used carefully, can reinforce the estimates derived from each survey. Techniques for calibrating uncertain data can apply naturally in this context. This involves finding the best unbiased estimator in common variables and calibrating each survey based on that estimator. The estimator thus obtained in each survey is always a linear estimator, the weightings of which can be easily explained and the variance can be obtained with no new problems, as can the variance estimate. To supplement the list of regression estimators, this technique can also be seen as a ridge-regression estimator, or as a Bayesian-regression estimator.

    Release date: 2000-03-02

  • Surveys and statistical programs – Documentation: 11-522-X19990015686
    Description:

    The U.S. Consumer Expenditure Survey uses two instruments, a diary and an in-person interview, to collect data on many categories of consumer expenditures. Consequently, it is important to use these data efficiently to estimate mean expenditures and related parameters. Three options are: (1) use only data from the diary source; (2) Use only data from the interview source; and (3) use generalized least squares, or related methods, to combine the diary and interview data. Historically, the U.S. Bureau of Labor Statistics has focused on options (1) and (2) for estimation at the five or six-digit Universal Classification Code level. Evaluation and possible implementation of option (3) depends on several factors, including possible measurement biases in the diary and interview data; the empirical magnitude of these biases, relative to the standard errors of customary mean estimators; and the degree of homogeneity of these biases across strata and periods. This paper reviews some issues related to options (1) through (3); describes a relatively simple generalized least squares method for implementation of option (3); and discussed the need for diagnostics to evaluate the feasibility and relative efficiency of the generalized least squares method.

    Release date: 2000-03-02

  • Surveys and statistical programs – Documentation: 11-522-X19990015688
    Description:

    The geographical and temporal relationship between outdoor air pollution and asthma was examined by linking together data from multiple sources. These included the administrative records of 59 general practices widely dispersed across England and Wales for half a million patients and all their consultations for asthma, supplemented by a socio-economic interview survey. Postcode enabled linkage with: (i) computed local road density; (ii) emission estimates of sulphur dioxide and nitrogen dioxides, (iii) measured/interpolated concentration of black smoke, sulphur dioxide, nitrogen dioxide and other pollutants at practice level. Parallel Poisson time series analysis took into account between-practice variations to examine daily correlations in practices close to air quality monitoring stations. Preliminary analyses show small and generally non-significant geographical associations between consultation rates and pollution markers. The methodological issues relevant to combining such data, and the interpretation of these results will be discussed.

    Release date: 2000-03-02

  • Surveys and statistical programs – Documentation: 11-522-X19990015690
    Description:

    The artificial sample was generated in two steps. The first step, based on a master panel, was a Multiple Correspondence Analysis (MCA) carried out on basic variables. Then, "dummy" individuals were generated randomly using the distribution of each "significant" factor in the analysis. Finally, for each individual, a value was generated for each basic variable most closely linked to one of the previous factors. This method ensured that sets of variables were drawn independently. The second step consisted in grafting some other data bases, based on certain property requirements. A variable was generated to be added on the basis of its estimated distribution, using a generalized linear model for common variables and those already added. The same procedure was then used to graft the other samples. This method was applied to the generation of an artificial sample taken from two surveys. The artificial sample that was generated was validated using sample comparison testing. The results were positive, demonstrating the feasibility of this method.

    Release date: 2000-03-02

  • Surveys and statistical programs – Documentation: 11-522-X19990015692
    Description:

    Electricity rates that vary by time-of-day have the potential to significantly increase economic efficiency in the energy market. A number of utilities have undertaken economic studies of time-of-use rates schemes for their residential customers. This paper uses meta-analysis to examine the impact of time-of-use rates on electricity demand pooling the results of thirty-eight separate programs. There are four key findings. First, very large peak to off-peak price ratios are needed to significantly affect peak demand. Second, summer peak rates are relatively effective compared to winter peak rates. Third, permanent time-or-use rates are relatively effective compared to experimental ones. Fourth, demand charges rival ordinary time-of-use rates in terms of impact.

    Release date: 2000-03-02

  • Surveys and statistical programs – Documentation: 11-522-X19990015694
    Description:

    We use data on 14 populations of coho salmon to estimate critical parameters that are vital for management of fish populations. Parameter estimates from individual data sets are inefficient and can be highly biased, and we investigate methods to overcome these problems. Combination of data sets using nonlinear mixed effects models provides more useful results, however questions of influence and robustness are raised. For comparison, robust estimates are obtained. Model-robustness is also explored using a family of alternative functional forms. Our results allow ready calculation of the limits of exploitation and may help to prevent extinction of fish stocks. Similar methods can be applied in other contexts where parameter estimation is part of a larger decision-making process.

    Release date: 2000-03-02

  • Surveys and statistical programs – Documentation: 21-601-M1999042
    Description:

    This paper reconstructs the development and evolution of the Canadian agricultural statistical system. It describes the expanding and increasingly important role of administrative data, which is integrated into survey and census information in order to complement, supplement or replace survey information or to assist with frame maintenance.

    Release date: 2000-01-14

  • Surveys and statistical programs – Documentation: 21-601-M1998034
    Description:

    This paper describes the experiences, the issues and the expectations of the many different players involved in the implementation of document imaging for the Canadian Census of Agriculture.

    Release date: 2000-01-13