Statistical methods

Sort Help

Results

All (2,008)

All (2,008) (0 to 10 of 2,008 results)

  • Surveys and statistical programs – Documentation: 75-005-M2019001
    Description:

    The production of statistics from the Labour Force Survey (LFS) involves many activities, one of which is data processing. This step involves the verification and correction of survey data when required in order to produce microdata files. Beginning in January 2019, LFS processing will be transitioned to a new system, the Social Survey Processing Environment. This document describes the development and testing that preceded the implementation of the new system, and demonstrates that the transition is expected to have minimal impact on LFS estimates and be transparent to users of LFS data.

    Release date: 2019-02-08

  • Journals and periodicals: 75F0002M
    Geography: Canada
    Description:

    This series provides detailed documentation on income developments, including survey design issues, data quality evaluation and exploratory research.

    Release date: 2019-01-29

  • Notices and consultations: 13-605-X
    Description:

    This product contains articles related to the latest methodological, conceptual developments in the Canadian System of Macroeconomic Accounts as well as the analysis of the Canadian economy. It includes articles detailing new methods, concepts and statistical techniques used to compile the Canadian System of Macroeconomic Accounts. It also includes information related to new or expanded data products, provides updates and supplements to information found in various guides and analytical articles touching upon a broad range of topics related to the Canadian economy.

    Release date: 2019-01-25

  • Articles and reports: 12-001-X201800254952
    Description:

    Panel surveys are frequently used to measure the evolution of parameters over time. Panel samples may suffer from different types of unit non-response, which is currently handled by estimating the response probabilities and by reweighting respondents. In this work, we consider estimation and variance estimation under unit non-response for panel surveys. Extending the work by Kim and Kim (2007) for several times, we consider a propensity score adjusted estimator accounting for initial non-response and attrition, and propose a suitable variance estimator. It is then extended to cover most estimators encountered in surveys, including calibrated estimators, complex parameters and longitudinal estimators. The properties of the proposed variance estimator and of a simplified variance estimator are estimated through a simulation study. An illustration of the proposed methods on data from the ELFE survey is also presented.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254953
    Description:

    Sample coordination seeks to create a probabilistic dependence between the selection of two or more samples drawn from the same population or from overlapping populations. Positive coordination increases the expected sample overlap, while negative coordination decreases it. There are numerous applications for sample coordination with varying objectives. A spatially balanced sample is a sample that is well-spread in some space. Forcing a spread within the selected samples is a general and very efficient variance reduction technique for the Horvitz-Thompson estimator. The local pivotal method and the spatially correlated Poisson sampling are two general schemes for achieving well-spread samples. We aim to introduce coordination for these sampling methods based on the concept of permanent random numbers. The goal is to coordinate such samples while preserving spatial balance. The proposed methods are motivated by examples from forestry, environmental studies, and official statistics.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254954
    Description:

    These last years, balanced sampling techniques have experienced a recrudescence of interest. They constrain the Horvitz Thompson estimators of the totals of auxiliary variables to be equal, at least approximately, to the corresponding true totals, to avoid the occurrence of bad samples. Several procedures are available to carry out balanced sampling; there is the cube method, see Deville and Tillé (2004), and an alternative, the rejective algorithm introduced by Hájek (1964). After a brief review of these sampling methods, motivated by the planning of an angler survey, we investigate using Monte Carlo simulations, the survey designs produced by these two sampling algorithms.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254955
    Description:

    Many studies conducted by various electric utilities around the world are based on the analysis of mean electricity consumption curves for various subpopulations, particularly geographic in nature. Those mean curves are estimated from samples of thousands of curves measured at very short intervals over long periods. Estimation for small subpopulations, also called small domains, is a very timely topic in sampling theory.

    In this article, we will examine this problem based on functional data and we will try to estimate the mean curves for small domains. For this, we propose four methods: functional linear regression; modelling the scores of a principal component analysis by unit-level linear mixed models; and two non-parametric estimators, with one based on regression trees and the other on random forests, adapted to the curves. All these methods have been tested and compared using real electricity consumption data for households in France.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254956
    Description:

    In Italy, the Labor Force Survey (LFS) is conducted quarterly by the National Statistical Institute (ISTAT) to produce estimates of the labor force status of the population at different geographical levels. In particular, ISTAT provides LFS estimates of employed and unemployed counts for local Labor Market Areas (LMAs). LMAs are 611 sub-regional clusters of municipalities and are unplanned domains for which direct estimates have overly large sampling errors. This implies the need of Small Area Estimation (SAE) methods. In this paper, we develop a new area level SAE method that uses a Latent Markov Model (LMM) as linking model. In LMMs, the characteristic of interest, and its evolution in time, is represented by a latent process that follows a Markov chain, usually of first order. Therefore, areas are allowed to change their latent state across time. The proposed model is applied to quarterly data from the LFS for the period 2004 to 2014 and fitted within a hierarchical Bayesian framework using a data augmentation Gibbs sampler. Estimates are compared with those obtained by the classical Fay-Herriot model, by a time-series area level SAE model, and on the basis of data coming from the 2011 Population Census.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254957
    Description:

    When a linear imputation method is used to correct non-response based on certain assumptions, total variance can be assigned to non-responding units. Linear imputation is not as limited as it seems, given that the most common methods – ratio, donor, mean and auxiliary value imputation – are all linear imputation methods. We will discuss the inference framework and the unit-level decomposition of variance due to non-response. Simulation results will also be presented. This decomposition can be used to prioritize non-response follow-up or manual corrections, or simply to guide data analysis.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254958
    Description:

    Domains (or subpopulations) with small sample sizes are called small areas. Traditional direct estimators for small areas do not provide adequate precision because the area-specific sample sizes are small. On the other hand, demand for reliable small area statistics has greatly increased. Model-based indirect estimators of small area means or totals are currently used to address difficulties with direct estimation. These estimators are based on linking models that borrow information across areas to increase the efficiency. In particular, empirical best (EB) estimators under area level and unit level linear regression models with random small area effects have received a lot of attention in the literature. Model mean squared error (MSE) of EB estimators is often used to measure the variability of the estimators. Linearization-based estimators of model MSE as well as jackknife and bootstrap estimators are widely used. On the other hand, National Statistical Agencies are often interested in estimating the design MSE of EB estimators in line with traditional design MSE estimators associated with direct estimators for large areas with adequate sample sizes. Estimators of design MSE of EB estimators can be obtained for area level models but they tend to be unstable when the area sample size is small. Composite MSE estimators are proposed in this paper and they are obtained by taking a weighted sum of the design MSE estimator and the model MSE estimator. Properties of the MSE estimators under the area level model are studied in terms of design bias, relative root mean squared error and coverage rate of confidence intervals. The case of a unit level model is also examined under simple random sampling within each area. Results of a simulation study show that the proposed composite MSE estimators provide a good compromise in estimating the design MSE.

    Release date: 2018-12-20
Data (28)

Data (28) (0 to 10 of 28 results)

  • Public use microdata: 89F0002X
    Description:

    The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and models, analytical software and user documentation.

    Release date: 2018-12-18

  • Data Visualization: 11-627-M2016005
    Description:

    This infographic presents a new interactive data visualization application on domestic regional trade flows in Canada for goods moved by truck and rail, 2004 to 2012. Through chord diagrams, users can look at the interconnectedness of different regions in Canada via their trade ties. They can also use interactive maps to get a picture of geographic trends in trade.

    Release date: 2016-09-22

  • Table: 53-500-X
    Description:

    This report presents the results of a pilot survey conducted by Statistics Canada to measure the fuel consumption of on-road motor vehicles registered in Canada. This study was carried out in connection with the Canadian Vehicle Survey (CVS) which collects information on road activity such as distance traveled, number of passengers and trip purpose.

    Release date: 2004-10-21

  • Table: 13-220-X
    Description:

    In the 1997 edition, new and revised benchmarks were introduced for 1992 and 1988. The indicators are used to monitor supply, demand and employment for tourism in Canada on a timely basis. The annual tables are derived using the National Income and Expenditure Accounts (NIEA) and various industry and travel surveys. Tables providing actual data and percentage changes, for seasonally adjusted current and constant price estimates are included. In addition, an analytical section provides graphs, and time series of first differences, percentage changes, and seasonal factors for selected indicators. Data are published from 1987 and the publication will be available on the day of release. New data are included in the demand tables for non-tourism commodities produced by non-tourism industries and in the employment tables covering direct tourism employment generated by non-tourism industries. This product was commissioned by the Canadian Tourism Commission to provide annual updates for the Tourism Satellite Account.

    Release date: 2003-01-08

  • Table: 11-516-X198300111298
    Description:

    The statistics in this section are mainly from two sources. Series Al-349 are from censuses, or derived from censuses, published by Statistics Canada or its predecessors. Series A350-416 are from the official records of the Department of Employment and Immigration or its predecessors.

    Release date: 1999-07-29

  • Table: 11-516-X198300111299
    Description:

    Statistics in the tables of Section B are in two divisions. Series Bl-81 contain data on vital statistics and series B82-543 on health. Data on social welfare, formerly contained in this section, are presented separately in Section C.

    Release date: 1999-07-29

  • Table: 11-516-X198300111300
    Description:

    The statistics in this section are in six main divisions: federal income security programs (series Cl-195); federal and provincial income insurance programs (series C196-286); cost-shared federal-provincial income security programs (series C287-442); federal and provincial social service programs (series C443-507); provincial-municipal income security programs (series C508-559); government expenditures on social security by broad program areas (series C560-599).

    Release date: 1999-07-29

  • Table: 11-516-X198300111301
    Description:

    This section provides series relating to the labour force, employment, unemployment and job vacancies. For the most part, the series are obtained from publications of Statistics Canada, formerly the Dominion Bureau of Statistics. Some of the older series are directly from census tabulations while others are derived from such tabulations but incorporate adjustments to improve the consistency of the series through time. Many of the series of more recent vintage are derived from the Labour Force Survey. Also included are series from the Statistics Canada Employment Survey, the Statistics Canada Job Vacancy Survey, the set of Help-Wanted Indexes developed in the Department of Finance and taken over subsequently by Statistics Canada, and a few other series.

    Release date: 1999-07-29

  • Table: 11-516-X198300111302
    Description:

    The statistics of this section are in eight parts as follows: labour income (series E1-40); employment, earnings and hours of work (series E41-135); employer labour cost (series E136-151); unemployment insurance (series E152-171); employment service (series E172-174); labour unions and strikes and lockouts (series E175-197); index numbers of wage rates, wage rates and salaries (series E198-375); workmen's compensation (series E376-389).

    Release date: 1999-07-29

  • Table: 11-516-X198300111303
    Description:

    The statistical data of this section are in five subsections. They contain data on national income and expenditure and related aggregates from 1926 to 1976 in series F1-152; on income produced, by industry, from 1919 to 1926 and on gross capital formation from 1901 to 1930 in series F153-182; on the stock of tangible capital from 1926 onwards in series F183-220 and on inventory book values in series F221-224; on real gross domestic product by industry in series F225-240; and on indexes of labour productivity in series F241-294.

    Release date: 1999-07-29
Analysis (1,573)

Analysis (1,573) (0 to 10 of 1,573 results)

  • Journals and periodicals: 75F0002M
    Geography: Canada
    Description:

    This series provides detailed documentation on income developments, including survey design issues, data quality evaluation and exploratory research.

    Release date: 2019-01-29

  • Articles and reports: 12-001-X201800254952
    Description:

    Panel surveys are frequently used to measure the evolution of parameters over time. Panel samples may suffer from different types of unit non-response, which is currently handled by estimating the response probabilities and by reweighting respondents. In this work, we consider estimation and variance estimation under unit non-response for panel surveys. Extending the work by Kim and Kim (2007) for several times, we consider a propensity score adjusted estimator accounting for initial non-response and attrition, and propose a suitable variance estimator. It is then extended to cover most estimators encountered in surveys, including calibrated estimators, complex parameters and longitudinal estimators. The properties of the proposed variance estimator and of a simplified variance estimator are estimated through a simulation study. An illustration of the proposed methods on data from the ELFE survey is also presented.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254953
    Description:

    Sample coordination seeks to create a probabilistic dependence between the selection of two or more samples drawn from the same population or from overlapping populations. Positive coordination increases the expected sample overlap, while negative coordination decreases it. There are numerous applications for sample coordination with varying objectives. A spatially balanced sample is a sample that is well-spread in some space. Forcing a spread within the selected samples is a general and very efficient variance reduction technique for the Horvitz-Thompson estimator. The local pivotal method and the spatially correlated Poisson sampling are two general schemes for achieving well-spread samples. We aim to introduce coordination for these sampling methods based on the concept of permanent random numbers. The goal is to coordinate such samples while preserving spatial balance. The proposed methods are motivated by examples from forestry, environmental studies, and official statistics.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254954
    Description:

    These last years, balanced sampling techniques have experienced a recrudescence of interest. They constrain the Horvitz Thompson estimators of the totals of auxiliary variables to be equal, at least approximately, to the corresponding true totals, to avoid the occurrence of bad samples. Several procedures are available to carry out balanced sampling; there is the cube method, see Deville and Tillé (2004), and an alternative, the rejective algorithm introduced by Hájek (1964). After a brief review of these sampling methods, motivated by the planning of an angler survey, we investigate using Monte Carlo simulations, the survey designs produced by these two sampling algorithms.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254955
    Description:

    Many studies conducted by various electric utilities around the world are based on the analysis of mean electricity consumption curves for various subpopulations, particularly geographic in nature. Those mean curves are estimated from samples of thousands of curves measured at very short intervals over long periods. Estimation for small subpopulations, also called small domains, is a very timely topic in sampling theory.

    In this article, we will examine this problem based on functional data and we will try to estimate the mean curves for small domains. For this, we propose four methods: functional linear regression; modelling the scores of a principal component analysis by unit-level linear mixed models; and two non-parametric estimators, with one based on regression trees and the other on random forests, adapted to the curves. All these methods have been tested and compared using real electricity consumption data for households in France.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254956
    Description:

    In Italy, the Labor Force Survey (LFS) is conducted quarterly by the National Statistical Institute (ISTAT) to produce estimates of the labor force status of the population at different geographical levels. In particular, ISTAT provides LFS estimates of employed and unemployed counts for local Labor Market Areas (LMAs). LMAs are 611 sub-regional clusters of municipalities and are unplanned domains for which direct estimates have overly large sampling errors. This implies the need of Small Area Estimation (SAE) methods. In this paper, we develop a new area level SAE method that uses a Latent Markov Model (LMM) as linking model. In LMMs, the characteristic of interest, and its evolution in time, is represented by a latent process that follows a Markov chain, usually of first order. Therefore, areas are allowed to change their latent state across time. The proposed model is applied to quarterly data from the LFS for the period 2004 to 2014 and fitted within a hierarchical Bayesian framework using a data augmentation Gibbs sampler. Estimates are compared with those obtained by the classical Fay-Herriot model, by a time-series area level SAE model, and on the basis of data coming from the 2011 Population Census.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254957
    Description:

    When a linear imputation method is used to correct non-response based on certain assumptions, total variance can be assigned to non-responding units. Linear imputation is not as limited as it seems, given that the most common methods – ratio, donor, mean and auxiliary value imputation – are all linear imputation methods. We will discuss the inference framework and the unit-level decomposition of variance due to non-response. Simulation results will also be presented. This decomposition can be used to prioritize non-response follow-up or manual corrections, or simply to guide data analysis.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254958
    Description:

    Domains (or subpopulations) with small sample sizes are called small areas. Traditional direct estimators for small areas do not provide adequate precision because the area-specific sample sizes are small. On the other hand, demand for reliable small area statistics has greatly increased. Model-based indirect estimators of small area means or totals are currently used to address difficulties with direct estimation. These estimators are based on linking models that borrow information across areas to increase the efficiency. In particular, empirical best (EB) estimators under area level and unit level linear regression models with random small area effects have received a lot of attention in the literature. Model mean squared error (MSE) of EB estimators is often used to measure the variability of the estimators. Linearization-based estimators of model MSE as well as jackknife and bootstrap estimators are widely used. On the other hand, National Statistical Agencies are often interested in estimating the design MSE of EB estimators in line with traditional design MSE estimators associated with direct estimators for large areas with adequate sample sizes. Estimators of design MSE of EB estimators can be obtained for area level models but they tend to be unstable when the area sample size is small. Composite MSE estimators are proposed in this paper and they are obtained by taking a weighted sum of the design MSE estimator and the model MSE estimator. Properties of the MSE estimators under the area level model are studied in terms of design bias, relative root mean squared error and coverage rate of confidence intervals. The case of a unit level model is also examined under simple random sampling within each area. Results of a simulation study show that the proposed composite MSE estimators provide a good compromise in estimating the design MSE.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254959
    Description:

    This article proposes a criterion for calculating the trade-off in so-called “mixed” allocations, which combine two classic allocations in sampling theory. In INSEE (National Institute of Statistics and Economic Studies) business surveys, it is common to use the arithmetic mean of a proportional allocation and a Neyman allocation (corresponding to a trade-off of 0.5). It is possible to obtain a trade-off value resulting in better properties for the estimators. This value belongs to a region that is obtained by solving an optimization program. Different methods for calculating the trade-off will be presented. An application for business surveys is presented, as well as a comparison with other usual trade-off allocations.

    Release date: 2018-12-20

  • Articles and reports: 12-001-X201800254960
    Description:

    Based on auxiliary information, calibration is often used to improve the precision of estimates. However, calibration weighting may not be appropriate for all variables of interest of the survey, particularly those not related to the auxiliary variables used in calibration. In this paper, we propose a criterion to assess, for any variable of interest, the impact of calibration weighting on the precision of the estimated total. This criterion can be used to decide on the weights associated with each survey variable of interest and determine the variables for which calibration weighting is appropriate.

    Release date: 2018-12-20
Reference (455)

Reference (455) (0 to 10 of 455 results)

  • Surveys and statistical programs – Documentation: 75-005-M2019001
    Description:

    The production of statistics from the Labour Force Survey (LFS) involves many activities, one of which is data processing. This step involves the verification and correction of survey data when required in order to produce microdata files. Beginning in January 2019, LFS processing will be transitioned to a new system, the Social Survey Processing Environment. This document describes the development and testing that preceded the implementation of the new system, and demonstrates that the transition is expected to have minimal impact on LFS estimates and be transparent to users of LFS data.

    Release date: 2019-02-08

  • Notices and consultations: 13-605-X
    Description:

    This product contains articles related to the latest methodological, conceptual developments in the Canadian System of Macroeconomic Accounts as well as the analysis of the Canadian economy. It includes articles detailing new methods, concepts and statistical techniques used to compile the Canadian System of Macroeconomic Accounts. It also includes information related to new or expanded data products, provides updates and supplements to information found in various guides and analytical articles touching upon a broad range of topics related to the Canadian economy.

    Release date: 2019-01-25

  • Surveys and statistical programs – Documentation: 98-306-X
    Description:

    This report describes sampling, weighting and estimation procedures used in the 2016 Census of Population. It provides operational and theoretical justifications for them, and presents the results of the evaluations of these procedures.

    Release date: 2018-09-11

  • Surveys and statistical programs – Documentation: 75-514-G
    Description:

    The Guide to the Job Vacancy and Wage Survey contains a dictionary of concepts and definitions, and covers topics such as survey methodology, data collection, processing, and data quality. The guide covers both components of the survey: the job vacancy component, which is quarterly, and the wage component, which is annual.

    Release date: 2018-07-12

  • Surveys and statistical programs – Documentation: 84-538-X
    Geography: Canada
    Description:

    This document presents the methodology underlying the production of the life tables for Canada, provinces and territories, from reference period 1980/1982 and onward.

    Release date: 2018-06-28

  • Surveys and statistical programs – Documentation: 71-526-X
    Description:

    The Canadian Labour Force Survey (LFS) is the official source of monthly estimates of total employment and unemployment. Following the 2011 census, the LFS underwent a sample redesign to account for the evolution of the population and labour market characteristics, to adjust to changes in the information needs and to update the geographical information used to carry out the survey. The redesign program following the 2011 census culminated with the introduction of a new sample at the beginning of 2015. This report is a reference on the methodological aspects of the LFS, covering stratification, sampling, collection, processing, weighting, estimation, variance estimation and data quality.

    Release date: 2017-12-21

  • Surveys and statistical programs – Documentation: 12-606-X
    Description:

    This is a toolkit intended to aid data producers and data users external to Statistics Canada.

    Release date: 2017-09-27

  • Surveys and statistical programs – Documentation: 91F0015M2017013
    Description:

    Using records linkage, this article compares the place of residence in the 2011 Census to that of the 2010 T1 Family File (T1FF). The main result is that although the overall level of consistency in the place of residence is relatively high, it decreases, sometimes substantially, for some segments of the population.

    Release date: 2017-09-26

  • Surveys and statistical programs – Documentation: 11-633-X2017007
    Description:

    The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 30 years. The IMDB combines administrative files on immigrant admissions and non-permanent resident permits from Immigration, Refugees and Citizenship Canada (IRCC) with tax files from the Canadian Revenue Agency (CRA). Information is available for immigrant taxfilers admitted since 1980. Tax records for 1982 and subsequent years are available for immigrant taxfilers.

    This report will discuss the IMDB data sources, concepts and variables, record linkage, data processing, dissemination, data evaluation and quality indicators, comparability with other immigration datasets, and the analyses possible with the IMDB.

    Release date: 2017-06-16

  • Surveys and statistical programs – Documentation: 12-586-X
    Description:

    The Quality Assurance Framework (QAF) serves as the highest-level governance tool for quality management at Statistics Canada. The QAF gives an overview of the quality management and risk mitigation strategies used by the Agency’s program areas. The QAF is used in conjunction with Statistics Canada management practices, such as those described in the Quality Guidelines.

    Release date: 2017-04-21

Browse our partners page to find a complete list of our partners and their associated products.

Date modified: