Keyword search

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Geography

3 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (472)

All (472) (0 to 10 of 472 results)

  • Public use microdata: 89F0002X
    Description: The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and models, analytical software and user documentation.
    Release date: 2024-02-02

  • Articles and reports: 11-522-X202100100009
    Description:

    Use of auxiliary data to improve the efficiency of estimators of totals and means through model-assisted survey regression estimation has received considerable attention in recent years. Generalized regression (GREG) estimators, based on a working linear regression model, are currently used in establishment surveys at Statistics Canada and several other statistical agencies.  GREG estimators use common survey weights for all study variables and calibrate to known population totals of auxiliary variables. Increasingly, many auxiliary variables are available, some of which may be extraneous. This leads to unstable GREG weights when all the available auxiliary variables, including interactions among categorical variables, are used in the working linear regression model. On the other hand, new machine learning methods, such as regression trees and lasso, automatically select significant auxiliary variables and lead to stable nonnegative weights and possible efficiency gains over GREG.  In this paper, a simulation study, based on a real business survey sample data set treated as the target population, is conducted to study the relative performance of GREG, regression trees and lasso in terms of efficiency of the estimators.

    Key Words: Model assisted inference; calibration estimation; model selection; generalized regression estimator.

    Release date: 2021-10-29

  • Articles and reports: 11-522-X202100100001
    Description:

    We consider regression analysis in the context of data integration. To combine partial information from external sources, we employ the idea of model calibration which introduces a “working” reduced model based on the observed covariates. The working reduced model is not necessarily correctly specified but can be a useful device to incorporate the partial information from the external data. The actual implementation is based on a novel application of the empirical likelihood method. The proposed method is particularly attractive for combining information from several sources with different missing patterns. The proposed method is applied to a real data example combining survey data from Korean National Health and Nutrition Examination Survey and big data from National Health Insurance Sharing Service in Korea.

    Key Words: Big data; Empirical likelihood; Measurement error models; Missing covariates.

    Release date: 2021-10-15

  • Surveys and statistical programs – Documentation: 11-633-X2021005
    Description:

    The Analytical Studies and Modelling Branch (ASMB) is the research arm of Statistics Canada mandated to provide high-quality, relevant and timely information on economic, health and social issues that are important to Canadians. The branch strategically makes use of expert knowledge and a broad range of data sources and modelling techniques to address the information needs of a broad range of government, academic and public sector partners and stakeholders through analysis and research, modeling and predictive analytics, and data development. The branch strives to deliver relevant, high-quality, timely, comprehensive, horizontal and integrated research and to enable the use of its research through capacity building and strategic dissemination to meet the user needs of policy makers, academics and the general public.

    This Multi-year Consolidated Plan for Research, Modelling and Data Development outlines the priorities for the branch over the next two years.

    Release date: 2021-08-12

  • Articles and reports: 62F0026M2020001
    Description:

    Since the 2010 Survey of Household Spending redesign, statistics on the annual proportion of households reporting expenditures and the annual average expenditure per reporting household have not been available for many good and service categories. To help fill this data gap for users, a statistical model was developed in order to produce approximations of these statistics. This product consists of data tables and a user guide.

    Release date: 2021-01-07

  • 36-23-0001
    Description: Input-output (IO) models are generally used to simulate the economic impacts of an expenditure on a given basket of goods and services or the output of one or several industries. The simulation results from a “shock” to an IO model will show the direct, indirect and induced impacts on Gross Domestic Product (GDP), which industries benefit the most, the number of jobs created, estimates of indirect taxes and subsidies generated, etc. The model also includes estimates of the impacts on energy use (expressed in terajoules) and greenhouse gas emissions (carbon dioxide equivalent, expressed in kilotonnes). IO price, energy, and tax models may also be available depending on the availability of resources. For more details, ask us for the Guide to using the input-output simulation model, available upon request.
    Release date: 2020-11-23

  • 36-23-0002
    Description: Input-output (IO) models are generally used to simulate the economic impacts of an expenditure on a given basket of goods and services or the output of one or several industries. The simulation results from a “shock” to an IO model will show the direct, indirect and induced impacts on Gross Domestic Product (GDP), which industries benefit the most, the number of jobs created, estimates of indirect taxes and subsidies generated, etc. The model also includes an estimate of the impact on interprovincial trade flows. IO price, energy, and tax models may also be available depending on the availability of resources. For more details, ask us for the Guide to using the input-output simulation model, available upon request.
    Release date: 2020-11-23

  • Articles and reports: 82-003-X202001100002
    Description:

    Using data from the 2003 to 2013 cycles of the Canadian Community Health Survey, this study’s objective was to characterize smoking history by sex using birth cohorts beginning in 1920. Smoking histories for each birth cohort included age at smoking initiation and cessation, which was used to construct smoking prevalence for each calendar year from 1971 to 2041. A secondary objective was to characterize smoking history by socioeconomic status.

    Release date: 2020-11-18

  • Surveys and statistical programs – Documentation: 12-539-X
    Description:

    This document brings together guidelines and checklists on many issues that need to be considered in the pursuit of quality objectives in the execution of statistical activities. Its focus is on how to assure quality through effective and appropriate design or redesign of a statistical project or program from inception through to data evaluation, dissemination and documentation. These guidelines draw on the collective knowledge and experience of many Statistics Canada employees. It is expected that Quality Guidelines will be useful to staff engaged in the planning and design of surveys and other statistical projects, as well as to those who evaluate and analyze the outputs of these projects.

    Release date: 2019-12-04

  • Surveys and statistical programs – Documentation: 15F0004X
    Description:

    The input-output (IO) models are generally used to simulate the economic impacts of an expenditure on a given basket of goods and services or the output of one or several industries. The simulation results from a "shock" to an IO model will show the direct, indirect and induced impacts on GDP, which industries benefit the most, the number of jobs created, estimates of indirect taxes and subsidies generated, etc. For more details, ask us for the Guide to using the input-output simulation model, available free of charge upon request.

    At various times, clients have requested the use of IO price, energy, tax and market models. Given their availability, arrangements can be made to use these models on request.

    The national IO model was not released in 2015 or 2016.

    Release date: 2019-04-04
Data (3)

Data (3) ((3 results))

  • Public use microdata: 89F0002X
    Description: The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and models, analytical software and user documentation.
    Release date: 2024-02-02

  • Public use microdata: 12M0014X
    Geography: Province or territory
    Description:

    This report presents a brief overview of the information collected in Cycle 14 of the General Social Survey (GSS). Cycle 14 is the first cycle to collect detailed information on access to and use of information communication technology in Canada. Topics include general use of technology and computers, technology in the workplace, development of computer skills, frequency of Internet and E-mail use, non-users and security and information on the Internet. The target population of the GSS is all individuals aged 15 and over living in a private household in one of the ten provinces.

    Release date: 2001-06-29

  • Public use microdata: 82M0009X
    Description:

    The National Population Health Survey (NPHS) used the Labour Force Survey sampling frame to draw the initial sample of approximately 20,000 households starting in 1994 and for the sample top-up this third cycle. The survey is conducted every two years. The sample collection is distributed over four quarterly periods followed by a follow-up period and the whole process takes a year. In each household, some limited health information is collected from all household members and one person in each household is randomly selected for a more in-depth interview.

    The survey is designed to collect information on the health of the Canadian population and related socio-demographic information. The first cycle of data collection began in 1994, and continues every second year thereafter. The survey is designed to produce both cross-sectional and longitudinal estimates. The questionnaires includes content related to health status, use of health services, determinants of health, a health index, chronic conditions and activity restrictions. The use of health services is probed through visits to health care providers, both traditional and non-traditional, and the use of drugs and other mediciations. Health determinants include smoking, alcohol use and physical activity. A special focus content for this cycle includes family medical history with questions about certain chronic conditions among immediate family members and when they were acquired. As well, a section on self care has also been included this cycle. The socio-demographic information includes age, sex, education, ethnicity, household income and labour force status.

    Release date: 2000-12-19
Analysis (435)

Analysis (435) (0 to 10 of 435 results)

  • Articles and reports: 11-522-X202100100009
    Description:

    Use of auxiliary data to improve the efficiency of estimators of totals and means through model-assisted survey regression estimation has received considerable attention in recent years. Generalized regression (GREG) estimators, based on a working linear regression model, are currently used in establishment surveys at Statistics Canada and several other statistical agencies.  GREG estimators use common survey weights for all study variables and calibrate to known population totals of auxiliary variables. Increasingly, many auxiliary variables are available, some of which may be extraneous. This leads to unstable GREG weights when all the available auxiliary variables, including interactions among categorical variables, are used in the working linear regression model. On the other hand, new machine learning methods, such as regression trees and lasso, automatically select significant auxiliary variables and lead to stable nonnegative weights and possible efficiency gains over GREG.  In this paper, a simulation study, based on a real business survey sample data set treated as the target population, is conducted to study the relative performance of GREG, regression trees and lasso in terms of efficiency of the estimators.

    Key Words: Model assisted inference; calibration estimation; model selection; generalized regression estimator.

    Release date: 2021-10-29

  • Articles and reports: 11-522-X202100100001
    Description:

    We consider regression analysis in the context of data integration. To combine partial information from external sources, we employ the idea of model calibration which introduces a “working” reduced model based on the observed covariates. The working reduced model is not necessarily correctly specified but can be a useful device to incorporate the partial information from the external data. The actual implementation is based on a novel application of the empirical likelihood method. The proposed method is particularly attractive for combining information from several sources with different missing patterns. The proposed method is applied to a real data example combining survey data from Korean National Health and Nutrition Examination Survey and big data from National Health Insurance Sharing Service in Korea.

    Key Words: Big data; Empirical likelihood; Measurement error models; Missing covariates.

    Release date: 2021-10-15

  • Articles and reports: 62F0026M2020001
    Description:

    Since the 2010 Survey of Household Spending redesign, statistics on the annual proportion of households reporting expenditures and the annual average expenditure per reporting household have not been available for many good and service categories. To help fill this data gap for users, a statistical model was developed in order to produce approximations of these statistics. This product consists of data tables and a user guide.

    Release date: 2021-01-07

  • Articles and reports: 82-003-X202001100002
    Description:

    Using data from the 2003 to 2013 cycles of the Canadian Community Health Survey, this study’s objective was to characterize smoking history by sex using birth cohorts beginning in 1920. Smoking histories for each birth cohort included age at smoking initiation and cessation, which was used to construct smoking prevalence for each calendar year from 1971 to 2041. A secondary objective was to characterize smoking history by socioeconomic status.

    Release date: 2020-11-18

  • Articles and reports: 12-001-X201800154928
    Description:

    A two-phase process was used by the Substance Abuse and Mental Health Services Administration to estimate the proportion of US adults with serious mental illness (SMI). The first phase was the annual National Survey on Drug Use and Health (NSDUH), while the second phase was a random subsample of adult respondents to the NSDUH. Respondents to the second phase of sampling were clinically evaluated for serious mental illness. A logistic prediction model was fit to this subsample with the SMI status (yes or no) determined by the second-phase instrument treated as the dependent variable and related variables collected on the NSDUH from all adults as the model’s explanatory variables. Estimates were then computed for SMI prevalence among all adults and within adult subpopulations by assigning an SMI status to each NSDUH respondent based on comparing his (her) estimated probability of having SMI to a chosen cut point on the distribution of the predicted probabilities. We investigate alternatives to this standard cut point estimator such as the probability estimator. The latter assigns an estimated probability of having SMI to each NSDUH respondent. The estimated prevalence of SMI is the weighted mean of those estimated probabilities. Using data from NSDUH and its subsample, we show that, although the probability estimator has a smaller mean squared error when estimating SMI prevalence among all adults, it has a greater tendency to be biased at the subpopulation level than the standard cut point estimator.

    Release date: 2018-06-21

  • Articles and reports: 12-001-X201800154963
    Description:

    The probability-sampling-based framework has dominated survey research because it provides precise mathematical tools to assess sampling variability. However increasing costs and declining response rates are expanding the use of non-probability samples, particularly in general population settings, where samples of individuals pulled from web surveys are becoming increasingly cheap and easy to access. But non-probability samples are at risk for selection bias due to differential access, degrees of interest, and other factors. Calibration to known statistical totals in the population provide a means of potentially diminishing the effect of selection bias in non-probability samples. Here we show that model calibration using adaptive LASSO can yield a consistent estimator of a population total as long as a subset of the true predictors is included in the prediction model, thus allowing large numbers of possible covariates to be included without risk of overfitting. We show that the model calibration using adaptive LASSO provides improved estimation with respect to mean square error relative to standard competitors such as generalized regression (GREG) estimators when a large number of covariates are required to determine the true model, with effectively no loss in efficiency over GREG when smaller models will suffice. We also derive closed form variance estimators of population totals, and compare their behavior with bootstrap estimators. We conclude with a real world example using data from the National Health Interview Survey.

    Release date: 2018-06-21

  • Articles and reports: 11-633-X2017008
    Description:

    The DYSEM microsimulation modelling platform provides a demographic and socioeconomic core that can be readily built upon to develop custom dynamic microsimulation models or applications. This paper describes DYSEM and provides an overview of its intended uses, as well as the methods and data used in its development.

    Release date: 2017-07-28

  • Articles and reports: 13-604-M2017083
    Description:

    Statistics Canada regularly publishes macroeconomic indicators on household assets, liabilities and net worth as part of the quarterly National Balance Sheet Accounts (NBSA). These accounts are aligned with the most recent international standards and are the source of estimates of national wealth for all sectors of the economy, including households, non-profit institutions, governments and corporations along with Canada’s wealth position vis-a-vis the rest of the world. While the NBSA provide high quality information on the overall position of households relative to other economic sectors, they lack the granularity required to understand vulnerabilities of specific groups and the resulting implications for economic wellbeing and financial stability.

    Release date: 2017-03-15

  • Journals and periodicals: 91-621-X
    Description:

    This document briefly describes Demosim, the microsimulation population projection model, how it works as well as its methods and data sources. It is a methodological complement to the analytical products produced using Demosim.

    Release date: 2017-01-25

  • Articles and reports: 12-001-X201600114538
    Description:

    The aim of automatic editing is to use a computer to detect and amend erroneous values in a data set, without human intervention. Most automatic editing methods that are currently used in official statistics are based on the seminal work of Fellegi and Holt (1976). Applications of this methodology in practice have shown systematic differences between data that are edited manually and automatically, because human editors may perform complex edit operations. In this paper, a generalization of the Fellegi-Holt paradigm is proposed that can incorporate a large class of edit operations in a natural way. In addition, an algorithm is outlined that solves the resulting generalized error localization problem. It is hoped that this generalization may be used to increase the suitability of automatic editing in practice, and hence to improve the efficiency of data editing processes. Some first results on synthetic data are promising in this respect.

    Release date: 2016-06-22
Reference (32)

Reference (32) (0 to 10 of 32 results)

  • Surveys and statistical programs – Documentation: 11-633-X2021005
    Description:

    The Analytical Studies and Modelling Branch (ASMB) is the research arm of Statistics Canada mandated to provide high-quality, relevant and timely information on economic, health and social issues that are important to Canadians. The branch strategically makes use of expert knowledge and a broad range of data sources and modelling techniques to address the information needs of a broad range of government, academic and public sector partners and stakeholders through analysis and research, modeling and predictive analytics, and data development. The branch strives to deliver relevant, high-quality, timely, comprehensive, horizontal and integrated research and to enable the use of its research through capacity building and strategic dissemination to meet the user needs of policy makers, academics and the general public.

    This Multi-year Consolidated Plan for Research, Modelling and Data Development outlines the priorities for the branch over the next two years.

    Release date: 2021-08-12

  • Surveys and statistical programs – Documentation: 12-539-X
    Description:

    This document brings together guidelines and checklists on many issues that need to be considered in the pursuit of quality objectives in the execution of statistical activities. Its focus is on how to assure quality through effective and appropriate design or redesign of a statistical project or program from inception through to data evaluation, dissemination and documentation. These guidelines draw on the collective knowledge and experience of many Statistics Canada employees. It is expected that Quality Guidelines will be useful to staff engaged in the planning and design of surveys and other statistical projects, as well as to those who evaluate and analyze the outputs of these projects.

    Release date: 2019-12-04

  • Surveys and statistical programs – Documentation: 15F0004X
    Description:

    The input-output (IO) models are generally used to simulate the economic impacts of an expenditure on a given basket of goods and services or the output of one or several industries. The simulation results from a "shock" to an IO model will show the direct, indirect and induced impacts on GDP, which industries benefit the most, the number of jobs created, estimates of indirect taxes and subsidies generated, etc. For more details, ask us for the Guide to using the input-output simulation model, available free of charge upon request.

    At various times, clients have requested the use of IO price, energy, tax and market models. Given their availability, arrangements can be made to use these models on request.

    The national IO model was not released in 2015 or 2016.

    Release date: 2019-04-04

  • Surveys and statistical programs – Documentation: 15F0009X
    Description:

    The input-output (IO) models are generally used to simulate the economic impacts of an expenditure on a given basket of goods and services or the output of one or several industries. The simulation results from a "shock" to an IO model will show the direct, indirect and induced impacts on GDP, which industries benefit the most, the number of jobs created, estimates of indirect taxes and subsidies generated, etc. For more details, ask us for the Guide to using the input-output simulation model, available free of charge upon request.

    At various times, clients have requested the use of IO price, energy, tax and market models. Given their availability, arrangements can be made to use these models on request.

    The interprovincial IO model was not released in 2015 or 2016.

    Release date: 2019-04-04

  • Surveys and statistical programs – Documentation: 71-526-X
    Description:

    The Canadian Labour Force Survey (LFS) is the official source of monthly estimates of total employment and unemployment. Following the 2011 census, the LFS underwent a sample redesign to account for the evolution of the population and labour market characteristics, to adjust to changes in the information needs and to update the geographical information used to carry out the survey. The redesign program following the 2011 census culminated with the introduction of a new sample at the beginning of 2015. This report is a reference on the methodological aspects of the LFS, covering stratification, sampling, collection, processing, weighting, estimation, variance estimation and data quality.

    Release date: 2017-12-21

  • Notices and consultations: 92-140-X2016001
    Description:

    The 2016 Census Program Content Test was conducted from May 2 to June 30, 2014. The Test was designed to assess the impact of any proposed content changes to the 2016 Census Program and to measure the impact of including a social insurance number (SIN) question on the data quality.

    This quantitative test used a split-panel design involving 55,000 dwellings, divided into 11 panels of 5,000 dwellings each: five panels were dedicated to the Content Test while the remaining six panels were for the SIN Test. Two models of test questionnaires were developed to meet the objectives, namely a model with all the proposed changes EXCEPT the SIN question and a model with all the proposed changes INCLUDING the SIN question. A third model of 'control' questionnaire with the 2011 content was also developed. The population living in a private dwelling in mail-out areas in one of the ten provinces was targeted for the test. Paper and electronic response channels were part of the Test as well.

    This report presents the Test objectives, the design and a summary of the analysis in order to determine potential content for the 2016 Census Program. Results from the data analysis of the Test were not the only elements used to determine the content for 2016. Other elements were also considered, such as response burden, comparison over time and users’ needs.

    Release date: 2016-04-01

  • Surveys and statistical programs – Documentation: 62F0026M2005006
    Description:

    This report describes the quality indicators produced for the 2003 Survey of Household Spending. These quality indicators, such as coefficients of variation, nonresponse rates, slippage rates and imputation rates, help users interpret the survey data.

    Release date: 2005-10-06

  • Surveys and statistical programs – Documentation: 15-002-M2001001
    Description:

    This document describes the sources, concepts and methods utilized by the Canadian Productivity Accounts and discusses how they compare with their U.S. counterparts.

    Release date: 2004-12-24

  • Notices and consultations: 13-605-X20020038512
    Description:

    As of September 30, 2002 the monthly GDP by industry estimates will incorporate the Chain Fisher formula. This change will be applied from January 1997 and will be pushed back to January 1961 within a year.

    Release date: 2002-09-30

  • Notices and consultations: 13-605-X20010018529
    Description:

    As of May 31, 2001 the Quarterly Income and Expenditure Accounts will have adopted the following change: Chain Fisher formula.

    Release date: 2001-05-31
Date modified: