Other content related to Statistical methods

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Geography

2 facets displayed. 0 facets selected.

Survey or statistical program

2 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (162)

All (162) (20 to 30 of 162 results)

  • Articles and reports: 11-522-X202100100012
    Description: The modernization of price statistics by National Statistical Offices (NSO) such as Statistics Canada focuses on the adoption of alternative data sources that include the near-universe of all products sold in the country, a scale that requires machine learning classification of the data. The process of evaluating classifiers to select appropriate ones for production, as well as monitoring classifiers once in production, needs to be based on robust metrics to measure misclassification. As commonly utilized metrics, such as the Fß-score may not take into account key aspects applicable to prices statistics in all cases, such as unequal importance of categories, a careful consideration of the metric space is necessary to select appropriate methods to evaluate classifiers. This working paper provides insight on the metric space applicable to price statistics and proposes an operational framework to evaluate and monitor classifiers, focusing specifically on the needs of the Canadian Consumer Prices Index and demonstrating discussed metrics using a publicly available dataset.

    Key Words: Consumer price index; supervised classification; evaluation metrics; taxonomy

    Release date: 2021-11-05

  • Articles and reports: 11-522-X202100100013
    Description: Statistics Canada’s Labour Force Survey (LFS) plays a fundamental role in the mandate of Statistics Canada. The labour market information provided by the LFS is among the most timely and important measures of the Canadian economy’s overall performance. An integral part of the LFS monthly data processing is the coding of respondent’s industry according to the North American Industrial Classification System (NAICS), occupation according to the National Occupational Classification System (NOC) and the Primary Class of Workers (PCOW). Each month, up to 20,000 records are coded manually. In 2020, Statistics Canada worked on developing Machine Learning models using fastText to code responses to the LFS questionnaire according to the three classifications mentioned previously. This article will provide an overview on the methodology developed and results obtained from a potential application of the use of fastText into the LFS coding process. 

    Key Words: Machine Learning; Labour Force Survey; Text classification; fastText.

    Release date: 2021-11-05

  • Articles and reports: 11-522-X202100100028
    Description:

    Many Government of Canada groups are developing codes to process and visualize various kinds data, often duplicating each other’s efforts, with sub-optimal efficiency and limited level of code quality reviewing. This paper informally presents a working-level approach to addressing this technical problem. The idea is to collaboratively build a common repository of code and knowledgebase for use by anyone in the public sector to perform many common data science tasks, and, in doing that, help each other to master both the data science coding skills and the industry standard collaborative practices. The paper explains why R language is used as the language of choice for collaborative data science code development. It summaries R advantages and addresses its limitations, establishes the taxonomy of discussion topics of highest interested to the GC data scientists working with R, provides an overview of used collaborative platforms, and presents the results obtained to date. Even though the code knowledgebase is developed mainly in R, it is meant to be valuable also for data scientists coding in Python and other development environments. Key Words: Collaboration; Data science; Data Engineering; R; Open Government; Open Data; Open Science

    Release date: 2021-10-29

  • Articles and reports: 11-522-X202100100001
    Description:

    We consider regression analysis in the context of data integration. To combine partial information from external sources, we employ the idea of model calibration which introduces a “working” reduced model based on the observed covariates. The working reduced model is not necessarily correctly specified but can be a useful device to incorporate the partial information from the external data. The actual implementation is based on a novel application of the empirical likelihood method. The proposed method is particularly attractive for combining information from several sources with different missing patterns. The proposed method is applied to a real data example combining survey data from Korean National Health and Nutrition Examination Survey and big data from National Health Insurance Sharing Service in Korea.

    Key Words: Big data; Empirical likelihood; Measurement error models; Missing covariates.

    Release date: 2021-10-15

  • Articles and reports: 11-522-X202100100002
    Description:

    A framework for the responsible use of machine learning processes has been developed at Statistics Canada. The framework includes guidelines for the responsible use of machine learning and a checklist, which are organized into four themes: respect for people, respect for data, sound methods, and sound application. All four themes work together to ensure the ethical use of both the algorithms and results of machine learning. The framework is anchored in a vision that seeks to create a modern workplace and provide direction and support to those who use machine learning techniques. It applies to all statistical programs and projects conducted by Statistics Canada that use machine learning algorithms. This includes supervised and unsupervised learning algorithms. The framework and associated guidelines will be presented first. The process of reviewing projects that use machine learning, i.e., how the framework is applied to Statistics Canada projects, will then be explained. Finally, future work to improve the framework will be described.

    Keywords: Responsible machine learning, explainability, ethics

    Release date: 2021-10-15

  • Articles and reports: 11-522-X202100100003
    Description:

    The increasing size and richness of digital data allow for modeling more complex relationships and interactions, which is the strongpoint of machine learning. Here we applied gradient boosting to the Dutch system of social statistical datasets to estimate transition probabilities into and out of poverty. Individual estimates are reasonable, but the main advantages of the approach in combination with SHAP and global surrogate models are the simultaneous ranking of hundreds of features by their importance, detailed insight into their relationship with the transition probabilities, and the data-driven identification of subpopulations with relatively high and low transition probabilities. In addition, we decompose the difference in feature importance between general and subpopulation into a frequency and a feature effect. We caution for misinterpretation and discuss future directions.

    Key Words: Classification; Explainability; Gradient boosting; Life event; Risk factors; SHAP decomposition.

    Release date: 2021-10-15

  • Articles and reports: 11-522-X202100100019
    Description: Official statistical agencies must continually seek new methods and techniques that can increase both program efficiency and product relevance. The U.S. Census Bureau’s measurement of construction activity is currently a resource-intensive endeavor, relying heavily on monthly survey response via questionnaires and extensive field data collection. While our data users continually require more timely and granular data products, the traditional survey approach and associated collection cost and respondent burden limits our ability to meet that need. In 2019, we began research on whether the application of machine learning techniques to satellite imagery could accurately estimate housing starts and completions while meeting existing monthly indicator timelines at a cost equal to or less than existing methods. Using historical Census construction survey data in combination with targeted satellite imagery, the team trained, tested, and validated convolutional neural networks capable of classifying images by their stage of construction demonstrating the viability of a data science-based approach to producing official measures of construction activity.

    Key Words: Official Statistics; Housing Starts, Machine Learning, Satellite Imagery

    Release date: 2021-10-15

  • 19-22-0007
    Description:

    Course Duration: 2 days

    Course Cost: There is no cost for Statistics Canada employees. The cost for external participants is $200 per day.

    Course Language: Offered in English and in French

    Pre-requisites: Knowledge of SAS is highly recommended. Knowledge equivalent to the SAS 9 Programming 1: Essentials course is a minimum.

    To familiarize participants with raking methods and software. Raking deals with the problem of restoring cross-sectional aggregation constraints in time series systems. Optionally, temporal constraints can also be preserved. We also use the words reconciliation and balancing.

    Benefits to Participants: Upon completion of the course, the participants will be able to understand some of the raking techniques in use at Statistics Canada. They will acquire the technical knowledge to run PROC TSRAKING, aSAS procedure developed at Statistics Canada. The course is practical, technical and theoretical.

    Course outline: Introduction; One and two dimensional raking with or without annual constraints; Alterability coefficients; Pro-rating and proportional iterative raking methods; Raking method implemented in PROC TSRAKING: numerical optimization approach with alterability coefficients; Time series system with multiple raking rules; Movement preservation.

    Release date: 2021-10-13

  • Surveys and statistical programs – Documentation: 11-633-X2021005
    Description:

    The Analytical Studies and Modelling Branch (ASMB) is the research arm of Statistics Canada mandated to provide high-quality, relevant and timely information on economic, health and social issues that are important to Canadians. The branch strategically makes use of expert knowledge and a broad range of data sources and modelling techniques to address the information needs of a broad range of government, academic and public sector partners and stakeholders through analysis and research, modeling and predictive analytics, and data development. The branch strives to deliver relevant, high-quality, timely, comprehensive, horizontal and integrated research and to enable the use of its research through capacity building and strategic dissemination to meet the user needs of policy makers, academics and the general public.

    This Multi-year Consolidated Plan for Research, Modelling and Data Development outlines the priorities for the branch over the next two years.

    Release date: 2021-08-12

  • Stats in brief: 89-20-00062020002
    Description:

    This video is intended to teach viewers the differences between three fundamental statistical concepts. First, the mean, then the median and finally, the mode.

    Release date: 2021-05-03
Data (1)

Data (1) ((1 result))

  • Table: 82-567-X
    Description:

    The National Population Health Survey (NPHS) is designed to enhance the understanding of the processes affecting health. The survey collects cross-sectional as well as longitudinal data. In 1994/95 the survey interviewed a panel of 17,276 individuals, then returned to interview them a second time in 1996/97. The response rate for these individuals was 96% in 1996/97. Data collection from the panel will continue for up to two decades. For cross-sectional purposes, data were collected for a total of 81,000 household residents in all provinces (except people on Indian reserves or on Canadian Forces bases) in 1996/97.

    This overview illustrates the variety of information available by presenting data on perceived health, chronic conditions, injuries, repetitive strains, depression, smoking, alcohol consumption, physical activity, consultations with medical professionals, use of medications and use of alternative medicine.

    Release date: 1998-07-29
Analysis (102)

Analysis (102) (40 to 50 of 102 results)

  • Articles and reports: 62F0014M2019005
    Description:

    This document describes the updated methodology for Investment Banking Services Price Index (IBSPI).

    Release date: 2019-07-08

  • Articles and reports: 75F0002M2019005
    Description:

    This note describes methodological changes made to the Market Basket Measure (MBM) in Calendar year 2019. These revisions mainly affect MBM estimates for 2008 and 2009, but they also affect the overall interpretation of the trends in the MBM over the 2000s.

    Release date: 2019-02-26

  • Articles and reports: 89-653-X2018001
    Description:

    This Concepts and Methods Guide is intended to provide a detailed review of the 2017 APS with respect to its subject matter and methodological approaches. It is designed to assist APS data users by serving as a guide to the concepts and measures of the survey as well as the technical details of the survey's design, field work and data processing. This guide is meant to provide users with helpful information on how to use and interpret survey results. The discussion on data quality also allows users to review the strengths and limitations of the data for their particular needs.

    Chapter 1 of this guide provides an overview of the 2017 APS by introducing the survey's background and objectives. Chapter 2 outlines the survey's themes and explains the key concepts and definitions used for the survey. Chapters 3 to 6 cover important aspects of the APS survey methodology, sampling design, data collection and processing. Chapters 7 and 8 review issues of data quality and caution users about comparing 2017 APS data with data from other sources. Chapter 9 outlines the survey products available to the public, including data tables, analytical articles and reference material. The Appendices provide a comprehensive list of survey indicators, extra coding categories and standard classifications used on the APS. Lastly, a glossary of survey terms is also provided.

    Release date: 2018-11-26

  • Articles and reports: 13-605-X201800154922
    Description:

    Over the last number of months, Statistics Canada has been updating the national statistical system to measure the production, consumption and distribution of non-medical cannabis. To date, this work has involved updating classification standards (such as the North American Product Classification), developing models that take existing information (mainly from health and social surveys) and transform it into estimates of consumption and expenditure, as well as undertaking new surveys on cannabis consumption.

    Release date: 2018-02-22

  • Articles and reports: 13-605-X201700114840
    Description:

    Statistics Canada is presently preparing the statistical system to be able to gauge the impact of the transition from illegal to legal non-medical cannabis use and to shed light on the social and economic activities related to the use of cannabis thereafter. While the system of social statistics captures some information on the use of cannabis, updates will be required to more accurately measure health effects and the impact on the judicial system. Current statistical infrastructure used to more comprehensively measure the use and impacts of substances such as tobacco and alcohol could be adapted to do the same for cannabis. However, available economic statistics are largely silent on the role illegal drugs play in the economy. Both social and economic statistics will need to be updated to reflect the legalization of cannabis and the challenge is especially great for economic statistics This paper provides a summary of the work that is now under way toward these ends.

    Release date: 2017-09-15

  • Stats in brief: 11-001-X201707516321
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2017-03-16

  • Journals and periodicals: 11-634-X
    Description:

    This publication is a catalogue of strategies and mechanisms that a statistical organization should consider adopting, according to its particular context. This compendium is based on lessons learned and best practices of leadership and management of statistical agencies within the scope of Statistics Canada’s International Statistical Fellowship Program (ISFP). It contains four broad sections including, characteristics of an effective national statistical system; core management practices; improving, modernizing and finding efficiencies; and, strategies to better inform and engage key stakeholders.

    Release date: 2016-07-06

  • Stats in brief: 11-001-X201516812543
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2015-06-17

  • Articles and reports: 82-003-X201500314143
    Description:

    This study evaluates the representativeness of the pooled 2007/2009-2009/2011 Canadian Health Measures Survey immigrant sample by comparing it with socio-demographic distributions from the 2006 Census and the 2011 National Household Survey, and with selected self-reported health and health behaviour indicators from the 2009/2010 Canadian Community Health Survey.

    Release date: 2015-03-18

  • Stats in brief: 11-001-X201503410941
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2015-02-03
Reference (54)

Reference (54) (0 to 10 of 54 results)

  • Surveys and statistical programs – Documentation: 75-514-G
    Description: The Guide to the Job Vacancy and Wage Survey contains a dictionary of concepts and definitions, and covers topics such as survey methodology, data collection, processing, and data quality. The guide covers both components of the survey: the job vacancy component, which is quarterly, and the wage component, which is annual.
    Release date: 2023-05-25

  • Surveys and statistical programs – Documentation: 32-26-0002
    Description:

    This reference guide may be useful to both new and experienced users who wish to familiarize themselves with and find specific information about the Census of Agriculture.

    It provides an overview of the Census of Agriculture communications, content determination, collection, processing, data quality evaluation and dissemination activities. It also summarizes the key changes to the census and other useful information.

    Release date: 2022-04-14

  • Surveys and statistical programs – Documentation: 11-633-X2021005
    Description:

    The Analytical Studies and Modelling Branch (ASMB) is the research arm of Statistics Canada mandated to provide high-quality, relevant and timely information on economic, health and social issues that are important to Canadians. The branch strategically makes use of expert knowledge and a broad range of data sources and modelling techniques to address the information needs of a broad range of government, academic and public sector partners and stakeholders through analysis and research, modeling and predictive analytics, and data development. The branch strives to deliver relevant, high-quality, timely, comprehensive, horizontal and integrated research and to enable the use of its research through capacity building and strategic dissemination to meet the user needs of policy makers, academics and the general public.

    This Multi-year Consolidated Plan for Research, Modelling and Data Development outlines the priorities for the branch over the next two years.

    Release date: 2021-08-12

  • Surveys and statistical programs – Documentation: 89-26-0003
    Description:

    Statistics Canada Data Strategy (SCDS) provides a course of action for managing and leveraging the agency’s data assets to ensure their optimal use and value while maintaining public trust. As Statistics Canada is the nation’s trusted provider of high-quality data and information to support evidence-based policy and decision making, the SCDS also naturally includes the agency’s plan for providing support and data expertise to other government organizations (federal, provincial and territorial), non-governmental organizations, the private sector, academia, and other national and international communities).

    The SCDS provides a roadmap for how Statistics Canada will continue to govern and manage its valuable data assets as part of its modernization agenda and in alignment with and response to other federal government strategies and initiatives. These federal strategies include the Data Strategy for the Federal Public Service, Canada’s 2018-2020 National Action Plan on Open Government, and the Treasury Board Secretariat Digital Operations Strategic Plan: 2018-2022.

    Release date: 2020-04-30

  • Surveys and statistical programs – Documentation: 99-011-X
    Description:

    This topic presents data on the Aboriginal peoples of Canada and their demographic characteristics. Depending on the application, estimates using any of the following concepts may be appropriate for the Aboriginal population: (1) Aboriginal identity, (2) Aboriginal ancestry, (3) Registered or Treaty Indian status and (4) Membership in a First Nation or Indian band. Data from the 2011 National Household Survey are available for the geographical locations where these populations reside, including 'on reserve' census subdivisions and Inuit communities of Inuit Nunangat as well as other geographic areas such as the national (Canada), provincial and territorial levels.

    Analytical products

    The analytical document provides analysis on the key findings and trends in the data, and is complimented with the short articles found in NHS in Brief and the NHS Focus on Geography Series.

    Data products

    The NHS Profile is one data product that provides a statistical overview of user selected geographic areas based on several detailed variables and/or groups of variables. Other data products include data tables which represent a series of cross tabulations ranging in complexity and are available for various levels of geography.

    Release date: 2019-10-29

  • Surveys and statistical programs – Documentation: 11-621-M2018105
    Description:

    Statistics Canada needs to respond to the legalization of cannabis for non-medical use by measuring various aspects of the introduction of cannabis in the Canadian economy and society. An important part of measuring the economy and society is using statistical classifications. It is common practice with classifications that they are updated and revised as new industries, products, occupations and educational programs are introduced into the Canadian economy and society. This paper describes the changes to the various statistical classifications used by Statistics Canada in order to measure the introduction of legal non-medical cannabis.

    Release date: 2019-07-24

  • Surveys and statistical programs – Documentation: 11-633-X2019001
    Description:

    The mandate of the Analytical Studies Branch (ASB) is to provide high-quality, relevant and timely information on economic, health and social issues that are important to Canadians. The branch strategically makes use of expert knowledge and a large range of statistical sources to describe, draw inferences from, and make objective and scientifically supported deductions about the evolving nature of the Canadian economy and society. Research questions are addressed by applying leading-edge methods, including microsimulation and predictive analytics using a range of linked and integrated administrative and survey data. In supporting greater access to data, ASB linked data are made available to external researchers and policy makers to support evidence-based decision making. Research results are disseminated by the branch using a range of mediums (i.e., research papers, studies, infographics, videos, and blogs) to meet user needs. The branch also provides analytical support and training, feedback, and quality assurance to the wide range of programs within and outside Statistics Canada.

    Release date: 2019-05-29

  • Surveys and statistical programs – Documentation: 75-005-M2019001
    Description:

    The production of statistics from the Labour Force Survey (LFS) involves many activities, one of which is data processing. This step involves the verification and correction of survey data when required in order to produce microdata files. Beginning in January 2019, LFS processing will be transitioned to a new system, the Social Survey Processing Environment. This document describes the development and testing that preceded the implementation of the new system, and demonstrates that the transition is expected to have minimal impact on LFS estimates and be transparent to users of LFS data.

    Release date: 2019-02-08

  • Surveys and statistical programs – Documentation: 71-526-X
    Description:

    The Canadian Labour Force Survey (LFS) is the official source of monthly estimates of total employment and unemployment. Following the 2011 census, the LFS underwent a sample redesign to account for the evolution of the population and labour market characteristics, to adjust to changes in the information needs and to update the geographical information used to carry out the survey. The redesign program following the 2011 census culminated with the introduction of a new sample at the beginning of 2015. This report is a reference on the methodological aspects of the LFS, covering stratification, sampling, collection, processing, weighting, estimation, variance estimation and data quality.

    Release date: 2017-12-21

  • Surveys and statistical programs – Documentation: 11-633-X2017007
    Description:

    The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 30 years. The IMDB combines administrative files on immigrant admissions and non-permanent resident permits from Immigration, Refugees and Citizenship Canada (IRCC) with tax files from the Canadian Revenue Agency (CRA). Information is available for immigrant taxfilers admitted since 1980. Tax records for 1982 and subsequent years are available for immigrant taxfilers.

    This report will discuss the IMDB data sources, concepts and variables, record linkage, data processing, dissemination, data evaluation and quality indicators, comparability with other immigration datasets, and the analyses possible with the IMDB.

    Release date: 2017-06-16
Date modified: