Sort Help
entries

Results

All (10)

All (10) ((10 results))

  • Articles and reports: 11-621-M2021005
    Description:

    Multinational enterprises (MNEs) have been drivers of globalization. These enterprises have taken advantage of innovations in logistics and communications technology over the past four decades to diversify their supply chains and expand into new markets. Operating internationally, however, also allows MNEs to take advantage of tax systems which were designed for a less integrated era. For example, MNEs can arrange for profits to be 'shifted' by charging affiliates in high tax locations prices above market rates in transactions with affiliates in lower tax regions. These behaviours are referred to as base erosion and profit shifting (BEPS), and, although not illegal, they impact government revenues worldwide.

    Release date: 2021-12-02

  • Articles and reports: 11-522-X202100100024
    Description: The Economic Directorate of the U.S. Census Bureau is developing coordinated design and sample selection procedures for the Annual Integrated Economic Survey. The unified sample will replace the directorate’s existing practice of independently developing sampling frames and sampling procedures for a suite of separate annual surveys, which optimizes sample design features at the cost of increased response burden. Size attributes of business populations, e.g., revenues and employment, are highly skewed. A high percentage of companies operate in more than one industry. Therefore, many companies are sampled into multiple surveys compounding the response burden, especially for “medium sized” companies.

    This component of response burden is reduced by selecting a single coordinated sample but will not be completely alleviated. Response burden is a function of several factors, including (1) questionnaire length and complexity, (2) accessibility of data, (3) expected number of repeated measures, and (4) frequency of collection. The sample design can have profound effects on the third and fourth factors. To help inform decisions about the integrated sample design, we use regression trees to identify covariates from the sampling frame that are related to response burden. Using historic frame and response data from four independently sampled surveys, we test a variety of algorithms, then grow regression trees that explain relationships between expected levels of response burden (as measured by response rate) and frame covariates common to more than one survey. We validate initial findings by cross-validation, examining results over time. Finally, we make recommendations on how to incorporate our robust findings into the coordinated sample design.
    Release date: 2021-10-29

  • Articles and reports: 11-522-X202100100017
    Description: The outbreak of the COVID-19 pandemic required the Government of Canada to provide relevant and timely information to support decision-making around a host of issues, including personal protective equipment (PPE) procurement and deployment. Our team built a compartmental epidemiological model from an existing code base to project PPE demand under a range of epidemiological scenarios. This model was further enhanced using data science techniques, which allowed for the rapid development and dissemination of model results to inform policy decisions.

    Key Words: COVID-19; SARS-CoV-2; Epidemiological model; Data science; Personal Protective Equipment (PPE); SEIR

    Release date: 2021-10-22

  • Articles and reports: 11-522-X202100100003
    Description:

    The increasing size and richness of digital data allow for modeling more complex relationships and interactions, which is the strongpoint of machine learning. Here we applied gradient boosting to the Dutch system of social statistical datasets to estimate transition probabilities into and out of poverty. Individual estimates are reasonable, but the main advantages of the approach in combination with SHAP and global surrogate models are the simultaneous ranking of hundreds of features by their importance, detailed insight into their relationship with the transition probabilities, and the data-driven identification of subpopulations with relatively high and low transition probabilities. In addition, we decompose the difference in feature importance between general and subpopulation into a frequency and a feature effect. We caution for misinterpretation and discuss future directions.

    Key Words: Classification; Explainability; Gradient boosting; Life event; Risk factors; SHAP decomposition.

    Release date: 2021-10-15

  • Articles and reports: 11-522-X202100100004
    Description:

    With labour market uncertainty increasing across Canada, there is a need for innovative ways to help displaced workers to re-skill/up-skill and potentially pivot to in-demand occupations. In our study, we present a unique approach to bridge the gap between the displaced and in-demand occupations and provide a machine learning framework that may be able to forecast employment by NAICS for 6 months. We have combined the monthly employment data from Statistics Canada’s Survey of Employment and Payroll Hours, and the monthly job ads counts from Burning Glass to achieve our goal. Our approach consists of three steps: 1.        Finding the displaced occupations in Alberta over the last 7 years based on the integrated actual employment and job ads count data. Step. 2. Using the list of displaced occupations, a unique pivot graph is developed to map a displaced occupation to a list of in-demand occupations which have skills similar to the chosen displaced occupation. Step 3.  Applying SARIMA and SARIMAX models to forecast employment for 6 months. The above approaches are aimed at assisting public policy and planning

    Key Words: Employment; Labour Market; Job Ads; Skills; Time Series Analysis; Forecasting.

    Release date: 2021-10-15

  • Articles and reports: 11-522-X202100100020
    Description: Seasonal adjustment of time series at Statistics Canada is performed using the X-12-ARIMA method. For most statistical programs performing seasonal adjustment, subject matter experts (SMEs) are responsible for managing the program and for verification, analysis and dissemination of the data, while methodologists from the Time Series Research and Analysis Center (TSRAC) are responsible for developing and maintaining the seasonal adjustment process and for providing support on seasonal adjustment to SMEs. A visual summary report called the seasonal adjustment dashboard has been developed in R Shiny by the TSRAC to build capacity to interpret seasonally adjusted data and to reduce the resources needed to support seasonal adjustment. It is currently being made available internally to assist SMEs to interpret and explain seasonally adjusted results. The summary report includes graphs of the series across time, as well as summaries of individual seasonal and calendar effects and patterns. Additionally, key seasonal adjustment diagnostics are presented and the net effect of seasonal adjustment is decomposed into its various components. This paper gives a visual representation of the seasonal adjustment process, while demonstrating the dashboard and its interactive functionality.

    Key Words: Time Series; X-12-ARIMA; Summary Report; R Shiny.

    Release date: 2021-10-15

  • Articles and reports: 11-522-X202100100021
    Description: Istat has started a new project for the Short Term statistical processes, to satisfy the coming new EU Regulation to release estimates in a shorter time. The assessment and analysis of the current Short Term Survey on Turnover in Services (FAS) survey process, aims at identifying how the best features of the current methods and practices can be exploited to design a more “efficient” process. In particular, the project is expected to release methods that would allow important economies of scale, scope and knowledge to be applied in general to the STS productive context, usually working with a limited number of resources. The analysis of the AS-IS process revealed that the FAS survey incurs substantial E&I costs, especially due to intensive follow-up and interactive editing that is used for every type of detected errors. In this view, we tried to exploit the lessons learned by participating to the High-Level Group for the Modernisation of Official Statistics (HLG-MOS, UNECE) about the Use of Machine Learning in Official Statistics. In this work, we present a first experiment using Random Forest models to: (i) predict which units represent “suspicious” data, (ii) to assess the prediction potential use over new data and (iii) to explore data to identify hidden rules and patterns. In particular, we focus on the use of Random Forest modelling to compare some alternative methods in terms of error prediction efficiency and to address the major aspects for the new design of the E&I scheme.
    Release date: 2021-10-15

  • Articles and reports: 11F0019M2021006
    Description:

    The overall objective of this paper is to provide an overview of selected approaches to measuring and reporting well-being in Canada and internationally, and to identify opportunities to move forward with new and enhanced measures to address current social, economic and environmental issues facing Canada that may impact the well-being of its population. This report highlights six trends and proposes a range of data development and measurement activities to advance well-being measurement in the following key areas: digitization, affordability and economic uncertainty, the quality of jobs, social cohesion, neighbourhoods and the built environment and climate change.

    Release date: 2021-07-12

  • Articles and reports: 12-001-X202100100008
    Description:

    Changes in the design of a repeated survey generally result in systematic effects in the sample estimates, which are further referred to as discontinuities. To avoid confounding real period-to-period change with the effects of a redesign, discontinuities are often quantified by conducting the old and the new design in parallel for some period of time. Sample sizes of such parallel runs are generally too small to apply direct estimators for domain discontinuities. A bivariate hierarchical Bayesian Fay-Herriot (FH) model is proposed to obtain more precise predictions for domain discontinuities and is applied to a redesign of the Dutch Crime Victimization Survey. This method is compared with a univariate FH model where the direct estimates under the regular approach are used as covariates in a FH model for the alternative approach conducted on a reduced sample size and a univariate FH model where the direct estimates for the discontinuities are modeled directly. An adjusted step forward selection procedure is proposed that minimizes the WAIC until the reduction of the WAIC is smaller than the standard error of this criteria. With this approach more parsimonious models are selected, which prevents selecting complex models that tend to overfit the data.

    Release date: 2021-06-24

  • Stats in brief: 11-627-M2021025
    Description:

    This infographic highlights a selection of statistics on restaurants, bars and caterers in Canada.

    Release date: 2021-03-25
Stats in brief (1)

Stats in brief (1) ((1 result))

Articles and reports (9)

Articles and reports (9) ((9 results))

  • Articles and reports: 11-621-M2021005
    Description:

    Multinational enterprises (MNEs) have been drivers of globalization. These enterprises have taken advantage of innovations in logistics and communications technology over the past four decades to diversify their supply chains and expand into new markets. Operating internationally, however, also allows MNEs to take advantage of tax systems which were designed for a less integrated era. For example, MNEs can arrange for profits to be 'shifted' by charging affiliates in high tax locations prices above market rates in transactions with affiliates in lower tax regions. These behaviours are referred to as base erosion and profit shifting (BEPS), and, although not illegal, they impact government revenues worldwide.

    Release date: 2021-12-02

  • Articles and reports: 11-522-X202100100024
    Description: The Economic Directorate of the U.S. Census Bureau is developing coordinated design and sample selection procedures for the Annual Integrated Economic Survey. The unified sample will replace the directorate’s existing practice of independently developing sampling frames and sampling procedures for a suite of separate annual surveys, which optimizes sample design features at the cost of increased response burden. Size attributes of business populations, e.g., revenues and employment, are highly skewed. A high percentage of companies operate in more than one industry. Therefore, many companies are sampled into multiple surveys compounding the response burden, especially for “medium sized” companies.

    This component of response burden is reduced by selecting a single coordinated sample but will not be completely alleviated. Response burden is a function of several factors, including (1) questionnaire length and complexity, (2) accessibility of data, (3) expected number of repeated measures, and (4) frequency of collection. The sample design can have profound effects on the third and fourth factors. To help inform decisions about the integrated sample design, we use regression trees to identify covariates from the sampling frame that are related to response burden. Using historic frame and response data from four independently sampled surveys, we test a variety of algorithms, then grow regression trees that explain relationships between expected levels of response burden (as measured by response rate) and frame covariates common to more than one survey. We validate initial findings by cross-validation, examining results over time. Finally, we make recommendations on how to incorporate our robust findings into the coordinated sample design.
    Release date: 2021-10-29

  • Articles and reports: 11-522-X202100100017
    Description: The outbreak of the COVID-19 pandemic required the Government of Canada to provide relevant and timely information to support decision-making around a host of issues, including personal protective equipment (PPE) procurement and deployment. Our team built a compartmental epidemiological model from an existing code base to project PPE demand under a range of epidemiological scenarios. This model was further enhanced using data science techniques, which allowed for the rapid development and dissemination of model results to inform policy decisions.

    Key Words: COVID-19; SARS-CoV-2; Epidemiological model; Data science; Personal Protective Equipment (PPE); SEIR

    Release date: 2021-10-22

  • Articles and reports: 11-522-X202100100003
    Description:

    The increasing size and richness of digital data allow for modeling more complex relationships and interactions, which is the strongpoint of machine learning. Here we applied gradient boosting to the Dutch system of social statistical datasets to estimate transition probabilities into and out of poverty. Individual estimates are reasonable, but the main advantages of the approach in combination with SHAP and global surrogate models are the simultaneous ranking of hundreds of features by their importance, detailed insight into their relationship with the transition probabilities, and the data-driven identification of subpopulations with relatively high and low transition probabilities. In addition, we decompose the difference in feature importance between general and subpopulation into a frequency and a feature effect. We caution for misinterpretation and discuss future directions.

    Key Words: Classification; Explainability; Gradient boosting; Life event; Risk factors; SHAP decomposition.

    Release date: 2021-10-15

  • Articles and reports: 11-522-X202100100004
    Description:

    With labour market uncertainty increasing across Canada, there is a need for innovative ways to help displaced workers to re-skill/up-skill and potentially pivot to in-demand occupations. In our study, we present a unique approach to bridge the gap between the displaced and in-demand occupations and provide a machine learning framework that may be able to forecast employment by NAICS for 6 months. We have combined the monthly employment data from Statistics Canada’s Survey of Employment and Payroll Hours, and the monthly job ads counts from Burning Glass to achieve our goal. Our approach consists of three steps: 1.        Finding the displaced occupations in Alberta over the last 7 years based on the integrated actual employment and job ads count data. Step. 2. Using the list of displaced occupations, a unique pivot graph is developed to map a displaced occupation to a list of in-demand occupations which have skills similar to the chosen displaced occupation. Step 3.  Applying SARIMA and SARIMAX models to forecast employment for 6 months. The above approaches are aimed at assisting public policy and planning

    Key Words: Employment; Labour Market; Job Ads; Skills; Time Series Analysis; Forecasting.

    Release date: 2021-10-15

  • Articles and reports: 11-522-X202100100020
    Description: Seasonal adjustment of time series at Statistics Canada is performed using the X-12-ARIMA method. For most statistical programs performing seasonal adjustment, subject matter experts (SMEs) are responsible for managing the program and for verification, analysis and dissemination of the data, while methodologists from the Time Series Research and Analysis Center (TSRAC) are responsible for developing and maintaining the seasonal adjustment process and for providing support on seasonal adjustment to SMEs. A visual summary report called the seasonal adjustment dashboard has been developed in R Shiny by the TSRAC to build capacity to interpret seasonally adjusted data and to reduce the resources needed to support seasonal adjustment. It is currently being made available internally to assist SMEs to interpret and explain seasonally adjusted results. The summary report includes graphs of the series across time, as well as summaries of individual seasonal and calendar effects and patterns. Additionally, key seasonal adjustment diagnostics are presented and the net effect of seasonal adjustment is decomposed into its various components. This paper gives a visual representation of the seasonal adjustment process, while demonstrating the dashboard and its interactive functionality.

    Key Words: Time Series; X-12-ARIMA; Summary Report; R Shiny.

    Release date: 2021-10-15

  • Articles and reports: 11-522-X202100100021
    Description: Istat has started a new project for the Short Term statistical processes, to satisfy the coming new EU Regulation to release estimates in a shorter time. The assessment and analysis of the current Short Term Survey on Turnover in Services (FAS) survey process, aims at identifying how the best features of the current methods and practices can be exploited to design a more “efficient” process. In particular, the project is expected to release methods that would allow important economies of scale, scope and knowledge to be applied in general to the STS productive context, usually working with a limited number of resources. The analysis of the AS-IS process revealed that the FAS survey incurs substantial E&I costs, especially due to intensive follow-up and interactive editing that is used for every type of detected errors. In this view, we tried to exploit the lessons learned by participating to the High-Level Group for the Modernisation of Official Statistics (HLG-MOS, UNECE) about the Use of Machine Learning in Official Statistics. In this work, we present a first experiment using Random Forest models to: (i) predict which units represent “suspicious” data, (ii) to assess the prediction potential use over new data and (iii) to explore data to identify hidden rules and patterns. In particular, we focus on the use of Random Forest modelling to compare some alternative methods in terms of error prediction efficiency and to address the major aspects for the new design of the E&I scheme.
    Release date: 2021-10-15

  • Articles and reports: 11F0019M2021006
    Description:

    The overall objective of this paper is to provide an overview of selected approaches to measuring and reporting well-being in Canada and internationally, and to identify opportunities to move forward with new and enhanced measures to address current social, economic and environmental issues facing Canada that may impact the well-being of its population. This report highlights six trends and proposes a range of data development and measurement activities to advance well-being measurement in the following key areas: digitization, affordability and economic uncertainty, the quality of jobs, social cohesion, neighbourhoods and the built environment and climate change.

    Release date: 2021-07-12

  • Articles and reports: 12-001-X202100100008
    Description:

    Changes in the design of a repeated survey generally result in systematic effects in the sample estimates, which are further referred to as discontinuities. To avoid confounding real period-to-period change with the effects of a redesign, discontinuities are often quantified by conducting the old and the new design in parallel for some period of time. Sample sizes of such parallel runs are generally too small to apply direct estimators for domain discontinuities. A bivariate hierarchical Bayesian Fay-Herriot (FH) model is proposed to obtain more precise predictions for domain discontinuities and is applied to a redesign of the Dutch Crime Victimization Survey. This method is compared with a univariate FH model where the direct estimates under the regular approach are used as covariates in a FH model for the alternative approach conducted on a reduced sample size and a univariate FH model where the direct estimates for the discontinuities are modeled directly. An adjusted step forward selection procedure is proposed that minimizes the WAIC until the reduction of the WAIC is smaller than the standard error of this criteria. With this approach more parsimonious models are selected, which prevents selecting complex models that tend to overfit the data.

    Release date: 2021-06-24
Journals and periodicals (0)

Journals and periodicals (0) (0 results)

No content available at this time.

Date modified: