Data analysis

Skip to filters. View results.

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Geography

2 facets displayed. 0 facets selected.

Survey or statistical program

1 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (289)

All (289) (10 to 20 of 289 results)

  • Stats in brief: 89-20-00062024003
    Description: This video is intended for professionals, policymakers, and researchers who are interested in understanding how data linkage can be used to gain deeper insights into various issues. It demonstrates how combining data from different sources can help address gaps in information, leading to better-informed policies and improved outcomes.
    Release date: 2024-11-25

  • Data Visualization: 71-607-X2020010
    Description: The Canadian Statistical Geospatial Explorer empowers users to discover geo enabled data holdings of Statistics Canada at various levels of geography including at the neighbourhood level. Users are able to visualize, thematically map, spatially explore and analyze, export and consume data in various formats. Users can also view the data superimposed on satellite imagery, topographic and street layers.
    Release date: 2024-08-21

  • Articles and reports: 11-522-X202200100004
    Description: In accordance with Statistics Canada’s long-term Disaggregated Data Action Plan (DDAP), several initiatives have been implemented into the Labour Force Survey (LFS). One of the more direct initiatives was a targeted increase in the size of the monthly LFS sample. Furthermore, a regular Supplement program was introduced, where an additional series of questions are asked to a subset of LFS respondents and analyzed in a monthly or quarterly production cycle. Finally, the production of modelled estimates based on Small Area Estimation (SAE) methodologies resumed for the LFS and will include a wider scope with more analytical value than what had existed in the past. This paper will give an overview of these three initiatives.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100006
    Description: The Australian Bureau of Statistics (ABS) is committed to improving access to more microdata, while ensuring privacy and confidentiality is maintained, through its virtual DataLab which supports researchers to undertake complex research more efficiently. Currently, the DataLab research outputs need to follow strict rules to minimise disclosure risks for clearance. However, the clerical-review process is not cost effective and has potential to introduce errors. The increasing number of statistical outputs from different projects can potentially introduce differencing risks even though these outputs from different projects have met the strict output rules. The ABS has been exploring the possibility of providing automatic output checking using the ABS cellkey methodology to ensure that all outputs across different projects are protected consistently to minimise differencing risks and reduce costs associated with output checking.
    Release date: 2024-03-25

  • Articles and reports: 11-522-X202200100009
    Description: Education and training is acknowledged as fundamental for the development of a society. It is a complex multidimensional phenomenon, which determinants are ascribable to several interrelated familiar and socio-economic conditions. To respond to the demand of supporting statistical information for policymaking and its monitoring and evaluation process, the Italian National Statistical Institute (Istat) is renewing the education and training statistical production system, implementing a new thematic statistical register. It will be part of the Istat Integrated System of Registers, thus allowing relating the education and training phenomenon to other relevant phenomena, e.g. transition to work.
    Release date: 2024-03-25

  • 19-22-0012
    Description: Overall course objective: Learn what disaggregated data is and how disaggregated data can be used at different stages of the policy-making cycle.

    Target audience: Junior policy analysts, or those who have less experience with working with data.

    Format: Virtual instructor-led course over 3 consecutive days (from 10am to 3pm each day), with a one-hour lunch break.

    Course structure: Six modules.

    Price: $500 per participant. 

    Contact information: For general information about this course and how to register, contact the Analytical Studies and Modelling Branch: statcan.asbtraining-deaformation.statcan@statcan.gc.ca.

    https://www.statcan.gc.ca/en/training/surveys/19220012
    Release date: 2024-03-07

  • Surveys and statistical programs – Documentation: 11-633-X2024001
    Description: The Longitudinal Immigration Database (IMDB) is a comprehensive source of data that plays a key role in the understanding of the economic behaviour of immigrants. It is the only annual Canadian dataset that allows users to study the characteristics of immigrants to Canada at the time of admission and their economic outcomes and regional (inter-provincial) mobility over a time span of more than 35 years.
    Release date: 2024-01-22

  • Stats in brief: 11-001-X202402237898
    Description: Release published in The Daily – Statistics Canada’s official release bulletin
    Release date: 2024-01-22

  • Articles and reports: 12-001-X202300200007
    Description: Conformal prediction is an assumption-lean approach to generating distribution-free prediction intervals or sets, for nearly arbitrary predictive models, with guaranteed finite-sample coverage. Conformal methods are an active research topic in statistics and machine learning, but only recently have they been extended to non-exchangeable data. In this paper, we invite survey methodologists to begin using and contributing to conformal methods. We introduce how conformal prediction can be applied to data from several common complex sample survey designs, under a framework of design-based inference for a finite population, and we point out gaps where survey methodologists could fruitfully apply their expertise. Our simulations empirically bear out the theoretical guarantees of finite-sample coverage, and our real-data example demonstrates how conformal prediction can be applied to complex sample survey data in practice.
    Release date: 2024-01-03

  • Articles and reports: 45-20-00022023004
    Description: Gender-based Analysis Plus (GBA Plus) is an analytical tool developed by Women and Gender Equality Canada (WAGE) to support the development of responsive and inclusive initiatives, including policies, programs, and other initiatives. This information sheet presents the usefulness of GBA Plus for disaggregating and analyzing data to identify the groups most affected by certain issues, such as overqualification.
    Release date: 2023-11-27
Data (2)

Data (2) ((2 results))

  • Data Visualization: 71-607-X2020010
    Description: The Canadian Statistical Geospatial Explorer empowers users to discover geo enabled data holdings of Statistics Canada at various levels of geography including at the neighbourhood level. Users are able to visualize, thematically map, spatially explore and analyze, export and consume data in various formats. Users can also view the data superimposed on satellite imagery, topographic and street layers.
    Release date: 2024-08-21

  • Data Visualization: 71-607-X2019010
    Description: The Housing Data Viewer is a visualization tool that allows users to explore Statistics Canada data on a map. Users can use the tool to navigate, compare and export data.
    Release date: 2019-10-30
Analysis (256)

Analysis (256) (220 to 230 of 256 results)

  • Articles and reports: 12-001-X199400214421
    Description:

    This paper discusses testing a single hypothesis about linear regression coefficients based on sample survey data. It suggests that when the design-based linearization variance estimator for a regression coefficient is used it should be adjusted to reduce its slight model bias and that a Satterthwaite-like estimation of its effective degrees of freedom be made. A very important special case of this analysis is its application to domain means.

    Release date: 1994-12-15

  • Articles and reports: 11F0019M1994070
    Geography: Canada
    Description:

    This paper uses job turnover data to compare how job creation, job destruction and net job change differ for small and large establishments in the Canadian manufacturing sector. It uses several different techniques to correct for the regression-to-the-mean problem that, it has been suggested, might incorrectly lead to the conclusion that small establishments create a disproportionate number of new jobs. It finds that net job creation for smaller establishments is greater than that of large establishments after such changes are made. The paper also compares the importance of small and large establishments in the manufacturing sectors of Canada and the United States. The Canadian manufacturing sector is shown to have both a larger proportion of employment in smaller establishments but also to have a small establishment sector that is growing in importance relative to that of the United States.

    Release date: 1994-11-16

  • Articles and reports: 12-001-X199300214452
    Description:

    Surveys across time can serve many objectives. The first half of the paper reviews the abilities of alternative survey designs across time - repeated surveys, panel surveys, rotating panel surveys and split panel surveys - to meet these objectives. The second half concentrates on panel surveys. It discusses the decisions that need to be made in designing a panel survey, the problems of wave nonresponse, time-in-sample bias and the seam effect, and some methods for the longitudinal analysis of panel survey data.

    Release date: 1993-12-15

  • Articles and reports: 12-001-X199300214460
    Description:

    Methods for estimating response bias in surveys require “unbiased” remeasurements for at least a subsample of observations. The usual estimator of response bias is the difference between the mean of the original observations and the mean of the unbiased observations. In this article, we explore a number of alternative estimators of response bias derived from a model prediction approach. The assumed sampling design is a stratified two-phase design implementing simple random sampling in each phase. We assume that the characteristic, y, is observed for each unit selected in phase 1 while the true value of the characteristic, \mu, is obtained for each unit in the subsample selected at phase 2. We further assume that an auxiliary variable x is known for each unit in the phase 1 sample and that the population total of x is known. A number of models relating y, \mu and x are assumed which yield alternative estimators of E (y - \mu), the response bias. The estimators are evaluated using a bootstrap procedure for estimating variance, bias, and mean squared error. Our bootstrap procedure is an extension of the Bickel-Freedman single phase method to the case of a stratified two-phase design. As an illustration, the methodology is applied to data from the National Agricultural Statistics Service reinterview program. For these data, we show that the usual difference estimator is outperformed by the model-assisted estimator suggested by Särndal, Swensson and Wretman (1991), thus indicating that improvements over the traditional estimator are possible using the model prediction approach.

    Release date: 1993-12-15

  • Articles and reports: 12-001-X199300114476
    Description:

    This paper focuses on how to deal with record linkage errors when engaged in regression analysis. Recent work by Rubin and Belin (1991) and by Winkler and Thibaudeau (1991) provides the theory, computational algorithms, and software necessary for estimating matching probabilities. These advances allow us to update the work of Neter, Maynes, and Ramanathan (1965). Adjustment procedures are outlined and some successful simulations are described. Our results are preliminary and intended largely to stimulate further work.

    Release date: 1993-06-15

  • Articles and reports: 12-001-X199200214484
    Description:

    Maximum likelihood estimation from complex sample data requires additional modeling due to the information in the sample selection. Alternatively, pseudo maximum likelihood methods that consist of maximizing estimates of the census score function can be applied. In this article we review some of the approaches considered in the literature and compare them with a new approach derived from the ideas of ‘weighted distributions’. The focus of the comparisons is on situations where some or all of the design variables are unknown or misspecified. The results obtained for the new method are encouraging, but the study is limited so far to simple situations.

    Release date: 1992-12-15

  • Articles and reports: 12-001-X199200214487
    Description:

    This paper reviews the idea of robustness for randomisation and model-based inference for descriptive and analytic surveys. The lack of robustness for model-based procedures can be partially overcome by careful design. In this paper a robust model-based approach to analysis is proposed based on smoothing methods.

    Release date: 1992-12-15

  • Articles and reports: 12-001-X199200114494
    Description:

    This article presents a selected annotated bibliography of the literature on capture-recapture (dual system) estimation of population size, on extensions to the basic methodology, and the application of these techniques in the context of census undercount estimation.

    Release date: 1992-06-15

  • Articles and reports: 12-001-X199200114498
    Description:

    One way to assess the undercount at subnational levels (e.g. the state level) is to obtain sample data from a post-enumeration survey, and then smooth those data based on a linear model of explanatory variables. The relative importance of sampling-error variances to corresponding model-error variances determines the amount of smoothing. Maximum likelihood estimation can lead to oversmoothing, so making the assessment of undercount over-reliant on the linear model. Restricted maximum likelihood (REML) estimators do not suffer from this drawback. Empirical Bayes prediction of undercount based on REML will be presented in this article, and will be compared to maximum likelihood and a method of moments by both simulation and example. Large-sample distributional properties of the REML estimators allow accurate mean squared prediction errors of the REML-based smoothers to be computed.

    Release date: 1992-06-15

  • Articles and reports: 12-001-X199200114499
    Description:

    This paper reviews some of the arguments for and against adjusting the U.S. census of 1980, and the decision of the court.

    Release date: 1992-06-15
Reference (26)

Reference (26) (20 to 30 of 26 results)

  • Surveys and statistical programs – Documentation: 11F0019M2003207
    Geography: Canada
    Description:

    The estimation of intergenerational earnings mobility is rife with measurement problems since the research does not observe permanent, lifetime earnings. Nearly all studies make corrections for mean variation in earnings because of the age differences among respondents. Recent works employ average earnings or instrumental variable methods to address the effects of measurement error as a result of transitory earnings shocks and mis-reporting. However, empirical studies of intergenerational mobility have paid no attention to the changes in earnings variance across the life cycle suggested by economic models of human capital investment.

    Using information from the Intergenerational Income Data from Canada and the National Longitudinal Survey and Panel Study of Income Dynamics from the United States, this study finds a strong association between age at observation and estimated earnings persistence. Part of this age-dependence is related to a general increase in transitory earnings variance during the collection of data. An independent effect of life cycle investment is also identified. These findings are then applied to the variation among intergenerational earnings persistence studies. Among studies with similar methodologies, one-third of the variance in published estimates of earnings persistence is attributable to cross-study differences in the age of responding fathers. Finally, these results call into question tests for the importance of credit constraints based on measures of earnings at different points in the life cycle.

    Release date: 2003-08-05

  • Surveys and statistical programs – Documentation: 12-584-G
    Description:

    This book introduces technical aspects of the Statistics Canada Total Work Accounts System (TWAS). The TWAS is designed to facilitate the analysis of issues that require simultaneous consideration of both paid work and unpaid productive work. Its key contribution is to allocate the deemed output of each episode of unpaid work activity to a specific beneficiary or group of beneficiaries (called "destinations"). The guide presents the criteria used to decide the allocation of each work episode to one of the destinations, as well as the pseudo code for DESTIN, the key variable of the System. This pseudo code allows programmers to quickly create the actual programming code needed to derive the DESTIN variable in their own microdata files of diary-based time-use records. The guide also discusses illustrative applications of the System, as well as its key limitations.

    Release date: 2002-02-12

  • Notices and consultations: 87-003-X19970012882
    Geography: Canada
    Description:

    The purpose of this article is to inform Travel-log readers of the availability of a new analytical tool - the National Tourism Indicators. These estimates, which measure trends in tourism in Canada, are placed in perspective here, taking into account the concepts and definitions used in developing them.

    Release date: 1997-01-08

  • Surveys and statistical programs – Documentation: 11F0019M1995083
    Geography: Canada
    Description:

    This paper examines the robustness of a measure of the average complete duration of unemployment in Canada to a host of assumptions used in its derivation. In contrast to the average incomplete duration of unemployment, which is a lagging cyclical indicator, this statistic is a coincident indicator of the business cycle. The impact of using a steady state as opposed to a non steady state assumption, as well as the impact of various corrections for response bias are explored. It is concluded that a non steady state estimator would be a valuable compliment to the statistics on unemployment duration that are currently released by many statistical agencies, and particularly Statistics Canada.

    Release date: 1995-12-30

  • Surveys and statistical programs – Documentation: 75F0002M1993014
    Description:

    This paper presents the results from test 3A of the Survey of Labour and Income Dynamics (SLID), conducted in January 1993, with a view to identify any necessary changes to the questions or to the algorithm used to derive labour force status.

    Release date: 1995-12-30

  • Surveys and statistical programs – Documentation: 75F0002M1994018
    Description:

    This document describes the demographic, cultural and geographic derived variables for the Survey of Labour and Income Dynamics (SLID).

    Release date: 1995-12-30