Keyword search

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Geography

3 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (466)

All (466) (0 to 10 of 466 results)

  • Surveys and statistical programs – Documentation: 15F0004X
    Description:

    The input-output (IO) models are generally used to simulate the economic impacts of an expenditure on a given basket of goods and services or the output of one or several industries. The simulation results from a "shock" to an IO model will show the direct, indirect and induced impacts on GDP, which industries benefit the most, the number of jobs created, estimates of indirect taxes and subsidies generated, etc. For more details, ask us for the Guide to using the input-output simulation model, available free of charge upon request.

    At various times, clients have requested the use of IO price, energy, tax and market models. Given their availability, arrangements can be made to use these models on request.

    The national IO model was not released in 2015 or 2016.

    Release date: 2019-04-04

  • Surveys and statistical programs – Documentation: 15F0009X
    Description:

    The input-output (IO) models are generally used to simulate the economic impacts of an expenditure on a given basket of goods and services or the output of one or several industries. The simulation results from a "shock" to an IO model will show the direct, indirect and induced impacts on GDP, which industries benefit the most, the number of jobs created, estimates of indirect taxes and subsidies generated, etc. For more details, ask us for the Guide to using the input-output simulation model, available free of charge upon request.

    At various times, clients have requested the use of IO price, energy, tax and market models. Given their availability, arrangements can be made to use these models on request.

    The interprovincial IO model was not released in 2015 or 2016.

    Release date: 2019-04-04

  • Public use microdata: 89F0002X
    Description:

    The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and models, analytical software and user documentation.

    Release date: 2018-12-18

  • Articles and reports: 12-001-X201800154928
    Description:

    A two-phase process was used by the Substance Abuse and Mental Health Services Administration to estimate the proportion of US adults with serious mental illness (SMI). The first phase was the annual National Survey on Drug Use and Health (NSDUH), while the second phase was a random subsample of adult respondents to the NSDUH. Respondents to the second phase of sampling were clinically evaluated for serious mental illness. A logistic prediction model was fit to this subsample with the SMI status (yes or no) determined by the second-phase instrument treated as the dependent variable and related variables collected on the NSDUH from all adults as the model’s explanatory variables. Estimates were then computed for SMI prevalence among all adults and within adult subpopulations by assigning an SMI status to each NSDUH respondent based on comparing his (her) estimated probability of having SMI to a chosen cut point on the distribution of the predicted probabilities. We investigate alternatives to this standard cut point estimator such as the probability estimator. The latter assigns an estimated probability of having SMI to each NSDUH respondent. The estimated prevalence of SMI is the weighted mean of those estimated probabilities. Using data from NSDUH and its subsample, we show that, although the probability estimator has a smaller mean squared error when estimating SMI prevalence among all adults, it has a greater tendency to be biased at the subpopulation level than the standard cut point estimator.

    Release date: 2018-06-21

  • Articles and reports: 12-001-X201800154963
    Description:

    The probability-sampling-based framework has dominated survey research because it provides precise mathematical tools to assess sampling variability. However increasing costs and declining response rates are expanding the use of non-probability samples, particularly in general population settings, where samples of individuals pulled from web surveys are becoming increasingly cheap and easy to access. But non-probability samples are at risk for selection bias due to differential access, degrees of interest, and other factors. Calibration to known statistical totals in the population provide a means of potentially diminishing the effect of selection bias in non-probability samples. Here we show that model calibration using adaptive LASSO can yield a consistent estimator of a population total as long as a subset of the true predictors is included in the prediction model, thus allowing large numbers of possible covariates to be included without risk of overfitting. We show that the model calibration using adaptive LASSO provides improved estimation with respect to mean square error relative to standard competitors such as generalized regression (GREG) estimators when a large number of covariates are required to determine the true model, with effectively no loss in efficiency over GREG when smaller models will suffice. We also derive closed form variance estimators of population totals, and compare their behavior with bootstrap estimators. We conclude with a real world example using data from the National Health Interview Survey.

    Release date: 2018-06-21

  • Surveys and statistical programs – Documentation: 71-526-X
    Description:

    The Canadian Labour Force Survey (LFS) is the official source of monthly estimates of total employment and unemployment. Following the 2011 census, the LFS underwent a sample redesign to account for the evolution of the population and labour market characteristics, to adjust to changes in the information needs and to update the geographical information used to carry out the survey. The redesign program following the 2011 census culminated with the introduction of a new sample at the beginning of 2015. This report is a reference on the methodological aspects of the LFS, covering stratification, sampling, collection, processing, weighting, estimation, variance estimation and data quality.

    Release date: 2017-12-21

  • Articles and reports: 11-633-X2017008
    Description:

    The DYSEM microsimulation modelling platform provides a demographic and socioeconomic core that can be readily built upon to develop custom dynamic microsimulation models or applications. This paper describes DYSEM and provides an overview of its intended uses, as well as the methods and data used in its development.

    Release date: 2017-07-28

  • Articles and reports: 13-604-M2017083
    Description:

    Statistics Canada regularly publishes macroeconomic indicators on household assets, liabilities and net worth as part of the quarterly National Balance Sheet Accounts (NBSA). These accounts are aligned with the most recent international standards and are the source of estimates of national wealth for all sectors of the economy, including households, non-profit institutions, governments and corporations along with Canada’s wealth position vis-a-vis the rest of the world. While the NBSA provide high quality information on the overall position of households relative to other economic sectors, they lack the granularity required to understand vulnerabilities of specific groups and the resulting implications for economic wellbeing and financial stability.

    Release date: 2017-03-15

  • Journals and periodicals: 91-621-X
    Description:

    This document briefly describes Demosim, the microsimulation population projection model, how it works as well as its methods and data sources. It is a methodological complement to the analytical products produced using Demosim.

    Release date: 2017-01-25

  • Articles and reports: 12-001-X201600114538
    Description:

    The aim of automatic editing is to use a computer to detect and amend erroneous values in a data set, without human intervention. Most automatic editing methods that are currently used in official statistics are based on the seminal work of Fellegi and Holt (1976). Applications of this methodology in practice have shown systematic differences between data that are edited manually and automatically, because human editors may perform complex edit operations. In this paper, a generalization of the Fellegi-Holt paradigm is proposed that can incorporate a large class of edit operations in a natural way. In addition, an algorithm is outlined that solves the resulting generalized error localization problem. It is hoped that this generalization may be used to increase the suitability of automatic editing in practice, and hence to improve the efficiency of data editing processes. Some first results on synthetic data are promising in this respect.

    Release date: 2016-06-22
Data (4)

Data (4) ((4 results))

  • Public use microdata: 89F0002X
    Description:

    The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and models, analytical software and user documentation.

    Release date: 2018-12-18

  • Public use microdata: 81M0011X
    Description:

    This survey was designed to determine such factors as: the extent to which graduates of postsecondary programs had been successful in obtaining employment since graduation; the relationship between the graduates' programs of study and the employment subsequently obtained; the graduates' job and career satisfaction; the rates of under-employment and unemployment; the type of employment obtained related to career expectations and qualification requirements; and the influence of postsecondary education on occupational achievement. The information is directed towards policy makers, researchers, educators, employers and young adults-interested in postsecondary education and the transition from school to work of trade/vocational, college and university graduates.

    Release date: 2015-01-12

  • Public use microdata: 12M0014X
    Geography: Province or territory
    Description:

    This report presents a brief overview of the information collected in Cycle 14 of the General Social Survey (GSS). Cycle 14 is the first cycle to collect detailed information on access to and use of information communication technology in Canada. Topics include general use of technology and computers, technology in the workplace, development of computer skills, frequency of Internet and E-mail use, non-users and security and information on the Internet. The target population of the GSS is all individuals aged 15 and over living in a private household in one of the ten provinces.

    Release date: 2001-06-29

  • Public use microdata: 82M0009X
    Description:

    The National Population Health Survey (NPHS) used the Labour Force Survey sampling frame to draw the initial sample of approximately 20,000 households starting in 1994 and for the sample top-up this third cycle. The survey is conducted every two years. The sample collection is distributed over four quarterly periods followed by a follow-up period and the whole process takes a year. In each household, some limited health information is collected from all household members and one person in each household is randomly selected for a more in-depth interview.

    The survey is designed to collect information on the health of the Canadian population and related socio-demographic information. The first cycle of data collection began in 1994, and continues every second year thereafter. The survey is designed to produce both cross-sectional and longitudinal estimates. The questionnaires includes content related to health status, use of health services, determinants of health, a health index, chronic conditions and activity restrictions. The use of health services is probed through visits to health care providers, both traditional and non-traditional, and the use of drugs and other mediciations. Health determinants include smoking, alcohol use and physical activity. A special focus content for this cycle includes family medical history with questions about certain chronic conditions among immediate family members and when they were acquired. As well, a section on self care has also been included this cycle. The socio-demographic information includes age, sex, education, ethnicity, household income and labour force status.

    Release date: 2000-12-19
Analysis (414)

Analysis (414) (0 to 10 of 414 results)

  • Articles and reports: 12-001-X201800154928
    Description:

    A two-phase process was used by the Substance Abuse and Mental Health Services Administration to estimate the proportion of US adults with serious mental illness (SMI). The first phase was the annual National Survey on Drug Use and Health (NSDUH), while the second phase was a random subsample of adult respondents to the NSDUH. Respondents to the second phase of sampling were clinically evaluated for serious mental illness. A logistic prediction model was fit to this subsample with the SMI status (yes or no) determined by the second-phase instrument treated as the dependent variable and related variables collected on the NSDUH from all adults as the model’s explanatory variables. Estimates were then computed for SMI prevalence among all adults and within adult subpopulations by assigning an SMI status to each NSDUH respondent based on comparing his (her) estimated probability of having SMI to a chosen cut point on the distribution of the predicted probabilities. We investigate alternatives to this standard cut point estimator such as the probability estimator. The latter assigns an estimated probability of having SMI to each NSDUH respondent. The estimated prevalence of SMI is the weighted mean of those estimated probabilities. Using data from NSDUH and its subsample, we show that, although the probability estimator has a smaller mean squared error when estimating SMI prevalence among all adults, it has a greater tendency to be biased at the subpopulation level than the standard cut point estimator.

    Release date: 2018-06-21

  • Articles and reports: 12-001-X201800154963
    Description:

    The probability-sampling-based framework has dominated survey research because it provides precise mathematical tools to assess sampling variability. However increasing costs and declining response rates are expanding the use of non-probability samples, particularly in general population settings, where samples of individuals pulled from web surveys are becoming increasingly cheap and easy to access. But non-probability samples are at risk for selection bias due to differential access, degrees of interest, and other factors. Calibration to known statistical totals in the population provide a means of potentially diminishing the effect of selection bias in non-probability samples. Here we show that model calibration using adaptive LASSO can yield a consistent estimator of a population total as long as a subset of the true predictors is included in the prediction model, thus allowing large numbers of possible covariates to be included without risk of overfitting. We show that the model calibration using adaptive LASSO provides improved estimation with respect to mean square error relative to standard competitors such as generalized regression (GREG) estimators when a large number of covariates are required to determine the true model, with effectively no loss in efficiency over GREG when smaller models will suffice. We also derive closed form variance estimators of population totals, and compare their behavior with bootstrap estimators. We conclude with a real world example using data from the National Health Interview Survey.

    Release date: 2018-06-21

  • Articles and reports: 11-633-X2017008
    Description:

    The DYSEM microsimulation modelling platform provides a demographic and socioeconomic core that can be readily built upon to develop custom dynamic microsimulation models or applications. This paper describes DYSEM and provides an overview of its intended uses, as well as the methods and data used in its development.

    Release date: 2017-07-28

  • Articles and reports: 13-604-M2017083
    Description:

    Statistics Canada regularly publishes macroeconomic indicators on household assets, liabilities and net worth as part of the quarterly National Balance Sheet Accounts (NBSA). These accounts are aligned with the most recent international standards and are the source of estimates of national wealth for all sectors of the economy, including households, non-profit institutions, governments and corporations along with Canada’s wealth position vis-a-vis the rest of the world. While the NBSA provide high quality information on the overall position of households relative to other economic sectors, they lack the granularity required to understand vulnerabilities of specific groups and the resulting implications for economic wellbeing and financial stability.

    Release date: 2017-03-15

  • Journals and periodicals: 91-621-X
    Description:

    This document briefly describes Demosim, the microsimulation population projection model, how it works as well as its methods and data sources. It is a methodological complement to the analytical products produced using Demosim.

    Release date: 2017-01-25

  • Articles and reports: 12-001-X201600114538
    Description:

    The aim of automatic editing is to use a computer to detect and amend erroneous values in a data set, without human intervention. Most automatic editing methods that are currently used in official statistics are based on the seminal work of Fellegi and Holt (1976). Applications of this methodology in practice have shown systematic differences between data that are edited manually and automatically, because human editors may perform complex edit operations. In this paper, a generalization of the Fellegi-Holt paradigm is proposed that can incorporate a large class of edit operations in a natural way. In addition, an algorithm is outlined that solves the resulting generalized error localization problem. It is hoped that this generalization may be used to increase the suitability of automatic editing in practice, and hence to improve the efficiency of data editing processes. Some first results on synthetic data are promising in this respect.

    Release date: 2016-06-22

  • Articles and reports: 12-001-X201600114539
    Description:

    Statistical matching is a technique for integrating two or more data sets when information available for matching records for individual participants across data sets is incomplete. Statistical matching can be viewed as a missing data problem where a researcher wants to perform a joint analysis of variables that are never jointly observed. A conditional independence assumption is often used to create imputed data for statistical matching. We consider a general approach to statistical matching using parametric fractional imputation of Kim (2011) to create imputed data under the assumption that the specified model is fully identified. The proposed method does not have a convergent EM sequence if the model is not identified. We also present variance estimators appropriate for the imputation procedure. We explain how the method applies directly to the analysis of data from split questionnaire designs and measurement error models.

    Release date: 2016-06-22

  • Articles and reports: 12-001-X201600114540
    Description:

    In this paper, we compare the EBLUP and pseudo-EBLUP estimators for small area estimation under the nested error regression model and three area level model-based estimators using the Fay-Herriot model. We conduct a design-based simulation study to compare the model-based estimators for unit level and area level models under informative and non-informative sampling. In particular, we are interested in the confidence interval coverage rate of the unit level and area level estimators. We also compare the estimators if the model has been misspecified. Our simulation results show that estimators based on the unit level model perform better than those based on the area level. The pseudo-EBLUP estimator is the best among unit level and area level estimators.

    Release date: 2016-06-22

  • Articles and reports: 12-001-X201600114541
    Description:

    In this work we compare nonparametric estimators for finite population distribution functions based on two types of fitted values: the fitted values from the well-known Kuo estimator and a modified version of them, which incorporates a nonparametric estimate for the mean regression function. For each type of fitted values we consider the corresponding model-based estimator and, after incorporating design weights, the corresponding generalized difference estimator. We show under fairly general conditions that the leading term in the model mean square error is not affected by the modification of the fitted values, even though it slows down the convergence rate for the model bias. Second order terms of the model mean square errors are difficult to obtain and will not be derived in the present paper. It remains thus an open question whether the modified fitted values bring about some benefit from the model-based perspective. We discuss also design-based properties of the estimators and propose a variance estimator for the generalized difference estimator based on the modified fitted values. Finally, we perform a simulation study. The simulation results suggest that the modified fitted values lead to a considerable reduction of the design mean square error if the sample size is small.

    Release date: 2016-06-22

  • Articles and reports: 12-001-X201600114542
    Description:

    The restricted maximum likelihood (REML) method is generally used to estimate the variance of the random area effect under the Fay-Herriot model (Fay and Herriot 1979) to obtain the empirical best linear unbiased (EBLUP) estimator of a small area mean. When the REML estimate is zero, the weight of the direct sample estimator is zero and the EBLUP becomes a synthetic estimator. This is not often desirable. As a solution to this problem, Li and Lahiri (2011) and Yoshimori and Lahiri (2014) developed adjusted maximum likelihood (ADM) consistent variance estimators which always yield positive variance estimates. Some of the ADM estimators always yield positive estimates but they have a large bias and this affects the estimation of the mean squared error (MSE) of the EBLUP. We propose to use a MIX variance estimator, defined as a combination of the REML and ADM methods. We show that it is unbiased up to the second order and it always yields a positive variance estimate. Furthermore, we propose an MSE estimator under the MIX method and show via a model-based simulation that in many situations, it performs better than other ‘Taylor linearization’ MSE estimators proposed recently.

    Release date: 2016-06-22
Reference (77)

Reference (77) (0 to 10 of 77 results)

  • Surveys and statistical programs – Documentation: 15F0004X
    Description:

    The input-output (IO) models are generally used to simulate the economic impacts of an expenditure on a given basket of goods and services or the output of one or several industries. The simulation results from a "shock" to an IO model will show the direct, indirect and induced impacts on GDP, which industries benefit the most, the number of jobs created, estimates of indirect taxes and subsidies generated, etc. For more details, ask us for the Guide to using the input-output simulation model, available free of charge upon request.

    At various times, clients have requested the use of IO price, energy, tax and market models. Given their availability, arrangements can be made to use these models on request.

    The national IO model was not released in 2015 or 2016.

    Release date: 2019-04-04

  • Surveys and statistical programs – Documentation: 15F0009X
    Description:

    The input-output (IO) models are generally used to simulate the economic impacts of an expenditure on a given basket of goods and services or the output of one or several industries. The simulation results from a "shock" to an IO model will show the direct, indirect and induced impacts on GDP, which industries benefit the most, the number of jobs created, estimates of indirect taxes and subsidies generated, etc. For more details, ask us for the Guide to using the input-output simulation model, available free of charge upon request.

    At various times, clients have requested the use of IO price, energy, tax and market models. Given their availability, arrangements can be made to use these models on request.

    The interprovincial IO model was not released in 2015 or 2016.

    Release date: 2019-04-04

  • Surveys and statistical programs – Documentation: 71-526-X
    Description:

    The Canadian Labour Force Survey (LFS) is the official source of monthly estimates of total employment and unemployment. Following the 2011 census, the LFS underwent a sample redesign to account for the evolution of the population and labour market characteristics, to adjust to changes in the information needs and to update the geographical information used to carry out the survey. The redesign program following the 2011 census culminated with the introduction of a new sample at the beginning of 2015. This report is a reference on the methodological aspects of the LFS, covering stratification, sampling, collection, processing, weighting, estimation, variance estimation and data quality.

    Release date: 2017-12-21

  • Notices and consultations: 92-140-X2016001
    Description:

    The 2016 Census Program Content Test was conducted from May 2 to June 30, 2014. The Test was designed to assess the impact of any proposed content changes to the 2016 Census Program and to measure the impact of including a social insurance number (SIN) question on the data quality.

    This quantitative test used a split-panel design involving 55,000 dwellings, divided into 11 panels of 5,000 dwellings each: five panels were dedicated to the Content Test while the remaining six panels were for the SIN Test. Two models of test questionnaires were developed to meet the objectives, namely a model with all the proposed changes EXCEPT the SIN question and a model with all the proposed changes INCLUDING the SIN question. A third model of 'control' questionnaire with the 2011 content was also developed. The population living in a private dwelling in mail-out areas in one of the ten provinces was targeted for the test. Paper and electronic response channels were part of the Test as well.

    This report presents the Test objectives, the design and a summary of the analysis in order to determine potential content for the 2016 Census Program. Results from the data analysis of the Test were not the only elements used to determine the content for 2016. Other elements were also considered, such as response burden, comparison over time and users’ needs.

    Release date: 2016-04-01

  • Surveys and statistical programs – Documentation: 12-001-X201400114002
    Description:

    We propose an approach for multiple imputation of items missing at random in large-scale surveys with exclusively categorical variables that have structural zeros. Our approach is to use mixtures of multinomial distributions as imputation engines, accounting for structural zeros by conceiving of the observed data as a truncated sample from a hypothetical population without structural zeros. This approach has several appealing features: imputations are generated from coherent, Bayesian joint models that automatically capture complex dependencies and readily scale to large numbers of variables. We outline a Gibbs sampling algorithm for implementing the approach, and we illustrate its potential with a repeated sampling study using public use census microdata from the state of New York, U.S.A.

    Release date: 2014-06-27

  • Surveys and statistical programs – Documentation: 12-001-X201300211887
    Description:

    Multi-level models are extensively used for analyzing survey data with the design hierarchy matching the model hierarchy. We propose a unified approach, based on a design-weighted log composite likelihood, for two-level models that leads to design-model consistent estimators of the model parameters even when the within cluster sample sizes are small provided the number of sample clusters is large. This method can handle both linear and generalized linear two-level models and it requires level 2 and level 1 inclusion probabilities and level 1 joint inclusion probabilities, where level 2 represents a cluster and level 1 an element within a cluster. Results of a simulation study demonstrating superior performance of the proposed method relative to existing methods under informative sampling are also reported.

    Release date: 2014-01-15

  • Surveys and statistical programs – Documentation: 12-001-X201200211755
    Description:

    Non-response in longitudinal studies is addressed by assessing the accuracy of response propensity models constructed to discriminate between and predict different types of non-response. Particular attention is paid to summary measures derived from receiver operating characteristic (ROC) curves and logit rank plots. The ideas are applied to data from the UK Millennium Cohort Study. The results suggest that the ability to discriminate between and predict non-respondents is not high. Weights generated from the response propensity models lead to only small adjustments in employment transitions. Conclusions are drawn in terms of the potential of interventions to prevent non-response.

    Release date: 2012-12-19

  • Surveys and statistical programs – Documentation: 12-539-X
    Description:

    This document brings together guidelines and checklists on many issues that need to be considered in the pursuit of quality objectives in the execution of statistical activities. Its focus is on how to assure quality through effective and appropriate design or redesign of a statistical project or program from inception through to data evaluation, dissemination and documentation. These guidelines draw on the collective knowledge and experience of many Statistics Canada employees. It is expected that Quality Guidelines will be useful to staff engaged in the planning and design of surveys and other statistical projects, as well as to those who evaluate and analyze the outputs of these projects.

    Release date: 2009-10-05

  • Surveys and statistical programs – Documentation: 12-001-X200900110882
    Description:

    The bootstrap technique is becoming more and more popular in sample surveys conducted by national statistical agencies. In most of its implementations, several sets of bootstrap weights accompany the survey microdata file given to analysts. So far, the use of the technique in practice seems to have been mostly limited to variance estimation problems. In this paper, we propose a bootstrap methodology for testing hypotheses about a vector of unknown model parameters when the sample has been drawn from a finite population. The probability sampling design used to select the sample may be informative or not. Our method uses model-based test statistics that incorporate the survey weights. Such statistics are usually easily obtained using classical software packages. We approximate the distribution under the null hypothesis of these weighted model-based statistics by using bootstrap weights. An advantage of our bootstrap method over existing methods of hypothesis testing with survey data is that, once sets of bootstrap weights are provided to analysts, it is very easy to apply even when no specialized software dealing with complex surveys is available. Also, our simulation results suggest that, overall, it performs similarly to the Rao-Scott procedure and better than the Wald and Bonferroni procedures when testing hypotheses about a vector of linear regression model parameters.

    Release date: 2009-06-22

  • Surveys and statistical programs – Documentation: 12-001-X200800210757
    Description:

    Sample weights can be calibrated to reflect the known population totals of a set of auxiliary variables. Predictors of finite population totals calculated using these weights have low bias if these variables are related to the variable of interest, but can have high variance if too many auxiliary variables are used. This article develops an "adaptive calibration" approach, where the auxiliary variables to be used in weighting are selected using sample data. Adaptively calibrated estimators are shown to have lower mean squared error and better coverage properties than non-adaptive estimators in many cases.

    Release date: 2008-12-23
Date modified: