Keyword search

Sort Help
entries

Results

All (84)

All (84) (0 to 10 of 84 results)

  • Journals and periodicals: 62F0026M
    Description:

    This series provides detailed documentation on the issues, concepts, methodology, data quality and other relevant research related to household expenditures from the Survey of Household Spending, the Homeowner Repair and Renovation Survey and the Food Expenditure Survey.

    Release date: 2021-01-22

  • Journals and periodicals: 12-206-X
    Description:

    This report summarizes the achievements program sponsored by the three methodology divisions of Statistics Canada. This program covers research and development activities in statistical methods with potentially broad application in the Agency's survey programs, which would not otherwise have been carried out during the provision of methodology services to those survey programs. They also include tasks that provided client support in the application of past successful developments in order to promote the utilization of the results of research and development work.

    Release date: 2020-09-29

  • Articles and reports: 12-001-X202000100006
    Description:

    In surveys, logical boundaries among variables or among waves of surveys make imputation of missing values complicated. We propose a new regression-based multiple imputation method to deal with survey nonresponses with two-sided logical boundaries. This imputation method automatically satisfies the boundary conditions without an additional acceptance/rejection procedure and utilizes the boundary information to derive an imputed value and to determine the suitability of the imputed value. Simulation results show that our new imputation method outperforms the existing imputation methods for both mean and quantile estimations regardless of missing rates, error distributions, and missing-mechanisms. We apply our method to impute the self-reported variable “years of smoking” in successive health screenings of Koreans.

    Release date: 2020-06-30

  • Public use microdata: 81M0011X
    Description:

    This survey was designed to collect details on topics such as: i) the extent to which graduates of postsecondary programs have been successful in obtaining employment since graduation; ii) the relationship between the graduates' program of study and the employment subsequently obtained; iii) the type of employment obtained and qualification requirements; iv) sources of funding for postsecondary education; and v) government-sponsored student loans and other sources of student debt. The survey results are directed towards policy makers, researchers, educators, employers and persons interested in public postsecondary education and graduates' transition from school to work.

    Release date: 2020-01-14

  • Surveys and statistical programs – Documentation: 12-539-X
    Description:

    This document brings together guidelines and checklists on many issues that need to be considered in the pursuit of quality objectives in the execution of statistical activities. Its focus is on how to assure quality through effective and appropriate design or redesign of a statistical project or program from inception through to data evaluation, dissemination and documentation. These guidelines draw on the collective knowledge and experience of many Statistics Canada employees. It is expected that Quality Guidelines will be useful to staff engaged in the planning and design of surveys and other statistical projects, as well as to those who evaluate and analyze the outputs of these projects.

    Release date: 2019-12-04

  • Articles and reports: 12-001-X201800254957
    Description:

    When a linear imputation method is used to correct non-response based on certain assumptions, total variance can be assigned to non-responding units. Linear imputation is not as limited as it seems, given that the most common methods – ratio, donor, mean and auxiliary value imputation – are all linear imputation methods. We will discuss the inference framework and the unit-level decomposition of variance due to non-response. Simulation results will also be presented. This decomposition can be used to prioritize non-response follow-up or manual corrections, or simply to guide data analysis.

    Release date: 2018-12-20

  • Articles and reports: 11-633-X2017006
    Description:

    This paper describes a method of imputing missing postal codes in a longitudinal database. The 1991 Canadian Census Health and Environment Cohort (CanCHEC), which contains information on individuals from the 1991 Census long-form questionnaire linked with T1 tax return files for the 1984-to-2011 period, is used to illustrate and validate the method. The cohort contains up to 28 consecutive fields for postal code of residence, but because of frequent gaps in postal code history, missing postal codes must be imputed. To validate the imputation method, two experiments were devised where 5% and 10% of all postal codes from a subset with full history were randomly removed and imputed.

    Release date: 2017-03-13

  • Articles and reports: 12-001-X201600114539
    Description:

    Statistical matching is a technique for integrating two or more data sets when information available for matching records for individual participants across data sets is incomplete. Statistical matching can be viewed as a missing data problem where a researcher wants to perform a joint analysis of variables that are never jointly observed. A conditional independence assumption is often used to create imputed data for statistical matching. We consider a general approach to statistical matching using parametric fractional imputation of Kim (2011) to create imputed data under the assumption that the specified model is fully identified. The proposed method does not have a convergent EM sequence if the model is not identified. We also present variance estimators appropriate for the imputation procedure. We explain how the method applies directly to the analysis of data from split questionnaire designs and measurement error models.

    Release date: 2016-06-22

  • Articles and reports: 12-001-X201500114193
    Description:

    Imputed micro data often contain conflicting information. The situation may e.g., arise from partial imputation, where one part of the imputed record consists of the observed values of the original record and the other the imputed values. Edit-rules that involve variables from both parts of the record will often be violated. Or, inconsistency may be caused by adjustment for errors in the observed data, also referred to as imputation in Editing. Under the assumption that the remaining inconsistency is not due to systematic errors, we propose to make adjustments to the micro data such that all constraints are simultaneously satisfied and the adjustments are minimal according to a chosen distance metric. Different approaches to the distance metric are considered, as well as several extensions of the basic situation, including the treatment of categorical data, unit imputation and macro-level benchmarking. The properties and interpretations of the proposed methods are illustrated using business-economic data.

    Release date: 2015-06-29

  • Articles and reports: 12-001-X201400214089
    Description:

    This manuscript describes the use of multiple imputation to combine information from multiple surveys of the same underlying population. We use a newly developed method to generate synthetic populations nonparametrically using a finite population Bayesian bootstrap that automatically accounts for complex sample designs. We then analyze each synthetic population with standard complete-data software for simple random samples and obtain valid inference by combining the point and variance estimates using extensions of existing combining rules for synthetic data. We illustrate the approach by combining data from the 2006 National Health Interview Survey (NHIS) and the 2006 Medical Expenditure Panel Survey (MEPS).

    Release date: 2014-12-19
Data (3)

Data (3) ((3 results))

  • Public use microdata: 81M0011X
    Description:

    This survey was designed to collect details on topics such as: i) the extent to which graduates of postsecondary programs have been successful in obtaining employment since graduation; ii) the relationship between the graduates' program of study and the employment subsequently obtained; iii) the type of employment obtained and qualification requirements; iv) sources of funding for postsecondary education; and v) government-sponsored student loans and other sources of student debt. The survey results are directed towards policy makers, researchers, educators, employers and persons interested in public postsecondary education and graduates' transition from school to work.

    Release date: 2020-01-14

  • Public use microdata: 82M0010X
    Description:

    The National Population Health Survey (NPHS) program is designed to collect information related to the health of the Canadian population. The first cycle of data collection began in 1994. The institutional component includes long-term residents (expected to stay longer than six months) in health care facilities with four or more beds in Canada with the principal exclusion of the Yukon and the Northwest Teritories. The document has been produced to facilitate the manipulation of the 1996-1997 microdata file containing survey results. The main variables include: demography, health status, chronic conditions, restriction of activity, socio-demographic, and others.

    Release date: 2000-08-02

  • Public use microdata: 12M0010X
    Description:

    Cycle 10 collected data from persons 15 years and older and concentrated on the respondent's family. Topics covered include marital history, common- law unions, biological, adopted and step children, family origins, child leaving and fertility intentions.

    The target population of the GSS (General Social Survey) consisted of all individuals aged 15 and over living in a private household in one of the ten provinces.

    Release date: 1997-02-28
Analysis (57)

Analysis (57) (0 to 10 of 57 results)

  • Journals and periodicals: 62F0026M
    Description:

    This series provides detailed documentation on the issues, concepts, methodology, data quality and other relevant research related to household expenditures from the Survey of Household Spending, the Homeowner Repair and Renovation Survey and the Food Expenditure Survey.

    Release date: 2021-01-22

  • Journals and periodicals: 12-206-X
    Description:

    This report summarizes the achievements program sponsored by the three methodology divisions of Statistics Canada. This program covers research and development activities in statistical methods with potentially broad application in the Agency's survey programs, which would not otherwise have been carried out during the provision of methodology services to those survey programs. They also include tasks that provided client support in the application of past successful developments in order to promote the utilization of the results of research and development work.

    Release date: 2020-09-29

  • Articles and reports: 12-001-X202000100006
    Description:

    In surveys, logical boundaries among variables or among waves of surveys make imputation of missing values complicated. We propose a new regression-based multiple imputation method to deal with survey nonresponses with two-sided logical boundaries. This imputation method automatically satisfies the boundary conditions without an additional acceptance/rejection procedure and utilizes the boundary information to derive an imputed value and to determine the suitability of the imputed value. Simulation results show that our new imputation method outperforms the existing imputation methods for both mean and quantile estimations regardless of missing rates, error distributions, and missing-mechanisms. We apply our method to impute the self-reported variable “years of smoking” in successive health screenings of Koreans.

    Release date: 2020-06-30

  • Articles and reports: 12-001-X201800254957
    Description:

    When a linear imputation method is used to correct non-response based on certain assumptions, total variance can be assigned to non-responding units. Linear imputation is not as limited as it seems, given that the most common methods – ratio, donor, mean and auxiliary value imputation – are all linear imputation methods. We will discuss the inference framework and the unit-level decomposition of variance due to non-response. Simulation results will also be presented. This decomposition can be used to prioritize non-response follow-up or manual corrections, or simply to guide data analysis.

    Release date: 2018-12-20

  • Articles and reports: 11-633-X2017006
    Description:

    This paper describes a method of imputing missing postal codes in a longitudinal database. The 1991 Canadian Census Health and Environment Cohort (CanCHEC), which contains information on individuals from the 1991 Census long-form questionnaire linked with T1 tax return files for the 1984-to-2011 period, is used to illustrate and validate the method. The cohort contains up to 28 consecutive fields for postal code of residence, but because of frequent gaps in postal code history, missing postal codes must be imputed. To validate the imputation method, two experiments were devised where 5% and 10% of all postal codes from a subset with full history were randomly removed and imputed.

    Release date: 2017-03-13

  • Articles and reports: 12-001-X201600114539
    Description:

    Statistical matching is a technique for integrating two or more data sets when information available for matching records for individual participants across data sets is incomplete. Statistical matching can be viewed as a missing data problem where a researcher wants to perform a joint analysis of variables that are never jointly observed. A conditional independence assumption is often used to create imputed data for statistical matching. We consider a general approach to statistical matching using parametric fractional imputation of Kim (2011) to create imputed data under the assumption that the specified model is fully identified. The proposed method does not have a convergent EM sequence if the model is not identified. We also present variance estimators appropriate for the imputation procedure. We explain how the method applies directly to the analysis of data from split questionnaire designs and measurement error models.

    Release date: 2016-06-22

  • Articles and reports: 12-001-X201500114193
    Description:

    Imputed micro data often contain conflicting information. The situation may e.g., arise from partial imputation, where one part of the imputed record consists of the observed values of the original record and the other the imputed values. Edit-rules that involve variables from both parts of the record will often be violated. Or, inconsistency may be caused by adjustment for errors in the observed data, also referred to as imputation in Editing. Under the assumption that the remaining inconsistency is not due to systematic errors, we propose to make adjustments to the micro data such that all constraints are simultaneously satisfied and the adjustments are minimal according to a chosen distance metric. Different approaches to the distance metric are considered, as well as several extensions of the basic situation, including the treatment of categorical data, unit imputation and macro-level benchmarking. The properties and interpretations of the proposed methods are illustrated using business-economic data.

    Release date: 2015-06-29

  • Articles and reports: 12-001-X201400214089
    Description:

    This manuscript describes the use of multiple imputation to combine information from multiple surveys of the same underlying population. We use a newly developed method to generate synthetic populations nonparametrically using a finite population Bayesian bootstrap that automatically accounts for complex sample designs. We then analyze each synthetic population with standard complete-data software for simple random samples and obtain valid inference by combining the point and variance estimates using extensions of existing combining rules for synthetic data. We illustrate the approach by combining data from the 2006 National Health Interview Survey (NHIS) and the 2006 Medical Expenditure Panel Survey (MEPS).

    Release date: 2014-12-19

  • Articles and reports: 12-001-X201400214091
    Description:

    Parametric fractional imputation (PFI), proposed by Kim (2011), is a tool for general purpose parameter estimation under missing data. We propose a fractional hot deck imputation (FHDI) which is more robust than PFI or multiple imputation. In the proposed method, the imputed values are chosen from the set of respondents and assigned proper fractional weights. The weights are then adjusted to meet certain calibration conditions, which makes the resulting FHDI estimator efficient. Two simulation studies are presented to compare the proposed method with existing methods.

    Release date: 2014-12-19

  • Articles and reports: 12-001-X201400114002
    Description:

    We propose an approach for multiple imputation of items missing at random in large-scale surveys with exclusively categorical variables that have structural zeros. Our approach is to use mixtures of multinomial distributions as imputation engines, accounting for structural zeros by conceiving of the observed data as a truncated sample from a hypothetical population without structural zeros. This approach has several appealing features: imputations are generated from coherent, Bayesian joint models that automatically capture complex dependencies and readily scale to large numbers of variables. We outline a Gibbs sampling algorithm for implementing the approach, and we illustrate its potential with a repeated sampling study using public use census microdata from the state of New York, U.S.A.

    Release date: 2014-06-27
Reference (24)

Reference (24) (0 to 10 of 24 results)

  • Surveys and statistical programs – Documentation: 12-539-X
    Description:

    This document brings together guidelines and checklists on many issues that need to be considered in the pursuit of quality objectives in the execution of statistical activities. Its focus is on how to assure quality through effective and appropriate design or redesign of a statistical project or program from inception through to data evaluation, dissemination and documentation. These guidelines draw on the collective knowledge and experience of many Statistics Canada employees. It is expected that Quality Guidelines will be useful to staff engaged in the planning and design of surveys and other statistical projects, as well as to those who evaluate and analyze the outputs of these projects.

    Release date: 2019-12-04

  • Surveys and statistical programs – Documentation: 99-012-X2011006
    Geography: Canada
    Description:

    This reference guide provides information that enables users to effectively use, apply and interpret data from the 2011 National Household Survey (NHS). This guide contains definitions and explanations of concepts, classifications, data quality and comparability to other sources. Additional information is included for specific variables to help general users better understand the concepts and questions used in the NHS.

    Release date: 2013-06-26

  • Surveys and statistical programs – Documentation: 99-012-X2011007
    Description:

    This reference guide provides information that enables users to effectively use, apply and interpret data from the 2011 National Household Survey (NHS). This guide contains definitions and explanations of concepts, classifications, data quality and comparability to other sources. Additional information is included for specific variables to help general users better understand the concepts and questions used in the NHS.

    Release date: 2013-06-26

  • Surveys and statistical programs – Documentation: 99-012-X2011008
    Description:

    This reference guide provides information that enables users to effectively use, apply and interpret data from the 2011 National Household Survey (NHS). This guide contains definitions and explanations of concepts, classifications, data quality and comparability to other sources. Additional information is included for specific variables to help general users better understand the concepts and questions used in the NHS.

    Release date: 2013-06-26

  • Surveys and statistical programs – Documentation: 99-013-X2011006
    Description:

    This reference guide provides information that enables users to effectively use, apply and interpret data from the 2011 National Household Survey (NHS). This guide contains definitions and explanations of concepts, classifications, data quality and comparability to other sources. Additional information is included for specific variables to help general users better understand the concepts and questions used in the NHS.

    Release date: 2013-06-26

  • Surveys and statistical programs – Documentation: 62F0026M2010004
    Description:

    This report describes the quality indicators produced for the 2007 Survey of Household Spending. These quality indicators, such as coefficients of variation, nonresponse rates, slippage rates and imputation rates, help users interpret the survey data.

    Release date: 2010-12-13

  • Surveys and statistical programs – Documentation: 62F0026M2010005
    Description:

    This report describes the quality indicators produced for the 2008 Survey of Household Spending. These quality indicators, such as coefficients of variation, nonresponse rates, slippage rates and imputation rates, help users interpret the survey data.

    Release date: 2010-12-13

  • Surveys and statistical programs – Documentation: 75F0002M2008005
    Description:

    The Survey of Labour and Income Dynamics (SLID) is a longitudinal survey initiated in 1993. The survey was designed to measure changes in the economic well-being of Canadians as well as the factors affecting these changes. Sample surveys are subject to sampling errors. In order to consider these errors, each estimates presented in the "Income Trends in Canada" series comes with a quality indicator based on the coefficient of variation. However, other factors must also be considered to make sure data are properly used. Statistics Canada puts considerable time and effort to control errors at every stage of the survey and to maximise the fitness for use. Nevertheless, the survey design and the data processing could restrict the fitness for use. It is the policy at Statistics Canada to furnish users with measures of data quality so that the user is able to interpret the data properly. This report summarizes the set of quality measures of SLID data. Among the measures included in the report are sample composition and attrition rates, sampling errors, coverage errors in the form of slippage rates, response rates, tax permission and tax linkage rates, and imputation rates.

    Release date: 2008-08-20

  • Surveys and statistical programs – Documentation: 92-393-X
    Description:

    This report is a brief guide to users of census income data. It provides a general description of the various 2001 Census phases, from data collection, through processing for non-response, to dissemination. Descriptions of, and summary data on, the changes to income data that occurred during the processing stages are given. Comparative data from national accounts and tax data sources at a highly aggregated level are also presented to put the quality of the 2001 Census income data into perspective. For users wishing to compare census income data over time, changes in income content and universe coverage over the years are explained. Finally, a complete description of all census products containing income data is also supplied.

    Release date: 2004-09-16

  • Surveys and statistical programs – Documentation: 92-390-X
    Description:

    This report includes a definition of the 2001 place of work concept and the place of work geography, standard text on data collection and coverage (including data collection methods, special coverage studies, sampling and weighting, edit and follow-up, coverage and content considerations). Both standard and subject-matter specific text pieces are also included for data assimilation (automated as well as interactive coding), edit and imputation and data evaluation. Finally, this technical report includes a section on historical comparability.

    Release date: 2004-08-26
Date modified: