Analysis

Skip to main content
Skip to footer

Language selection

Français

Search and menus

Search and menus

Search

Skip to filters. View results.

What’s new on our website

Statistics Canada's Trust Centre

Results

All (20)

All (20) (0 to 10 of 20 results)

1. Imputation for nonmonotone last-value-dependent nonrespondents in longitudinal surveys Archived
Articles and reports: 12-001-X200800210756
Description:
In longitudinal surveys nonresponse often occurs in a pattern that is not monotone. We consider estimation of time-dependent means under the assumption that the nonresponse mechanism is last-value-dependent. Since the last value itself may be missing when nonresponse is nonmonotone, the nonresponse mechanism under consideration is nonignorable. We propose an imputation method by first deriving some regression imputation models according to the nonresponse mechanism and then applying nonparametric regression imputation. We assume that the longitudinal data follow a Markov chain with finite second-order moments. No other assumption is imposed on the joint distribution of longitudinal data and their nonresponse indicators. A bootstrap method is applied for variance estimation. Some simulation results and an example concerning the Current Employment Survey are presented.
Release date: 2008-12-23
2. Focus Groups with Respondents and Non-respondents to the Survey of Consumer Finances Archived
Articles and reports: 75F0002M1992009
Description:
There are many issues to consider when developing and conducting a survey. Length, complexity and timing of the survey are all factors that may influence potential respondents' likelihood to participate in a survey. One important issue that affects this decision is the extent to which a questionnaire appears to be an invasion of privacy. Information on income and finances is one type of information that many people are reluctant to share but that is important for policy and research purposes.
Collecting such information for the Survey of Consumer Finances (SCF) has proven difficult, and has resulted in higher than average non-response rate for a supplemental survey to the Labour Force Survey. Given the similarity between the SCF and an upcoming survey, the Survey of Labour and Income Dynamics (SLID), it is important to examine the reasons behind the SCF's higher non-response rate and obtain suggestions for increasing response rate and gaining commitment from respondents to the 6-year SLID.
Statistics Canada asked Price Waterhouse to conduct focus groups and in-depth interviews with respondents and non-respondents to the SCF. The objectives of these focus groups and in-depth interviews were to explore reasons for response and non-response, issues of privacy and confidentiality and understanding of the terms used in the survey, and to test reactions to the appearance of a draft SLID package.
Release date: 2008-10-21
3. The feasibility of establishing correction factors to adjust self-reported estimates of obesity Archived
Articles and reports: 82-003-X200800310680
Geography: Canada
Description:
This study examines the feasibility of developing correction factors to adjust self-reported measures of body mass index to more closely approximate measured values. Data are from the 2005 Canadian Community Health Survey, in which respondents were asked to report their height and weight, and were subsequently measured.
Release date: 2008-09-17
4. Report on Regional Discussions on Aboriginal Identification Questions Archived
Journals and periodicals: 89-629-X
Geography: Canada
Description:
This report summarizes the main issues raised in these meetings. Four questions used to identify Aboriginal people from the Census and surveys were considered in the discussions.Statistics Canada regularly reviews the questions used on the Census and other surveys to ensure that the resulting data are representative of the population. As a first step in the process to review the questions used to produce data about First Nations, Inuit and Métis populations, regional discussions were held with more than 350 users of Aboriginal data in over 40 locations across Canada during the winter, spring and early summer of 2007.
This report summarizes the main issues raised in these meetings. Four questions used to identify Aboriginal people from the Census and surveys were considered in the discussions.
Release date: 2008-05-27
5. Effects of measurement on obesity and morbidity Archived
Articles and reports: 82-003-X200800210564
Geography: Canada
Description:
This article compares associations between body mass index categories based on self-reported versus measured data with selected health conditions. The goal is to determine if the misclassification resulting from the use of self-reported data alters associations between excess weight and these health conditions. The analysis is based on 2,667 respondents aged 40 or older from the 2005 Canadian Community Health Survey.
Release date: 2008-05-14
6. Estimates of obesity based on self-report versus direct measures Archived
Articles and reports: 82-003-X200800210569
Geography: Canada
Description:
Based on a representative sample of the Canadian population, this article quantifies the bias resulting from the use of self-reported rather than directly measured height, weight and body mass index. The analysis is based on 4,567 respondents to the 2005 Canadian Community Health Survey, who provided self-reported values for height and weight and were then measured.
Release date: 2008-05-14
7. Mixed linear nonlinear aggregate level models for small area estimation from surveys of binary counts Archived
Articles and reports: 11-522-X200600110390
Description:
We propose an aggregate level generalized linear model with additive random components (GLMARC) for binary count data from surveys. It has both linear (for random effects) and nonlinear (for fixed effects) parts in modeling the mean function and hence belongs to a class termed as mixed linear non-linear models. The model allows for linear mixed model (LMM)-type approach to small area estimation (SAE) somewhat similar to the well-known Fay-Herriot (1979) method and thus takes full account of the sampling design. Unlike the alternative hierarchical Bayes (HB) approach of You and Rao (2002), the proposed method gives rise to easily interpretable SAEs and frequentist diagnostics as well as self-benchmarking to reliable large area direct estimates. The usual LMM methodology is not appropriate for the problem with count data because of lack of range restrictions on the mean function and the possibility of unrealistic (e.g. zero in the context of SAE) estimates of the variance component as the model does not allow the random effect part of the conditional mean function to depend on the marginal mean. The proposed method is an improvement of the earlier method due to Vonesh and Carter (1992) which also uses mixed linear nonlinear models but the variance-mean relationship was not accounted for although typically done via range restrictions on the random effect. Also the implications of survey design were not considered as well as the estimation of random effects. In our application for SAE, however, it is important to obtain suitable estimates of both fixed and random effects. It may be noted that unlike the generalized linear mixed model (GLMM), GLMARC like LMM offers considerable simplicity in model fitting. This was made possible by replacing the original fixed and random effects of GLMM with a new set of parameters of GLMARC with quite a different interpretation as the random effect is no longer inside the nonlinear predictor function. However, this is of no consequence for SAE because the small area parameters correspond to the overall conditional means and not on individual model parameters. We propose a method of iterative BLUP for parameters estimation which allows for self-benchmarking after a suitable model enlargement. The problem of small areas with small or no sample sizes or zero direct estimates is addressed by collapsing domains only for the stage of parameter estimation. Application to the 2000-01 Canadian Community Health Survey for estimation of the proportion of daily smokers in subpopulations defined by provincial health regions by age-sex groups is presented as an illustration.
Release date: 2008-03-17
8. Complex sampling design based familial longitudinal health data analysis: an overview Archived
Articles and reports: 11-522-X200600110399
Description:
In health studies, it is quite common to collect binary or count repeated responses along with a set of multi-dimensional covariates over a small period of time from a large number of independent families, where the families are selected from a finite population by using certain complex sampling designs. It is of interest to examine the effects of the covariates on the familial longitudinal responses after taking the variation in the family effects as well as the longitudinal correlations of the repeated responses into account. In this paper, I review the advantages and drawbacks of the existing methodologies for the estimation of the regression effects, the variance of the family effects and the longitudinal correlations. We then outline the advantages of a new unified generalized quasilikelihood approach in analyzing the complex design based familial longitudinal data. Some existing numerical studies are discussed as illustrations of the methodologies considered in the paper.
Release date: 2008-03-17
9. Combining information from two surveys to improve on analyses of self-reported data in estimating measures of health Archived
Articles and reports: 11-522-X200600110408
Description:
Despite advances that have improved the health of the United States population, disparities in health remain among various racial/ethnic and socio-economic groups. Common data sources for assessing the health of a population of interest include large-scale surveys that often pose questions requiring a self-report, such as, "Has a doctor or other health professional ever told you that you have health condition of interest?" Answers to such questions might not always reflect the true prevalences of health conditions (for example, if a respondent does not have access to a doctor or other health professional). Similarly, self-reported data on quantities such as height and weight might be subject to reporting errors. Such "measurement error" in health data could affect inferences about measures of health and health disparities. In this work, we fit measurement-error models to data from the National Health and Nutrition Examination Survey, which asks self-report questions during an interview component and also obtains physical measurements during an examination component. We then develop methods for using the fitted models to improve on analyses of self-reported data from another survey that does not include an examination component. The methods, which involve multiply imputing examination-based data values for the survey that has only self-reported data, are applied to the National Health Interview Survey in examples involving diabetes, hypertension, and obesity. Preliminary results suggest that the adjustments for measurement error can result in non-negligible changes in estimates of measures of health.
Release date: 2008-03-17
10. The analysis of population-based case control studies Archived
Articles and reports: 11-522-X200600110415
Description:
We discuss methods for the analysis of case-control studies in which the controls are drawn using a complex sample survey. The most straightforward method is the standard survey approach based on weighted versions of population estimating equations. We also look at more efficient methods and compare their robustness to model mis-specification in simple cases. Case-control family studies, where the within-cluster structure is of interest in its own right, are also discussed briefly.
Release date: 2008-03-17

Stats in brief (0)

Stats in brief (0) (0 results)

No content available at this time.

Articles and reports (19)

Articles and reports (19) (0 to 10 of 19 results)

1. Imputation for nonmonotone last-value-dependent nonrespondents in longitudinal surveys Archived
Articles and reports: 12-001-X200800210756
Description:
In longitudinal surveys nonresponse often occurs in a pattern that is not monotone. We consider estimation of time-dependent means under the assumption that the nonresponse mechanism is last-value-dependent. Since the last value itself may be missing when nonresponse is nonmonotone, the nonresponse mechanism under consideration is nonignorable. We propose an imputation method by first deriving some regression imputation models according to the nonresponse mechanism and then applying nonparametric regression imputation. We assume that the longitudinal data follow a Markov chain with finite second-order moments. No other assumption is imposed on the joint distribution of longitudinal data and their nonresponse indicators. A bootstrap method is applied for variance estimation. Some simulation results and an example concerning the Current Employment Survey are presented.
Release date: 2008-12-23
2. Focus Groups with Respondents and Non-respondents to the Survey of Consumer Finances Archived
Articles and reports: 75F0002M1992009
Description:
There are many issues to consider when developing and conducting a survey. Length, complexity and timing of the survey are all factors that may influence potential respondents' likelihood to participate in a survey. One important issue that affects this decision is the extent to which a questionnaire appears to be an invasion of privacy. Information on income and finances is one type of information that many people are reluctant to share but that is important for policy and research purposes.
Collecting such information for the Survey of Consumer Finances (SCF) has proven difficult, and has resulted in higher than average non-response rate for a supplemental survey to the Labour Force Survey. Given the similarity between the SCF and an upcoming survey, the Survey of Labour and Income Dynamics (SLID), it is important to examine the reasons behind the SCF's higher non-response rate and obtain suggestions for increasing response rate and gaining commitment from respondents to the 6-year SLID.
Statistics Canada asked Price Waterhouse to conduct focus groups and in-depth interviews with respondents and non-respondents to the SCF. The objectives of these focus groups and in-depth interviews were to explore reasons for response and non-response, issues of privacy and confidentiality and understanding of the terms used in the survey, and to test reactions to the appearance of a draft SLID package.
Release date: 2008-10-21
3. The feasibility of establishing correction factors to adjust self-reported estimates of obesity Archived
Articles and reports: 82-003-X200800310680
Geography: Canada
Description:
This study examines the feasibility of developing correction factors to adjust self-reported measures of body mass index to more closely approximate measured values. Data are from the 2005 Canadian Community Health Survey, in which respondents were asked to report their height and weight, and were subsequently measured.
Release date: 2008-09-17
4. Effects of measurement on obesity and morbidity Archived
Articles and reports: 82-003-X200800210564
Geography: Canada
Description:
This article compares associations between body mass index categories based on self-reported versus measured data with selected health conditions. The goal is to determine if the misclassification resulting from the use of self-reported data alters associations between excess weight and these health conditions. The analysis is based on 2,667 respondents aged 40 or older from the 2005 Canadian Community Health Survey.
Release date: 2008-05-14
5. Estimates of obesity based on self-report versus direct measures Archived
Articles and reports: 82-003-X200800210569
Geography: Canada
Description:
Based on a representative sample of the Canadian population, this article quantifies the bias resulting from the use of self-reported rather than directly measured height, weight and body mass index. The analysis is based on 4,567 respondents to the 2005 Canadian Community Health Survey, who provided self-reported values for height and weight and were then measured.
Release date: 2008-05-14
6. Mixed linear nonlinear aggregate level models for small area estimation from surveys of binary counts Archived
Articles and reports: 11-522-X200600110390
Description:
We propose an aggregate level generalized linear model with additive random components (GLMARC) for binary count data from surveys. It has both linear (for random effects) and nonlinear (for fixed effects) parts in modeling the mean function and hence belongs to a class termed as mixed linear non-linear models. The model allows for linear mixed model (LMM)-type approach to small area estimation (SAE) somewhat similar to the well-known Fay-Herriot (1979) method and thus takes full account of the sampling design. Unlike the alternative hierarchical Bayes (HB) approach of You and Rao (2002), the proposed method gives rise to easily interpretable SAEs and frequentist diagnostics as well as self-benchmarking to reliable large area direct estimates. The usual LMM methodology is not appropriate for the problem with count data because of lack of range restrictions on the mean function and the possibility of unrealistic (e.g. zero in the context of SAE) estimates of the variance component as the model does not allow the random effect part of the conditional mean function to depend on the marginal mean. The proposed method is an improvement of the earlier method due to Vonesh and Carter (1992) which also uses mixed linear nonlinear models but the variance-mean relationship was not accounted for although typically done via range restrictions on the random effect. Also the implications of survey design were not considered as well as the estimation of random effects. In our application for SAE, however, it is important to obtain suitable estimates of both fixed and random effects. It may be noted that unlike the generalized linear mixed model (GLMM), GLMARC like LMM offers considerable simplicity in model fitting. This was made possible by replacing the original fixed and random effects of GLMM with a new set of parameters of GLMARC with quite a different interpretation as the random effect is no longer inside the nonlinear predictor function. However, this is of no consequence for SAE because the small area parameters correspond to the overall conditional means and not on individual model parameters. We propose a method of iterative BLUP for parameters estimation which allows for self-benchmarking after a suitable model enlargement. The problem of small areas with small or no sample sizes or zero direct estimates is addressed by collapsing domains only for the stage of parameter estimation. Application to the 2000-01 Canadian Community Health Survey for estimation of the proportion of daily smokers in subpopulations defined by provincial health regions by age-sex groups is presented as an illustration.
Release date: 2008-03-17
7. Complex sampling design based familial longitudinal health data analysis: an overview Archived
Articles and reports: 11-522-X200600110399
Description:
In health studies, it is quite common to collect binary or count repeated responses along with a set of multi-dimensional covariates over a small period of time from a large number of independent families, where the families are selected from a finite population by using certain complex sampling designs. It is of interest to examine the effects of the covariates on the familial longitudinal responses after taking the variation in the family effects as well as the longitudinal correlations of the repeated responses into account. In this paper, I review the advantages and drawbacks of the existing methodologies for the estimation of the regression effects, the variance of the family effects and the longitudinal correlations. We then outline the advantages of a new unified generalized quasilikelihood approach in analyzing the complex design based familial longitudinal data. Some existing numerical studies are discussed as illustrations of the methodologies considered in the paper.
Release date: 2008-03-17
8. Combining information from two surveys to improve on analyses of self-reported data in estimating measures of health Archived
Articles and reports: 11-522-X200600110408
Description:
Despite advances that have improved the health of the United States population, disparities in health remain among various racial/ethnic and socio-economic groups. Common data sources for assessing the health of a population of interest include large-scale surveys that often pose questions requiring a self-report, such as, "Has a doctor or other health professional ever told you that you have health condition of interest?" Answers to such questions might not always reflect the true prevalences of health conditions (for example, if a respondent does not have access to a doctor or other health professional). Similarly, self-reported data on quantities such as height and weight might be subject to reporting errors. Such "measurement error" in health data could affect inferences about measures of health and health disparities. In this work, we fit measurement-error models to data from the National Health and Nutrition Examination Survey, which asks self-report questions during an interview component and also obtains physical measurements during an examination component. We then develop methods for using the fitted models to improve on analyses of self-reported data from another survey that does not include an examination component. The methods, which involve multiply imputing examination-based data values for the survey that has only self-reported data, are applied to the National Health Interview Survey in examples involving diabetes, hypertension, and obesity. Preliminary results suggest that the adjustments for measurement error can result in non-negligible changes in estimates of measures of health.
Release date: 2008-03-17
9. The analysis of population-based case control studies Archived
Articles and reports: 11-522-X200600110415
Description:
We discuss methods for the analysis of case-control studies in which the controls are drawn using a complex sample survey. The most straightforward method is the standard survey approach based on weighted versions of population estimating equations. We also look at more efficient methods and compare their robustness to model mis-specification in simple cases. Case-control family studies, where the within-cluster structure is of interest in its own right, are also discussed briefly.
Release date: 2008-03-17
10. Causal inference in observational studies using health administrative data Archived
Articles and reports: 11-522-X200600110419
Description:
Health services research generally relies on observational data to compare outcomes of patients receiving different therapies. Comparisons of patient groups in observational studies may be biased, in that outcomes differ due to both the effects of treatment and the effects of patient prognosis. In some cases, especially when data are collected on detailed clinical risk factors, these differences can be controlled for using statistical or epidemiological methods. In other cases, when unmeasured characteristics of the patient population affect both the decision to provide therapy and the outcome, these differences cannot be removed using standard techniques. Use of health administrative data requires particular cautions in undertaking observational studies since important clinical information does not exist. We discuss several statistical and epidemiological approaches to remove overt (measurable) and hidden (unmeasurable) bias in observational studies. These include regression model-based case-mix adjustment, propensity-based matching, redefining the exposure variable of interest, and the econometric technique of instrumental variable (IV) analysis. These methods are illustrated using examples from the medical literature including prediction of one-year mortality following heart attack; the return to health care spending in higher spending U.S. regions in terms of clinical and financial benefits; and the long-term survival benefits of invasive cardiac management of heart attack patients. It is possible to use health administrative data for observational studies provided careful attention is paid to addressing issues of reverse causation and unmeasured confounding.
Release date: 2008-03-17

Journals and periodicals (1)

Journals and periodicals (1) ((1 result))

1. Report on Regional Discussions on Aboriginal Identification Questions Archived
Journals and periodicals: 89-629-X
Geography: Canada
Description:
This report summarizes the main issues raised in these meetings. Four questions used to identify Aboriginal people from the Census and surveys were considered in the discussions.Statistics Canada regularly reviews the questions used on the Census and other surveys to ensure that the resulting data are representative of the population. As a first step in the process to review the questions used to produce data about First Nations, Inuit and Métis populations, regional discussions were held with more than 350 users of Aboriginal data in over 40 locations across Canada during the winter, spring and early summer of 2007.
This report summarizes the main issues raised in these meetings. Four questions used to identify Aboriginal people from the Census and surveys were considered in the discussions.
Release date: 2008-05-27

Report a problem or mistake on this page

Date modified:: 2024-07-16