Keyword search

Skip to main content
Skip to footer

Language selection

Français

Search and menus

Search and menus

Search

Results

All (44)

All (44) (0 to 10 of 44 results)

1. The Statistics Canada Productivity Program: Methodology 2000 Archived
Surveys and statistical programs – Documentation: 15-002-M2001001
Description:
This document describes the sources, concepts and methods utilized by the Canadian Productivity Accounts and discusses how they compare with their U.S. counterparts.
Release date: 2004-12-24
2. Postsecondary Field of Study and the Canadian Labour Market Outcomes of Immigrants and Non-immigrants Archived
Articles and reports: 11F0019M2004233
Geography: Canada
Description:
In Canada's federal system for economic (skilled) class immigrant selection, education is treated as if it is homogeneous and only differs in quantity. Some provinces, however, differentiate based on postsecondary field of study. This study explores the economic implications of field of study for each sex, and for two subgroups of immigrants, those educated in Canada and those educated elsewhere .
Field of study is not observed to explain much of the earnings difference between immigrants and the Canadian born, though it is relatively more important for males than females in doing so. Interestingly, while there are a few exceptions, a general pattern is observed whereby the differences between high- and low-earning fields are not as large for immigrants as for the Canadian born. Similarly, social assistance receipt has smaller variance across fields for immigrants than for the Canadian born. Nevertheless, substantial inter-field differences are observed for each immigrant group.
Release date: 2004-10-28
3. Bias reduction in standard errors for linear regression with multi-stage samples Archived
Articles and reports: 11-522-X20020016430
Description:
Linearization (or Taylor series) methods are widely used to estimate standard errors for the co-efficients of linear regression models fit to multi-stage samples. When the number of primary sampling units (PSUs) is large, linearization can produce accurate standard errors under quite general conditions. However, when the number of PSUs is small or a co-efficient depends primarily on data from a small number of PSUs, linearization estimators can have large negative bias.
In this paper, we characterize features of the design matrix that produce large bias in linearization standard errors for linear regression co-efficients. We then propose a new method, bias reduced linearization (BRL), based on residuals adjusted to better approximate the covariance of the true errors. When the errors are independent and identically distributed (i.i.d.), the BRL estimator is unbiased for the variance. Furthermore, a simulation study shows that BRL can greatly reduce the bias, even if the errors are not i.i.d. We also propose using a Satterthwaite approximation to determine the degrees of freedom of the reference distribution for tests and confidence intervals about linear combinations of co-efficients based on the BRL estimator. We demonstrate that the jackknife estimator also tends to be biased in situations where linearization is biased. However, the jackknife's bias tends to be positive. Our bias-reduced linearization estimator can be viewed as a compromise between the traditional linearization and jackknife estimators.
Release date: 2004-09-13
4. Comparison of design-based and model-based methods in analyzing complex health survey data: A case study Archived
Articles and reports: 11-522-X20020016708
Description:
In this paper, we discuss the analysis of complex health survey data by using multivariate modelling techniques. Main interests are in various design-based and model-based methods that aim at accounting for the design complexities, including clustering, stratification and weighting. Methods covered include generalized linear modelling based on pseudo-likelihood and generalized estimating equations, linear mixed models estimated by restricted maximum likelihood, and hierarchical Bayes techniques using Markov Chain Monte Carlo (MCMC) methods. The methods will be compared empirically, using data from an extensive health interview and examination survey conducted in Finland in 2000 (Health 2000 Study).
The data of the Health 2000 Study were collected using personal interviews, questionnaires and clinical examinations. A stratified two-stage cluster sampling design was used in the survey. The sampling design involved positive intra-cluster correlation for many study variables. For a closer investigation, we selected a small number of study variables from the health interview and health examination phases. In many cases, the different methods produced similar numerical results and supported similar statistical conclusions. Methods that failed to account for the design complexities sometimes led to conflicting conclusions. We also discuss the application of the methods in this paper by using standard statistical software products.
Release date: 2004-09-13
5. Interval censoring of smoking cessation in the National Population Health Survey Archived
Articles and reports: 11-522-X20020016712
Description:
In this paper, we consider the effect of the interval censoring of cessation time on intensity parameter estimation with regard to smoking cessation and pregnancy. The three waves of the National Population Health Survey allow the methodology of event history analysis to be applied to smoking initiation, cessation and relapse. One issue of interest is the relationship between smoking cessation and pregnancy. If a longitudinal respondent who is a smoker at the first cycle ceases smoking by the second cycle, we know the cessation time to within an interval of length at most a year, since the respondent is asked for the age at which she stopped smoking, and her date of birth is known. We also know whether she is pregnant at the time of the second cycle, and whether she has given birth since the time of the first cycle. For many such subjects, we know the date of conception to within a relatively small interval. If we knew the time of smoking cessation and pregnancy period exactly for each member who experienced one or other of these events between cycles, we could model their temporal relationship through their joint intensities.
Release date: 2004-09-13
6. Application of the delete-a-group jackknife variance estimator to analyses of data from a complex longitudinal survey Archived
Articles and reports: 11-522-X20020016714
Description:
In this highly technical paper, we illustrate the application of the delete-a-group jack-knife variance estimator approach to a particular complex multi-wave longitudinal study, demonstrating its utility for linear regression and other analytic models. The delete-a-group jack-knife variance estimator is proving a very useful tool for measuring variances under complex sampling designs. This technique divides the first-phase sample into mutually exclusive and nearly equal variance groups, deletes one group at a time to create a set of replicates and makes analogous weighting adjustments in each replicate to those done for the sample as a whole. Variance estimation proceeds in the standard (unstratified) jack-knife fashion.
Our application is to the Chicago Health and Aging Project (CHAP), a community-based longitudinal study examining risk factors for chronic health problems of older adults. A major aim of the study is the investigation of risk factors for incident Alzheimer's disease. The current design of CHAP has two components: (1) Every three years, all surviving members of the cohort are interviewed on a variety of health-related topics. These interviews include cognitive and physical function measures. (2) At each of these waves of data collection, a stratified Poisson sample is drawn from among the respondents to the full population interview for detailed clinical evaluation and neuropsychological testing. To investigate risk factors for incident disease, a 'disease-free' cohort is identified at the preceding time point and forms one major stratum in the sampling frame.
We provide proofs of the theoretical applicability of the delete-a-group jack-knife for particular estimators under this Poisson design, paying needed attention to the distinction between finite-population and infinite-population (model) inference. In addition, we examine the issue of determining the 'right number' of variance groups.
Release date: 2004-09-13
7. Area-level models using data from multiple surveys Archived
Articles and reports: 11-522-X20020016717
Description:
In the United States, the National Health and Nutrition Examination Survey (NHANES) is linked to the National Health Interview Survey (NHIS) at the primary sampling unit level (the same counties, but not necessarily the same persons, are in both surveys). The NHANES examines about 5,000 persons per year, while the NHIS samples about 100,000 persons per year. In this paper, we present and develop properties of models that allow NHIS and administrative data to be used as auxiliary information for estimating quantities of interest in the NHANES. The methodology, related to Fay-Herriot (1979) small-area models and to calibration estimators in Deville and Sarndal (1992), accounts for the survey designs in the error structure.
Release date: 2004-09-13
8. Obtaining cancer risk factor prevalence estimates in small areas Archived
Articles and reports: 11-522-X20020016718
Description:
Cancer surveillance research requires accurate estimates of risk factors at the small area level. These risk factors are often obtained from surveys such as the National Health Interview Survey (NHIS) or the Behavioral Risk Factors Surveillance Survey (BRFSS). Unfortunately, no one population-based survey provides ideal prevalence estimates of such risk factors. One strategy is to combine information from multiple surveys, using the complementary strengths of one survey to compensate for the weakness of the other. The NHIS is a nationally representative, face-to-face survey with a high response rate; however, it cannot produce state or substate estimates of risk factor prevalence because sample sizes are too small. The BRFSS is a state-level telephone survey that excludes non-telephone households and has a lower response rate, but does provide reasonable sample sizes in all states and many counties. Several methods are available for constructing small-area estimators that combine information from both the NHIS and the BRFSS, including direct estimators, estimators under hierarchical Bayes models and model-assisted estimators. In this paper, we focus on the latter, constructing generalized regression (GREG) and 'minimum-distance' estimators and using existing and newly developed small-area smoothing techniques to smooth the resulting estimators.
Release date: 2004-09-13
9. A comparison of approaches to modelling health and environment Archived
Articles and reports: 11-522-X20020016719
Description:
This study takes a look at the modelling methods used for public health data. Public health has a renewed interest in the impact of the environment on health. Ecological or contextual studies ideally investigate these relationships using public health data augmented with environmental characteristics in multilevel or hierarchical models. In these models, individual respondents in health data are the first level and community data are the second level. Most public health data use complex sample survey designs, which require analyses accounting for the clustering, nonresponse, and poststratification to obtain representative estimates of prevalence of health risk behaviours.
This study uses the Behavioral Risk Factor Surveillance System (BRFSS), a state-specific US health risk factor surveillance system conducted by the Center for Disease Control and Prevention, which assesses health risk factors in over 200,000 adults annually. BRFSS data are now available at the metropolitan statistical area (MSA) level and provide quality health information for studies of environmental effects. MSA-level analyses combining health and environmental data are further complicated by joint requirements of the survey sample design and the multilevel analyses.
We compare three modelling methods in a study of physical activity and selected environmental factors using BRFSS 2000 data. Each of the methods described here is a valid way to analyse complex sample survey data augmented with environmental information, although each accounts for the survey design and multilevel data structure in a different manner and is thus appropriate for slightly different research questions.
Release date: 2004-09-13
10. Modelling the impacts of colorectal cancer screening in Canada using POHEM Archived
Articles and reports: 11-522-X20020016722
Geography: Canada
Description:
Colorectal cancer (CRC) is the second cause of cancer deaths in Canada. Randomized controlled trials (RCT) have shown the efficacy of screening using faecal occult blood tests (FOBT). A comprehensive evaluation of the costs and consequences of CRC screening for the Canadian population is required before implementing such a program. This paper evaluates whether or not the CRC screening is cost-effective. The results of these simulations will be provided to the Canadian National Committee on Colorectal Cancer Screening to help formulate national policy recommendations for CRC screening.
Statistics Canada's Population Health Microsimulation Model was updated to incorporate a comprehensive CRC screening module based on Canadian data and RCT efficacy results. The module incorporated sensitivity and specificity of FOBT and colonoscopy, participation rates, incidence, staging, diagnostic and therapeutic options, disease progression, mortality and direct health care costs for different screening scenarios. Reproducing the mortality reduction observed in the Funen screening trial validated the model.
Release date: 2004-09-13

Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (43)

Analysis (43) (0 to 10 of 43 results)

1. Postsecondary Field of Study and the Canadian Labour Market Outcomes of Immigrants and Non-immigrants Archived
Articles and reports: 11F0019M2004233
Geography: Canada
Description:
In Canada's federal system for economic (skilled) class immigrant selection, education is treated as if it is homogeneous and only differs in quantity. Some provinces, however, differentiate based on postsecondary field of study. This study explores the economic implications of field of study for each sex, and for two subgroups of immigrants, those educated in Canada and those educated elsewhere .
Field of study is not observed to explain much of the earnings difference between immigrants and the Canadian born, though it is relatively more important for males than females in doing so. Interestingly, while there are a few exceptions, a general pattern is observed whereby the differences between high- and low-earning fields are not as large for immigrants as for the Canadian born. Similarly, social assistance receipt has smaller variance across fields for immigrants than for the Canadian born. Nevertheless, substantial inter-field differences are observed for each immigrant group.
Release date: 2004-10-28
2. Bias reduction in standard errors for linear regression with multi-stage samples Archived
Articles and reports: 11-522-X20020016430
Description:
Linearization (or Taylor series) methods are widely used to estimate standard errors for the co-efficients of linear regression models fit to multi-stage samples. When the number of primary sampling units (PSUs) is large, linearization can produce accurate standard errors under quite general conditions. However, when the number of PSUs is small or a co-efficient depends primarily on data from a small number of PSUs, linearization estimators can have large negative bias.
In this paper, we characterize features of the design matrix that produce large bias in linearization standard errors for linear regression co-efficients. We then propose a new method, bias reduced linearization (BRL), based on residuals adjusted to better approximate the covariance of the true errors. When the errors are independent and identically distributed (i.i.d.), the BRL estimator is unbiased for the variance. Furthermore, a simulation study shows that BRL can greatly reduce the bias, even if the errors are not i.i.d. We also propose using a Satterthwaite approximation to determine the degrees of freedom of the reference distribution for tests and confidence intervals about linear combinations of co-efficients based on the BRL estimator. We demonstrate that the jackknife estimator also tends to be biased in situations where linearization is biased. However, the jackknife's bias tends to be positive. Our bias-reduced linearization estimator can be viewed as a compromise between the traditional linearization and jackknife estimators.
Release date: 2004-09-13
3. Comparison of design-based and model-based methods in analyzing complex health survey data: A case study Archived
Articles and reports: 11-522-X20020016708
Description:
In this paper, we discuss the analysis of complex health survey data by using multivariate modelling techniques. Main interests are in various design-based and model-based methods that aim at accounting for the design complexities, including clustering, stratification and weighting. Methods covered include generalized linear modelling based on pseudo-likelihood and generalized estimating equations, linear mixed models estimated by restricted maximum likelihood, and hierarchical Bayes techniques using Markov Chain Monte Carlo (MCMC) methods. The methods will be compared empirically, using data from an extensive health interview and examination survey conducted in Finland in 2000 (Health 2000 Study).
The data of the Health 2000 Study were collected using personal interviews, questionnaires and clinical examinations. A stratified two-stage cluster sampling design was used in the survey. The sampling design involved positive intra-cluster correlation for many study variables. For a closer investigation, we selected a small number of study variables from the health interview and health examination phases. In many cases, the different methods produced similar numerical results and supported similar statistical conclusions. Methods that failed to account for the design complexities sometimes led to conflicting conclusions. We also discuss the application of the methods in this paper by using standard statistical software products.
Release date: 2004-09-13
4. Interval censoring of smoking cessation in the National Population Health Survey Archived
Articles and reports: 11-522-X20020016712
Description:
In this paper, we consider the effect of the interval censoring of cessation time on intensity parameter estimation with regard to smoking cessation and pregnancy. The three waves of the National Population Health Survey allow the methodology of event history analysis to be applied to smoking initiation, cessation and relapse. One issue of interest is the relationship between smoking cessation and pregnancy. If a longitudinal respondent who is a smoker at the first cycle ceases smoking by the second cycle, we know the cessation time to within an interval of length at most a year, since the respondent is asked for the age at which she stopped smoking, and her date of birth is known. We also know whether she is pregnant at the time of the second cycle, and whether she has given birth since the time of the first cycle. For many such subjects, we know the date of conception to within a relatively small interval. If we knew the time of smoking cessation and pregnancy period exactly for each member who experienced one or other of these events between cycles, we could model their temporal relationship through their joint intensities.
Release date: 2004-09-13
5. Application of the delete-a-group jackknife variance estimator to analyses of data from a complex longitudinal survey Archived
Articles and reports: 11-522-X20020016714
Description:
In this highly technical paper, we illustrate the application of the delete-a-group jack-knife variance estimator approach to a particular complex multi-wave longitudinal study, demonstrating its utility for linear regression and other analytic models. The delete-a-group jack-knife variance estimator is proving a very useful tool for measuring variances under complex sampling designs. This technique divides the first-phase sample into mutually exclusive and nearly equal variance groups, deletes one group at a time to create a set of replicates and makes analogous weighting adjustments in each replicate to those done for the sample as a whole. Variance estimation proceeds in the standard (unstratified) jack-knife fashion.
Our application is to the Chicago Health and Aging Project (CHAP), a community-based longitudinal study examining risk factors for chronic health problems of older adults. A major aim of the study is the investigation of risk factors for incident Alzheimer's disease. The current design of CHAP has two components: (1) Every three years, all surviving members of the cohort are interviewed on a variety of health-related topics. These interviews include cognitive and physical function measures. (2) At each of these waves of data collection, a stratified Poisson sample is drawn from among the respondents to the full population interview for detailed clinical evaluation and neuropsychological testing. To investigate risk factors for incident disease, a 'disease-free' cohort is identified at the preceding time point and forms one major stratum in the sampling frame.
We provide proofs of the theoretical applicability of the delete-a-group jack-knife for particular estimators under this Poisson design, paying needed attention to the distinction between finite-population and infinite-population (model) inference. In addition, we examine the issue of determining the 'right number' of variance groups.
Release date: 2004-09-13
6. Area-level models using data from multiple surveys Archived
Articles and reports: 11-522-X20020016717
Description:
In the United States, the National Health and Nutrition Examination Survey (NHANES) is linked to the National Health Interview Survey (NHIS) at the primary sampling unit level (the same counties, but not necessarily the same persons, are in both surveys). The NHANES examines about 5,000 persons per year, while the NHIS samples about 100,000 persons per year. In this paper, we present and develop properties of models that allow NHIS and administrative data to be used as auxiliary information for estimating quantities of interest in the NHANES. The methodology, related to Fay-Herriot (1979) small-area models and to calibration estimators in Deville and Sarndal (1992), accounts for the survey designs in the error structure.
Release date: 2004-09-13
7. Obtaining cancer risk factor prevalence estimates in small areas Archived
Articles and reports: 11-522-X20020016718
Description:
Cancer surveillance research requires accurate estimates of risk factors at the small area level. These risk factors are often obtained from surveys such as the National Health Interview Survey (NHIS) or the Behavioral Risk Factors Surveillance Survey (BRFSS). Unfortunately, no one population-based survey provides ideal prevalence estimates of such risk factors. One strategy is to combine information from multiple surveys, using the complementary strengths of one survey to compensate for the weakness of the other. The NHIS is a nationally representative, face-to-face survey with a high response rate; however, it cannot produce state or substate estimates of risk factor prevalence because sample sizes are too small. The BRFSS is a state-level telephone survey that excludes non-telephone households and has a lower response rate, but does provide reasonable sample sizes in all states and many counties. Several methods are available for constructing small-area estimators that combine information from both the NHIS and the BRFSS, including direct estimators, estimators under hierarchical Bayes models and model-assisted estimators. In this paper, we focus on the latter, constructing generalized regression (GREG) and 'minimum-distance' estimators and using existing and newly developed small-area smoothing techniques to smooth the resulting estimators.
Release date: 2004-09-13
8. A comparison of approaches to modelling health and environment Archived
Articles and reports: 11-522-X20020016719
Description:
This study takes a look at the modelling methods used for public health data. Public health has a renewed interest in the impact of the environment on health. Ecological or contextual studies ideally investigate these relationships using public health data augmented with environmental characteristics in multilevel or hierarchical models. In these models, individual respondents in health data are the first level and community data are the second level. Most public health data use complex sample survey designs, which require analyses accounting for the clustering, nonresponse, and poststratification to obtain representative estimates of prevalence of health risk behaviours.
This study uses the Behavioral Risk Factor Surveillance System (BRFSS), a state-specific US health risk factor surveillance system conducted by the Center for Disease Control and Prevention, which assesses health risk factors in over 200,000 adults annually. BRFSS data are now available at the metropolitan statistical area (MSA) level and provide quality health information for studies of environmental effects. MSA-level analyses combining health and environmental data are further complicated by joint requirements of the survey sample design and the multilevel analyses.
We compare three modelling methods in a study of physical activity and selected environmental factors using BRFSS 2000 data. Each of the methods described here is a valid way to analyse complex sample survey data augmented with environmental information, although each accounts for the survey design and multilevel data structure in a different manner and is thus appropriate for slightly different research questions.
Release date: 2004-09-13
9. Modelling the impacts of colorectal cancer screening in Canada using POHEM Archived
Articles and reports: 11-522-X20020016722
Geography: Canada
Description:
Colorectal cancer (CRC) is the second cause of cancer deaths in Canada. Randomized controlled trials (RCT) have shown the efficacy of screening using faecal occult blood tests (FOBT). A comprehensive evaluation of the costs and consequences of CRC screening for the Canadian population is required before implementing such a program. This paper evaluates whether or not the CRC screening is cost-effective. The results of these simulations will be provided to the Canadian National Committee on Colorectal Cancer Screening to help formulate national policy recommendations for CRC screening.
Statistics Canada's Population Health Microsimulation Model was updated to incorporate a comprehensive CRC screening module based on Canadian data and RCT efficacy results. The module incorporated sensitivity and specificity of FOBT and colonoscopy, participation rates, incidence, staging, diagnostic and therapeutic options, disease progression, mortality and direct health care costs for different screening scenarios. Reproducing the mortality reduction observed in the Funen screening trial validated the model.
Release date: 2004-09-13
10. The analysis of survey data using Stata: Some recent developments Archived
Articles and reports: 11-522-X20020016724
Description:
Some of the most commonly used statistical models are fitted using maximum likelihood (ML) or some extension of ML. Stata's ML command provides researchers and data analysts with a tool to develop estimation commands to fit their models using their data. Such models may include multiple equations, clustered observations, sampling weights and other survey design characteristics. These elements are discussed in this paper.
Release date: 2004-09-13

Reference (1)

Reference (1) ((1 result))

1. The Statistics Canada Productivity Program: Methodology 2000 Archived
Surveys and statistical programs – Documentation: 15-002-M2001001
Description:
This document describes the sources, concepts and methods utilized by the Canadian Productivity Accounts and discusses how they compare with their U.S. counterparts.
Release date: 2004-12-24

Report a problem or mistake on this page

Date modified:: 2024-04-19