Keyword search

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Year of publication

1 facets displayed. 1 facets selected.

Geography

2 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (44)

All (44) (0 to 10 of 44 results)

  • Surveys and statistical programs – Documentation: 15-002-M2001001
    Description:

    This document describes the sources, concepts and methods utilized by the Canadian Productivity Accounts and discusses how they compare with their U.S. counterparts.

    Release date: 2004-12-24

  • Articles and reports: 11F0019M2004233
    Geography: Canada
    Description:

    In Canada's federal system for economic (skilled) class immigrant selection, education is treated as if it is homogeneous and only differs in quantity. Some provinces, however, differentiate based on postsecondary field of study. This study explores the economic implications of field of study for each sex, and for two subgroups of immigrants, those educated in Canada and those educated elsewhere .

    Field of study is not observed to explain much of the earnings difference between immigrants and the Canadian born, though it is relatively more important for males than females in doing so. Interestingly, while there are a few exceptions, a general pattern is observed whereby the differences between high- and low-earning fields are not as large for immigrants as for the Canadian born. Similarly, social assistance receipt has smaller variance across fields for immigrants than for the Canadian born. Nevertheless, substantial inter-field differences are observed for each immigrant group.

    Release date: 2004-10-28

  • Articles and reports: 11-522-X20020016430
    Description:

    Linearization (or Taylor series) methods are widely used to estimate standard errors for the co-efficients of linear regression models fit to multi-stage samples. When the number of primary sampling units (PSUs) is large, linearization can produce accurate standard errors under quite general conditions. However, when the number of PSUs is small or a co-efficient depends primarily on data from a small number of PSUs, linearization estimators can have large negative bias.

    In this paper, we characterize features of the design matrix that produce large bias in linearization standard errors for linear regression co-efficients. We then propose a new method, bias reduced linearization (BRL), based on residuals adjusted to better approximate the covariance of the true errors. When the errors are independent and identically distributed (i.i.d.), the BRL estimator is unbiased for the variance. Furthermore, a simulation study shows that BRL can greatly reduce the bias, even if the errors are not i.i.d. We also propose using a Satterthwaite approximation to determine the degrees of freedom of the reference distribution for tests and confidence intervals about linear combinations of co-efficients based on the BRL estimator. We demonstrate that the jackknife estimator also tends to be biased in situations where linearization is biased. However, the jackknife's bias tends to be positive. Our bias-reduced linearization estimator can be viewed as a compromise between the traditional linearization and jackknife estimators.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016708
    Description:

    In this paper, we discuss the analysis of complex health survey data by using multivariate modelling techniques. Main interests are in various design-based and model-based methods that aim at accounting for the design complexities, including clustering, stratification and weighting. Methods covered include generalized linear modelling based on pseudo-likelihood and generalized estimating equations, linear mixed models estimated by restricted maximum likelihood, and hierarchical Bayes techniques using Markov Chain Monte Carlo (MCMC) methods. The methods will be compared empirically, using data from an extensive health interview and examination survey conducted in Finland in 2000 (Health 2000 Study).

    The data of the Health 2000 Study were collected using personal interviews, questionnaires and clinical examinations. A stratified two-stage cluster sampling design was used in the survey. The sampling design involved positive intra-cluster correlation for many study variables. For a closer investigation, we selected a small number of study variables from the health interview and health examination phases. In many cases, the different methods produced similar numerical results and supported similar statistical conclusions. Methods that failed to account for the design complexities sometimes led to conflicting conclusions. We also discuss the application of the methods in this paper by using standard statistical software products.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016712
    Description:

    In this paper, we consider the effect of the interval censoring of cessation time on intensity parameter estimation with regard to smoking cessation and pregnancy. The three waves of the National Population Health Survey allow the methodology of event history analysis to be applied to smoking initiation, cessation and relapse. One issue of interest is the relationship between smoking cessation and pregnancy. If a longitudinal respondent who is a smoker at the first cycle ceases smoking by the second cycle, we know the cessation time to within an interval of length at most a year, since the respondent is asked for the age at which she stopped smoking, and her date of birth is known. We also know whether she is pregnant at the time of the second cycle, and whether she has given birth since the time of the first cycle. For many such subjects, we know the date of conception to within a relatively small interval. If we knew the time of smoking cessation and pregnancy period exactly for each member who experienced one or other of these events between cycles, we could model their temporal relationship through their joint intensities.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016714
    Description:

    In this highly technical paper, we illustrate the application of the delete-a-group jack-knife variance estimator approach to a particular complex multi-wave longitudinal study, demonstrating its utility for linear regression and other analytic models. The delete-a-group jack-knife variance estimator is proving a very useful tool for measuring variances under complex sampling designs. This technique divides the first-phase sample into mutually exclusive and nearly equal variance groups, deletes one group at a time to create a set of replicates and makes analogous weighting adjustments in each replicate to those done for the sample as a whole. Variance estimation proceeds in the standard (unstratified) jack-knife fashion.

    Our application is to the Chicago Health and Aging Project (CHAP), a community-based longitudinal study examining risk factors for chronic health problems of older adults. A major aim of the study is the investigation of risk factors for incident Alzheimer's disease. The current design of CHAP has two components: (1) Every three years, all surviving members of the cohort are interviewed on a variety of health-related topics. These interviews include cognitive and physical function measures. (2) At each of these waves of data collection, a stratified Poisson sample is drawn from among the respondents to the full population interview for detailed clinical evaluation and neuropsychological testing. To investigate risk factors for incident disease, a 'disease-free' cohort is identified at the preceding time point and forms one major stratum in the sampling frame.

    We provide proofs of the theoretical applicability of the delete-a-group jack-knife for particular estimators under this Poisson design, paying needed attention to the distinction between finite-population and infinite-population (model) inference. In addition, we examine the issue of determining the 'right number' of variance groups.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016717
    Description:

    In the United States, the National Health and Nutrition Examination Survey (NHANES) is linked to the National Health Interview Survey (NHIS) at the primary sampling unit level (the same counties, but not necessarily the same persons, are in both surveys). The NHANES examines about 5,000 persons per year, while the NHIS samples about 100,000 persons per year. In this paper, we present and develop properties of models that allow NHIS and administrative data to be used as auxiliary information for estimating quantities of interest in the NHANES. The methodology, related to Fay-Herriot (1979) small-area models and to calibration estimators in Deville and Sarndal (1992), accounts for the survey designs in the error structure.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016718
    Description:

    Cancer surveillance research requires accurate estimates of risk factors at the small area level. These risk factors are often obtained from surveys such as the National Health Interview Survey (NHIS) or the Behavioral Risk Factors Surveillance Survey (BRFSS). Unfortunately, no one population-based survey provides ideal prevalence estimates of such risk factors. One strategy is to combine information from multiple surveys, using the complementary strengths of one survey to compensate for the weakness of the other. The NHIS is a nationally representative, face-to-face survey with a high response rate; however, it cannot produce state or substate estimates of risk factor prevalence because sample sizes are too small. The BRFSS is a state-level telephone survey that excludes non-telephone households and has a lower response rate, but does provide reasonable sample sizes in all states and many counties. Several methods are available for constructing small-area estimators that combine information from both the NHIS and the BRFSS, including direct estimators, estimators under hierarchical Bayes models and model-assisted estimators. In this paper, we focus on the latter, constructing generalized regression (GREG) and 'minimum-distance' estimators and using existing and newly developed small-area smoothing techniques to smooth the resulting estimators.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016719
    Description:

    This study takes a look at the modelling methods used for public health data. Public health has a renewed interest in the impact of the environment on health. Ecological or contextual studies ideally investigate these relationships using public health data augmented with environmental characteristics in multilevel or hierarchical models. In these models, individual respondents in health data are the first level and community data are the second level. Most public health data use complex sample survey designs, which require analyses accounting for the clustering, nonresponse, and poststratification to obtain representative estimates of prevalence of health risk behaviours.

    This study uses the Behavioral Risk Factor Surveillance System (BRFSS), a state-specific US health risk factor surveillance system conducted by the Center for Disease Control and Prevention, which assesses health risk factors in over 200,000 adults annually. BRFSS data are now available at the metropolitan statistical area (MSA) level and provide quality health information for studies of environmental effects. MSA-level analyses combining health and environmental data are further complicated by joint requirements of the survey sample design and the multilevel analyses.

    We compare three modelling methods in a study of physical activity and selected environmental factors using BRFSS 2000 data. Each of the methods described here is a valid way to analyse complex sample survey data augmented with environmental information, although each accounts for the survey design and multilevel data structure in a different manner and is thus appropriate for slightly different research questions.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016722
    Geography: Canada
    Description:

    Colorectal cancer (CRC) is the second cause of cancer deaths in Canada. Randomized controlled trials (RCT) have shown the efficacy of screening using faecal occult blood tests (FOBT). A comprehensive evaluation of the costs and consequences of CRC screening for the Canadian population is required before implementing such a program. This paper evaluates whether or not the CRC screening is cost-effective. The results of these simulations will be provided to the Canadian National Committee on Colorectal Cancer Screening to help formulate national policy recommendations for CRC screening.

    Statistics Canada's Population Health Microsimulation Model was updated to incorporate a comprehensive CRC screening module based on Canadian data and RCT efficacy results. The module incorporated sensitivity and specificity of FOBT and colonoscopy, participation rates, incidence, staging, diagnostic and therapeutic options, disease progression, mortality and direct health care costs for different screening scenarios. Reproducing the mortality reduction observed in the Funen screening trial validated the model.

    Release date: 2004-09-13
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (43)

Analysis (43) (0 to 10 of 43 results)

  • Articles and reports: 11F0019M2004233
    Geography: Canada
    Description:

    In Canada's federal system for economic (skilled) class immigrant selection, education is treated as if it is homogeneous and only differs in quantity. Some provinces, however, differentiate based on postsecondary field of study. This study explores the economic implications of field of study for each sex, and for two subgroups of immigrants, those educated in Canada and those educated elsewhere .

    Field of study is not observed to explain much of the earnings difference between immigrants and the Canadian born, though it is relatively more important for males than females in doing so. Interestingly, while there are a few exceptions, a general pattern is observed whereby the differences between high- and low-earning fields are not as large for immigrants as for the Canadian born. Similarly, social assistance receipt has smaller variance across fields for immigrants than for the Canadian born. Nevertheless, substantial inter-field differences are observed for each immigrant group.

    Release date: 2004-10-28

  • Articles and reports: 11-522-X20020016430
    Description:

    Linearization (or Taylor series) methods are widely used to estimate standard errors for the co-efficients of linear regression models fit to multi-stage samples. When the number of primary sampling units (PSUs) is large, linearization can produce accurate standard errors under quite general conditions. However, when the number of PSUs is small or a co-efficient depends primarily on data from a small number of PSUs, linearization estimators can have large negative bias.

    In this paper, we characterize features of the design matrix that produce large bias in linearization standard errors for linear regression co-efficients. We then propose a new method, bias reduced linearization (BRL), based on residuals adjusted to better approximate the covariance of the true errors. When the errors are independent and identically distributed (i.i.d.), the BRL estimator is unbiased for the variance. Furthermore, a simulation study shows that BRL can greatly reduce the bias, even if the errors are not i.i.d. We also propose using a Satterthwaite approximation to determine the degrees of freedom of the reference distribution for tests and confidence intervals about linear combinations of co-efficients based on the BRL estimator. We demonstrate that the jackknife estimator also tends to be biased in situations where linearization is biased. However, the jackknife's bias tends to be positive. Our bias-reduced linearization estimator can be viewed as a compromise between the traditional linearization and jackknife estimators.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016708
    Description:

    In this paper, we discuss the analysis of complex health survey data by using multivariate modelling techniques. Main interests are in various design-based and model-based methods that aim at accounting for the design complexities, including clustering, stratification and weighting. Methods covered include generalized linear modelling based on pseudo-likelihood and generalized estimating equations, linear mixed models estimated by restricted maximum likelihood, and hierarchical Bayes techniques using Markov Chain Monte Carlo (MCMC) methods. The methods will be compared empirically, using data from an extensive health interview and examination survey conducted in Finland in 2000 (Health 2000 Study).

    The data of the Health 2000 Study were collected using personal interviews, questionnaires and clinical examinations. A stratified two-stage cluster sampling design was used in the survey. The sampling design involved positive intra-cluster correlation for many study variables. For a closer investigation, we selected a small number of study variables from the health interview and health examination phases. In many cases, the different methods produced similar numerical results and supported similar statistical conclusions. Methods that failed to account for the design complexities sometimes led to conflicting conclusions. We also discuss the application of the methods in this paper by using standard statistical software products.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016712
    Description:

    In this paper, we consider the effect of the interval censoring of cessation time on intensity parameter estimation with regard to smoking cessation and pregnancy. The three waves of the National Population Health Survey allow the methodology of event history analysis to be applied to smoking initiation, cessation and relapse. One issue of interest is the relationship between smoking cessation and pregnancy. If a longitudinal respondent who is a smoker at the first cycle ceases smoking by the second cycle, we know the cessation time to within an interval of length at most a year, since the respondent is asked for the age at which she stopped smoking, and her date of birth is known. We also know whether she is pregnant at the time of the second cycle, and whether she has given birth since the time of the first cycle. For many such subjects, we know the date of conception to within a relatively small interval. If we knew the time of smoking cessation and pregnancy period exactly for each member who experienced one or other of these events between cycles, we could model their temporal relationship through their joint intensities.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016714
    Description:

    In this highly technical paper, we illustrate the application of the delete-a-group jack-knife variance estimator approach to a particular complex multi-wave longitudinal study, demonstrating its utility for linear regression and other analytic models. The delete-a-group jack-knife variance estimator is proving a very useful tool for measuring variances under complex sampling designs. This technique divides the first-phase sample into mutually exclusive and nearly equal variance groups, deletes one group at a time to create a set of replicates and makes analogous weighting adjustments in each replicate to those done for the sample as a whole. Variance estimation proceeds in the standard (unstratified) jack-knife fashion.

    Our application is to the Chicago Health and Aging Project (CHAP), a community-based longitudinal study examining risk factors for chronic health problems of older adults. A major aim of the study is the investigation of risk factors for incident Alzheimer's disease. The current design of CHAP has two components: (1) Every three years, all surviving members of the cohort are interviewed on a variety of health-related topics. These interviews include cognitive and physical function measures. (2) At each of these waves of data collection, a stratified Poisson sample is drawn from among the respondents to the full population interview for detailed clinical evaluation and neuropsychological testing. To investigate risk factors for incident disease, a 'disease-free' cohort is identified at the preceding time point and forms one major stratum in the sampling frame.

    We provide proofs of the theoretical applicability of the delete-a-group jack-knife for particular estimators under this Poisson design, paying needed attention to the distinction between finite-population and infinite-population (model) inference. In addition, we examine the issue of determining the 'right number' of variance groups.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016717
    Description:

    In the United States, the National Health and Nutrition Examination Survey (NHANES) is linked to the National Health Interview Survey (NHIS) at the primary sampling unit level (the same counties, but not necessarily the same persons, are in both surveys). The NHANES examines about 5,000 persons per year, while the NHIS samples about 100,000 persons per year. In this paper, we present and develop properties of models that allow NHIS and administrative data to be used as auxiliary information for estimating quantities of interest in the NHANES. The methodology, related to Fay-Herriot (1979) small-area models and to calibration estimators in Deville and Sarndal (1992), accounts for the survey designs in the error structure.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016718
    Description:

    Cancer surveillance research requires accurate estimates of risk factors at the small area level. These risk factors are often obtained from surveys such as the National Health Interview Survey (NHIS) or the Behavioral Risk Factors Surveillance Survey (BRFSS). Unfortunately, no one population-based survey provides ideal prevalence estimates of such risk factors. One strategy is to combine information from multiple surveys, using the complementary strengths of one survey to compensate for the weakness of the other. The NHIS is a nationally representative, face-to-face survey with a high response rate; however, it cannot produce state or substate estimates of risk factor prevalence because sample sizes are too small. The BRFSS is a state-level telephone survey that excludes non-telephone households and has a lower response rate, but does provide reasonable sample sizes in all states and many counties. Several methods are available for constructing small-area estimators that combine information from both the NHIS and the BRFSS, including direct estimators, estimators under hierarchical Bayes models and model-assisted estimators. In this paper, we focus on the latter, constructing generalized regression (GREG) and 'minimum-distance' estimators and using existing and newly developed small-area smoothing techniques to smooth the resulting estimators.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016719
    Description:

    This study takes a look at the modelling methods used for public health data. Public health has a renewed interest in the impact of the environment on health. Ecological or contextual studies ideally investigate these relationships using public health data augmented with environmental characteristics in multilevel or hierarchical models. In these models, individual respondents in health data are the first level and community data are the second level. Most public health data use complex sample survey designs, which require analyses accounting for the clustering, nonresponse, and poststratification to obtain representative estimates of prevalence of health risk behaviours.

    This study uses the Behavioral Risk Factor Surveillance System (BRFSS), a state-specific US health risk factor surveillance system conducted by the Center for Disease Control and Prevention, which assesses health risk factors in over 200,000 adults annually. BRFSS data are now available at the metropolitan statistical area (MSA) level and provide quality health information for studies of environmental effects. MSA-level analyses combining health and environmental data are further complicated by joint requirements of the survey sample design and the multilevel analyses.

    We compare three modelling methods in a study of physical activity and selected environmental factors using BRFSS 2000 data. Each of the methods described here is a valid way to analyse complex sample survey data augmented with environmental information, although each accounts for the survey design and multilevel data structure in a different manner and is thus appropriate for slightly different research questions.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016722
    Geography: Canada
    Description:

    Colorectal cancer (CRC) is the second cause of cancer deaths in Canada. Randomized controlled trials (RCT) have shown the efficacy of screening using faecal occult blood tests (FOBT). A comprehensive evaluation of the costs and consequences of CRC screening for the Canadian population is required before implementing such a program. This paper evaluates whether or not the CRC screening is cost-effective. The results of these simulations will be provided to the Canadian National Committee on Colorectal Cancer Screening to help formulate national policy recommendations for CRC screening.

    Statistics Canada's Population Health Microsimulation Model was updated to incorporate a comprehensive CRC screening module based on Canadian data and RCT efficacy results. The module incorporated sensitivity and specificity of FOBT and colonoscopy, participation rates, incidence, staging, diagnostic and therapeutic options, disease progression, mortality and direct health care costs for different screening scenarios. Reproducing the mortality reduction observed in the Funen screening trial validated the model.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016724
    Description:

    Some of the most commonly used statistical models are fitted using maximum likelihood (ML) or some extension of ML. Stata's ML command provides researchers and data analysts with a tool to develop estimation commands to fit their models using their data. Such models may include multiple equations, clustered observations, sampling weights and other survey design characteristics. These elements are discussed in this paper.

    Release date: 2004-09-13
Reference (1)

Reference (1) ((1 result))

  • Surveys and statistical programs – Documentation: 15-002-M2001001
    Description:

    This document describes the sources, concepts and methods utilized by the Canadian Productivity Accounts and discusses how they compare with their U.S. counterparts.

    Release date: 2004-12-24
Date modified: