Keyword search
Filter results by
Search HelpKeyword(s)
Subject
- Agriculture and food (1)
- Business and consumer services and culture (1)
- Business performance and ownership (7)
- Children and youth (5)
- Economic accounts (2)
- Education, training and learning (13)
- Environment (1)
- Families, households and marital status (4)
- Health (9)
- Immigration and ethnocultural diversity (4)
- Income, pensions, spending and wealth (13)
- Indigenous peoples (1)
- International trade (1)
- Labour (27)
- Languages (2)
- Manufacturing (4)
- Population and demography (4)
- Prices and price indexes (1)
- Science and technology (3)
- Society and community (3)
- Statistical methods (139)
- Transportation (2)
Type
Year of publication
Geography
Survey or statistical program
- Programme for the International Assessment of Adult Competencies (5)
- Survey of Employment, Payrolls and Hours (1)
- Workplace and Employee Survey (1)
- Passenger Bus Statistics (1)
- Survey of Labour and Income Dynamics (1)
- Survey of Innovation (1)
- Survey of Advanced Technology (1)
- Longitudinal Survey of Immigrants to Canada (1)
- National Longitudinal Survey of Children and Youth (1)
- Longitudinal Immigration Database (1)
- Canadian Survey of Experiences with Primary Health Care (1)
- Canadian Community Health Survey - Healthy Aging (1)
- National Household Survey (1)
Results
All (186)
All (186) (0 to 10 of 186 results)
- 1. Estimating municipal life expectancy and health-adjusted life expectancy in Canada, 2019 and 2020Articles and reports: 82-003-X202500800001Description: Data measuring life expectancy (LE) and health-adjusted life expectancy (HALE) in Canada are available for large geographical areas, such as provinces, territories, and health regions. However, to date, no study has analyzed LE and HALE at the municipal level. To address issues related to sparse administrative and survey data in small geographic areas, this study applies multilevel regression models and poststratification methods that have been shown to provide reliable estimates of population- and small area-level quantities from health surveys.Release date: 2025-08-20
- Public use microdata: 99M0001XDescription: The Individuals File, 2011 National Household Survey (Public Use Microdata Files) provides data on the characteristics of the Canadian population. The file contains a 2.7% sample of anonymous responses to the 2011 National Household Survey (NHS) questionnaire. The files have been carefully scrutinized to ensure the complete confidentiality of the individual responses and geographic identifiers have been restricted to provinces/territories and metropolitan areas. With 133 variables, this comprehensive tool is excellent for policy analysts, pollsters, social researchers and anyone interested in modelling and performing statistical regression analysis using National Household Survey data.
Microdata files uniquely provide users access to non-aggregated data. The PUMFs user can group and manipulate these variables to suit data and research requirements. Tabulations excluded from other NHS products can be created or relationships between variables can be analyzed using different statistical tests. PUMFs provide quick access to a comprehensive social and economic database about Canada and its people.
This product, offered on DVD-ROM, contains the data file (in ASCII format); user documentation and supporting information; all licence agreements; and SAS, SPSS and Stata program source codes to enable users to read the set of records. It is important to note that users will require knowledge of data manipulation packages (or software) such as SAS, SPSS or Stata to use this product.
Release date: 2023-09-12 - Articles and reports: 12-001-X202300100002Description: We consider regression analysis in the context of data integration. To combine partial information from external sources, we employ the idea of model calibration which introduces a “working” reduced model based on the observed covariates. The working reduced model is not necessarily correctly specified but can be a useful device to incorporate the partial information from the external data. The actual implementation is based on a novel application of the information projection and model calibration weighting. The proposed method is particularly attractive for combining information from several sources with different missing patterns. The proposed method is applied to a real data example combining survey data from Korean National Health and Nutrition Examination Survey and big data from National Health Insurance Sharing Service in Korea.Release date: 2023-06-30
- Articles and reports: 11-522-X202100100009Description:
Use of auxiliary data to improve the efficiency of estimators of totals and means through model-assisted survey regression estimation has received considerable attention in recent years. Generalized regression (GREG) estimators, based on a working linear regression model, are currently used in establishment surveys at Statistics Canada and several other statistical agencies. GREG estimators use common survey weights for all study variables and calibrate to known population totals of auxiliary variables. Increasingly, many auxiliary variables are available, some of which may be extraneous. This leads to unstable GREG weights when all the available auxiliary variables, including interactions among categorical variables, are used in the working linear regression model. On the other hand, new machine learning methods, such as regression trees and lasso, automatically select significant auxiliary variables and lead to stable nonnegative weights and possible efficiency gains over GREG. In this paper, a simulation study, based on a real business survey sample data set treated as the target population, is conducted to study the relative performance of GREG, regression trees and lasso in terms of efficiency of the estimators.
Key Words: Model assisted inference; calibration estimation; model selection; generalized regression estimator.
Release date: 2021-10-29 - 5. Refugees and Canadian Post-Secondary Education: Characteristics and Economic Outcomes in Comparison ArchivedArticles and reports: 89-657-X2018001Description:
This study draws on data from the Longitudinal Immigration Database to examine participation in Canadian post-secondary education (PSE) among adult immigrants in the 2002-2005 landing cohort, with an explicit focus on resettled refugees. The study describes the demographic characteristics of participants, the qualities of participation, and the economic returns on investment in Canadian PSE. It also employs multivariate regression analysis to further examine the effects of participation in Canadian training on employment incidence and the income of those employed, while controlling for other factors associated with successful economic integration.
Release date: 2018-11-14 - 6. A comparison between nonparametric estimators for finite population distribution functions ArchivedArticles and reports: 12-001-X201600114541Description:
In this work we compare nonparametric estimators for finite population distribution functions based on two types of fitted values: the fitted values from the well-known Kuo estimator and a modified version of them, which incorporates a nonparametric estimate for the mean regression function. For each type of fitted values we consider the corresponding model-based estimator and, after incorporating design weights, the corresponding generalized difference estimator. We show under fairly general conditions that the leading term in the model mean square error is not affected by the modification of the fitted values, even though it slows down the convergence rate for the model bias. Second order terms of the model mean square errors are difficult to obtain and will not be derived in the present paper. It remains thus an open question whether the modified fitted values bring about some benefit from the model-based perspective. We discuss also design-based properties of the estimators and propose a variance estimator for the generalized difference estimator based on the modified fitted values. Finally, we perform a simulation study. The simulation results suggest that the modified fitted values lead to a considerable reduction of the design mean square error if the sample size is small.
Release date: 2016-06-22 - Articles and reports: 12-001-X201600114543Description:
The regression estimator is extensively used in practice because it can improve the reliability of the estimated parameters of interest such as means or totals. It uses control totals of variables known at the population level that are included in the regression set up. In this paper, we investigate the properties of the regression estimator that uses control totals estimated from the sample, as well as those known at the population level. This estimator is compared to the regression estimators that strictly use the known totals both theoretically and via a simulation study.
Release date: 2016-06-22 - Articles and reports: 12-001-X201600114545Description:
The estimation of quantiles is an important topic not only in the regression framework, but also in sampling theory. A natural alternative or addition to quantiles are expectiles. Expectiles as a generalization of the mean have become popular during the last years as they not only give a more detailed picture of the data than the ordinary mean, but also can serve as a basis to calculate quantiles by using their close relationship. We show, how to estimate expectiles under sampling with unequal probabilities and how expectiles can be used to estimate the distribution function. The resulting fitted distribution function estimator can be inverted leading to quantile estimates. We run a simulation study to investigate and compare the efficiency of the expectile based estimator.
Release date: 2016-06-22 - 9. The Impact of Annual Wages on Interprovincial Mobility, Interprovincial Employment, and Job Vacancies ArchivedArticles and reports: 11F0019M2016376Geography: Canada, Province or territoryDescription: The degree to which workers move across geographic areas in response to emerging employment opportunities or negative labour demand shocks is a key element in the adjustment process of an economy, and its ability to reach a desired allocation of resources.
This study estimates the causal impact of real after-tax annual wages and salaries on the propensity of young men to migrate to Alberta or to accept jobs in that province while maintaining residence in their home province. To do so, it exploits the cross-provincial variation in earnings growth plausibly induced by increases in world oil prices that occurred during the 2000s.
Release date: 2016-04-11 - 10. Do Workplace Pensions Crowd Out Other Retirement Savings? Evidence from Canadian Tax Records ArchivedArticles and reports: 11F0019M2015371Description:
This paper investigates whether registered pension plans (RPPs) help households prepare financially for retirement or simply substitute for other forms of private saving. This issue is addressed using a panel of 1.8 million Canadian households, from 1991 to 2010, which appear in the Longitudinal Administrative Databank. The analysis controls for correlations in savings across accounts due to unobserved tastes for saving by exploiting the fact that employer contribution rates increase discontinuously on earnings above the average industrial wage, a unique feature of occupational pensions in Canada, the effect being estimated in a Regression Kink Design.
Release date: 2015-12-21
- Previous Go to previous page of All results
- 1 (current) Go to page 1 of All results
- 2 Go to page 2 of All results
- 3 Go to page 3 of All results
- 4 Go to page 4 of All results
- 5 Go to page 5 of All results
- 6 Go to page 6 of All results
- 7 Go to page 7 of All results
- ...
- 19 Go to page 19 of All results
- Next Go to next page of All results
Data (2)
Data (2) ((2 results))
- Public use microdata: 99M0001XDescription: The Individuals File, 2011 National Household Survey (Public Use Microdata Files) provides data on the characteristics of the Canadian population. The file contains a 2.7% sample of anonymous responses to the 2011 National Household Survey (NHS) questionnaire. The files have been carefully scrutinized to ensure the complete confidentiality of the individual responses and geographic identifiers have been restricted to provinces/territories and metropolitan areas. With 133 variables, this comprehensive tool is excellent for policy analysts, pollsters, social researchers and anyone interested in modelling and performing statistical regression analysis using National Household Survey data.
Microdata files uniquely provide users access to non-aggregated data. The PUMFs user can group and manipulate these variables to suit data and research requirements. Tabulations excluded from other NHS products can be created or relationships between variables can be analyzed using different statistical tests. PUMFs provide quick access to a comprehensive social and economic database about Canada and its people.
This product, offered on DVD-ROM, contains the data file (in ASCII format); user documentation and supporting information; all licence agreements; and SAS, SPSS and Stata program source codes to enable users to read the set of records. It is important to note that users will require knowledge of data manipulation packages (or software) such as SAS, SPSS or Stata to use this product.
Release date: 2023-09-12 - 2. Bilingualism and earnings ArchivedTable: 75-001-X19890022277Description:
This study compares the earnings of bilingual and unilingual workers in three urban centres: Montreal, Toronto and Ottawa-Hull. Differences in the earnings of bilingual and unilingual workers are considered in the light of several demographic and job-related traits.
Release date: 1989-06-30
Analysis (174)
Analysis (174) (50 to 60 of 174 results)
- Articles and reports: 12-001-X200700210488Description:
Calibration is the principal theme in many recent articles on estimation in survey sampling. Words such as "calibration approach" and "calibration estimators" are frequently used. As article authors like to point out, calibration provides a systematic way to incorporate auxiliary information in the procedure.
Calibration has established itself as an important methodological instrument in large-scale production of statistics. Several national statistical agencies have developed software designed to compute weights, usually calibrated to auxiliary information available in administrative registers and other accurate sources.
This paper presents a review of the calibration approach, with an emphasis on progress achieved in the past decade or so. The literature on calibration is growing rapidly; selected issues are discussed in this paper. The paper starts with a definition of the calibration approach. Its important features are reviewed. The calibration approach is contrasted with (generalized) regression estimation, which is an alternative but conceptually different way to take auxiliary information into account. The computational aspects of calibration are discussed, including methods for avoiding extreme weights. In the early sections of the paper, simple applications of calibration are examined: The estimation of a population total in direct, single phase sampling. Generalization to more complex parameters and more complex sampling designs are then considered. A common feature of more complex designs (sampling in two or more phases or stages) is that the available auxiliary information may consist of several components or layers. The uses of calibration in such cases of composite information are reviewed. Later in the paper, examples are given to illustrate how the results of the calibration thinking may contrast with answers given by earlier established approaches. Finally, applications of calibration in the presence of nonsampling error are discussed, in particular methods for nonresponse bias adjustment.
Release date: 2008-01-03 - 52. Literacy and the Labour Market: The Generation of Literacy and Its Impact on Earnings for Native Born Canadians ArchivedArticles and reports: 89-552-M2007018Geography: CanadaDescription:
This study examines the distribution of literacy skills in the Canadian economy and the ways in which they are generated. In large part, the generation of literacy skills has to do with formal schooling and parental inputs into their children's education. The nature of literacy generation in the years after individuals have left formal schooling and are in the labour market is also investigated. Once the core facts about literacy in the economy have been established, the study turns to examining the impact of increased literacy on individual earnings. Both the causal impact of literacy on earnings and the joint distribution of literacy and income are explored. The authors argue that the latter provides a more complete measure of how well an individual is able to function in society.
The study focuses mainly on data from the Canadian component of the 2003 International Adult Literacy and Skills Survey (IALSS), composed of a sample of over 22,000 respondents. The Canadian component of the 1994 International Adult Literacy Survey (IALS) is also used in order to obtain a more complete picture of how literacy changes with age and across birth cohorts.
Release date: 2007-11-30 - Articles and reports: 12-001-X20070019847Description:
We investigate the impact of cluster sampling on standard errors in the analysis of longitudinal survey data. We consider a widely used class of regression models for longitudinal data and a standard class of point estimators of a generalized least squares type. We argue theoretically that the impact of ignoring clustering in standard error estimation will tend to increase with the number of waves in the analysis, under some patterns of clustering which are realistic for many social surveys. The implication is that it is, in general, at least as important to allow for clustering in standard errors for longitudinal analyses as for cross-sectional analyses. We illustrate this theoretical argument with empirical evidence from a regression analysis of longitudinal data on gender role attitudes from the British Household Panel Survey. We also compare two approaches to variance estimation in the analysis of longitudinal survey data: a survey sampling approach based upon linearization and a multilevel modelling approach. We conclude that the impact of clustering can be seriously underestimated if it is simply handled by including an additive random effect to represent the clustering in a multilevel model.
Release date: 2007-06-28 - Articles and reports: 12-001-X20070019848Description:
We investigate some modifications of the classical single-spell Cox model in order to handle multiple spells from the same individual when the data are collected in a longitudinal survey based on a complex sample design. One modification is the use of a design-based approach for the estimation of the model coefficients and their variances; in the variance estimation each individual is treated as a cluster of spells, bringing an extra stage of clustering into the survey design. Other modifications to the model allow a flexible specification of the baseline hazard to account for possibly differential dependence of hazard on the order and duration of successive spells, and also allow for differential effects of the covariates on the spells of different orders. These approaches are illustrated using data from the Canadian Survey of Labour and Income Dynamics (SLID).
Release date: 2007-06-28 - Articles and reports: 12-001-X20070019849Description:
In sample surveys where units have unequal probabilities of inclusion in the sample, associations between the probability of inclusion and the statistic of interest can induce bias. Weights equal to the inverse of the probability of inclusion are often used to counteract this bias. Highly disproportional sample designs have large weights, which can introduce undesirable variability in statistics such as the population mean estimator or population regression estimator. Weight trimming reduces large weights to a fixed cutpoint value and adjusts weights below this value to maintain the untrimmed weight sum, reducing variability at the cost of introducing some bias. Most standard approaches are ad-hoc in that they do not use the data to optimize bias-variance tradeoffs. Approaches described in the literature that are data-driven are a little more efficient than fully-weighted estimators. This paper develops Bayesian methods for weight trimming of linear and generalized linear regression estimators in unequal probability-of-inclusion designs. An application to estimate injury risk of children rear-seated in compact extended-cab pickup trucks using the Partners for Child Passenger Safety surveillance survey is considered.
Release date: 2007-06-28 - Articles and reports: 12-001-X20070019850Description:
Auxiliary information is often used to improve the precision of survey estimators of finite population means and totals through ratio or linear regression estimation techniques. Resulting estimators have good theoretical and practical properties, including invariance, calibration and design consistency. However, it is not always clear that ratio or linear models are good approximations to the true relationship between the auxiliary variables and the variable of interest in the survey, resulting in efficiency loss when the model is not appropriate. In this article, we explain how regression estimation can be extended to incorporate semiparametric regression models, in both simple and more complicated designs. While maintaining the good theoretical and practical properties of the linear models, semiparametric models are better able to capture complicated relationships between variables. This often results in substantial gains in efficiency. The applicability of the approach for complex designs using multiple types of auxiliary variables will be illustrated by estimating several acidification-related characteristics for a survey of lakes in the Northeastern US.
Release date: 2007-06-28 - Articles and reports: 12-001-X20070019852Description:
A common class of survey designs involves selecting all people within selected households. Generalized regression estimators can be calculated at either the person or household level. Implementing the estimator at the household level has the convenience of equal estimation weights for people within households. In this article the two approaches are compared theoretically and empirically for the case of simple random sampling of households and selection of all persons in each selected household. We find that the household level approach is theoretically more efficient in large samples and any empirical inefficiency in small samples is limited.
Release date: 2007-06-28 - 58. Mean - Adjusted bootstrap for two - Phase sampling ArchivedArticles and reports: 12-001-X20070019853Description:
Two-phase sampling is a useful design when the auxiliary variables are unavailable in advance. Variance estimation under this design, however, is complicated particularly when sampling fractions are high. This article addresses a simple bootstrap method for two-phase simple random sampling without replacement at each phase with high sampling fractions. It works for the estimation of distribution functions and quantiles since no rescaling is performed. The method can be extended to stratified two-phase sampling by independently repeating the proposed procedure in different strata. Variance estimation of some conventional estimators, such as the ratio and regression estimators, is studied for illustration. A simulation study is conducted to compare the proposed method with existing variance estimators for estimating distribution functions and quantiles.
Release date: 2007-06-28 - 59. Innovativeness and Export Orientation Among Establishments in Knowledge-Intensive Business Services (KIBS) ArchivedArticles and reports: 88F0006X2007001Description: This study examines the factors that explain export orientation among Canadian Knowledge-Intensive Business Services (KIBS) firms, particularly innovativeness, while controlling for foreign control, size of establishment, training level of workforce, use of intellectual property protection and industry type. The data are based on the 2003 Survey of Innovation.Release date: 2007-04-03
- 60. Regression with latent variables ArchivedArticles and reports: 11-522-X20050019473Description:
This talk will provide a brief overview of some of some techniques, highlighting the advantages and disadvantages of each, with particular reference to the data types usually encountered in the social sciences. The overview will touch on naïve methods based on the use of latent variable scores, and on methods for correcting and / or avoiding the biases associated with such analyses. The talk will conclude with a brief description of some recent applications to probit and logistic regression with latent predictor variables, and with suggestions for future research.
Release date: 2007-03-02
- Previous Go to previous page of Analysis results
- 1 Go to page 1 of Analysis results
- 2 Go to page 2 of Analysis results
- 3 Go to page 3 of Analysis results
- 4 Go to page 4 of Analysis results
- 5 Go to page 5 of Analysis results
- 6 (current) Go to page 6 of Analysis results
- 7 Go to page 7 of Analysis results
- ...
- 18 Go to page 18 of Analysis results
- Next Go to next page of Analysis results
Reference (10)
Reference (10) ((10 results))
- Surveys and statistical programs – Documentation: 11-522-X20010016308Description:
This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.
The Census Bureau uses response error analysis to evaluate the effectiveness of survey questions. For a given survey, questions that are deemed critical to the survey or considered problematic from past examination are selected for analysis. New or revised questions are prime candidates for re-interview. Re-interview is a new interview where a subset of questions from the original interview are re-asked to a sample of the survey respondents. For each re-interview question, the proportion of respondents who give inconsistent responses is evaluated. The "Index of Inconsistency" is used as the measure of response variance. Each question is labelled low, moderate, or high in response variance. In high response variance cases, the questions are put through cognitive testing, and modifications to the question are recommended.
The Schools and Staffing Survey (SASS) sponsored by The National Center for Education Statistics (NCES), is also investigated for response error analysis and the possible relationships between inconsistent responses and characteristics of the schools and teachers in that survey. Results of this analysis can be used to change survey procedures and improve data quality.
Release date: 2002-09-12 - 2. Particulate matter and daily mortality: Combining time series information from eight U.S. cities ArchivedSurveys and statistical programs – Documentation: 11-522-X19990015656Description:
Time series studies have shown associations between air pollution concentrations and morbidity and mortality. These studies have largely been conducted within single cities, and with varying methods. Critics of these studies have questioned the validity of the data sets used and the statistical techniques applied to them; the critics have noted inconsistencies in findings among studies and even in independent re-analyses of data from the same city. In this paper we review some of the statistical methods used to analyze a subset of a national data base of air pollution, mortality and weather assembled during the National Morbidity and Mortality Air Pollution Study (NMMAPS).
Release date: 2000-03-02 - 3. A donor imputation system to create a census database fully adjusted for underenumeration ArchivedSurveys and statistical programs – Documentation: 11-522-X19990015668Description:
Following the problems with estimating underenumeration in the 1991 Census of England and Wales the aim for the 2001 Census is to create a database that is fully adjusted to net underenumeration. To achieve this, the paper investigates weighted donor imputation methodology that utilises information from both the census and census coverage survey (CCS). The US Census Bureau has considered a similar approach for their 2000 Census (see Isaki et al 1998). The proposed procedure distinguishes between individuals who are not counted by the census because their household is missed and those who are missed in counted households. Census data is linked to data from the CCS. Multinomial logistic regression is used to estimate the probabilities that households are missed by the census and the probabilities that individuals are missed in counted households. Household and individual coverage weights are constructed from the estimated probabilities and these feed into the donor imputation procedure.
Release date: 2000-03-02 - Surveys and statistical programs – Documentation: 11-522-X19990015682Description:
The application of dual system estimation (DSE) to matched Census / Post Enumeration Survey (PES) data in order to measure net undercount is well understood (Hogan, 1993). However, this approach has so far not been used to measure net undercount in the UK. The 2001 PES in the UK will use this methodology. This paper presents the general approach to design and estimation for this PES (the 2001 Census Coverage Survey). The estimation combines DSE with standard ratio and regression estimation. A simulation study using census data from the 1991 Census of England and Wales demonstrates that the ratio model is in general more robust than the regression model.
Release date: 2000-03-02 - 5. Simultaneous calibration of several surveys ArchivedSurveys and statistical programs – Documentation: 11-522-X19990015684Description:
Often, the same information is gathered almost simultaneously for several different surveys. In France, this practice is institutionalized for household surveys that have a common set of demographic variables, i.e., employment, residence and income. These variables are important co-factors for the variables of interest in each survey, and if used carefully, can reinforce the estimates derived from each survey. Techniques for calibrating uncertain data can apply naturally in this context. This involves finding the best unbiased estimator in common variables and calibrating each survey based on that estimator. The estimator thus obtained in each survey is always a linear estimator, the weightings of which can be easily explained and the variance can be obtained with no new problems, as can the variance estimate. To supplement the list of regression estimators, this technique can also be seen as a ridge-regression estimator, or as a Bayesian-regression estimator.
Release date: 2000-03-02 - Surveys and statistical programs – Documentation: 11-522-X19990015688Description:
The geographical and temporal relationship between outdoor air pollution and asthma was examined by linking together data from multiple sources. These included the administrative records of 59 general practices widely dispersed across England and Wales for half a million patients and all their consultations for asthma, supplemented by a socio-economic interview survey. Postcode enabled linkage with: (i) computed local road density; (ii) emission estimates of sulphur dioxide and nitrogen dioxides, (iii) measured/interpolated concentration of black smoke, sulphur dioxide, nitrogen dioxide and other pollutants at practice level. Parallel Poisson time series analysis took into account between-practice variations to examine daily correlations in practices close to air quality monitoring stations. Preliminary analyses show small and generally non-significant geographical associations between consultation rates and pollution markers. The methodological issues relevant to combining such data, and the interpretation of these results will be discussed.
Release date: 2000-03-02 - Surveys and statistical programs – Documentation: 11-522-X19990015692Description:
Electricity rates that vary by time-of-day have the potential to significantly increase economic efficiency in the energy market. A number of utilities have undertaken economic studies of time-of-use rates schemes for their residential customers. This paper uses meta-analysis to examine the impact of time-of-use rates on electricity demand pooling the results of thirty-eight separate programs. There are four key findings. First, very large peak to off-peak price ratios are needed to significantly affect peak demand. Second, summer peak rates are relatively effective compared to winter peak rates. Third, permanent time-or-use rates are relatively effective compared to experimental ones. Fourth, demand charges rival ordinary time-of-use rates in terms of impact.
Release date: 2000-03-02 - Surveys and statistical programs – Documentation: 11-522-X19980015017Description:
Longitudinal studies with repeated observations on individuals permit better characterizations of change and assessment of possible risk factors, but there has been little experience applying sophisticated models for longitudinal data to the complex survey setting. We present results from a comparison of different variance estimation methods for random effects models of change in cognitive function among older adults. The sample design is a stratified sample of people 65 and older, drawn as part of a community-based study designed to examine risk factors for dementia. The model summarizes the population heterogeneity in overall level and rate of change in cognitive function using random effects for intercept and slope. We discuss an unweighted regression including covariates for the stratification variables, a weighted regression, and bootstrapping; we also did preliminary work into using balanced repeated replication and jackknife repeated replication.
Release date: 1999-10-22 - Surveys and statistical programs – Documentation: 11-522-X19980015029Description:
In longitudinal surveys, sample subjects are observed over several time points. This feature typically leads to dependent observations on the same subject, in addition to the customary correlations across subjects induced by the sample design. Much research in the literature has focussed on modeling the marginal mean of a response as a function of covariates. Liang and Zeger (1986) used generalized estimating equations (GEE), requiring only correct specification of the marginal mean, and obtained standard errors of regression parameter estimates and associated Wald tests, assuming a "working" correlation structure for the repeated measurements on a sample subject. Rotnitzky and Jewell (1990) developed quasi-score tests and Rao-Scott adjustments to "working" quasi-score tests under marginal models. These methods are asymptotically robust to misspecification of the within-subject correlation structure, but assume independence of sample subjects which is not satisfied for complex longitudinal survey data based on stratified multi-stage sampling. We proposed asymptotically valid Wald and quasi-score tests for longitudinal survey data, using the Taylor Linearization and jackknife methods. Alternative tests, based on Rao-Scott adjustments to naive tests that ignore survey design features and on Bonferroni-t, are also developed. These tests are particularly useful when the effective degrees of freedom, usually taken as the total number of sample primary units (clusters) minus the number of strata, is small.
Release date: 1999-10-22 - 10. Estimation with partial overlap longitudinal samples ArchivedSurveys and statistical programs – Documentation: 11-522-X19980015035Description:
In a longitudinal survey conducted for k periods some units may be observed for less than k of the periods. Examples include, surveys designed with partially overlapping subsamples, a pure panel survey with nonresponse, and a panel survey supplemented with additional samples for some of the time periods. Estimators of the regression type are exhibited for such surveys. An application to special studies associated with the National Resources Inventory is discussed.
Release date: 1999-10-22