Keyword search

Results

All (186)

All (186) (20 to 30 of 186 results)

21. Variance estimation under composite imputation: The methodology behind SEVANI Archived
Articles and reports: 12-001-X201100211605
Description:
Composite imputation is often used in business surveys. The term "composite" means that more than a single imputation method is used to impute missing values for a variable of interest. The literature on variance estimation in the presence of composite imputation is rather limited. To deal with this problem, we consider an extension of the methodology developed by Särndal (1992). Our extension is quite general and easy to implement provided that linear imputation methods are used to fill in the missing values. This class of imputation methods contains linear regression imputation, donor imputation and auxiliary value imputation, sometimes called cold-deck or substitution imputation. It thus covers the most common methods used by national statistical agencies for the imputation of missing values. Our methodology has been implemented in the System for the Estimation of Variance due to Nonresponse and Imputation (SEVANI) developed at Statistics Canada. Its performance is evaluated in a simulation study.
Release date: 2011-12-21
22. Maximum likelihood estimation for contingency tables and logistic regression with incorrectly linked data Archived
Articles and reports: 12-001-X201100111444
Description:
Data linkage is the act of bringing together records that are believed to belong to the same unit (e.g., person or business) from two or more files. It is a very common way to enhance dimensions such as time and breadth or depth of detail. Data linkage is often not an error-free process and can lead to linking a pair of records that do not belong to the same unit. There is an explosion of record linkage applications, yet there has been little work on assuring the quality of analyses using such linked files. Naively treating such a linked file as if it were linked without errors will, in general, lead to biased estimates. This paper develops a maximum likelihood estimator for contingency tables and logistic regression with incorrectly linked records. The estimation technique is simple and is implemented using the well-known EM algorithm. A well known method of linking records in the present context is probabilistic data linking. The paper demonstrates the effectiveness of the proposed estimators in an empirical study which uses probabilistic data linkage.
Release date: 2011-06-29
23. Replication variance estimation under two-phase sampling Archived
Articles and reports: 12-001-X201100111448
Description:
In two-phase sampling for stratification, the second-phase sample is selected by a stratified sample based on the information observed in the first-phase sample. We develop a replication-based bias adjusted variance estimator that extends the method of Kim, Navarro and Fuller (2006). The proposed method is also applicable when the first-phase sampling rate is not negligible and when second-phase sample selection is unequal probability Poisson sampling within each stratum. The proposed method can be extended to variance estimation for two-phase regression estimators. Results from a limited simulation study are presented.
Release date: 2011-06-29
24. Comparison of survey regression techniques in the context of small area estimation of poverty Archived
Articles and reports: 12-001-X201000211378
Description:
One key to poverty alleviation or eradication in the third world is reliable information on the poor and their location, so that interventions and assistance can be effectively targeted to the neediest people. Small area estimation is one statistical technique that is used to monitor poverty and to decide on aid allocation in pursuit of the Millennium Development Goals. Elbers, Lanjouw and Lanjouw (ELL) (2003) proposed a small area estimation methodology for income-based or expenditure-based poverty measures, which is implemented by the World Bank in its poverty mapping projects via the involvement of the central statistical agencies in many third world countries, including Cambodia, Lao PDR, the Philippines, Thailand and Vietnam, and is incorporated into the World Bank software program PovMap. In this paper, the ELL methodology which consists of first modeling survey data and then applying that model to census information is presented and discussed with strong emphasis on the first phase, i.e., the fitting of regression models and on the estimated standard errors at the second phase. Other regression model fitting procedures such as the General Survey Regression (GSR) (as described in Lohr (1999) Chapter 11) and those used in existing small area estimation techniques: Pseudo-Empirical Best Linear Unbiased Prediction (Pseudo-EBLUP) approach (You and Rao 2002) and Iterative Weighted Estimating Equation (IWEE) method (You, Rao and Kovacevic 2003) are presented and compared with the ELL modeling strategy. The most significant difference between the ELL method and the other techniques is in the theoretical underpinning of the ELL model fitting procedure. An example based on the Philippines Family Income and Expenditure Survey is presented to show the differences in both the parameter estimates and their corresponding standard errors, and in the variance components generated from the different methods and the discussion is extended to the effect of these on the estimated accuracy of the final small area estimates themselves. The need for sound estimation of variance components, as well as regression estimates and estimates of their standard errors for small area estimation of poverty is emphasized.
Release date: 2010-12-21
25. Linearization variance estimation for generalized raking estimators in the presence of nonresponse Archived
Articles and reports: 12-001-X201000211380
Description:
Alternative forms of linearization variance estimators for generalized raking estimators are defined via different choices of the weights applied (a) to residuals and (b) to the estimated regression coefficients used in calculating the residuals. Some theory is presented for three forms of generalized raking estimator, the classical raking ratio estimator, the 'maximum likelihood' raking estimator and the generalized regression estimator, and for associated linearization variance estimators. A simulation study is undertaken, based upon a labour force survey and an income and expenditure survey. Properties of the estimators are assessed with respect to both sampling and nonresponse. The study displays little difference between the properties of the alternative raking estimators for a given sampling scheme and nonresponse model. Amongst the variance estimators, the approach which weights residuals by the design weight can be severely biased in the presence of nonresponse. The approach which weights residuals by the calibrated weight tends to display much less bias. Varying the choice of the weights used to construct the regression coefficients has little impact.
Release date: 2010-12-21
26. Linearization variance estimators for model parameters from complex survey data Archived
Articles and reports: 12-001-X201000211381
Description:
Taylor linearization methods are often used to obtain variance estimators for calibration estimators of totals and nonlinear finite population (or census) parameters, such as ratios, regression and correlation coefficients, which can be expressed as smooth functions of totals. Taylor linearization is generally applicable to any sampling design, but it can lead to multiple variance estimators that are asymptotically design unbiased under repeated sampling. The choice among the variance estimators requires other considerations such as (i) approximate unbiasedness for the model variance of the estimator under an assumed model, and (ii) validity under a conditional repeated sampling framework. Demnati and Rao (2004) proposed a unified approach to deriving Taylor linearization variance estimators that leads directly to a unique variance estimator that satisfies the above considerations for general designs. When analyzing survey data, finite populations are often assumed to be generated from super-population models, and analytical inferences on model parameters are of interest. If the sampling fractions are small, then the sampling variance captures almost the entire variation generated by the design and model random processes. However, when the sampling fractions are not negligible, the model variance should be taken into account in order to construct valid inferences on model parameters under the combined process of generating the finite population from the assumed super-population model and the selection of the sample according to the specified sampling design. In this paper, we obtain an estimator of the total variance, using the Demnati-Rao approach, when the characteristics of interest are assumed to be random variables generated from a super-population model. We illustrate the method using ratio estimators and estimators defined as solutions to calibration weighted estimating equations. Simulation results on the performance of the proposed variance estimator for model parameters are also presented.
Release date: 2010-12-21
27. Some contributions to jackknifing two-phase sampling estimators Archived
Articles and reports: 12-001-X201000111247
Description:
In this paper, the problem of estimating the variance of various estimators of the population mean in two-phase sampling has been considered by jackknifing the two-phase calibrated weights of Hidiroglou and Särndal (1995, 1998). Several estimators of population mean available in the literature are shown to be the special cases of the technique developed here, including those suggested by Rao and Sitter (1995) and Sitter (1997). By following Raj (1965) and Srivenkataramana and Tracy (1989), some new estimators of the population mean are introduced and their variances are estimated through the proposed jackknife procedure. The variance of the chain ratio and regression type estimators due to Chand (1975) are also estimated using the jackknife. A simulation study is conducted to assess the efficiency of the proposed jackknife estimators relative to the usual estimators of variance.
Release date: 2010-06-29
28. Bayesian penalized spline model-based inference for finite population proportion in unequal probability sampling Archived
Articles and reports: 12-001-X201000111250
Description:
We propose a Bayesian Penalized Spline Predictive (BPSP) estimator for a finite population proportion in an unequal probability sampling setting. This new method allows the probabilities of inclusion to be directly incorporated into the estimation of a population proportion, using a probit regression of the binary outcome on the penalized spline of the inclusion probabilities. The posterior predictive distribution of the population proportion is obtained using Gibbs sampling. The advantages of the BPSP estimator over the Hájek (HK), Generalized Regression (GR), and parametric model-based prediction estimators are demonstrated by simulation studies and a real example in tax auditing. Simulation studies show that the BPSP estimator is more efficient, and its 95% credible interval provides better confidence coverage with shorter average width than the HK and GR estimators, especially when the population proportion is close to zero or one or when the sample is small. Compared to linear model-based predictive estimators, the BPSP estimators are robust to model misspecification and influential observations in the sample.
Release date: 2010-06-29
29. The Health of First Nations Living Off-Reserve, Inuit, and Métis Adults in Canada: The Impact of Socio-economic Status on Inequalities in Health Archived
Articles and reports: 82-622-X2010004
Geography: Canada
Description:
Aboriginal people - First Nations, Métis and Inuit - comprise a growing proportion of the Canadian population. Despite the younger average age of these populations, First Nations, Métis and Inuit people tend to suffer a greater burden of morbidity and mortality than non-Aboriginal Canadians. This may be due, in part, to higher rates of socio-economic disadvantage in Aboriginal populations.
Release date: 2010-06-23
30. Evaluating the hyperactivity/inattention subscale of the National Longitudinal Survey of Children and Youth Archived
Articles and reports: 82-003-X201000211234
Geography: Canada
Description:
This article evaluates the parent-reported Hyperactivity/Inattention Subscale of the National Longitudinal Survey of Children and Youth with data from cycle 1 (1994/1995) of the survey.
Release date: 2010-06-16

Data (2)

Data (2) ((2 results))

1. Individuals File, 2011 National Household Survey (Public Use Microdata Files)
Public use microdata: 99M0001X
Description: The Individuals File, 2011 National Household Survey (Public Use Microdata Files) provides data on the characteristics of the Canadian population. The file contains a 2.7% sample of anonymous responses to the 2011 National Household Survey (NHS) questionnaire. The files have been carefully scrutinized to ensure the complete confidentiality of the individual responses and geographic identifiers have been restricted to provinces/territories and metropolitan areas. With 133 variables, this comprehensive tool is excellent for policy analysts, pollsters, social researchers and anyone interested in modelling and performing statistical regression analysis using National Household Survey data.

Microdata files uniquely provide users access to non-aggregated data. The PUMFs user can group and manipulate these variables to suit data and research requirements. Tabulations excluded from other NHS products can be created or relationships between variables can be analyzed using different statistical tests. PUMFs provide quick access to a comprehensive social and economic database about Canada and its people.

This product, offered on DVD-ROM, contains the data file (in ASCII format); user documentation and supporting information; all licence agreements; and SAS, SPSS and Stata program source codes to enable users to read the set of records. It is important to note that users will require knowledge of data manipulation packages (or software) such as SAS, SPSS or Stata to use this product.
Release date: 2023-09-12
2. Bilingualism and earnings Archived
Table: 75-001-X19890022277
Description:
This study compares the earnings of bilingual and unilingual workers in three urban centres: Montreal, Toronto and Ottawa-Hull. Differences in the earnings of bilingual and unilingual workers are considered in the light of several demographic and job-related traits.
Release date: 1989-06-30

Analysis (174)

Analysis (174) (0 to 10 of 174 results)

1. Estimating municipal life expectancy and health-adjusted life expectancy in Canada, 2019 and 2020
Articles and reports: 82-003-X202500800001
Description: Data measuring life expectancy (LE) and health-adjusted life expectancy (HALE) in Canada are available for large geographical areas, such as provinces, territories, and health regions. However, to date, no study has analyzed LE and HALE at the municipal level. To address issues related to sparse administrative and survey data in small geographic areas, this study applies multilevel regression models and poststratification methods that have been shown to provide reliable estimates of population- and small area-level quantities from health surveys.
Release date: 2025-08-20
2. Survey data integration for regression analysis using model calibration
Articles and reports: 12-001-X202300100002
Description: We consider regression analysis in the context of data integration. To combine partial information from external sources, we employ the idea of model calibration which introduces a “working” reduced model based on the observed covariates. The working reduced model is not necessarily correctly specified but can be a useful device to incorporate the partial information from the external data. The actual implementation is based on a novel application of the information projection and model calibration weighting. The proposed method is particularly attractive for combining information from several sources with different missing patterns. The proposed method is applied to a real data example combining survey data from Korean National Health and Nutrition Examination Survey and big data from National Health Insurance Sharing Service in Korea.
Release date: 2023-06-30
3. Relative Performance of Methods Based on Model-Assisted Survey Regression Estimation Archived
Articles and reports: 11-522-X202100100009
Description:
Use of auxiliary data to improve the efficiency of estimators of totals and means through model-assisted survey regression estimation has received considerable attention in recent years. Generalized regression (GREG) estimators, based on a working linear regression model, are currently used in establishment surveys at Statistics Canada and several other statistical agencies. GREG estimators use common survey weights for all study variables and calibrate to known population totals of auxiliary variables. Increasingly, many auxiliary variables are available, some of which may be extraneous. This leads to unstable GREG weights when all the available auxiliary variables, including interactions among categorical variables, are used in the working linear regression model. On the other hand, new machine learning methods, such as regression trees and lasso, automatically select significant auxiliary variables and lead to stable nonnegative weights and possible efficiency gains over GREG. In this paper, a simulation study, based on a real business survey sample data set treated as the target population, is conducted to study the relative performance of GREG, regression trees and lasso in terms of efficiency of the estimators.
Key Words: Model assisted inference; calibration estimation; model selection; generalized regression estimator.

Release date: 2021-10-29
4. Refugees and Canadian Post-Secondary Education: Characteristics and Economic Outcomes in Comparison Archived
Articles and reports: 89-657-X2018001
Description:
This study draws on data from the Longitudinal Immigration Database to examine participation in Canadian post-secondary education (PSE) among adult immigrants in the 2002-2005 landing cohort, with an explicit focus on resettled refugees. The study describes the demographic characteristics of participants, the qualities of participation, and the economic returns on investment in Canadian PSE. It also employs multivariate regression analysis to further examine the effects of participation in Canadian training on employment incidence and the income of those employed, while controlling for other factors associated with successful economic integration.
Release date: 2018-11-14
5. A comparison between nonparametric estimators for finite population distribution functions Archived
Articles and reports: 12-001-X201600114541
Description:
In this work we compare nonparametric estimators for finite population distribution functions based on two types of fitted values: the fitted values from the well-known Kuo estimator and a modified version of them, which incorporates a nonparametric estimate for the mean regression function. For each type of fitted values we consider the corresponding model-based estimator and, after incorporating design weights, the corresponding generalized difference estimator. We show under fairly general conditions that the leading term in the model mean square error is not affected by the modification of the fitted values, even though it slows down the convergence rate for the model bias. Second order terms of the model mean square errors are difficult to obtain and will not be derived in the present paper. It remains thus an open question whether the modified fitted values bring about some benefit from the model-based perspective. We discuss also design-based properties of the estimators and propose a variance estimator for the generalized difference estimator based on the modified fitted values. Finally, we perform a simulation study. The simulation results suggest that the modified fitted values lead to a considerable reduction of the design mean square error if the sample size is small.
Release date: 2016-06-22
6. A note on regression estimation with unknown population size Archived
Articles and reports: 12-001-X201600114543
Description:
The regression estimator is extensively used in practice because it can improve the reliability of the estimated parameters of interest such as means or totals. It uses control totals of variables known at the population level that are included in the regression set up. In this paper, we investigate the properties of the regression estimator that uses control totals estimated from the sample, as well as those known at the population level. This estimator is compared to the regression estimators that strictly use the known totals both theoretically and via a simulation study.
Release date: 2016-06-22
7. A short note on quantile and expectile estimation in unequal probability samples Archived
Articles and reports: 12-001-X201600114545
Description:
The estimation of quantiles is an important topic not only in the regression framework, but also in sampling theory. A natural alternative or addition to quantiles are expectiles. Expectiles as a generalization of the mean have become popular during the last years as they not only give a more detailed picture of the data than the ordinary mean, but also can serve as a basis to calculate quantiles by using their close relationship. We show, how to estimate expectiles under sampling with unequal probabilities and how expectiles can be used to estimate the distribution function. The resulting fitted distribution function estimator can be inverted leading to quantile estimates. We run a simulation study to investigate and compare the efficiency of the expectile based estimator.
Release date: 2016-06-22
8. The Impact of Annual Wages on Interprovincial Mobility, Interprovincial Employment, and Job Vacancies Archived
Articles and reports: 11F0019M2016376
Geography: Canada, Province or territory
Description: The degree to which workers move across geographic areas in response to emerging employment opportunities or negative labour demand shocks is a key element in the adjustment process of an economy, and its ability to reach a desired allocation of resources.

This study estimates the causal impact of real after-tax annual wages and salaries on the propensity of young men to migrate to Alberta or to accept jobs in that province while maintaining residence in their home province. To do so, it exploits the cross-provincial variation in earnings growth plausibly induced by increases in world oil prices that occurred during the 2000s.
Release date: 2016-04-11
9. Do Workplace Pensions Crowd Out Other Retirement Savings? Evidence from Canadian Tax Records Archived
Articles and reports: 11F0019M2015371
Description:
This paper investigates whether registered pension plans (RPPs) help households prepare financially for retirement or simply substitute for other forms of private saving. This issue is addressed using a panel of 1.8 million Canadian households, from 1991 to 2010, which appear in the Longitudinal Administrative Databank. The analysis controls for correlations in savings across accounts due to unobserved tastes for saving by exploiting the fact that employer contribution rates increase discontinuously on earnings above the average industrial wage, a unique feature of occupational pensions in Canada, the effect being estimated in a Regression Kink Design.
Release date: 2015-12-21
10. A design effect measure for calibration weighting in single-stage samples Archived
Articles and reports: 12-001-X201500214236
Description:
We propose a model-assisted extension of weighting design-effect measures. We develop a summary-level statistic for different variables of interest, in single-stage sampling and under calibration weight adjustments. Our proposed design effect measure captures the joint effects of a non-epsem sampling design, unequal weights produced using calibration adjustments, and the strength of the association between an analysis variable and the auxiliaries used in calibration. We compare our proposed measure to existing design effect measures in simulations using variables like those collected in establishment surveys and telephone surveys of households.
Release date: 2015-12-17

Reference (10)

Reference (10) ((10 results))

1. Response error reinterview of the 1999-2000 Schools and Staffing Survey Archived
Surveys and statistical programs – Documentation: 11-522-X20010016308
Description:
This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.
The Census Bureau uses response error analysis to evaluate the effectiveness of survey questions. For a given survey, questions that are deemed critical to the survey or considered problematic from past examination are selected for analysis. New or revised questions are prime candidates for re-interview. Re-interview is a new interview where a subset of questions from the original interview are re-asked to a sample of the survey respondents. For each re-interview question, the proportion of respondents who give inconsistent responses is evaluated. The "Index of Inconsistency" is used as the measure of response variance. Each question is labelled low, moderate, or high in response variance. In high response variance cases, the questions are put through cognitive testing, and modifications to the question are recommended.
The Schools and Staffing Survey (SASS) sponsored by The National Center for Education Statistics (NCES), is also investigated for response error analysis and the possible relationships between inconsistent responses and characteristics of the schools and teachers in that survey. Results of this analysis can be used to change survey procedures and improve data quality.
Release date: 2002-09-12
2. Particulate matter and daily mortality: Combining time series information from eight U.S. cities Archived
Surveys and statistical programs – Documentation: 11-522-X19990015656
Description:
Time series studies have shown associations between air pollution concentrations and morbidity and mortality. These studies have largely been conducted within single cities, and with varying methods. Critics of these studies have questioned the validity of the data sets used and the statistical techniques applied to them; the critics have noted inconsistencies in findings among studies and even in independent re-analyses of data from the same city. In this paper we review some of the statistical methods used to analyze a subset of a national data base of air pollution, mortality and weather assembled during the National Morbidity and Mortality Air Pollution Study (NMMAPS).
Release date: 2000-03-02
3. A donor imputation system to create a census database fully adjusted for underenumeration Archived
Surveys and statistical programs – Documentation: 11-522-X19990015668
Description:
Following the problems with estimating underenumeration in the 1991 Census of England and Wales the aim for the 2001 Census is to create a database that is fully adjusted to net underenumeration. To achieve this, the paper investigates weighted donor imputation methodology that utilises information from both the census and census coverage survey (CCS). The US Census Bureau has considered a similar approach for their 2000 Census (see Isaki et al 1998). The proposed procedure distinguishes between individuals who are not counted by the census because their household is missed and those who are missed in counted households. Census data is linked to data from the CCS. Multinomial logistic regression is used to estimate the probabilities that households are missed by the census and the probabilities that individuals are missed in counted households. Household and individual coverage weights are constructed from the estimated probabilities and these feed into the donor imputation procedure.
Release date: 2000-03-02
4. Dual system estimation and the 2001 Census coverage surveys of the U.K. Archived
Surveys and statistical programs – Documentation: 11-522-X19990015682
Description:
The application of dual system estimation (DSE) to matched Census / Post Enumeration Survey (PES) data in order to measure net undercount is well understood (Hogan, 1993). However, this approach has so far not been used to measure net undercount in the UK. The 2001 PES in the UK will use this methodology. This paper presents the general approach to design and estimation for this PES (the 2001 Census Coverage Survey). The estimation combines DSE with standard ratio and regression estimation. A simulation study using census data from the 1991 Census of England and Wales demonstrates that the ratio model is in general more robust than the regression model.
Release date: 2000-03-02
5. Simultaneous calibration of several surveys Archived
Surveys and statistical programs – Documentation: 11-522-X19990015684
Description:
Often, the same information is gathered almost simultaneously for several different surveys. In France, this practice is institutionalized for household surveys that have a common set of demographic variables, i.e., employment, residence and income. These variables are important co-factors for the variables of interest in each survey, and if used carefully, can reinforce the estimates derived from each survey. Techniques for calibrating uncertain data can apply naturally in this context. This involves finding the best unbiased estimator in common variables and calibrating each survey based on that estimator. The estimator thus obtained in each survey is always a linear estimator, the weightings of which can be easily explained and the variance can be obtained with no new problems, as can the variance estimate. To supplement the list of regression estimators, this technique can also be seen as a ridge-regression estimator, or as a Bayesian-regression estimator.
Release date: 2000-03-02
6. Combining data sources: Air pollution and asthma consultations in 59 general practices throughout England and Wales - A case study Archived
Surveys and statistical programs – Documentation: 11-522-X19990015688
Description:
The geographical and temporal relationship between outdoor air pollution and asthma was examined by linking together data from multiple sources. These included the administrative records of 59 general practices widely dispersed across England and Wales for half a million patients and all their consultations for asthma, supplemented by a socio-economic interview survey. Postcode enabled linkage with: (i) computed local road density; (ii) emission estimates of sulphur dioxide and nitrogen dioxides, (iii) measured/interpolated concentration of black smoke, sulphur dioxide, nitrogen dioxide and other pollutants at practice level. Parallel Poisson time series analysis took into account between-practice variations to examine daily correlations in practices close to air quality monitoring stations. Preliminary analyses show small and generally non-significant geographical associations between consultation rates and pollution markers. The methodological issues relevant to combining such data, and the interpretation of these results will be discussed.
Release date: 2000-03-02
7. Using meta-analysis to understand the impact of time-of-use rates Archived
Surveys and statistical programs – Documentation: 11-522-X19990015692
Description:
Electricity rates that vary by time-of-day have the potential to significantly increase economic efficiency in the energy market. A number of utilities have undertaken economic studies of time-of-use rates schemes for their residential customers. This paper uses meta-analysis to examine the impact of time-of-use rates on electricity demand pooling the results of thirty-eight separate programs. There are four key findings. First, very large peak to off-peak price ratios are needed to significantly affect peak demand. Second, summer peak rates are relatively effective compared to winter peak rates. Third, permanent time-or-use rates are relatively effective compared to experimental ones. Fourth, demand charges rival ordinary time-of-use rates in terms of impact.
Release date: 2000-03-02
8. Random effects models for longitudinal data from complex samples Archived
Surveys and statistical programs – Documentation: 11-522-X19980015017
Description:
Longitudinal studies with repeated observations on individuals permit better characterizations of change and assessment of possible risk factors, but there has been little experience applying sophisticated models for longitudinal data to the complex survey setting. We present results from a comparison of different variance estimation methods for random effects models of change in cognitive function among older adults. The sample design is a stratified sample of people 65 and older, drawn as part of a community-based study designed to examine risk factors for dementia. The model summarizes the population heterogeneity in overall level and rate of change in cognitive function using random effects for intercept and slope. We discuss an unweighted regression including covariates for the stratification variables, a weighted regression, and bootstrapping; we also did preliminary work into using balanced repeated replication and jackknife repeated replication.
Release date: 1999-10-22
9. Marginal models for repeated observations: Inference with survey data Archived
Surveys and statistical programs – Documentation: 11-522-X19980015029
Description:
In longitudinal surveys, sample subjects are observed over several time points. This feature typically leads to dependent observations on the same subject, in addition to the customary correlations across subjects induced by the sample design. Much research in the literature has focussed on modeling the marginal mean of a response as a function of covariates. Liang and Zeger (1986) used generalized estimating equations (GEE), requiring only correct specification of the marginal mean, and obtained standard errors of regression parameter estimates and associated Wald tests, assuming a "working" correlation structure for the repeated measurements on a sample subject. Rotnitzky and Jewell (1990) developed quasi-score tests and Rao-Scott adjustments to "working" quasi-score tests under marginal models. These methods are asymptotically robust to misspecification of the within-subject correlation structure, but assume independence of sample subjects which is not satisfied for complex longitudinal survey data based on stratified multi-stage sampling. We proposed asymptotically valid Wald and quasi-score tests for longitudinal survey data, using the Taylor Linearization and jackknife methods. Alternative tests, based on Rao-Scott adjustments to naive tests that ignore survey design features and on Bonferroni-t, are also developed. These tests are particularly useful when the effective degrees of freedom, usually taken as the total number of sample primary units (clusters) minus the number of strata, is small.
Release date: 1999-10-22
10. Estimation with partial overlap longitudinal samples Archived
Surveys and statistical programs – Documentation: 11-522-X19980015035
Description:
In a longitudinal survey conducted for k periods some units may be observed for less than k of the periods. Examples include, surveys designed with partially overlapping subsamples, a pure panel survey with nonresponse, and a panel survey supplemented with additional samples for some of the time periods. Estimators of the regression type are exhibited for such surveys. An application to special studies associated with the National Resources Inventory is discussed.
Release date: 1999-10-22

Date modified:: 2026-06-04

Language selection

WxT Language switcher

Search and menus

WxT Search form

Keyword search

Filter results by

Keyword(s)

Subject

Type

Year of publication

Geography

Survey or statistical program

Portal

Content

Results

All (186) (20 to 30 of 186 results)

Data (2) ((2 results))

Analysis (174) (0 to 10 of 174 results)

Reference (10) ((10 results))

Keyword search

Filter results by

Keyword(s)

Subject

Type

Year of publication

Geography

Survey or statistical program

Portal

Content

Results

All (186) (20 to 30 of 186 results)

Data (2) ((2 results))

Analysis (174) (0 to 10 of 174 results)

Reference (10) ((10 results))

How are the results ordered?

How are the results ordered?

How do I use the filters and the search box?

How do I refine my search?

How does the search work?