  • Surveys and statistical programs – Documentation: 11-522-X19980015021

    The U.S. Bureau of the Census implemented major changes to the design of the Survey of Income and Program Participation (SIPP) with the panel begun in 1996. The revised survey design emphasized longitudinal applications and the Census Bureau attempted to understand and resolve the seam bias common to longitudinal surveys. In addition to the substantive and administrative redesign of the survey, the Census Bureau is improving the data processing procedures which yield microdata files for the public to analyse. The wave-by-wave data products are being edited and imputed with a longitudinal element rather than cross-sectionally, carrying forward information from a prior wave that is missing in the current wave. The longitudinal data products will be enhanced, both by the redesigned survey and new processing procedures. Simple methods of imputing data over time are being replaced with more sophisticated methods that do not attenuate seam bias. The longitudinal sample is expanding to include more observations which were nonrespondents in one or more waves. Longitudinal weights will be applied to the file to support person-based longitudinal analysis for calendar years or longer periods of time (up to four years).

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015022

    This article extends and further develops the method proposed by Pfeffermann, Skinner and Humphreys (1998) for the estimation of gross flows in the presence of classification errors. The main feature of that method is the use of auxiliary information at the individual level which circumvents the need for validation data for estimating the misclassification rates. The new developments in this article are the establishment of conditions for model identification, a study of the properties of a model goodness of fit statistic and modifications to the sample likelihood to account for missing data and informative sampling. The new developments are illustrated by a small Monte-Carlo simulation study.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015023

    The study of social mobility, between labour market statuses or between income levels, for example, is often based on the analysis of mobility matrices. When comparing these transition matrices, with a view to evaluating behavioural changes, one often forgets that the data derive from a sample survey and are therefore affected by sampling variances. Similarly, it is assumed that the responses collected correspond to the ' true value.'

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015024

    A longitudinal study on a cohort of pupils in the secondary school has been conducted in an Italian region since 1986 in order to study the transition from school to working life. The information have been collected at every sweep by a mail questionnaire and, at the final sweep, by a face-to-face interview, where retrospective questions referring back to the whole observation period have been asked. The gross flows between different discrete states - still in the school system, in the labour force without a job, in the labour force with a job - may then be estimated both from prospective and retrospective data, and the recall effect may be evaluated. Moreover, the conditions observed by the two different techniques may be regarded as two indicators of the 'true' unobservable condition, thus leading to the specification and estimation of a latent class model. In this framework, a Markov chain hypothesis may be introduced and evaluated in order to estimate the transition probabilities between the states, once they are corrected or the classification errors. Since the information collected by mail show a given amount of missing data in terms of unit nonresponse, the 'missing' category is also introduced in the model specification.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015025

    The log-linear modelling of categorical longitudinal survey data on income is studied. An emphasis is on inference about change. Special attention is paid to modelling of longitudinal data from two waves. A small illustration is based on data from the Canadian Survey of Labour and Income Dynamics.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015026

    The purpose of the present study is to utilize panel data from the Current Population Survey (CPS) to examine the effects of unit nonresponse. Because most nonrespondents to the CPS are respondents during at least one month-in-sample, data from other months can be used to compare the characteristics of complete respondents and panel nonrespondents and to evaluate nonresponse adjustment procedures. In the current paper we present analyses utilizing CPS panel data to illustrate the effects of unit nonresponse. After adjusting for nonresponse, additional comparisons are also made to evaluate the effects of nonresponse adjustment. The implications of the findings and suggestions for further research are discussed.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015027

    The disseminated results of annual business surveys inevitably contain statistics that are changing. Since the economic sphere is increasingly dynamic, a simple difference of aggregates between n-l and n is no longer sufficient to provide an overall description of what has happened. The change calculation module in the new generation of annual business surveys divides overall change into various components (births, deaths, inter-industry migration) and calculates change on the basis of a constant field, assigning special importance to restructurings. The main difficulties lie in establishing subsamples, reweighting, calibrating according to calculable changes, and taking account of restructuring.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015028

    We address the problem of estimation for the income dynamics statistics calculated from complex longitudinal surveys. In addition, we compare two design-based estimators of longitudinal proportions and transition rates in terms of variability under large attrition rates. One estimator is based on the cross-sectional samples for the estimation of the income class boundaries at each time period and on the longitudinal sample for the estimation of the longitudinal counts; the other estimator is entirely based on the longitudinal sample, both for the estimation of the class boundaries and the longitudinal counts. We develop Taylor linearization-type variance estimators for both the longitudinal and the mixed estimator under the assumption of no change in the population, and for the mixed estimator when there is change.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015029

    In longitudinal surveys, sample subjects are observed over several time points. This feature typically leads to dependent observations on the same subject, in addition to the customary correlations across subjects induced by the sample design. Much research in the literature has focussed on modeling the marginal mean of a response as a function of covariates. Liang and Zeger (1986) used generalized estimating equations (GEE), requiring only correct specification of the marginal mean, and obtained standard errors of regression parameter estimates and associated Wald tests, assuming a "working" correlation structure for the repeated measurements on a sample subject. Rotnitzky and Jewell (1990) developed quasi-score tests and Rao-Scott adjustments to "working" quasi-score tests under marginal models. These methods are asymptotically robust to misspecification of the within-subject correlation structure, but assume independence of sample subjects which is not satisfied for complex longitudinal survey data based on stratified multi-stage sampling. We proposed asymptotically valid Wald and quasi-score tests for longitudinal survey data, using the Taylor Linearization and jackknife methods. Alternative tests, based on Rao-Scott adjustments to naive tests that ignore survey design features and on Bonferroni-t, are also developed. These tests are particularly useful when the effective degrees of freedom, usually taken as the total number of sample primary units (clusters) minus the number of strata, is small.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015030

    Two-phase sampling designs have been conducted in waves to estimate the incidence of a rare disease such as dementia. Estimation of disease incidence from longitudinal dementia study has to appropriately adjust for data missing by death as well as the sampling design used at each study wave. In this paper we adopt a selection model approach to model the missing data by death and use a likelihood approach to derive incidence estimates. A modified EM algorithm is used to deal with data missing by sampling selection. The non-paramedic jackknife variance estimator is used to derive variance estimates for the model parameters and the incidence estimates. The proposed approaches are applied to data from the Indianapolis-Ibadan Dementia Study.

    Release date: 1999-10-22
  • Surveys and statistical programs – Documentation: 92-370-X

    Series description

    This series includes five general reference products - the Preview of Products and Services; the Catalogue; the Dictionary; the Handbook and the Technical Reports - as well as geography reference products - GeoSuite and Reference Maps.

    Product description

    Technical Reports examine the quality of data from the 1996 Census, a large and complex undertaking. While considerable effort was taken to ensure high quality standards throughout each step, the results are subject to a certain degree of error. Each report looks at the collection and processing operations and presents results from data evaluation, as well as notes on historical comparability.

    Technical Reports are aimed at moderate and sophisticated users but are written in a manner which could make them useful to all census data users. Most of the technical reports have been cancelled, with the exception of Age, Sex, Marital Status and Common-law Status, Coverage and Sampling and Weighting. These reports will be available as bilingual publications as well as being available in both official languages on the Internet as free products.

    This report deals with coverage errors, which occured when persons, households, dwellings or families were missed by the 1996 Census or enumerated in error. Coverage errors are one of the most important types of error since they affect not only the accuracy of the counts of the various census universes but also the accuracy of all of the census data describing the characteristics of these universes. With this information, users can determine the risks involved in basing conclusions or decisions on census data.

    Release date: 1999-12-14

  • Surveys and statistical programs – Documentation: 92-371-X

    This report deals with sampling and weighting, a process whereby certain characteristics are collected and processed for a random sample of dwellings and persons identified in the complete census enumeration. Data for the whole population are then obtained by scaling up the results for the sample to the full population level. The use of sampling may lead to substantial reductions in costs and respondent burden, or alternatively, can allow the scope of a census to be broadened at the same cost.

    Release date: 1999-12-07

  • Surveys and statistical programs – Documentation: 92-351-U

    Series Description:

    This series includes five general reference products - the Preview of Products and Services; the Catalogue; the Dictionary; the Handbook and Technical Reports - as well as two geography reference products - GeoSuite and Reference Maps.

    Product Description:

    The 1996 Census Dictionary provides detailed information on all of the concepts, variables and geographic elements of the 1996 Census. Information provided for each variable includes a definition, the associated census questions, applicable response categories or classifications and special remarks, namely on historical aspects. Users should make use of this edition of the 1996 Census Dictionary for the most up-to-date information. This final edition is also available on our web site as a free downloadable product.

    Release date: 1999-10-25

  • Surveys and statistical programs – Documentation: 11-522-X19980015007

    The National Population Health Survey (NPHS) is a family of surveys with multiple objectives, one of which is to provide information on a panel of people who will be followed over time to reflect the dynamic process of health and illness. Data for the first cycle of the NPHS - Households Survey were collected from June 1994 to June 1995, and were released in September 1995. Data for the second cycle were collected from June 1996 to August 1997. One of the primary outputs for the second cycle is a longitudinal master file. This paper will describe six major strategies that were developed to process the longitudinal master file.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015015

    In epidemiology analysis of longitudinal data is commonly accepted as providing the most robust measures of association between putative risk and selected outcomes such as death or cancer. SMARTIE is a SAS application for efficient analysis of longitudinal data. Based on person days at risk, it can handle multiple exits from and re-entries to risk, and derives outcome measures such as survival rates. Standardised Mortality Ratios (SMRs) and Cancer Incidence Ratios (SIRs). Summary data can be produced in a format easily ported to any modelling package such as Stats 5.0. We discuss the background to its development, the overall program structure, its command language, and finally we say something about the organization of outputs. Findings from survival studies using the Longitudinal Study of the Office for National Statistics (ONS) are used to demonstrate features of SMARTIE. This study is based on one per cent of the population of England and Wales. It is continually updated with the addition of new members and with information from birth, death and cancer records, and from the census.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015016

    Models for fitting longitudinal binary responses are explored using a panel study of voting intentions. A standard repeated measures multilevel logistic model is shown inadequate due to the presence of a substantial proportion of respondents who maintain a constant response over time. A multivariate binary response model is shown a better fit to the data.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015017

    Longitudinal studies with repeated observations on individuals permit better characterizations of change and assessment of possible risk factors, but there has been little experience applying sophisticated models for longitudinal data to the complex survey setting. We present results from a comparison of different variance estimation methods for random effects models of change in cognitive function among older adults. The sample design is a stratified sample of people 65 and older, drawn as part of a community-based study designed to examine risk factors for dementia. The model summarizes the population heterogeneity in overall level and rate of change in cognitive function using random effects for intercept and slope. We discuss an unweighted regression including covariates for the stratification variables, a weighted regression, and bootstrapping; we also did preliminary work into using balanced repeated replication and jackknife repeated replication.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015018

    This paper presents a method for handling longitudinal data in which individuals belong to more than one unit at a higher level, and also where there is missing information on the identification of the units to which they belong. In education, for example, a student might be classified as belonging sequentially to a particular combination of primary and secondary school, but for some students, the identity of either the primary or secondary school may be unknown. Likewise, in a longitudinal study, students may change school or class from one period to the next, so 'belonging' to more than one higher level unit. The procedures used to model these stuctures are extensions of a random effects cross-classified multilevel model.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015019

    The British Labour Force Survey (LFS) is a quarterly household survey with a rotating sample design that can potentially be used to produce longitudinal data, including estimates of labour force gross flows. However, these estimates may be biased due to the effect of non-response. Weighting adjustments are a commonly used method to account for non-response bias. We find that weighting may not fully account for the effect of non-response bias because non-response may depend on the unobserved labour force flows, i.e., the non-response is non-ignorable. To adjust for the effects of non-ignorable non-response, we propose a model for the complex non-response patterns in the LFS which controls for the correlated within-household non-response behaviour found in the survey. The results of modelling suggest that non-response may be non-ignorable in the LFS, causing the weighting estimates to be biased.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015020

    At the end of 1993, Eurostat lauched a 'community' panel of households. The first wave, carried out in 1994 in the 12 countries of the European Union, included some 7,300 households in France, and at least 14,000 adults 17 years or over. Each individual was then followed up and interviewed each year, even if they had moved. The individuals leaving the sample present a particular profile. In the first part, we present a sketch of how our sample evolves and an analysis of the main characteristics of the non-respondents. We then propose 2 models to correct for non-response per homogeneous category. We then describe the longitudinal weight distribution obtained from the two models, and the cross-sectional weights using the weight share method. Finally, we compare some indicators calculated using both weighting methods.

    Release date: 1999-10-22
