Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Year of publication

5 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (8)

All (8) ((8 results))

  • Articles and reports: 12-001-X200900211045
    Description:

    In analysis of sample survey data, degrees-of-freedom quantities are often used to assess the stability of design-based variance estimators. For example, these degrees-of-freedom values are used in construction of confidence intervals based on t distribution approximations; and of related t tests. In addition, a small degrees-of-freedom term provides a qualitative indication of the possible limitations of a given variance estimator in a specific application. Degrees-of-freedom calculations sometimes are based on forms of the Satterthwaite approximation. These Satterthwaite-based calculations depend primarily on the relative magnitudes of stratum-level variances. However, for designs involving a small number of primary units selected per stratum, standard stratum-level variance estimators provide limited information on the true stratum variances. For such cases, customary Satterthwaite-based calculations can be problematic, especially in analyses for subpopulations that are concentrated in a relatively small number of strata. To address this problem, this paper uses estimated within-primary-sample-unit (within PSU) variances to provide auxiliary information regarding the relative magnitudes of the overall stratum-level variances. Analytic results indicate that the resulting degrees-of-freedom estimator will be better than modified Satterthwaite-type estimators provided: (a) the overall stratum-level variances are approximately proportional to the corresponding within-stratum variances; and (b) the variances of the within-PSU variance estimators are relatively small. In addition, this paper develops errors-in-variables methods that can be used to check conditions (a) and (b) empirically. For these model checks, we develop simulation-based reference distributions, which differ substantially from reference distributions based on customary large-sample normal approximations. The proposed methods are applied to four variables from the U.S. Third National Health and Nutrition Examination Survey (NHANES III).

    Release date: 2009-12-23

  • Articles and reports: 11-522-X200800010991
    Description:

    In the evaluation of prospective survey designs, statistical agencies generally must consider a large number of design factors that may have a substantial impact on both survey costs and data quality. Assessments of trade-offs between cost and quality are often complicated by limitations on the amount of information available regarding fixed and marginal costs related to: instrument redesign and field testing; the number of primary sample units and sample elements included in the sample; assignment of instrument sections and collection modes to specific sample elements; and (for longitudinal surveys) the number and periodicity of interviews. Similarly, designers often have limited information on the impact of these design factors on data quality.

    This paper extends standard design-optimization approaches to account for uncertainty in the abovementioned components of cost and quality. Special attention is directed toward the level of precision required for cost and quality information to provide useful input into the design process; sensitivity of cost-quality trade-offs to changes in assumptions regarding functional forms; and implications for preliminary work focused on collection of cost and quality information. In addition, the paper considers distinctions between cost and quality components encountered in field testing and production work, respectively; incorporation of production-level cost and quality information into adaptive design work; as well as costs and operational risks arising from the collection of detailed cost and quality data during production work. The proposed methods are motivated by, and applied to, work with partitioned redesign of the interview and diary components of the U.S. Consumer Expenditure Survey.

    Release date: 2009-12-03

  • Articles and reports: 11-522-X200600110410
    Description:

    The U.S. Survey of Occupational Illnesses and Injuries (SOII) is a large-scale establishment survey conducted by the Bureau of Labor Statistics to measure incidence rates and impact of occupational illnesses and injuries within specified industries at the national and state levels. This survey currently uses relatively simple procedures for detection and treatment of outliers. The outlier-detection methods center on comparison of reported establishment-level incidence rates to the corresponding distribution of reports within specified cells defined by the intersection of state and industry classifications. The treatment methods involve replacement of standard probability weights with a weight set equal to one, followed by a benchmark adjustment.

    One could use more complex methods for detection and treatment of outliers for the SOII, e.g., detection methods that use influence functions, probability weights and multivariate observations; or treatment methods based on Winsorization or M-estimation. Evaluation of the practical benefits of these more complex methods requires one to consider three important factors. First, severe outliers are relatively rare, but when they occur, they may have a severe impact on SOII estimators in cells defined by the intersection of states and industries. Consequently, practical evaluation of the impact of outlier methods focuses primarily on the tails of the distributions of estimators, rather than standard aggregate performance measures like variance or mean squared error. Second, the analytic and data-based evaluations focus on the incremental improvement obtained through use of the more complex methods, relative to the performance of the simple methods currently in place. Third, development of the abovementioned tools requires somewhat nonstandard asymptotics the reflect trade-offs in effects associated with, respectively, increasing sample sizes; increasing numbers of publication cells; and changing tails of underlying distributions of observations.

    Release date: 2008-03-17

  • Articles and reports: 11-522-X20040018755
    Description:

    This paper reviews the robustness of methods dealing with response errors for rare populations. It also reviews problems with weighting scheme for these populations. It develops an asymptotic framework intended to deal with such problems.

    Release date: 2005-10-27

  • Articles and reports: 11-522-X20030017700
    Description:

    This paper suggests a useful framework for exploring the effects of moderate deviations from idealized conditions. It offers evaluation criteria for point estimators and interval estimators.

    Release date: 2005-01-26

  • Articles and reports: 11-522-X20020016750
    Description:

    Analyses of data from social and economic surveys sometimes use generalized variance function models to approximate the design variance of point estimators of population means and proportions. Analysts may use the resulting standard error estimates to compute associated confidence intervals or test statistics for the means and proportions of interest. In comparison with design-based variance estimators computed directly from survey microdata, generalized variance function models have several potential advantages, as will be discussed in this paper, including operational simplicity; increased stability of standard errors; and, for cases involving public-use datasets, reduction of disclosure limitation problems arising from the public release of stratum and cluster indicators.

    These potential advantages, however, may be offset in part by several inferential issues. First, the properties of inferential statistics based on generalized variance functions (e.g., confidence interval coverage rates and widths) depend heavily on the relative empirical magnitudes of the components of variability associated, respectively, with:

    (a) the random selection of a subset of items used in estimation of the generalized variance function model(b) the selection of sample units under a complex sample design (c) the lack of fit of the generalized variance function model (d) the generation of a finite population under a superpopulation model.

    Second, under conditions, one may link each of components (a) through (d) with different empirical measures of the predictive adequacy of a generalized variance function model. Consequently, these measures of predictive adequacy can offer us some insight into the extent to which a given generalized variance function model may be appropriate for inferential use in specific applications.

    Some of the proposed diagnostics are applied to data from the US Survey of Doctoral Recipients and the US Current Employment Survey. For the Survey of Doctoral Recipients, components (a), (c) and (d) are of principal concern. For the Current Employment Survey, components (b), (c) and (d) receive principal attention, and the availability of population microdata allow the development of especially detailed models for components (b) and (c).

    Release date: 2004-09-13

  • Articles and reports: 12-001-X19970013103
    Description:

    This paper discusses the use of some simple diagnostics to guide the formation of nonresponse adjustment cells. Following Little (1986), we consider construction of adjustment cells by grouping sample units according to their estimated response probabilities or estimated survey items. Four issues receive principal attention: assessment of the sensitivity of adjusted mean estimates to changes in k, the number of cells used; identification of specific cells that require additional refinement; comparison of adjusted and unadjusted mean estimates; and comparison of estimation results from estimated-probability and estimated-item based cells. The proposed methods are motivated and illustrated with an application involving estimation of mean consumer unit income from the U.S. Consumer Expenditure Survey.

    Release date: 1997-08-18

  • Articles and reports: 12-001-X19960022982
    Description:

    In work with sample surveys, we often use estimators of the variance components associated with sampling within and between primary sample units. For these applications, it can be important to have some indication of whether the variance component estimators are stable, i.e., have relatively low variance. This paper discusses several data-based measures of the stability of design-based variance component estimators and related quantities. The development emphasizes methods that can be applied to surveys with moderate or large numbers of strata and small numbers of primary sample units per stratum. We direct principal attention toward the design variance of a within-PSU variance estimator, and two related degrees-of-freedom terms. A simulation-based method allows one to assess whether an observed stability measure is consistent with standard assumptions regarding variance estimator stability. We also develop two sets of stability measures for design-based estimators of between-PSU variance components and the ratio of the overall variance to the within-PSU variance. The proposed methods are applied to interview and examination data from the U.S. Third National Health and Nutrition Examination Survey (NHANES III). These results indicate that the true stability properties may vary substantially across variables. In addition, for some variables, within-PSU variance estimators appear to be considerably less stable than one would anticipate from a simple count of secondary units within each stratum.

    Release date: 1997-01-30
Stats in brief (0)

Stats in brief (0) (0 results)

No content available at this time.

Articles and reports (8)

Articles and reports (8) ((8 results))

  • Articles and reports: 12-001-X200900211045
    Description:

    In analysis of sample survey data, degrees-of-freedom quantities are often used to assess the stability of design-based variance estimators. For example, these degrees-of-freedom values are used in construction of confidence intervals based on t distribution approximations; and of related t tests. In addition, a small degrees-of-freedom term provides a qualitative indication of the possible limitations of a given variance estimator in a specific application. Degrees-of-freedom calculations sometimes are based on forms of the Satterthwaite approximation. These Satterthwaite-based calculations depend primarily on the relative magnitudes of stratum-level variances. However, for designs involving a small number of primary units selected per stratum, standard stratum-level variance estimators provide limited information on the true stratum variances. For such cases, customary Satterthwaite-based calculations can be problematic, especially in analyses for subpopulations that are concentrated in a relatively small number of strata. To address this problem, this paper uses estimated within-primary-sample-unit (within PSU) variances to provide auxiliary information regarding the relative magnitudes of the overall stratum-level variances. Analytic results indicate that the resulting degrees-of-freedom estimator will be better than modified Satterthwaite-type estimators provided: (a) the overall stratum-level variances are approximately proportional to the corresponding within-stratum variances; and (b) the variances of the within-PSU variance estimators are relatively small. In addition, this paper develops errors-in-variables methods that can be used to check conditions (a) and (b) empirically. For these model checks, we develop simulation-based reference distributions, which differ substantially from reference distributions based on customary large-sample normal approximations. The proposed methods are applied to four variables from the U.S. Third National Health and Nutrition Examination Survey (NHANES III).

    Release date: 2009-12-23

  • Articles and reports: 11-522-X200800010991
    Description:

    In the evaluation of prospective survey designs, statistical agencies generally must consider a large number of design factors that may have a substantial impact on both survey costs and data quality. Assessments of trade-offs between cost and quality are often complicated by limitations on the amount of information available regarding fixed and marginal costs related to: instrument redesign and field testing; the number of primary sample units and sample elements included in the sample; assignment of instrument sections and collection modes to specific sample elements; and (for longitudinal surveys) the number and periodicity of interviews. Similarly, designers often have limited information on the impact of these design factors on data quality.

    This paper extends standard design-optimization approaches to account for uncertainty in the abovementioned components of cost and quality. Special attention is directed toward the level of precision required for cost and quality information to provide useful input into the design process; sensitivity of cost-quality trade-offs to changes in assumptions regarding functional forms; and implications for preliminary work focused on collection of cost and quality information. In addition, the paper considers distinctions between cost and quality components encountered in field testing and production work, respectively; incorporation of production-level cost and quality information into adaptive design work; as well as costs and operational risks arising from the collection of detailed cost and quality data during production work. The proposed methods are motivated by, and applied to, work with partitioned redesign of the interview and diary components of the U.S. Consumer Expenditure Survey.

    Release date: 2009-12-03

  • Articles and reports: 11-522-X200600110410
    Description:

    The U.S. Survey of Occupational Illnesses and Injuries (SOII) is a large-scale establishment survey conducted by the Bureau of Labor Statistics to measure incidence rates and impact of occupational illnesses and injuries within specified industries at the national and state levels. This survey currently uses relatively simple procedures for detection and treatment of outliers. The outlier-detection methods center on comparison of reported establishment-level incidence rates to the corresponding distribution of reports within specified cells defined by the intersection of state and industry classifications. The treatment methods involve replacement of standard probability weights with a weight set equal to one, followed by a benchmark adjustment.

    One could use more complex methods for detection and treatment of outliers for the SOII, e.g., detection methods that use influence functions, probability weights and multivariate observations; or treatment methods based on Winsorization or M-estimation. Evaluation of the practical benefits of these more complex methods requires one to consider three important factors. First, severe outliers are relatively rare, but when they occur, they may have a severe impact on SOII estimators in cells defined by the intersection of states and industries. Consequently, practical evaluation of the impact of outlier methods focuses primarily on the tails of the distributions of estimators, rather than standard aggregate performance measures like variance or mean squared error. Second, the analytic and data-based evaluations focus on the incremental improvement obtained through use of the more complex methods, relative to the performance of the simple methods currently in place. Third, development of the abovementioned tools requires somewhat nonstandard asymptotics the reflect trade-offs in effects associated with, respectively, increasing sample sizes; increasing numbers of publication cells; and changing tails of underlying distributions of observations.

    Release date: 2008-03-17

  • Articles and reports: 11-522-X20040018755
    Description:

    This paper reviews the robustness of methods dealing with response errors for rare populations. It also reviews problems with weighting scheme for these populations. It develops an asymptotic framework intended to deal with such problems.

    Release date: 2005-10-27

  • Articles and reports: 11-522-X20030017700
    Description:

    This paper suggests a useful framework for exploring the effects of moderate deviations from idealized conditions. It offers evaluation criteria for point estimators and interval estimators.

    Release date: 2005-01-26

  • Articles and reports: 11-522-X20020016750
    Description:

    Analyses of data from social and economic surveys sometimes use generalized variance function models to approximate the design variance of point estimators of population means and proportions. Analysts may use the resulting standard error estimates to compute associated confidence intervals or test statistics for the means and proportions of interest. In comparison with design-based variance estimators computed directly from survey microdata, generalized variance function models have several potential advantages, as will be discussed in this paper, including operational simplicity; increased stability of standard errors; and, for cases involving public-use datasets, reduction of disclosure limitation problems arising from the public release of stratum and cluster indicators.

    These potential advantages, however, may be offset in part by several inferential issues. First, the properties of inferential statistics based on generalized variance functions (e.g., confidence interval coverage rates and widths) depend heavily on the relative empirical magnitudes of the components of variability associated, respectively, with:

    (a) the random selection of a subset of items used in estimation of the generalized variance function model(b) the selection of sample units under a complex sample design (c) the lack of fit of the generalized variance function model (d) the generation of a finite population under a superpopulation model.

    Second, under conditions, one may link each of components (a) through (d) with different empirical measures of the predictive adequacy of a generalized variance function model. Consequently, these measures of predictive adequacy can offer us some insight into the extent to which a given generalized variance function model may be appropriate for inferential use in specific applications.

    Some of the proposed diagnostics are applied to data from the US Survey of Doctoral Recipients and the US Current Employment Survey. For the Survey of Doctoral Recipients, components (a), (c) and (d) are of principal concern. For the Current Employment Survey, components (b), (c) and (d) receive principal attention, and the availability of population microdata allow the development of especially detailed models for components (b) and (c).

    Release date: 2004-09-13

  • Articles and reports: 12-001-X19970013103
    Description:

    This paper discusses the use of some simple diagnostics to guide the formation of nonresponse adjustment cells. Following Little (1986), we consider construction of adjustment cells by grouping sample units according to their estimated response probabilities or estimated survey items. Four issues receive principal attention: assessment of the sensitivity of adjusted mean estimates to changes in k, the number of cells used; identification of specific cells that require additional refinement; comparison of adjusted and unadjusted mean estimates; and comparison of estimation results from estimated-probability and estimated-item based cells. The proposed methods are motivated and illustrated with an application involving estimation of mean consumer unit income from the U.S. Consumer Expenditure Survey.

    Release date: 1997-08-18

  • Articles and reports: 12-001-X19960022982
    Description:

    In work with sample surveys, we often use estimators of the variance components associated with sampling within and between primary sample units. For these applications, it can be important to have some indication of whether the variance component estimators are stable, i.e., have relatively low variance. This paper discusses several data-based measures of the stability of design-based variance component estimators and related quantities. The development emphasizes methods that can be applied to surveys with moderate or large numbers of strata and small numbers of primary sample units per stratum. We direct principal attention toward the design variance of a within-PSU variance estimator, and two related degrees-of-freedom terms. A simulation-based method allows one to assess whether an observed stability measure is consistent with standard assumptions regarding variance estimator stability. We also develop two sets of stability measures for design-based estimators of between-PSU variance components and the ratio of the overall variance to the within-PSU variance. The proposed methods are applied to interview and examination data from the U.S. Third National Health and Nutrition Examination Survey (NHANES III). These results indicate that the true stability properties may vary substantially across variables. In addition, for some variables, within-PSU variance estimators appear to be considerably less stable than one would anticipate from a simple count of secondary units within each stratum.

    Release date: 1997-01-30
Journals and periodicals (0)

Journals and periodicals (0) (0 results)

No content available at this time.

Date modified: