Keyword search

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Type

1 facets displayed. 0 facets selected.

Year of publication

1 facets displayed. 1 facets selected.
Sort Help
entries

Results

All (13)

All (13) (0 to 10 of 13 results)

  • Articles and reports: 12-001-X200900211038
    Description:

    We examine overcoming the overestimation in using generalized weight share method (GWSM) caused by link nonresponse in indirect sampling. A few adjustment methods incorporating link nonresponse in using GWSM have been constructed for situations both with and without the availability of auxiliary variables. A simulation study on a longitudinal survey is presented using some of the adjustment methods we recommend. The simulation results show that these adjusted GWSMs perform well in reducing both estimation bias and variance. The advancement in bias reduction is significant.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211045
    Description:

    In analysis of sample survey data, degrees-of-freedom quantities are often used to assess the stability of design-based variance estimators. For example, these degrees-of-freedom values are used in construction of confidence intervals based on t distribution approximations; and of related t tests. In addition, a small degrees-of-freedom term provides a qualitative indication of the possible limitations of a given variance estimator in a specific application. Degrees-of-freedom calculations sometimes are based on forms of the Satterthwaite approximation. These Satterthwaite-based calculations depend primarily on the relative magnitudes of stratum-level variances. However, for designs involving a small number of primary units selected per stratum, standard stratum-level variance estimators provide limited information on the true stratum variances. For such cases, customary Satterthwaite-based calculations can be problematic, especially in analyses for subpopulations that are concentrated in a relatively small number of strata. To address this problem, this paper uses estimated within-primary-sample-unit (within PSU) variances to provide auxiliary information regarding the relative magnitudes of the overall stratum-level variances. Analytic results indicate that the resulting degrees-of-freedom estimator will be better than modified Satterthwaite-type estimators provided: (a) the overall stratum-level variances are approximately proportional to the corresponding within-stratum variances; and (b) the variances of the within-PSU variance estimators are relatively small. In addition, this paper develops errors-in-variables methods that can be used to check conditions (a) and (b) empirically. For these model checks, we develop simulation-based reference distributions, which differ substantially from reference distributions based on customary large-sample normal approximations. The proposed methods are applied to four variables from the U.S. Third National Health and Nutrition Examination Survey (NHANES III).

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211046
    Description:

    A semiparametric regression model is developed for complex surveys. In this model, the explanatory variables are represented separately as a nonparametric part and a parametric linear part. The estimation techniques combine nonparametric local polynomial regression estimation and least squares estimation. Asymptotic results such as consistency and normality of the estimators of regression coefficients and the regression functions have also been developed. Success of the performance of the methods and the properties of estimates have been shown by simulation and empirical examples with the Ontario Health Survey 1990.

    Release date: 2009-12-23

  • Articles and reports: 11-522-X200800010957
    Description:

    Business surveys differ from surveys of populations of individual persons or households in many respects. Two of the most important differences are (a) that respondents in business surveys do not answer questions about characteristics of themselves (such as their experiences, behaviours, attitudes and feelings) but about characteristics of organizations (such as their size, revenues, policies, and strategies) and (b) that they answer these questions as an informant for that organization. Academic business surveys differ from other business surveys, such as of national statistical agencies, in many respects as well. The one most important difference is that academic business surveys usually do not aim at generating descriptive statistics but at testing hypotheses, i.e. relations between variables. Response rates in academic business surveys are very low, which implies a huge risk of non-response bias. Usually no attempt is made to assess the extent of non-response bias and published survey results might, therefore, not be a correct reflection of actual relations within the population, which in return increases the likelihood that the reported test result is not correct.

    This paper provides an analysis of how (the risk of) non-response bias is discussed in research papers published in top management journals. It demonstrates that non-response bias is not assessed to a sufficient degree and that, if attempted at all, correction of non-response bias is difficult or very costly in practice. Three approaches to dealing with this problem are presented and discussed:(a) obtaining data by other means than questionnaires;(b) conducting surveys of very small populations; and(c) conducting surveys of very small samples.

    It will be discussed why these approaches are appropriate means of testing hypotheses in populations. Trade-offs regarding the selection of an approach will be discussed as well.

    Release date: 2009-12-03

  • Articles and reports: 11-522-X200800010959
    Description:

    The Unified Enterprise Survey (UES) at Statistics Canada is an annual business survey that unifies more than 60 surveys from different industries. Two types of collection follow-up score functions are currently used in the UES data collection. The objective of using a score function is to maximize the economically weighted response rates of the survey in terms of the primary variables of interest, under the constraint of a limited follow-up budget. Since the two types of score functions are based on different methodologies, they could have different impacts on the final estimates.

    This study generally compares the two types of score functions based on the collection data obtained from the two recent years. For comparison purposes, this study applies each score function method to the same data respectively and computes various estimates of the published financial and commodity variables, their deviation from the true pseudo value and their mean square deviation, based on each method. These estimates of deviation and mean square deviation based on each method are then used to measure the impact of each score function on the final estimates of the financial and commodity variables.

    Release date: 2009-12-03

  • Articles and reports: 11-522-X200800010967
    Description:

    In this paper the background of the eXtensible Business Reporting Language and the involvement of Statistics Netherlands in the Dutch Taxonomy Project are discussed. The discussion predominantly focuses on the statistical context of using XBRL and the Dutch Taxonomy for expressing data terms to companies.

    Release date: 2009-12-03

  • Articles and reports: 11-536-X200900110803
    Description:

    "Classical GREG estimator" is used here to refer to the generalized regression estimator extensively discussed for example in Särndal, Swensson and Wretman (1992). This paper summarize some recent extensions of the classical GREG estimator when applied to the estimation of totals for population subgroups or domains. GREG estimation was introduced for domain estimation in Särndal (1981, 1984), Hidiroglou and Särndal (1985) and Särndal and Hidiroglou (1989), and was developed further in Estevao, Hidiroglou and Särndal (1995). For the classical GREG estimator, fixed-effects linear model serves as the underlying working or assisting model, and aggregate-level auxiliary totals are incorporated in the estimation procedure. In some recent developments, an access to unit-level auxiliary data is assumed for GREG estimation for domains. Obviously, an access to micro-merged register and survey data involves much flexibility for domain estimation. This view has been adopted for GREG estimation for example in Lehtonen and Veijanen (1998), Lehtonen, Särndal and Veijanen (2003, 2005), and Lehtonen, Myrskylä, Särndal and Veijanen (2007). These extensions cover the cases of continuous and binary or polytomous response variables, use of generalized linear mixed models as assisting models, and unequal probability sampling designs. Relative merits and challenges of the various GREG estimators will be discussed.

    Release date: 2009-08-11

  • Articles and reports: 11-536-X200900110807
    Description:

    Model calibration (Wu & Sitter, JASA, 2001) has been shown to provide more efficient estimates than classical calibration when the values of one or more auxiliary variables are available for each unit in the population and the relationship between such variables and the variable of interest is more complex than a linear one. Model calibration, though, provides a different set of weights for each variable of interest. To overcome this problem an estimator is proposed: calibration is pursued with respect to both the auxiliary variables values and the fitted values of the variables of interest obtained with parametric and/or nonparametric models. This allows for coherence among estimates and more efficiency if the model is well specified. The asymptotic properties of the resulting estimator are studied with respect to the sampling design. The issue of high variability of the weights is addressed by relaxing binding constraints on the variables included for efficiency purposes in the calibration equations. A simulation study is also presented to better understand the finite size sample behavior of the proposed estimator

    Release date: 2009-08-11

  • Articles and reports: 11-536-X200900110811
    Description:

    Composite imputation is often used in business surveys. It occurs when several imputation methods are used to impute a single variable of interest. The choice of one method instead of another depends on the availability or not of some auxiliary variables. For instance, ratio imputation could be used to impute a missing value when an auxiliary variable is available and, otherwise, mean imputation could be used.

    Although composite imputation is frequent in practice, the literature on variance estimation when composite imputation is used is limited. We consider the general methodology proposed by Särndal et al. (1992), which requires the validity of an imputation model i.e., a model for the variable being imputed. At first glance, the extension of this methodology to composite imputation seems quite tedious until we notice that most imputation methods used in practice lead to imputed estimators that are linear in the observed values of the variable of interest. This considerably simplifies the derivation of a variance estimator even when there is a single imputation method. Regarding the estimation of the sampling portion of the total variance, we use a methodology slightly different than the one proposed by Särndal et al. (1992). Our methodology is similar to the sampling variance estimator under multiple imputation with an infinite number of imputations.

    This methodology is the central part of version 2.0 of the System for Estimation of Variance due to Nonresponse and Imputation (SEVANI), which is being developed at Statistics Canada. Using SEVANI, we will illustrate our method through an example based on real data.

    Release date: 2009-08-11

  • Articles and reports: 12-001-X200900110888
    Description:

    In the selection of a sample, a current practice is to define a sampling design stratified on subpopulations. This reduces the variance of the Horvitz-Thompson estimator in comparison with direct sampling if the strata are highly homogeneous with respect to the variable of interest. If auxiliary variables are available for each individual, sampling can be improved through balanced sampling within each stratum, and the Horvitz-Thompson estimator will be more precise if the auxiliary variables are strongly correlated with the variable of interest. However, if the sample allocation is small in some strata, balanced sampling will be only very approximate. In this paper, we propose a method of selecting a sample that is balanced across the entire population while maintaining a fixed allocation within each stratum. We show that in the important special case of size-2 sampling in each stratum, the precision of the Horvitz-Thompson estimator is improved if the variable of interest is well explained by balancing variables over the entire population. An application to rotational sampling is also presented.

    Release date: 2009-06-22
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (13)

Analysis (13) (0 to 10 of 13 results)

  • Articles and reports: 12-001-X200900211038
    Description:

    We examine overcoming the overestimation in using generalized weight share method (GWSM) caused by link nonresponse in indirect sampling. A few adjustment methods incorporating link nonresponse in using GWSM have been constructed for situations both with and without the availability of auxiliary variables. A simulation study on a longitudinal survey is presented using some of the adjustment methods we recommend. The simulation results show that these adjusted GWSMs perform well in reducing both estimation bias and variance. The advancement in bias reduction is significant.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211045
    Description:

    In analysis of sample survey data, degrees-of-freedom quantities are often used to assess the stability of design-based variance estimators. For example, these degrees-of-freedom values are used in construction of confidence intervals based on t distribution approximations; and of related t tests. In addition, a small degrees-of-freedom term provides a qualitative indication of the possible limitations of a given variance estimator in a specific application. Degrees-of-freedom calculations sometimes are based on forms of the Satterthwaite approximation. These Satterthwaite-based calculations depend primarily on the relative magnitudes of stratum-level variances. However, for designs involving a small number of primary units selected per stratum, standard stratum-level variance estimators provide limited information on the true stratum variances. For such cases, customary Satterthwaite-based calculations can be problematic, especially in analyses for subpopulations that are concentrated in a relatively small number of strata. To address this problem, this paper uses estimated within-primary-sample-unit (within PSU) variances to provide auxiliary information regarding the relative magnitudes of the overall stratum-level variances. Analytic results indicate that the resulting degrees-of-freedom estimator will be better than modified Satterthwaite-type estimators provided: (a) the overall stratum-level variances are approximately proportional to the corresponding within-stratum variances; and (b) the variances of the within-PSU variance estimators are relatively small. In addition, this paper develops errors-in-variables methods that can be used to check conditions (a) and (b) empirically. For these model checks, we develop simulation-based reference distributions, which differ substantially from reference distributions based on customary large-sample normal approximations. The proposed methods are applied to four variables from the U.S. Third National Health and Nutrition Examination Survey (NHANES III).

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211046
    Description:

    A semiparametric regression model is developed for complex surveys. In this model, the explanatory variables are represented separately as a nonparametric part and a parametric linear part. The estimation techniques combine nonparametric local polynomial regression estimation and least squares estimation. Asymptotic results such as consistency and normality of the estimators of regression coefficients and the regression functions have also been developed. Success of the performance of the methods and the properties of estimates have been shown by simulation and empirical examples with the Ontario Health Survey 1990.

    Release date: 2009-12-23

  • Articles and reports: 11-522-X200800010957
    Description:

    Business surveys differ from surveys of populations of individual persons or households in many respects. Two of the most important differences are (a) that respondents in business surveys do not answer questions about characteristics of themselves (such as their experiences, behaviours, attitudes and feelings) but about characteristics of organizations (such as their size, revenues, policies, and strategies) and (b) that they answer these questions as an informant for that organization. Academic business surveys differ from other business surveys, such as of national statistical agencies, in many respects as well. The one most important difference is that academic business surveys usually do not aim at generating descriptive statistics but at testing hypotheses, i.e. relations between variables. Response rates in academic business surveys are very low, which implies a huge risk of non-response bias. Usually no attempt is made to assess the extent of non-response bias and published survey results might, therefore, not be a correct reflection of actual relations within the population, which in return increases the likelihood that the reported test result is not correct.

    This paper provides an analysis of how (the risk of) non-response bias is discussed in research papers published in top management journals. It demonstrates that non-response bias is not assessed to a sufficient degree and that, if attempted at all, correction of non-response bias is difficult or very costly in practice. Three approaches to dealing with this problem are presented and discussed:(a) obtaining data by other means than questionnaires;(b) conducting surveys of very small populations; and(c) conducting surveys of very small samples.

    It will be discussed why these approaches are appropriate means of testing hypotheses in populations. Trade-offs regarding the selection of an approach will be discussed as well.

    Release date: 2009-12-03

  • Articles and reports: 11-522-X200800010959
    Description:

    The Unified Enterprise Survey (UES) at Statistics Canada is an annual business survey that unifies more than 60 surveys from different industries. Two types of collection follow-up score functions are currently used in the UES data collection. The objective of using a score function is to maximize the economically weighted response rates of the survey in terms of the primary variables of interest, under the constraint of a limited follow-up budget. Since the two types of score functions are based on different methodologies, they could have different impacts on the final estimates.

    This study generally compares the two types of score functions based on the collection data obtained from the two recent years. For comparison purposes, this study applies each score function method to the same data respectively and computes various estimates of the published financial and commodity variables, their deviation from the true pseudo value and their mean square deviation, based on each method. These estimates of deviation and mean square deviation based on each method are then used to measure the impact of each score function on the final estimates of the financial and commodity variables.

    Release date: 2009-12-03

  • Articles and reports: 11-522-X200800010967
    Description:

    In this paper the background of the eXtensible Business Reporting Language and the involvement of Statistics Netherlands in the Dutch Taxonomy Project are discussed. The discussion predominantly focuses on the statistical context of using XBRL and the Dutch Taxonomy for expressing data terms to companies.

    Release date: 2009-12-03

  • Articles and reports: 11-536-X200900110803
    Description:

    "Classical GREG estimator" is used here to refer to the generalized regression estimator extensively discussed for example in Särndal, Swensson and Wretman (1992). This paper summarize some recent extensions of the classical GREG estimator when applied to the estimation of totals for population subgroups or domains. GREG estimation was introduced for domain estimation in Särndal (1981, 1984), Hidiroglou and Särndal (1985) and Särndal and Hidiroglou (1989), and was developed further in Estevao, Hidiroglou and Särndal (1995). For the classical GREG estimator, fixed-effects linear model serves as the underlying working or assisting model, and aggregate-level auxiliary totals are incorporated in the estimation procedure. In some recent developments, an access to unit-level auxiliary data is assumed for GREG estimation for domains. Obviously, an access to micro-merged register and survey data involves much flexibility for domain estimation. This view has been adopted for GREG estimation for example in Lehtonen and Veijanen (1998), Lehtonen, Särndal and Veijanen (2003, 2005), and Lehtonen, Myrskylä, Särndal and Veijanen (2007). These extensions cover the cases of continuous and binary or polytomous response variables, use of generalized linear mixed models as assisting models, and unequal probability sampling designs. Relative merits and challenges of the various GREG estimators will be discussed.

    Release date: 2009-08-11

  • Articles and reports: 11-536-X200900110807
    Description:

    Model calibration (Wu & Sitter, JASA, 2001) has been shown to provide more efficient estimates than classical calibration when the values of one or more auxiliary variables are available for each unit in the population and the relationship between such variables and the variable of interest is more complex than a linear one. Model calibration, though, provides a different set of weights for each variable of interest. To overcome this problem an estimator is proposed: calibration is pursued with respect to both the auxiliary variables values and the fitted values of the variables of interest obtained with parametric and/or nonparametric models. This allows for coherence among estimates and more efficiency if the model is well specified. The asymptotic properties of the resulting estimator are studied with respect to the sampling design. The issue of high variability of the weights is addressed by relaxing binding constraints on the variables included for efficiency purposes in the calibration equations. A simulation study is also presented to better understand the finite size sample behavior of the proposed estimator

    Release date: 2009-08-11

  • Articles and reports: 11-536-X200900110811
    Description:

    Composite imputation is often used in business surveys. It occurs when several imputation methods are used to impute a single variable of interest. The choice of one method instead of another depends on the availability or not of some auxiliary variables. For instance, ratio imputation could be used to impute a missing value when an auxiliary variable is available and, otherwise, mean imputation could be used.

    Although composite imputation is frequent in practice, the literature on variance estimation when composite imputation is used is limited. We consider the general methodology proposed by Särndal et al. (1992), which requires the validity of an imputation model i.e., a model for the variable being imputed. At first glance, the extension of this methodology to composite imputation seems quite tedious until we notice that most imputation methods used in practice lead to imputed estimators that are linear in the observed values of the variable of interest. This considerably simplifies the derivation of a variance estimator even when there is a single imputation method. Regarding the estimation of the sampling portion of the total variance, we use a methodology slightly different than the one proposed by Särndal et al. (1992). Our methodology is similar to the sampling variance estimator under multiple imputation with an infinite number of imputations.

    This methodology is the central part of version 2.0 of the System for Estimation of Variance due to Nonresponse and Imputation (SEVANI), which is being developed at Statistics Canada. Using SEVANI, we will illustrate our method through an example based on real data.

    Release date: 2009-08-11

  • Articles and reports: 12-001-X200900110888
    Description:

    In the selection of a sample, a current practice is to define a sampling design stratified on subpopulations. This reduces the variance of the Horvitz-Thompson estimator in comparison with direct sampling if the strata are highly homogeneous with respect to the variable of interest. If auxiliary variables are available for each individual, sampling can be improved through balanced sampling within each stratum, and the Horvitz-Thompson estimator will be more precise if the auxiliary variables are strongly correlated with the variable of interest. However, if the sample allocation is small in some strata, balanced sampling will be only very approximate. In this paper, we propose a method of selecting a sample that is balanced across the entire population while maintaining a fixed allocation within each stratum. We show that in the important special case of size-2 sampling in each stratum, the precision of the Horvitz-Thompson estimator is improved if the variable of interest is well explained by balancing variables over the entire population. An application to rotational sampling is also presented.

    Release date: 2009-06-22
Reference (0)

Reference (0) (0 results)

No content available at this time.

Date modified: