Release date: December 20, 2016
The journal Survey Methodology Volume 42, Number 2 (December 2016) contains the following 6 papers, 2 short notes and one corrigendum:
Tests for evaluating nonresponse bias in surveys
How do we tell whether weighting adjustments reduce nonresponse bias? If a variable is measured for everyone in the selected sample, then the design weights can be used to calculate an approximately unbiased estimate of the population mean or total for that variable. A second estimate of the population mean or total can be calculated using the survey respondents only, with weights that have been adjusted for nonresponse. If the two estimates disagree, then there is evidence that the weight adjustments may not have removed the nonresponse bias for that variable. In this paper we develop the theoretical properties of linearization and jackknife variance estimators for evaluating the bias of an estimated population mean or total by comparing estimates calculated from overlapping subsets of the same data with different sets of weights, when poststratification or inverse propensity weighting is used for the nonresponse adjustments to the weights. We provide sufficient conditions on the population, sample, and response mechanism for the variance estimators to be consistent, and demonstrate their small-sample properties through a simulation study.
Reducing the response imbalance: Is the accuracy of the survey estimates improved?
by Carl-Erik Särndal, Kaur Lumiste and Imbi Traat
We present theoretical evidence that efforts during data collection to balance the survey response with respect to selected auxiliary variables will improve the chances for low nonresponse bias in the estimates that are ultimately produced by calibrated weighting. One of our results shows that the variance of the bias – measured here as the deviation of the calibration estimator from the (unrealized) full-sample unbiased estimator – decreases linearly as a function of the response imbalance that we assume measured and controlled continuously over the data collection period. An attractive prospect is thus a lower risk of bias if one can manage the data collection to get low imbalance. The theoretical results are validated in a simulation study with real data from an Estonian household survey.
Statistical inference based on judgment post-stratified samples in finite population
by Omer Ozturk
This paper draws statistical inference for finite population mean based on judgment post stratified (JPS) samples. The JPS sample first selects a simple random sample and then stratifies the selected units into H judgment classes based on their relative positions (ranks) in a small set of size H. This leads to a sample with random sample sizes in judgment classes. Ranking process can be performed either using auxiliary variables or visual inspection to identify the ranks of the measured observations. The paper develops unbiased estimator and constructs confidence interval for population mean. Since judgment ranks are random variables, by conditioning on the measured observations we construct Rao-Blackwellized estimators for the population mean. The paper shows that Rao-Blackwellized estimators perform better than usual JPS estimators. The proposed estimators are applied to 2012 United States Department of Agriculture Census Data.
Adaptive rectangular sampling: An easy, incomplete, neighbourhood-free adaptive cluster sampling design
by Bardia Panahbehagh
This paper introduces an incomplete adaptive cluster sampling design that is easy to implement, controls the sample size well, and does not need to follow the neighbourhood. In this design, an initial sample is first selected, using one of the conventional designs. If a cell satisfies a prespecified condition, a specified radius around the cell is sampled completely. The population mean is estimated using the estimator. If all the inclusion probabilities are known, then an unbiased estimator is available; if, depending on the situation, the inclusion probabilities are not known for some of the final sample units, then they are estimated. To estimate the inclusion probabilities, a biased estimator is constructed. However, the simulations show that if the sample size is large enough, the error of the inclusion probabilities is negligible, and the relative estimator is almost unbiased. This design rivals adaptive cluster sampling because it controls the final sample size and is easy to manage. It rivals adaptive two-stage sequential sampling because it considers the cluster form of the population and reduces the cost of moving across the area. Using real data on a bird population and simulations, the paper compares the design with adaptive two-stage sequential sampling. The simulations show that the design has significant efficiency in comparison with its rival.
Unequal probability inverse sampling
by Yves Tillé
In an economic survey of a sample of enterprises, occupations are randomly selected from a list until a number r of occupations in a local unit has been identified. This is an inverse sampling problem for which we are proposing a few solutions. Simple designs with and without replacement are processed using negative binomial distributions and negative hypergeometric distributions. We also propose estimators for when the units are selected with unequal probabilities, with or without replacement.
A cautionary note on Clark Winsorization
Winsorization procedures replace extreme values with less extreme values, effectively moving the original extreme values toward the center of the distribution. Winsorization therefore both detects and treats influential values. Mulry, Oliver and Kaputa (2014) compare the performance of the one-sided Winsorization method developed by Clark (1995) and described by Chambers, Kokic, Smith and Cruddas (2000) to the performance of M-estimation (Beaumont and Alavi 2004) in highly skewed business population data. One aspect of particular interest for methods that detect and treat influential values is the range of values designated as influential, called the detection region. The Clark Winsorization algorithm is easy to implement and can be extremely effective. However, the resultant detection region is highly dependent on the number of influential values in the sample, especially when the survey totals are expected to vary greatly by collection period. In this note, we examine the effect of the number and magnitude of influential values on the detection regions from Clark Winsorization using data simulated to realistically reflect the properties of the population for the Monthly Retail Trade Survey (MRTS) conducted by the U.S. Census Bureau. Estimates from the MRTS and other economic surveys are used in economic indicators, such as the Gross Domestic Product (GDP).
A few remarks on a small example by Jean-Claude Deville regarding non-ignorable non-response
by Yves Tillé
An example presented by Jean-Claude Deville in 2005 is subjected to three estimation methods: the method of moments, the maximum likelihood method, and generalized calibration. The three methods yield exactly the same results for the two non-response models. A discussion follows on how to choose the most appropriate model.
A note on the concept of invariance in two-phase sampling designs
Two-phase sampling designs are often used in surveys when the sampling frame contains little or no auxiliary information. In this note, we shed some light on the concept of invariance, which is often mentioned in the context of two-phase sampling designs. We define two types of invariant two-phase designs: strongly invariant and weakly invariant two-phase designs. Some examples are given. Finally, we describe the implications of strong and weak invariance from an inference point of view.
Statistical matching using fractional imputation
- Date modified: