Estimation of response propensities and indicators of representative response using population-level information
Section 6. Discussion
The
extension of sample-based to population-based estimators of R-indicators is
comprised of two steps: 1) the estimation of response propensities, and 2) the
estimation of the R-indicators based on these propensities. The
population-based estimation of response propensities is straightforward when
linear models are assumed for response propensities and response influences. The
linear link function is reasonable when estimating response propensities under
typical response rates seen for large-scale national social surveys as shown in
the evaluation study in Section 4. The sample-based estimators contain
sample covariance matrices and sample frequencies that can be replaced by population
covariance matrices or population frequencies. We identified two types of
settings: when population cross-products are available or when auxiliary
information is restricted to marginal population counts only. We labelled the
corresponding estimators as Type 1 and Type 2 estimators, respectively. The
Type 2 setting is more restrictive than the Type 1 setting.
Following
the estimation of population-based response propensities, we have constructed
population-based estimators for the R-indicator and examined their properties
both theoretically and empirically. The estimators are applied to samples drawn
from real data from the 1995 Israel Census Data where “true” propensities were
calculated according to realistic assumptions for national household social
surveys. Thus, we have addressed the first two research questions at the
beginning of the paper: How to extend sample-based response propensities and R-indicators
to population-based response propensities and R-indicators? and What are the
statistical properties of population-based R-indicators?
There
are many options for the estimation of R-indicators depending on the response
to the survey. We used propensity weighted response means as the propensities
are available. However, any calibration method can be used such as linear
weighting or adjustment classes. In fact, the set of auxiliary variables used
for the estimation of the R-indicators may be a subset of the auxiliary
variables used for the estimation of propensities and influences. Parsimonious models
may prove to be more efficient as it is known that propensity-weighting may
seriously affect the precision of the estimators. This is a topic for future
research.
The two
properties we examined are the bias and standard errors of the proposed population-based
R-indicators. As expected the bias and standard errors are dependent on the
size of the sample and the type of auxiliary information available where the smaller
the sample, the larger the bias and the standard error. When samples are
smaller, it becomes more difficult to distinguish sampling variation from
response variation. Clearly, the confidence intervals become larger as there is
less information in small samples.
The
bias-adjusted Type 1 estimators (population cross-products) perform better than
the bias-adjusted Type 2 estimators (population marginal counts). This is as
expected given that they employ more information. However, the unadjusted Type
2 estimators have better RRMSE properties than the unadjusted Type 1
estimators. This is a surprising result and points to a suboptimal use of the
population cross-products when they are used as “plug-ins” and do not account
for any sampling variation. The standard errors of the population-based
estimators are larger than their sample-based counterparts.
The
evaluation study in scenario RR3 shows that, for very high response rates, the
population-based R-indicators provide higher standard errors and larger bias,
mainly due to propensities being estimated outside of the interval [0, 1].
For this reason, we proposed a composite estimator with varying smoothing
parameters dependent on the response rate. Standard errors were reduced but at
the cost of increased bias.
From
the analyses it becomes apparent that the bias of the Type 1 and Type 2
estimators depends on the number of auxiliary variables, but this dependence
was modest in our evaluations. The bias may increase when using detailed models
with many variables for the estimation of response propensities. The rationale
behind this is that detailed models allow for more sampling variation to be
picked up as bias.
The
population-based R-indicators have a number of caveats:
Firstly,
the choice of auxiliary information that is available at a national level may
be more limiting than sample-based auxiliary information depending on the
availability of registers and administrative data. The selection of auxiliary
variables should depend on whether they are correlated with the survey target
variables. Also, it is strongly recommended that population statistics that are
based on registers or administrative data are used rather than those based on
weighted survey counts from other surveys since these statistics may not
reflect the true population distribution accurately. One would draw erroneous
conclusions about the representativeness of the response if the population estimates
are biased.
Secondly,
we make the assumption that the survey measures the same quantities as in the
population information and we do not investigate the effect of possible
departures from this assumption. However, we note that there is an imminent
risk of measurement errors when comparing the representativeness of survey
questions to population statistics. It must be ascertained that the survey
questions that are employed have the same definitions and classifications as
the population tables. Hence, it is best to avoid questions that are prone to
measurement errors, such as questions that require a strong cognitive effort or
that may lead to socially desirable answers.
Thirdly,
in settings where only population information is available, options to improve
representativeness during data collection are much more limited since there is
no individual auxiliary information available for the nonrespondents. Nonetheless,
in these settings, assessments of representativeness may still be useful in the
design of advance and reminder letters, in interviewer training and in paradata
collection.
Finally,
we do not consider hybrid settings where the R-indicator is based on both
linked data and population tables. In addition, we do not deal with the case
where we could use weighted survey estimates if there is no aggregated
population information available. This will impact on both the bias and
variance estimates for the population based R-indicators. Such extensions are relatively
straightforward but will be left to future papers.
The
research into population-based R-indicators is still at the beginning stage and
it is too early to provide a definitive answer to the last research question
presented in the introduction regarding the feasibility and practicability of
R-indicators based on aggregate population auxiliary information. As mentioned
in the introduction, further usage of these R-indicators are being explored in
the context of evaluating and monitoring streamed administrative data and
assessing the representativeness of linked records. In addition, Schouten et al.
(2011) introduced partial R-indicators under sample-based auxiliary information
for evaluating the lack of representativeness due to a specific auxiliary
variable or category. These were used for monitoring and evaluating data collection.
Schouten and Shlomo (2017) demonstrate the use of partial R-indicators for
adaptive survey designs. It is straightforward, similarly, to define
population-based partial R-indicators and this will be a subject of future
work.
Regarding
the evaluation study presented in Section 4 on survey representativeness,
it is based on real data under realistic assumptions of response probabilities
typically found in social surveys conducted at national statistical institutes.
Future research needs to assess whether alternative estimators can be
constructed that are more precise, and, consequently, allow for stronger
conclusions regarding the nature of response. A natural avenue to explore is an
iterative approach through a modification of the EM-algorithm, in which the
score of the nonrespondents on the auxiliary variables is estimated and used to
update response propensity estimates.
We did
not consider population-based estimation for other types of models such as
logistic or probit regression. As shown in the numerical evaluation in Section 4,
differences in sample-based estimators between the linear and logistic link
function are in general small, but when the response rates get very close to 1,
they become more evident. For these cases, developing other link functions for
population-based estimation is a subject of future research. This would be a
useful and natural extension to the theory of R-indicators as these models are
often used in practice and avoid propensities outside the [0, 1] interval.
Acknowledgements
Part of
the research presented here was developed within project RISQ (Representativity Indicators
for Survey Quality, www.risq-project.eu), funded by the European 7th Framework Programme. We thank the members of the RISQ project: Katja Rutar from
Statisti
ni Urad Republike Slovenije, Geert Loosveldt and Koen
Beullens from Katholieke Universiteit, Leuven, Øyvin Kleven, Johan Fosen and Li-Chun Zhang from
Statistisk Sentralbyrå,
Norway, Ana Marujo from the University of Southampton, UK and Paul Knottnerus,
Centraal Bureau voor de Statistiek, for their valuable input.
The first author was supported by a STSM Grant from
the COST Action IS1004 and by the ex 60% University of Bergamo, Biffignandi
grant.
Appendix A
Analytic approximation to the bias of Type 1
estimators
First,
we compute the bias of
under general sampling design. Letting
and
then we can write
Note that
and
where
and
are the second-order sample inclusion
probabilities. Hence, the bias of
with respect to the joint distribution of
sampling design and the response mechanism is given by
Under simple
random sampling without replacement, (A.2) can be simplified to
A response-set
based estimator of
is
More generally,
the Horvitz-Thompson response-set estimator for (A.2) under complex sampling is
given by
Appendix B
Analytic approximation to the bias of Type 2
estimators
The
strategy to compute an analytical bias adjustment for
is to first approximate
by a linear estimator using Taylor linearization techniques. Next,
compute an approximate bias adjustment for
by inserting the linear approximation for
into
In the
following, define, for
and
the estimated totals
where
and
Let
be a
-vector with components
and
be the symmetric
-matrix with
elements
We may write
Define now the
population totals
Notice that
is
unbiased for
is
unbiased for
and
is
unbiased for
Let
Proposition 1. The estimator
defined
in (2.7) may be approximated by
Proof. Following standard Taylor linearization (see Särndal, Swensson and Wretman, 1992 and Bethlehem, 1988), the estimator
may be approximated by
where
and
where
is a
matrix with ones in positions
and
and
zeros elsewhere and
is a
-vector with the
component equal to one and zeros elsewhere.
Inserting the partial derivatives into (B.1) gives the result.
Proposition 2. Under simple random sampling, an
approximate bias for
with
respect to the joint distribution of sampling design and the response mechanism
is given by
where
and
A
response-set based estimator of
is
Proof. Thanks to Proposition 1,
defined
in Appendix A may be approximated as follows
The expected values of the terms
and
are
and
It follows that, under simple random sampling,
becomes
So the total bias under simple random sampling
is obtained by inserting
computed above into (A.1) and following the
proof in Appendix A for the other terms.
The response-set based estimator
of
is obtained by substituting
with
with
with
and
with
Note
that the bias adjustment
corresponds to “plugging-in” Type 2 quantities
instead of
matrix
instead of
and
instead of
into the analytical bias adjustment
developed for
with two additional terms due to the
linearization of
.
More
generally, the Horvitz-Thompson response-set estimator under complex sampling
for the bias adjustment of Type 2 population-based R-indicator is given by
References
Beaumont,
J.-F., Bocci, C. and Haziza, D. (2014). An adaptive data collection procedure
for call prioritization. Journal of Official Statistics, 30, 607-621.
Bethlehem,
J. (1988). Reduction of nonresponse bias through regression estimation. Journal of Official Statistics, 4,
251-260.
Booth,
J.G., Butler, R.W. and Hall, P. (1994). Bootstrap methods for finite
populations. Journal of the American
Statistical Association, 89 (428), 1282-1289.
Brick,
J.M., and Jones, M.E. (2008). Propensity to respond and nonresponse bias. METRON – International Journal of Statistics, LXVI (1), 51-73.
Copas,
J.B. (1983). Regression, prediction and shrinkage. Journal of the Royal Statistical Society, Series B, 45, 311-354.
Copas,
J.B. (1993). The shrinkage of point scoring methods. Journal of the Royal Statistical Society, Series C, 42, 315-331.
De
Heij, V., Schouten, B. and Shlomo, N. (2015). RISQ manual 2.1. Tools in SAS and
R for the computation of R-indicators and partial R-indicators, available at
www.risq-project.eu.
Deville,
J.-C., and Särndal, C.-E. (1992). Calibration estimators in survey sampling. Journal of the American Statistical
Association, 87, 376-382.
Efron,
B., and Tibshirani, R.J. (1993). An
Introduction to the Bootstrap. New York: Chapman and Hall.
Kreuter,
F. (2013). Improving Surveys with Process
and Paradata, Edited monograph, New Jersey: John Wiley & Sons, Inc.
Little, R.J.A. (1986). Survey nonresponse adjustments
for estimates of means. International Statistical Review, 54, 139-157.
Little, R.J.A. (1988). Missing-data adjustments in
large surveys. Journal
of Business and Economic Statistics, 6, 287-301.
Little,
R.J.A., and Rubin, D.B. (2002). Statistical Analysis with Missing Data,
Hoboken, New Jersey: John Wiley & Sons, Inc.
Lundquist,
P., and Särndal, C.-E. (2013). Aspects of responsive design with applications
to the Swedish Living Conditions Survey. Journal
of Official Statistics, 29 (4), 557-582.
MOA
(2015). User Instruction Gold Standard, Dutch Market Research Association,
available at www.moaweb.nl/sevrices/services/gouden-standaard.html.
Rosenbaum, P.R., and Rubin, D.B. (1983). The central
role of the propensity score in observational studies for causal effects. Biometrika, 70, 41-55.
Särndal,
C.-E. (2011). The 2010 Morris Hansen Lecture: Dealing with survey nonresponse
in data collection, in estimation. Journal
of Official Statistics, 27 (1), 1-21.
Särndal,
C.-E., and Lundquist, P. (2014). Accuracy in estimation with nonresponse: A
function of degree of imbalance and degree of explanation. Journal of Survey Statistics and Methodology, 2 (4), 361-387.
Särndal,
C.-E., and Lundström, S. (2005). Estimation
in Surveys with Nonresponse, New York: John Wiley & Sons, Inc.
Särndal,
C.-E., Swensson, B. and Wretman, J. (1992). Model
Assisted Survey Sampling, New York: Springer.
Schouten, B., and Shlomo,
N. (2017). Selecting adaptive survey design strata with partial R-indicators. International
Statistical Review, 85 (1), 143-163.
Schouten,
B., Calinescu, M. and Luiten, A. (2013). Optimizing quality of response through
adaptive survey designs. Survey
Methodology, 39, 1, 29-58. Paper available at https://www150.statcan.gc.ca/n1/pub/12-001-x/2013001/article/11824-eng.pdf.
Schouten, B., Cobben, F. and Bethlehem, J. (2009).
Indicators for the representativeness of survey response. Survey Methodology, 35, 1, 101-113. Paper available at https://www150.statcan.gc.ca/n1/pub/12-001-x/2009001/article/10887-eng.pdf.
Schouten,
B., Shlomo, N. and Skinner, C. (2011). Indicators for monitoring and improving
representativeness of response. Journal
of Official Statistics, 27, 231-253.
Schouten, B., Cobben, F., Lundquist, P. and Wagner, J.
(2016). Does more balanced survey response imply less non-response bias? Journal of the Royal Statistical Society,
Series A, 179 (3), 727-748.
Schouten,
B., Bethlehem, J., Beulens, K., Kleven, Ø., Loosveldt, G., Rutar, K., Shlomo, N.
and Skinner, C. (2012). Evaluating, comparing, monitoring and improving
representativeness of survey response through R-indicators and partial
R-indicators. International Statistical
Review, 80 (3), 382-399.
Shlomo,
N., Skinner, C. and Schouten, B. (2012). Estimation of an indicator of the representativeness
of survey response. Journal of
Statistical Planning and Inference, 142, 201-211.
Van der Laan, D., and Bakker, B. (2015). Indicators for
the representativeness of linked sources, NTTS 2015 Proceedings, available at https://ec.europa.eu/eurostat/cros/system/files/NTTS2015%20proceedings.pdf.
Wagner,
J. (2012). A comparison of alternative indicators for the risk of nonresponse
bias. Public Opinion Quarterly, 76
(3), 555-575.
Wagner,
J. (2013). Adaptive contact strategies in telephone and face-to-face surveys. Survey Research Methods, 7 (1), 45-55.
Wagner, J.,
and Hubbard, F. (2014). Producing unbiased estimates of propensity models
during data collection. Journal of Survey
Statistics and Methodology, 2, 323-342.
Wolter,
K.M. (2007). Introduction to Variance
Estimation, 2nd Ed. New York: Springer.