Non-response follow-up for business surveys
Section 2. Proposed follow-up strategy
Consider a finite population of units, partitioned into strata, of size respectively, such that and We are interested in estimating the population
total where is the value of the variable of interest for From each stratum a sample of size is selected according to simple random
sampling without replacement. The resulting total sample, is of size We denote by the probability that unit is selected in The sampled units in stratum are invited, either by post or email, to
complete an online electronic questionnaire. We call this the “mail-out”. If
all sampled units respond to the mail-out, one could use the unbiased expansion
estimator of also called the full sample estimator:
where denotes
the design weight associated with
In practice, not all sampled units respond to the
mail-out. Suppose that, after a certain period of time, of the sampled units respond in stratum We denote the set of respondents in stratum by and the response probability for unit by A sample of units, is then selected from the set of all
non-respondents to the mail-out, We denote by the set of units selected for a follow-up in stratum among
the set of non-respondents to the mail-out in stratum We denote the probability that the mail-out non-respondent is selected in the follow-up sample by We assume that this probability can be written
as where does not depend on the follow-up sample size and satisfy the condition
This condition is satisfied for simple random sampling,
stratified simple random sampling, with proportional or Neyman allocation, and
probability proportional to size sampling.
Units of the sample are followed up via telephone. If all units respond to the follow-up, the unbiased Hansen and Hurwitz (1946)
estimator of the population total can be used:
where is the
follow-up design weight of unit The objective of the sample is to estimate the unknown total If a variable strongly related to the variable of interest is available before sample selection for all
the mail-out non-respondents, it seems natural to use as an auxiliary variable for stratification or
as a size measure for probability proportional to size sampling.
As pointed out
by a reviewer, it is important to wait until mail-out data collection is closed
before selecting the follow-up sample. If units respond to the mail-out after
the follow-up sample has been selected, some decisions on how to handle these
late respondents are required. If they are not discarded, it may be difficult
to obtain an unbiased estimator like (2.3) without introducing model
assumptions (see Beaumont, Bocci and Hidiroglou, 2014). This issue may also
have implications on the length of the collection period.
As pointed out in the introduction, it is unlikely that
all the follow-up sample units will respond. Suppose that after the end of the
data collection period, units have responded to the follow-up in
stratum We denote by the set of the respondents in stratum We consider the non-response-adjusted version
of the Hansen and Hurwitz (1946) estimator:
where is a
non-response weight adjustment. Under uniform non-response, a suitable weight
adjustment is the inverse of the overall weighted response rate:
A less restrictive assumption is uniform non-response
within strata. Under this assumption, a suitable weight adjustment would be the
inverse of the stratum weighted response rate:
Note that the non-response weight adjustment (2.6) is
computable only if for all strata. Alternatively, unweighted
versions of (2.5) and (2.6) could also be considered.
As mentioned
earlier, follow-up of non-respondents who have been selected in is performed via telephone. In our proposed
data collection procedure, a calling queue is first created by randomly
ordering units in These units are then called sequentially until
the queue is empty or the entire follow-up budget has been expended, whichever
comes first. Each call attempt made to units in results in one of these three outcomes:
- Response: A
response is obtained from the unit. The unit is removed from the calling queue
so that it does not get called again.
- Final non-response: The unit is finalized as a non-respondent; it should not be
called back again and is removed from the calling queue. The most common
example of this outcome is a refusal to respond to the survey.
- Still in progress: The unit is not finalized and needs to be called again; it is
therefore returned to the end of the calling queue. An example of this outcome
is an attempt where no contact is made or
an attempt where an appointment is made for a callback.
The “response” and “final
non-response” outcomes are both final outcomes, in the sense that the unit is
removed from the calling queue and the
collection process. This is in contrast to the “still-in-progress”
outcome where the unit is returned to the calling queue so that it can be
called again. A unit that completes the data collection process with an outcome
of “response” or “final non-response” is said to be finalized or resolved,
otherwise, it is said to be unresolved. There
are two types of non-respondents after data collection: i) Finalized units
with a “final non-response” outcome; and ii) Unresolved units. Both types
of non-respondents are accounted for in estimation using the
non-response-adjusted estimator (2.4).
We
assume that, for a given sample unit, the outcomes of the call attempts are
independent, and the probability associated with each of the three possible
outcomes remains constant throughout the entire data collection period. For a
given sampled unit the probability of
a “response” is denoted as the probability of
a “final non-response” is denoted as and the probability
of a “still-in-progress” outcome is denoted as In practice, the
independence and constant probability assumptions may not hold exactly. The
independence assumption is expected to be more plausible if the probabilities
are conditional on strong predictors and if the time gap between two successive
call attempts on the same unit is not too short. The constant probability
assumption is not satisfied when the probabilities depend on predictors that
can vary during data collection, such as the time of day or day of the week of
the call attempt. Although it might be possible to extend our model to
time-varying predictors, it would complicate our theoretical developments and
simulation study. These assumptions are made throughout the paper to simplify
our analyses. This is a limitation of our investigations that should be kept in
mind when interpreting our results.
Multiple phone call attempts may be necessary to reach
and resolve a unit. Data collection managers may wish to impose an upper limit
on the number of call attempts that can be made to any follow-up sample unit.
If a unit is still in progress after reaching the limit, it is removed from the
calling queue and remains unresolved at the end of data collection. Let be that upper limit on the number of call
attempts. Assuming each unresolved unit at the end of data collection always
reaches the maximum number of attempts the probability that unit responds when selected in the sample can be written as where is the probability that unit responds exactly at the attempt when selected in Under our assumptions, it is easy to see that As a result, we have
In the next
section, equation (2.7) will be used to determine an appropriate follow-up
sample size.
ISSN : 1492-0921
Editorial policy
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.
© His Majesty the King in Right of Canada as represented by the Minister of Industry, 2022
Use of this publication is governed by the Statistics Canada Open Licence Agreement.
Catalogue No. 12-001-X
Frequency: Semi-annual
Ottawa