Non-response follow-up for business surveys
Section 1. Introduction

Data collection research is a topic of interest amongst national statistical agencies looking to increase response rates and/or reduce data collection costs. With the high costs of collecting survey data, even a small increase in the efficiency of data collection procedures can translate into significant monetary savings. Given that response rates have declined over the past twenty years in both social and economic surveys, there has also been a growing concern over non-response bias.

In one of the first papers to discuss non-response, Hansen and Hurwitz (1946) proposed drawing a sub-sample of non-respondents, also called a non-response follow-up sample, to eliminate non-response bias. Their set-up was as follows: questionnaires were mailed out and after a certain period, a sample of non-respondents was followed up by personal interviewers to obtain their responses. They showed how the responses to the initial mail-out could be combined with those from the non-response follow-up sample to obtain an unbiased estimator of a population total or mean. They made the strong assumption that every unit of the follow-up sample responds. However, in today’s environment, this assumption is not realistic as businesses and individuals are becoming increasingly reluctant to respond to surveys.

Much of the research published in the literature in the last 15 years has focused on adaptive collection designs, also called adaptive survey designs, responsive collection designs, responsive survey designs, or simply responsive designs. Groves and Heeringa (2006) defined a responsive survey design as one that uses paradata, or process data, to guide changes in the features of data collection to achieve higher quality estimates per unit cost. Beaumont, Bocci and Haziza (2014) noted that the literature on adaptive collection designs has mainly focussed on developing procedures that aim at reducing the non-response bias of an estimator that is not adjusted for non-response (see for example Schouten, Cobben and Bethlehem, 2009; and Peytchev, Riley, Rosen, Murphy and Lindblad, 2010). Beaumont et al. (2014) argued that any information (e.g., auxiliary data, paradata) that can be used during data collection to reduce non-response bias can also be used at the estimation stage. In other words, the non-response bias that can be removed at the collection stage through an adaptive collection procedure can also be removed at the estimation stage through appropriate non-response weight adjustments. They suggested that adaptive collection procedures, such as call prioritization, cannot reduce the non-response bias to a greater extent than a proper non-response weight adjustment. Limitations of adaptive collection procedures to reduce non-response bias and costs were also noted in the review paper by Tourangeau, Brick, Lohr and Li (2017).

So far, the literature on collection research has mostly targeted household surveys, and little has been reported on this subject for business surveys, two exceptions being Bosa, Godbout, Mills and Picard (2018) and Thompson, Kaputa and Bechtel (2018). Bosa et al. (2018) derived an item score that reflects the importance of following-up a particular sample unit and suggested an adaptative collection procedure using this score. Units with a large item score contribute the most to reducing the variance of point estimators. These units are given priority for expensive collection operations such as telephone follow-up. Thompson et al. (2018) considered sub-sampling of non-respondents and investigated the problem of sub-sample allocation subject to some constraints on the response rate and sample size in predetermined domains of interest.

Although business surveys typically use simple sampling designs, such as stratified simple random or Bernoulli sampling designs, they do possess certain features that can pose collection challenges. A distinctive feature is that business populations are highly skewed with a small percentage of businesses representing much of the economic activity. Consequently, business surveys usually include a take-all stratum where all units are selected with certainty, and take-some strata where the units are usually selected using simple random sampling without replacement or Bernoulli sampling. The take-all units correspond to large businesses. Failing to obtain a response from these large businesses could cause significantly biased estimates. As a result, all take-all units are typically followed up, and efforts are made to ensure their responses are received. The large businesses usually have staff (e.g., accountants) capable of responding to items on the questionnaire. On the other hand, small businesses may have to pay an outside accountant to obtain the requested information; this could be a contributing factor to non-response for such businesses. Another feature of business surveys is that collection is usually conducted in two steps. First, letters are sent to the sample units by postal service or by email, inviting them to complete an online electronic questionnaire. After a certain period of time, a follow-up of the non-responding units is conducted via computer-assisted-telephone interviews.

In this article, we focus on the take-some strata and attempt to respond to the following questions: (i) For a fixed budget for follow-up, how much effort should we dedicate to repeatedly following up non-respondents until a response is received? (ii) Should we follow up all the non-respondents or select a sample of them? (iii) If we select a sample of non-respondents, what sampling designs would lead to more efficient estimators? To the best of our knowledge, determining an appropriate follow-up sample size and sampling design has not been investigated in the literature.

In the remainder of the paper, we present our investigations on non-response follow-up in the business survey context. The proposed follow-up strategy, which consists of a follow-up sampling design, data collection procedure, and estimator, is introduced in Section 2. In Section 3, we provide some theoretical properties of the proposed follow-up strategy. Section 4 describes a simulation study conducted to investigate the properties of the non-response-adjusted Hansen-Hurwitz estimator of a population total under different follow-up sampling designs and response scenarios. Finally, in Section 5, we summarize our main conclusions. Although we focus on business surveys, we believe that most of our conclusions also apply to social surveys.


Date modified: