4 A static adaptive survey design: Assigning telephone interviewers

Barry Schouten, Melania Calinescu and Annemieke Luiten

In this section, a simulation study is presented where telephone interviewer assignment is the design feature of interest. The response probabilities used in the example are estimated from real telephone survey data.

The Dutch Survey of Consumer Satisfaction (SCS) is a monthly telephone survey about the sentiments of households about their economic situation and expenditure. The survey provides insight into short-term economic development, and early indicators of differences in consumer trends. Each month 1,500 households are sampled. The two most influential causes of nonresponse in the SCS are non-contact and refusal. Of the sample 95% is contacted, and of the contacted 71% of the households participate. The response rate is 67%.

One of the most important factors that affect participation is the interviewer. Interviewer's performance may vary greatly when it comes to obtaining response. In total 60 interviewers worked on the SCS during 2005. That means an interviewer had contact with 280 households on average. Interviewer participation rates ranged from 50% to 79%. The lowest rate of 50% was, however, exceptional as the one but lowest participation rate was 61%. The mean interviewer participation rate was 67%. Households were randomly assigned to interviewers in the CATI management system. Hence, with respect to the interviewer the data are randomized (or interpenetrated). In the following, the interviewer will be the design feature of interest. The survey strategy set $S$ consists of sixty strategies, $S = {s_{1}, s_{2}, \dots, s_{60}} .$

From the available auxiliary variables a vector $X$ was selected containing ethnicity, gender composition of the household core (male, female or mix), average age of the household core in 5-year classes, type of household, degree of urbanization of the neigborhood of residence and average value of houses in the neighborhood. Especially age, average house value and type of household relate to key statistics deduced from the SCS. No paradata were available in this study. Therefore, the adaptive survey design is static. In the optimization the allocation probabilities $p (s_{k} | x)$ need to be chosen, i.e., it needs to be decided to which interviewers subpopulations based on $X$ are assigned (such that $\sum_{k} p (s_{k} | x) = 1$ ).

The coefficient of variation of the response propensities $ρ_{X}$ defined by (2.8) is selected as the target quality function. To estimate the response propensities $ρ (s_{k}, x)$ for interviewers, a multilevel model is used with the identity link function, i.e., a linear regression with two levels. The interviewers form the first level of the model and the households the second level. The multilevel model is used to separate individual response propensities and interviewer response propensities. The rationale is that by separating interviewer and individual, the interviewer effect can be isolated and interviewer assignment can be optimized. We chose a linear model as it allows for easy optimization. Since the propensities are never close to 0 or 1, the linear model produces almost the same estimates as a logit or probit model.

For the interviewer effect it was first investigated whether it was sufficient to use a fixed slope multilevel model, i.e., the interviewer is added as a main effect only and there are no interactions with auxiliary variables. All pre-selected covariates gave significant contributions to the multilevel model, but none of the interactions with the interviewer were significant at the 5% level. For this reason, we restrained ourselves to the following main effect model

$ρ (s_{k}, x_{i}) = β_{0} + β_{X} x_{i} + β_{k}$ (4.1)

where $x_{i}$ is the covariate vector of household $i, 1 \leq i \leq n, n$ the sample size, $β_{k}$ is the (fixed) interviewer effect for interviewer $k, β_{0}$ is the constant term or intercept and $β_{X}$ is the slope parameter. We let $ρ (x_{i}) = \sum_{k} p (s_{k} | x_{i}) ρ (s_{k}, x_{i})$ denote the response propensity of sample unit $i .$

Model (4.1) was fitted to the SCS data set. Next, the estimated interviewer effect $β_{k}$ was used to optimize the coefficient of variation, subject to two cost constraints: both the total interview time and the individual number of calls for each interviewer must be the same as in the original design. Since the telephone management system handles the calls, the interview time is the dominant component in the costs. If we fix the total interview time, then we constrain costs to be the same as for the regular SCS. Since interviewers can handle only a certain amount of calls, we must also fix the number of calls they are allocated to. The first constraint implies that we fix the response rate, as the total interview time is the multiple of the average individual interview time and the number of respondents. The SCS questionnaire is simple and does not contain any nested sets of survey items. As a result the individual interview time shows hardly any variation over population subgroups. The second constraint is equal to

$\sum_{i} p (s_{k} | x_{i}) = n_{k},$ (4.2)

where $n_{k}$ is the pre-specified number of calls for interviewer $k$ and $\sum_{k} n_{k} = n .$

We optimize the coefficient of variation by distributing the $β_{k} ’ s$ to the households. Due to the additive nature of the model, it is easy to show that any permutation of the interviewers to the cases leads to the same average response propensity and, hence, to the same interview time and costs. The average response propensity is

$\bar{ρ} = \frac{1}{n} \sum_{i} ρ (x_{i}) = \frac{1}{n} \sum_{i, k} p (s_{k} | x_{i}) (β_{0} + β_{X} x_{i} + β_{k}) = β_{0} + \frac{1}{n} \sum_{i} β_{X} x_{i} + \frac{1}{n} \sum_{k} n_{k} β_{k},$

which does not depend on the set of allocation probabilities $p (s_{k} | x) .$ As a consequence, optimizing the coefficient of variation amounts to optimizing the variance of the response propensities $S^{2} (ρ_{X}) .$

If we restrict ourselves to 0-1 decision variables, i.e., $p (s_{k} | x) \in {0, 1}, \forall x, k,$ then it is relatively easy to show that the optimal allocation corresponds to linking the best interviewers to the most difficult sample units and vice versa. In other words, the sample units are sorted by putting the individual response propensities without the interviewer effect, $β_{0} + β_{X} x_{i},$ in an increasing order, and the interviewers are sorted in a decreasing order based on their interviewer effect, $β_{k} .$ If two sample units $i$ and $j$ are allocated to two different interviewers, say $k$ and $l,$ and $β_{X} x_{i} < β_{X} x_{j}, β_{k} < β_{l}$ and $p (s_{k} | x_{i}) = p (s_{l} | x_{j}) = 1,$ then it is optimal to switch the two interviewers, i.e., $p (s_{l} | x_{i}) = p (s_{k} | x_{j}) = 1.$ This can be shown as follows. The difference in variance $S^{2} (ρ_{X})$ is proportional to

$Δ S^{2} (ρ_{X}) = {(β_{0} + β_{X} x_{i} + β_{k} - \bar{ρ})}^{2}$
$+ {(β_{0} + β_{X} x_{j} + β_{l} - \bar{ρ})}^{2} - {(β_{0} + β_{X} x_{i} + β_{l} - \bar{ρ})}^{2} - {(β_{0} + β_{X} x_{j} + β_{l} - \bar{ρ})}^{2}$ (4.3)
$= 2 (β_{l} - β_{k}) (β_{0} + β_{X} x_{j} - \bar{ρ}) - 2 (β_{l} - β_{k}) (β_{0} + β_{X} x_{i} - \bar{ρ})$
$= 2 (β_{l} - β_{k}) (β_{X} x_{j} - β_{X} x_{i}) > 0.$

From (4.3), we can conclude that there is a decrease in variance, and, hence, in the coefficient of variation, if we swap the two interviewers for cases $i$ and $j .$ From this argument, it follows easily that the optimal solution is as suggested. In a similar fashion, but requiring more algebra, it can be shown that the optimal solution for probabilistic allocations, $p (s_{k} | x) \in [0, 1],$ is the same.

The first two rows of table 4.1 contain the average response propensity and the coefficient of variation before and after re-assignment of interviewers. The coefficient of variation dropped from 0.117 to 0.035. In order to get an idea of the significance of the change in the quality function, we computed bootstrap standard errors. For each bootstrap, the re-assignment of interviewers was performed. The errors are given in table 4.1.

Table 4.1
The average response propensity and coefficient of variation of the regular SCS, the SCS after re-assignment of interviewers without and with adjustment for interview time. Bootstrap standard errors are given within brackets.
Table summary
This table displays the results of the average response propensity and coefficient of variation of the regular scs. The information is grouped by scs (appearing as row headers), adjustment for interview time?, header 1 and header 2 (appearing as column headers).
SCS	Adjustment for interview time?	$\bar{ρ}$	$Q (p)$
Regular	-	70.8%	0.117 (0.005)
Re-assignment	No	70.8%	0.035 (0.003)
Re-assignment	Yes	70.8%	0.034 (0.003)

The reader may have noticed that fixed numbers of interviewer cases do not imply fixed numbers of interviews per interviewer. In fact, by rearranging the interviewers, the good interviewers will do fewer interviews as they get the harder cases, while the less good interviewers do more interviews. As a result, the good interviewers will work smaller numbers of hours than they would do in the regular SCS and the less good interviewers will work more. This would be an undesirable side effect, which can, however, be adjusted relatively easy. Starting from the optimal solution, and sorting again the sample units based on their individual response propensities without the interviewer effect, we can shift neighbouring cases from less good interviewers to better interviewers. This is done in such a way that the total interview time per interviewer does not exceed that of the regular SCS. One can again prove that this procedure leads to a new optimal solution where the constraint on the fixed number of cases in (4.2) is replaced by the constraint on the fixed number of interviews

$\sum_{i} p (s_{k} | x_{i}) ρ (s_{k}, x_{i}) = r_{k},$ (4.4)

where $r_{k}$ is the pre-specified number of interviews. Table 4.1 presents the coefficient of variation for the optimal solution given (4.4). The response rate remains fixed, and the coefficient of variation is marginally smaller.

In 2009, the SCS survey has been used as an instrument to test a static adaptive survey design. We refer to Luiten and Wetzels (2009) and Luiten and Schouten (2013) for details. Interviewer assignment was one of the main design features that were adapted. Other design features were the survey mode and the contact protocol. Apart from telephone, also web was selected as a potential survey mode. Sample units with low estimated contact probabilities were assigned to more intensive contact protocols and were prioritized. Based on historical SCS data, contact and response probabilities were estimated. The pilot succeeded in significantly improving the coefficient of variation, while fixing the response rate and budget.

In this section, we presented a simulation study where good telephone interviewers get more difficult cases. This may in practice lead to annoyance among these interviewers. When implementing such a design, one should carefully instruct interviewers beforehand. In the 2009 SCS pilot, this did not lead to any negative comments from interviewers. In face-to-face surveys, a re-assignment of interviewers cannot be done so easily as travel costs may change drastically. Still, within densely populated inteviewer regions, re-assignment may be an option.

Previous | Next

Date modified:: 2017-09-20

Language selection

Search and menus

Search

Publications

Survey Methodology

Browse by

4 A static adaptive survey design: Assigning telephone interviewers