# Adaptive survey designs to minimize survey mode effects – a case study on the Dutch Labor Force Survey 4. Case study: The Dutch Labor Force SurveyAdaptive survey designs to minimize survey mode effects – a case study on the Dutch Labor Force Survey 4. Case study: The Dutch Labor Force Survey

In this section, we discuss a case study linked to the Dutch Labor Force Survey (LFS) of the years 2010 $-$ 2012. We briefly describe the design of the LFS first. We then proceed to a description of the selected design features and the selected population subgroups. Next, we explain how we have estimated the main input parameters to the optimization problem: response propensities, telephone registration propensities, variable costs and adjusted method effects with respect to two different benchmark designs. Following the estimation, we present the main optimization results. We end with a discussion of the sensitivity of optimal designs to inaccuracy of input parameters. For full details, we refer to Calinescu and Schouten (2013b).

## 4.1 The Dutch LFS design and redesign in 2010 $–$ 2012

The Dutch LFS is a monthly household survey using a rotating panel with five waves at quarterly intervals. The LFS is based on an address sample using a two-stage design in which the first stage consists of municipalities and the second consists of addresses. A stratified simple random sample is drawn based on the household age, ethnicity and registered unemployment composition. All households, to a maximum of eight, that are residents at the address are invited to participate. Within each household, all members of 15 years and older are eligible; they form the potential labor force population. The LFS contains a variety of topics, from employment status, profession and working hours to educational level, but the main survey statistic is the unemployment rate.

Up to 2010, the LFS consisted of a face-to-face first wave and telephone subsequent waves. For various reasons, costs being the most important, the first wave went through a major redesign. The other waves were left unchanged, except for a few relatively small changes to the questionnaires. The redesign consisted of two phases: First, telephone was added as a survey mode, and, second, also Web was added as a survey mode. In the first phase, the face-to-face first wave was replaced by a concurrent mode design where all households with at least one listed/registered phone number were assigned to telephone and all other households to face-to-face. The listed phone numbers consist of both landline and mobile phone numbers that can be bought from commercial vendors. In the second phase, the telephone and face-to-face concurrent design was preceded by a Web invitation, resulting in a mix of a sequential and a concurrent design. All households were sent an invitation to participate through an on-line questionnaire. Nonresponding households were approached by telephone if a listed number was available and otherwise by face-to-face. The first phase was performed during 2010 and the second phase during 2012. In both years large parallel samples were drawn in order to assess method effects between the designs on the unemployment rate. The 2010 parallel run compared the old design to the intermediate concurrent design and the 2012 parallel run compared the intermediate design to the final design with all three modes.

The redesign did not change the data collection strategy per mode. In all years, the face-to-face contact strategy for the LFS first wave consists of a maximum of six visits to the address and contacts are varied over days of the week and times during the day. If no contact is made at the sixth visit, then the address is processed as a noncontact. The telephone contact strategy consists of three series of three calls. The three series are termed contact attempts and represent three different interviewer shifts. In each shift the phone number is called three times with a time lag of roughly an hour. The Web strategy is an advance letter with a login code to a website and two reminder letters with time lags of one week.

We use the 2010 $-$ 2012 first wave LFS data to estimate various input parameters for the optimization model. In order to keep the exposition simple, and since the subsequent waves were not redesigned, we restrict ourselves to methods effects on unemployment rate estimates based on the first wave only. However, the first wave redesign may clearly have influenced the recruitment and response to waves 2 to 5. In follow-up studies at Statistics Netherlands, recruitment propensities to subsequent waves were included in the optimization problem, but we do not discuss these here. The LFS data were augmented with data from two administrative registers: the POLIS register and the UWV register. The POLIS register contains information about employments, allowances, income from employment and social benefits. The UWV register contains persons that have registered themselves as unemployed and applied for an unemployment allowance. Both registers contain relevant variables for the LFS and will be used to stratify the population.

## 4.2 The strategy set

The parallel runs in the LFS allow us to consider a multi-mode optimization problem with various single mode and sequential mixed-mode strategies. In the following, we abbreviate the telephone and face-to-face modes to $Tel$ and $F2F,$ respectively. Although, the sequential strategy $Web\to F2F$ is observed only for large households and for households without a registered phone, we do include this strategy in the optimization.

Since later face-to-face and telephone calls are relatively much more expensive than early calls, we also introduce a simple cap on calls. For $Tel$ we set the cap after two calls and for $F2F$ after three calls. These values are motivated by historical survey data, e.g., after these numbers of calls the cost per call increases quickly. We let $Tel2$ and $F2F3$ denote the strategies where a cap is placed on the number of calls. $Tel2\text{\hspace{0.17em}}+$ and $F2F3\text{\hspace{0.17em}}+$ represent strategies where there is no cap and the regular contact strategy is applied. We do realize that placing a cap is not the same as restricting the number of calls in practice. This holds especially for face-to-face. With fewer calls, interviewers or interviewer staff may change behaviour and spread calls differently. At Statistics Netherlands the $Tel2$ and $F2F3$ strategies are viewed as censored strategies with shorter data collection periods, e.g., two weeks instead of four weeks. Hence, cases are removed from the interviewer workloads after the pre-specified data collection period. From this perspective, it is more reasonable to assume that the optimal contact strategy during the first two weeks of a $F2F3\text{\hspace{0.17em}}+$ strategy is not so different from the optimal contact strategy in $F2F3.$ Still, we may expect that realized response propensities and costs in strategies with a cap are different from their simulated propensities and costs. The strategy set now becomes

$S= { Web, Tel2, Tel2+, F2F3, F2F3+, Web→Tel2, Web→Tel2+, Web→F2F3, Web→F2F3+, Φ }, (4.1) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpipeea0xe9Lq=Je9 vqaqFeFr0xbba9Fa0P0RWFb9fq0FXxbbf9=e0dfrpm0dXdirVu0=vr 0=vr0=fdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaqbaeaabiGaaa qaamrr1ngBPrwtHrhAXaqeguuDJXwAKbstHrhAG8KBLbacfaGae8Ne XpLaeyypa0dabaWaaiqaaeaacaWGxbGaamyzaiaadkgacaqGSaGaaG jbVlaadsfacaWGLbGaamiBaiaabkdacaqGSaGaaGjbVlaadsfacaWG LbGaamiBaiaabkdacqGHRaWkcaqGSaGaaGjbVlaadAeacaqGYaGaam OraiaabodacaqGSaGaaGjbVlaadAeacaqGYaGaamOraiaabodacqGH RaWkcaqGSaGaaGjbVlaadEfacaWGLbGaamOyaiabgkziUkaadsfaca WGLbGaamiBaiaabkdacaqGSaaacaGL7baaaeaaaeaadaGacaqaaiaa ysW7caaMc8UaaGPaVlaaykW7caaMc8UaaGPaVlaaykW7caaMc8UaaG PaVlaaykW7caaMc8UaaGPaVlaaykW7caaMc8UaaGPaVlaaykW7caWG xbGaamyzaiaadkgacqGHsgIRcaWGubGaamyzaiaadYgacaqGYaGaey 4kaSIaaeilaiaaysW7caWGxbGaamyzaiaadkgacqGHsgIRcaWGgbGa aeOmaiaadAeacaqGZaGaaeilaiaaysW7caWGxbGaamyzaiaadkgacq GHsgIRcaWGgbGaaeOmaiaadAeacaqGZaGaey4kaSIaaGilaiaaysW7 cqqHMoGraiaaw2haaiaacYcaaaGaaGzbVlaaywW7caaMf8UaaGzbVl aaywW7caGGOaGaaGinaiaac6cacaaIXaGaaiykaaaa@B128@$

where $\Phi$ denotes the nonsampling strategy.

The parallel runs for the LFS in 2010 and 2012 were large. In both years the LFS sample was doubled in size for six months. Still, estimated parameters are subject to sampling variation and in case of the $Web\to F2F$ strategies possibly also to bias. We return to this issue in Section 4.6.

## 4.3 Population groups

In order to stratify the population, the regular LFS weighting variables were used as a starting point: unemployment office registration, age, household size, ethnicity and registered employment. Crossing the five variables led to 48 population strata (yes or no registered unemployed in household times three age classes times two household size classes times two ethnicity classes times yes or no registered employment in household). These strata were collapsed to nine disjoint strata based on their response behavior and mode effects:

1. Registered unemployed : Households with at least one person registered to an unemployment office (7.5% of the population).
2. 65+ households without employment : Households with a maximum of three persons of 15 years and older without a registration to an unemployment office, without employment and with at least one person of 65 years or older (19.8% of population)
3. Young household members and no employment : Households with a maximum of three persons of 15 years and older without a registration to an unemployment office, without employment, with all persons younger than 65 years, and with at least one person between 15 and 26 years of age (2.4% of population).
4. Non-western without employment : Households with a maximum of three persons of 15 years and older without a registration to an unemployment office, without employment, with all persons younger than 65 years and older than 26 years of age, and at least one person of non-western ethnicity (1.5% of population).
5. Western without employment : Households with a maximum of three persons of 15 years and older without a registration to an unemployment office, without employment, with all persons younger than 65 years and older than 26 years of age and all persons of western ethnicity (11.0% of population).
6. Young household member and employment : Households with a maximum of three persons of 15 years and older without a registration to an unemployment office, with at least one employment, with all persons younger than 65 years, and with at least one person between 15 and 26 years of age (15.6% of population).
7. Non-western and employment : Households with a maximum of three persons of 15 years and older without a registration to an unemployment office, with at least one employment, with all persons older than 26 years of age, and at least one person of non-western ethnicity (3.9% of population).
8. Western and employment : Households with a maximum of three persons of 15 years and older without a registration to an unemployment office, with at least one employment, with all persons older than 26 years of age and all persons of western ethnicity (33.5% of population).
9. Large households : Households with more than three persons of 15 years and older without a registration to an unemployment office (4.9% of population)

The nine population strata were given informal labels in order to aid interpretation. Note, however, that the strata 7, 8 and 9 may have household members that are 65+. Furthermore, some subgroups follow from collapsing certain strata. For instance, households with at least one employment are found by combining strata 6, 7 and 8, and households with no more than three members of 15 years and older by combining all strata from 1 to 8.

In the optimization model, the nine strata were allowed different strategies and with different strategy allocation probabilities. In addition, we added precision constraints following the regular LFS on another stratification. Minimum numbers of respondents were requested based on age, ethnicity and registered unemployment. We refer again to Calinescu and Schouten (2013b) for details about these strata and corresponding precision thresholds.

## 4.4 Estimation of input parameters

The input parameters to the multi-mode optimization problem are subpopulation response propensities per strategy, subgroup telephone registration propensities, subgroup costs per sample unit per strategy, and subgroup adjusted method effects per strategy. We sketch the estimation of each set of parameters in the following subsections. More details can be found in Appendix A.

There are three settings that may occur when estimating input parameters: 1) The strategy is directly observed in historical survey data, 2) the strategy is only partially observed in historical survey data, i.e., only for a subset of the sample, and 3) the strategy is not observed at all.

For the LFS case study, the first setting applies to strategies $Web,$ $Tel2\text{\hspace{0.17em}}+,$ $F2F3\text{\hspace{0.17em}}+,$ $Web\to Tel2\text{\hspace{0.17em}}+\text{\hspace{0.17em}}.$ The second setting applies to $Web\to F2F3\text{\hspace{0.17em}}+$ and the third setting applies to $Tel2,$ $F2F3,$ $Web\to Tel2$ and $Web\to F2F3.$ Sequential mixed-mode designs with face-to-face as the follow-up mode are only observed for households without a listed phone number and fall under settings 2 or 3 depending on whether a cap is placed on the number of calls. We attempted to deal with setting 2 by modeling the input parameters based on the observed differences in parameters between $Tel2\text{\hspace{0.17em}}+$ and $F2F3\text{\hspace{0.17em}}+\text{\hspace{0.17em}}.$ We assumed that the ratio in response propensity between $F2F3\left(+\right)$ and $Tel2\text{\hspace{0.17em}}+$ for households with a listed phone number can be applied to $Web\to F2F3\left(+\right)$ and $Web\to Tel2\text{\hspace{0.17em}}+\text{\hspace{0.17em}}.$ Furthermore, in the estimation, we assumed that strategies involving caps on the number of calls are similar to simulated strategies, i.e., by artificially restricting strategies with the full number of calls to the specified cap. Hence, we attempted to deal with setting 3 by censoring strategies. Calinescu and Schouten (2013b) elaborate these modeling steps.

For the method effect $D\left(s,g\right),$ two benchmarks were selected ${\text{BM}}_{1}={\overline{y}}_{F\text{2}F\text{3}+}$ and ${\text{BM}}_{2}=1/3*\left({\overline{y}}_{Web}+{\overline{y}}_{Tel\text{2}+}+{\overline{y}}_{F\text{2}F\text{3}+}\right),$ where ${\overline{y}}_{\text{mode}}$ represents the average unemployment rate estimated via the indicated survey mode. The first benchmark assumes that the average unemployment rate that is estimated via a single mode face-to-face design represents the target unemployment rate. The second benchmark assumes there is no preferred mode, hence, it assigns an equal weight to each of the three modes. The $F2F3\text{\hspace{0.17em}}+$ benchmark is chosen because it is the traditional mode for the LFS first wave and, hence, determines the LFS time series up to 2010. Furthermore, we believe it is the mode that provides the smallest nonresponse bias for many surveys, see, e.g., Klausch et al. (2013a). It is, however, unclear whether $F2F3\text{\hspace{0.17em}}+$ should also be considered the mode with the smallest measurement bias. Hence, we also introduced the second benchmark to investigate the importance of the benchmark choice.

Standard errors for the estimated input parameters were approximated using bootstrap resampling per sampling stratum, following the stratified sampling design.

## 4.5 Optimization results

In this section, we explore the optimal allocation and minimal method effect for various budget levels, between stratum method effect levels and sample size levels

$B ∈{ 160,000; 170,000; 180,000 } M ∈{ 1%; 0.5%; 0.25% } S max ∈{ 9,500; 12,000; 15,000 }. MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpkpi0xe9LqFHe9fr pepeuf0xe9q8qq0RWFaDk9vq=dbvh9v8Wq0db9Fn0dbba9pw0lfr=x fr=xfbpdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaqbaeaaamGaaa uaaiaadkeaaeaacqGHiiIZdaGadaqaaiaabgdacaqG2aGaaeimaiaa bYcacaqGWaGaaeimaiaabcdacaqG7aGaaGjbVlaabgdacaqG3aGaae imaiaabYcacaqGWaGaaeimaiaabcdacaqG7aGaaGjbVlaabgdacaqG 4aGaaeimaiaabYcacaqGWaGaaeimaiaabcdaaiaawUhacaGL9baaae aacaWGnbaabaGaeyicI48aaiWaaeaacaaIXaGaaiyjaiaacUdacaaM e8UaaGimaiaac6cacaaI1aGaaiyjaiaacUdacaaMe8UaaGimaiaac6 cacaaIYaGaaGynaiaacwcaaiaawUhacaGL9baaaeaacaWGtbWaaSba aSqaaiaab2gacaqGHbGaaeiEaaqabaaakeaacqGHiiIZdaGadaqaai aabMdacaqGSaGaaeynaiaabcdacaqGWaGaae4oaiaaysW7caqGXaGa aeOmaiaabYcacaqGWaGaaeimaiaabcdacaqG7aGaaGjbVlaabgdaca qG1aGaaeilaiaabcdacaqGWaGaaeimaaGaay5Eaiaaw2haaiaai6ca aaaaaa@78D0@$

Appendix B presents the minimal method effects for the various levels and or the two benchmark designs, ${\text{BM}}_{1}$ and ${\text{BM}}_{2}.$ For the sake of brevity, here, we highlight mostly the results for ${\text{BM}}_{1},$ which is the former LFS design. The actual values for the non-adaptive regular three mode LFS design are

$\begin{array}{llllll}B\hfill & =\hfill & \text{170,000}\hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}M\hfill & =\hfill & 3.00%\hfill \\ {S}_{\mathrm{max}}\hfill & =\hfill & \text{11,000}\hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\overline{D}}^{{\text{BM}}_{1}}\hfill & =\hfill & -0.15%.\hfill \end{array}$

Two main conclusions can be drawn from the results. First, the adaptive design is able to decrease the absolute overall method effect with respect to both benchmarks while respecting a strict constraint on the maximal between stratum method effect and keeping the budget at the current level. The only constraint that need to be relaxed in order to reduce the overall method effect is the maximal sample size. Second, for benchmark ${\text{BM}}_{2},$ smaller minimal overall method effects are obtained than for ${\text{BM}}_{1},$ with the exception of ${S}_{\text{max}}=\text{9,500}.$ This difference is the result of the generally smaller and more similar values of the stratum method effects $D\left(s,g\right).$ We can explore the impact of the sample size constraint by comparing the optimal allocations for ${S}_{\text{max}}=\text{9,500}$ and ${S}_{\text{max}}=\text{15,000}\text{.}$ Assume thresholds are set at $B=\text{170,000},$ $M=1%$ and ${\text{BM}}_{1}.$ Figures 4.1 and 4.2 present the optimal allocation probabilities per stratum and strategy given that a unit is sampled. Each figure can be seen as a matrix where each row represents one of the strategies in ${\mathcal{S}}^{R}$ and each column one of the nine strata described in Section 4.3, e.g., ${g}_{1}$ is the registered unemployed stratum. Each cell in the matrix, i.e., intersection of a row with a column, shows the probability of assigning the corresponding strategy to the corresponding stratum. The probabilities are depicted as bars; the larger a bar, the larger the proportion of the stratum that is allocated to the strategy. The probabilities sum up to one over the strategies, i.e., over the rows. The exact values are given in the bars in case they are 20% or larger. Figure 4.1 and 4.2 show a clear shift in allocation probabilities when the sample size is allowed to increase, e.g., stratum 6 (young household member and employment) is almost fully allocated to $Web$ and stratum 8 (western and employment) and 9 (large households) change from sequential to face-to-face only strategies.

Description for Figure 4.1

Figure that can be seen as a matrix where each row represents one of the strategies in ${\mathcal{S}}^{R}$ and each column one of the nine strata, ${g}_{1}$ to ${g}_{9},$ described in Section 4.3. See Section 4.2 and formula (4.1) for a list of the strategies and their descriptions. For ${g}_{1},$ 39% will be attributed to $F\text{2}F\text{3}$ and 61% to $Web\to F\text{2}F\text{3+}.$ For ${g}_{2},$ 96 % will be attributed to $Web\to Tel\text{2+}$ and 4% to $Web\to Tel\text{2}.$ For ${g}_{3},$ 96% will be attributed to $F\text{2}F\text{3+}$ and the rest to  For ${g}_{4},$ 71% will be attributed to $F\text{2}F\text{3+,}$ 22% to $Tel2$ and the rest to  For ${g}_{5},$ 45% will be attributed to $Web\to Tel\text{2+,}$ 31% to $Tel2$ and the rest to  For ${g}_{6},$ 43 % will be attributed to $Web\to Tel\text{2+,}$ 29% to $Web\to F\text{2}F\text{3}$ and 29% to $Web\to F\text{2}F\text{3+}.$ For ${g}_{7},$ 65% will be attributed to $F2F3\text{+}$ and 35% to $Web\to F\text{2}F\text{3+}.$ For ${g}_{8},$ 100% will be attributed to $Web\to Tel2\text{+}.$ For ${g}_{9},$ 100% will be attributed to $Web\to F\text{2}F\text{3}\text{.}$

Description for Figure 4.2

Figure that can be seen as a matrix where each row represents one of the strategies in ${\mathcal{S}}^{R}$ and each column one of the nine strata, ${g}_{1}$ to ${g}_{9},$ described in Section 4.3. See Section 4.2 and formula (4.1) for a list of the strategies and their descriptions. For ${g}_{1},$ 60 % will be attributed to $Web\to Tel\text{2+,}$ 22 % to $F\text{2}F\text{3}$ and the rest to  For ${g}_{2},$ 39 % will be attributed to $Web\to Tel\text{2+,}$ 20 % to $Web\to F\text{2}F\text{3+}$ and the rest to  For ${g}_{3},$ 81 % will be attributed to $F\text{2}F\text{3+}$ and the rest to  For ${g}_{4},$ 77 % will be attributed to $F\text{2}F\text{3,}$ 20 % to $Tel2$ and the rest to  For ${g}_{5},$ 98 % will be attributed to $Web\to Tel\text{2}$ and 2 % to $Web\to F\text{2}F\text{3+}.$ For ${g}_{6},$ 99 % will be attributed to $Web$ and 1 % to $Web\to F\text{2}F\text{3}\text{.}$ For ${g}_{7},$ 45 % will be attributed to $F2F3\text{+,}$ 41 % to $Tel2$ and the rest to  For ${g}_{8},$ 56 % will be attributed to $F2F3\text{+}$ and the rest to  For ${g}_{9},$ 85 % will be attributed to $F\text{2}F\text{3+}$ and the rest to

The impact of the available budget can be seen very clearly for ${S}_{\text{max}}=\text{12,000}$ and ${\text{BM}}_{1},$ where the minimal overall method effect drops from 0.10% for $B=\text{160,000}$ to 0.01% for $B=\text{180,000}.$ The optimal allocation probabilities are shown in Figures 4.3 and 4.4. When increasing the budget, a shift takes place from telephone only strategies to a mix of face-to-face only strategies and, somewhat surprisingly, Web only strategies.

Description for Figure 4.3

Figure that can be seen as a matrix where each row represents one of the strategies in ${\mathcal{S}}^{R}$ and each column one of the nine strata, ${g}_{1}$ to ${g}_{9},$ described in Section 4.3. See Section 4.2 and formula (4.1) for a list of the strategies and their descriptions. For ${g}_{1},$ 45 % will be attributed to $Tel2\text{+,}$ 42 % to $F2F3$ and the rest to  For ${g}_{2},$ 94 % will be attributed to $Tel\text{2+}$ and the rest to  For ${g}_{3},$ 40 % will be attributed to $F\text{2}F\text{3,}$ 28 % to $Tel2$ and the rest to  For ${g}_{4},$ 88 % will be attributed to $F\text{2}F\text{3+}$ and the rest to  For ${g}_{5},$ 62 % will be attributed to $Tel2+,$ 36 % to $Web\to Tel2\text{+}$ and 2 % to $Tel2.$ For ${g}_{6},$ 79 % will be attributed to $Web\to Tel2\text{+}$ and the rest to  For ${g}_{7},$ 80 % will be attributed to $Tel2\text{+}$ and the rest to  For ${g}_{8},$ 47 % will be attributed to $Web\to Tel2+,$ 44 % to $Tel2\text{+}$ and the rest to  For ${g}_{9},$ 59 % will be attributed to $Web\to Tel2\text{,}$ 21 % to $Tel2\text{+}$ and the rest to

Description for Figure 4.4

Figure that can be seen as a matrix where each row represents one of the strategies in ${\mathcal{S}}^{R}$ and each column one of the nine strata, ${g}_{1}$ to ${g}_{9},$ described in Section 4.3. See Section 4.2 and formula (4.1) for a list of the strategies and their descriptions. For ${g}_{1},$ 58 % will be attributed to $F\text{2}F\text{3+}$ and 42 % to $Web.$ For ${g}_{2},$ 100 % will be attributed to $Web\to Tel\text{2+}\text{.}$ For ${g}_{3},$ 67 % will be attributed to $F\text{2}F\text{3+,}$ 28 % to $Web\to F\text{2}F\text{3}$ and 5 % to $F\text{2}F\text{3}\text{.}$ For ${g}_{4},$ 73 % will be attributed to $F\text{2}F\text{3,}$ 22 % to $Web\to F\text{2}F\text{3+}$ and the rest to  For ${g}_{5},$ 57 % will be attributed to $F2F3,$ 32 % to $Tel2\text{+}$ and the rest to  For ${g}_{6},$ 100 % will be attributed to $Web.$ For ${g}_{7},$ 55 % will be attributed to $F2F3\text{+}$ and 45 % to $Web\to F2F3.$ For ${g}_{8},$ 63 % will be attributed to $Web,$ 31 % to $F2F3\text{+}$ and 6 % to $Web\to Tel2\text{+}.$ For ${g}_{9},$ 25 % will be attributed to $Web\to F2F3\text{+,}$ 21 % to $Tel2,$ 20 % to $Web,$ 20% to $F\text{2}F\text{3+}$ and the rest to an of the other strategies.

A range of scenarios can be investigated using a wide range of threshold values, which we leave to other papers. We conclude by mentioning that optimal allocations with many small allocation probabilities lead to very intractable data collection processes. Lower thresholds to the allocation probabilities may be added to avoid strategies that get only small numbers of cases.

## 4.6 Robustness of optimal designs

In this section, we briefly discuss the robustness of the optimal designs. Sensitivity analyses are beyond the scope of this paper and are part of current research.

In the estimation of the response propensities, telephone registration propensities, costs per sample unit and adjusted methods effects, we make four main assumptions; apart from assumptions about the logistic link function between response $-$ nonresponse, telephone registration $-$ no registration and auxiliary variables. These are:

1. Model for $Web\to F2F3$ and $Web\to F2F\text{​}+\text{\hspace{0.17em}}:$ these two strategies have only been employed for households without a listed phone number.
2. Strategies with cap on calls estimated using censoring: The strategies with a cap on calls have not been conducted and we assume that their response propensities and costs can be approximated by censoring strategies with the full contact strategy.
3. Costs linear in size allocated to strategies: We assume that costs per sample unit do not depend on the size of the sample allocated to a strategy.
4. Time stability of methods effects during 2010 $-$ 2012: Since the parallel runs were performed in two steps, the method effects for some strategies were estimated in two steps. We implicitly assume that the methods effects for these designs have not changed over 2010 $-$ 2012.

Furthermore, all estimated input parameters are subject to sampling variation. Consequently, we expect that certain variations in the optimal designs might occur due to inaccuracy of parameters. In order to assess robustness of optimal designs we propose two types of sensitivity analysis:

$•$ Repeated optimization for input parameters obtained from resampled data. In other words, all historical data are resampled multiple times and for each draw an optimization is performed. The resulting optimal values for quality and costs as well as the strategy composition of the optimal designs can thus be compared across the various draws.

$•$ Performance evaluation of the optimal design on resampled data. In other words, given observed historical data, an optimization is performed. All historical data are then resampled and for each draw the optimization input parameters are recomputed. The optimal design is applied to each set of input parameters and the corresponding quality and cost values are computed. Finally, the statistical properties of quality and cost values are assessed across all draws of input parameters.

Exploratory sensitivity analyses show that there is relatively large variation in the strategy composition of the optimal designs, but that optimal method effects ${\overline{D}}^{\text{BM}}$ are very stable. This implies that the method effect, as objective function, is a relatively smooth function.

Is something not working? Is there information outdated? Can't find what you're looking for?