Considering interviewer and design effects when planning sample sizes
Section 4. Conclusions

Table of contents

Using a design effect to select a sample size is a commonly used method to account for the loss of efficiency that a complex sampling design might entail. However, the design effect can be inflated by an interviewer effect in face-to-face surveys. This can lead to erroneous conclusions about the effect that complex sampling has on the efficiency of a sampling strategy. As a consequence, this could lead to misallocation of resources. The planned sample size might be too high, if it is based on an overestimated design effect. Therefore, we propose to consider both the design and the interviewer effect simultaneously when planning a sample size. The survey effect, which we develop in Section 2, accounts both for interviewer and PSU variance to assess the efficiency of a survey design. Based on the survey effect we introduce a corrected design effect, which uses as a reference design a simple random sample with an interviewer effect. As a result, the corrected design effect is no longer conflated with the interviewer effect and can be used to better base the decision on the samples size on the effect the sampling design has on the precision of survey estimates.

For ESS6, our empirical findings in Section 3.2 show that high design effects are related to high interviewer effects. The average corrected design effects that we observe suggest that the sampling design influences the variance of an estimator to a lesser degree than interviewers for many countries in the ESS6. The ability to estimate the corrected design effect, e.g., from historical data as guide for the survey planner, depends mainly on the PSU-interviewer structure and the allocation of interviewer workloads and cluster sizes. We find a partially interpenetrated survey design, i.e., on a regional level, can be sufficient to disentangle PSU and interviewer variance. In our simulation study an average number of 1.5 PSUs per interviewer or interviewers per PSU was enough to estimate the variance components of measurement model $(M_{2}) .$ For actual survey data, that is categorical, this level of interpenetration might not be high enough, but a high number of PSUs, interviewers, and a large sample size might off-set a low interpenetration. For practical applications, we recommend testing via simulation if the assumed measurement model can be estimated with the given PSU-interviewer structure, as we did in Section 3.1.

When using the survey effect and corrected design effect for the planning of a sample size it can be helpful to work with the upper and lower bounds of these statistics. In Section 2, we derive such bounds, but under somewhat unrealistic assumptions regarding the distribution of survey weights, interviewer workloads and PSU sizes. However, if realistic assumptions about the concentration of survey weights, interviewer workloads and PSU sizes can be made, then we propose to use a linear optimization, as shown in the Appendix, to derive bounds that are of much higher practical relevance and can serve as valuable guidance for survey planners. Generally, we recommend to have lowly concentrated distributions of interviewer workloads and PSU cluster sizes in order to increase the precision of survey estimates. Thus, interviewer workloads and PSU cluster sizes should be as equal as possible for any given number of interviewer and PSUs.

The measurement models we introduce in Section 2 are arguably simplistic. This makes the models applicable to most survey designs. The only information, besides the survey data, used to compute the estimates for Table 3.3 were the PSU and interviewer indicators. However, there are certain aspects of survey measurements that could be incorporated into a practical measurement model, such as stratification, which, in general, increases the efficiency of an estimation strategy (Särndal et al., 1992, Section 3.7). This was neglected in our analysis, despite the fact that many ESS6 countries used a stratified design for their PSU sample. Gabler, Häder and Lynn (2006) develop a design effect for estimation strategies that combine different sampling designs for sampling domains. This approach could possibly be adapted to add a stratification effect to the PSU variance. Furthermore, it might be plausible to assume that interviewers differ with regard to the degree of homogeneity that they add to their measurements. This interviewer heterogeneity could be incorporated into a measurement model by allowing groups of interviewers to have different distributions of $ℑ_{i},$ i.e., values for $σ_{I}^{2}$ (West and Elliott, 2014). However, a procedure to classify interviewers would be needed. Preferably one that does mainly rely on the survey data and not so much on information available about the interviewers, which might differ from survey to survey.

A future application for the presented framework of the survey effect would be to find an optimal budget allocation with respect to the number of PSUs and interviewers, for a given effective sample size. Such an optimization requires a cost model for the deployment of interviewers to a possible set of PSUs. Fieldwork institutes could possibly provide the necessary information to calculate such a model for a particular country. Such a method could help survey planners to conduct face-to-face surveys more effectively, which is of increasing importance as surveys based on probability samples are under pressure from the comparably cheap alternative of recruiting respondents from online-access panels.

Further research could also focus on the development of survey effect for other estimators than the weighted sample mean. For estimators that can be described as functions of estimated totals, which includes the Ordinary Least Square Estimator for regression coefficients (Särndal et al., 1992, Section 5.10), it should be possible to derive survey effects, under the framework shown in Section 2, that allow for a similar factorization as the survey effect presented in this work.

Appendix

For the Appendix we will introduce a short notation of multiple sums, where, for example, $\sum_{q i k} y_{q i k}$ will be shorthand for

$\sum_{q \in K} \sum_{i \in R} \sum_{k \in s_{q i}} y_{q i k} .$

Results 1

${eff}_{20}^{*} (w) \leq \frac{n \sum_{q i k} w_{q i k}^{2}}{{(\sum_{q i k} w_{q i k})}^{2}} (1 + ρ_{I} [\frac{n}{R} - 1] + ρ_{C} [\frac{n}{K} - 1]) .$

Proof: We need to show that

$\frac{\sum_{i} {(\sum_{q k} w_{q i k})}^{2}}{\sum_{q i k} w_{q i k}^{2}} \leq \frac{n}{R} (A .1)$

and

$\frac{\sum_{i} {(\sum_{q k} w_{q i k})}^{2}}{\sum_{q i k} w_{q i k}^{2}} \leq \frac{n}{K} (A .2)$

hold, if $n_{i} = \frac{n}{R}$ and $n_{q} = \frac{n}{K},$ for all $i = 1, \dots, R$ and $q = 1, \dots, K .$

As shown in Gabler et al. (1999), if $a_{q i k} = 1$ for all $q \in K, i \in R, k \in s_{q i},$ using the Cauchy-Schwarz inequality, we know that

$\begin{array}{l} {(\sum_{q k} w_{q i k} a_{q i k})}^{2} = & {(\sum_{q k} w_{q i k})}^{2} \leq n_{i} \sum_{q k} w_{q i k}^{2} = \sum_{q k} a_{q i k}^{2} \sum_{q k} w_{q i k}^{2} \\ \frac{\sum_{i} {(\sum_{q k} w_{q i k})}^{2}}{\sum_{i} \sum_{q k} w_{q i k}^{2}} \leq \frac{\sum_{i} n_{i} \sum_{q k} w_{q i k}^{2}}{\sum_{i} \sum_{q k} w_{q i k}^{2}} . \end{array}$

If we have $n_{i} = \frac{n}{R}$ for all $i = 1, \dots, R,$ then it follows that

$\frac{\sum_{i} {(\sum_{q k} w_{q i k})}^{2}}{\sum_{q i k} w_{q i k}^{2}} \leq \frac{n}{R} .$

The proof for inequality (A.2) is analogous to the one above, which completes the proof of Result 1.

Upper bounds for ${\bar{m}}_{I} (w),$ ${\bar{m}}_{C} (w)$ and $e f f_{w} (w)$

For given $n_{I}^{⊤}$ and $n_{C}^{⊤}$ and $w_{k} \in [a, b]$ with $a, b \in ℝ_{+}$ for all $k \in s,$ and $\sum_{k} w_{k} = n$ we can construct an upper bound for ${\bar{m}}_{I} (w)$ and ${\bar{m}}_{C} (w) .$

We know that

$\begin{array}{r} \frac{\sum_{i} {(\sum_{q k} w_{q i k})}^{2}}{\sum_{i} \sum_{q k} w_{q i k}^{2}} \leq \frac{\sum_{i} n_{i} \sum_{q k} w_{q i k}^{2}}{\sum_{i} \sum_{q k} w_{q i k}^{2}} \leq \frac{\sum_{i} n_{i} \sum_{q k} w_{q i k}^{2}}{n} . \end{array} (A .3)$

Now we need to find a sufficiently high value for $\sum_{i} n_{i} \sum_{q k} w_{q i k}^{2} .$ For this we define $x_{i} = \sum_{q k} w_{q i k}^{2}$ and $x = {(x_{1}, \dots, x_{I})}^{⊤} .$ Thus we have to solve the following problem:

$\begin{array}{l} \max_{x \in ℝ^{R}} n_{I}^{⊤} x \\ s .t . \\ x_{i} \geq a^{2} n_{i} \forall i \in R \\ x_{i} \leq b^{2} n_{i} \forall i \in R \\ \sum_{i} x_{i} \geq n \\ \sum_{i} x_{i} \leq f_{s q m} (a, b, n), \end{array} (A .4)$

where

$f_{s q m} (a, b, n) = b^{2} ⌊ \frac{n - n b}{a - b} ⌋ + (n - n b) - ⌊ \frac{n - n b}{a - b} ⌋ (a - b) + b + a^{2} (n - ⌊ \frac{n - n b}{a - b} ⌋ - 1),$

where $⌊ ⌋$ means rounded to the nearest lower integer. The problem formulated in equation (A.4) can be solved using a solver for linear programs, e.g., with the solveLP function from the R package Henningsen (2012). Function $f_{s q m}$ gives a maximum of $\sum_{k} w_{k}^{2}$ given the upper and lower bounds of the weights $a$ and $b$ and the fact that the weights are scaled to $n,$ i.e., $\sum_{k} w_{k} = n .$ The sum of squares is maximized by giving as many weights their highest possible value $b$ under the condition that each weight must have at least a value of $a$ and that $\sum_{k} w_{k} = n .$ The problem can then be solved using a simplex algorithm. An upper bound for ${\bar{m}}_{C}$ can be determined in the same fashion. Changing the problem to minimization and a lower bound for ${eff}_{20}$ can be found. However, it is not guaranteed that separate optimization of ${\bar{m}}_{C}$ and ${\bar{m}}_{I}$ will yield values of $x$ that allow for a value of $w$ that jointly maximizes (or minimizes) ${\bar{m}}_{C}$ and ${\bar{m}}_{I} .$ Although, if, $x_{C}$ and $x_{I}$ are the vectors that optimizes ${\bar{m}}_{C}$ and ${\bar{m}}_{I}$ respectively, it should be possible to find a possible value for $w,$ e.g., using iterative proportional fitting.

For ${eff}_{w} (w)$ we have under the same assumptions as made above

$\begin{array}{r} 1 \leq {eff}_{w} (w) = \frac{\sum_{k \in s} w_{k}^{2}}{n} \leq \frac{f_{s q x} (a, b, n)}{n} . (A .5) \end{array}$

Result 2

$\begin{matrix} \frac{n}{ρ_{I} (n - R) (n - R + 1) + n} \leq {eff}_{I} \leq \frac{R}{R + (n - R) ρ_{I}} . \end{matrix}$

Proof: The upper bound in Result 2 can be shown by using the Cauchy-Schwarz inequality, which gives us

$\begin{array}{l} R \sum_{i} n_{i}^{2} & \geq {(\sum_{i} n_{i})}^{2} \\ \sum_{i} n_{i}^{2} & \geq \frac{n^{2}}{R} . \end{array} (A .6)$

With a some algebra we can formulate the upper bound of ${eff}_{I} .$

To prove the lower bound in Result 2 we solve the following problem:

$\begin{array}{l} max_{n_{I} \in ℕ_{> 0}^{R}} n_{I}^{⊤} n_{I} \\ s .t . \\ \sum_{i} n_{i} = n . \end{array} (A .7)$

A solution to the problem formulated in (A.7) can be found by considering that if we have $n_{i} - 1 \geq 1$ and $n_{i} \leq n_{j}$ it follows that ${(n_{i} - 1)}^{2} + {(n_{j} + 1)}^{2} > n_{i}^{2} + n_{j}^{2} .$ Thus for $n_{j} = {max}_{i \in R} n_{i}$ we can increase $\sum_{i}^{R} n_{i}^{2}$ if we reduce any $n_{i} > 1 i \neq j$ by one and add one to $n_{j} .$ Hence, if $n_{i} = 1$ for all $i \neq j \in R$ and $n_{j} = n - R + 1$ then $\sum_{i} n_{i}^{2}$ is at its maximum, with $\sum_{i} n_{i}^{2} = (R - 1) + {(n - R + 1)}^{2} .$

Result 4

Proof: Given Result 2, to prove the right-hand side of Result 4 we need to show that

${eff}_{2 * 0}^{*} (w) \leq \frac{n \sum_{q i k} w_{q i k}^{2}}{{(\sum_{q i k} w_{q i k})}^{2}} (1 + ρ_{I} [\frac{n}{R} - 1] + ρ_{C} [\frac{n}{K} - 1] + ρ_{I C} [\frac{n}{R K} - 1]) . (A .8)$

To prove inequality (A.8) we only need to show that

$\frac{\sum_{q i} {(\sum_{k} w_{q i k})}^{2}}{\sum_{q i k} w_{q i k}^{2}} \leq \frac{n}{R K} .$

The rest follows from the proofs of inequalities (A.1) and (A.2). Thus it is sufficient to show that

${(\sum_{k} w_{q i k})}^{2} = {(\sum_{k} w_{q i k} a_{q i k})}^{2} \leq n_{q i} \sum_{k} w_{q i k}^{2} = \sum_{k} a_{q i k}^{2} \sum_{k} w_{q i k}^{2},$

if $a_{q i k} = 1$ for all $q \in K, i \in R, k \in s_{q i},$ which also follows from the Cauchy-Schwarz inequality. Inequality (A.8) then follows if $n_{q i} = \frac{n}{R K}$ for $i = 1, \dots, R$ and $q = 1, \dots, K .$

The left-hand side of Result 4 follows from the proof of Result 6 in Gabler and Lahiri (2009) and Result 2.

ESS6 variables used for empirical evaluation

Table A.1
ESS6 variables used for empirical evaluation
Table summary
This table displays the results of ESS6 variables used for empirical evaluation % (appearing as column headers).
pplfair	trstprt	stfdem	imueclt	iorgact
pplhlp	trstep	stfedu	imwbcnt	agea
polintr	trstun	stfhlth	happy	gndr
trstprl	lrscale	gincdif	aesfdrk	This is an empty cell
trstlgl	stflife	freehms	health	This is an empty cell
trstplc	stfeco	euftf	rlgdgr	This is an empty cell
trstplt	stfgov	imbgeco	wkdcorga	This is an empty cell

The definition of these variables including question text can be found in ESS (2013).

References

Bates, D.M., Mächler, M., Bolker, B.M. and Walker, S.C. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1, 1-48. https://doi.org/10.18637/jss.v067.i01.

Bates, D.M., Mächler, M., Bolker, B.M. and Walker, S.C. (2019). Lme4: Linear Mixed-Effects Models Using ‘Eigen’ and S4. https://CRAN.R-project.org/package=lme4.

Beullens, K., and Loosveldt, G. (2016). Interviewer effects in the European social survey. Survey Research Methods, 10, 2, 103-118.

Biemer, P.P. (2010). Total survey error: Design, implementation, and evaluation. Public Opinion Quarterly, Oxford University Press, 74, 5, 817-848.

Chambers, R.L., and Skinner, C.J. (2003). Analysis of Survey Data. New York: John Wiley & Sons, Inc.

Chaudhuri, A., and Stenger, H. (2005). Survey Sampling: Theory and Methods. CRC Press.

Davis, P., and Scott, A.(1995). The effect of interviewer variance on domain comparisons. Survey Methodology, 21, 2, 99-106. Paper available at https://www150.statcan.gc.ca/n1/pub/12-001-x/1995002/article/14405-eng.pdf.

Ellis, P.D. (2010). The Essential Guide to Effect Sizes: Statistical Power, Meta-Analysis, and the Interpretation of Research Results. Cambridge University Press.

European Social Survey (ESS) (2013). ESS6 Data Protocol. 1.4. London: ESS ERIC. http://www.europeansocialsurvey.org/data/download.html?r=6.

European Social Survey (ESS) (2014a). European Social Survey Round 6 Interviewer Questionnaire. Dataset edition: 2.1. London: ESS ERIC.

European Social Survey (ESS) (2014b). Weighting European Social Survey Data. London: ESS ERIC. www.europeansocialsurvey.org/docs/methodology/ESS_weighting_data_1.pdf.

European Social Survey (ESS) (2014c). ESS6 - 2012 Documentation Report. Edition: 2.3. London: ESS ERIC. http://www.europeansocialsurvey.org/docs/round6/survey/ESS6_data_documentation_report_e02_3.pdf.

European Social Survey (ESS) (2016). European Social Survey Round 6 Data. Dataset edition: 2.2. London: ESS ERIC.

European Social Survey (ESS) (2018a). Countries by Round (Year). London: ESS ERIC. http://www.europeansocialsurvey.org/data/country_index.html.

European Social Survey (ESS) (2018b). Data and Documentation by Round European Social Survey (ESS). London: ESS ERIC. http://www.europeansocialsurvey.org/data/download.html?r=6.

Fahrmeir, L., Heumann, C., Künstler, R., Pigeot, I. and Tutz, G. (1997). Statistik: Der Weg Zur Datenanalyse. 1^sted. Berlin: Springer-Verlag.

Fischer, M., West, B.T., Elliott, M.R. and Kreuter, F. (2018). The impact of interviewer effects on regression coefficients. Journal of Survey Statistics and Methodology, May. https://doi.org/10.1093/jssam/smy007.

Gabler, S., Häder, S. and Lahiri, P. (1999). A model based justification of Kish’s formula for design effects for weighting and clustering. Survey Methodology, 25, 1, 105-106. Paper available at https://www150.statcan.gc.ca/n1/pub/12-001-x/1999001/article/4718-eng.pdf.

Gabler, S., Häder, S. and Lynn, P. (2006). Design effects for multiple design samples. Survey Methodology, 32, 1, 115-120. Paper available at https://www150.statcan.gc.ca/n1/en/pub/12-001-x/2006001/article/9256-eng.pdf.

Gabler, S., and Lahiri, P. (2009). On the definition and interpretation of interviewer variability for a complex sampling design. Survey Methodology, 35, 1, 85-99. Paper available at https://www150.statcan.gc.ca/n1/en/pub/12-001-x/2009001/article/10886-eng.pdf.

Ganninger, M. (2010). Design Effects: Model-Based Versus Design-Based Approach. Edited by GESIS -Leibniz-Institut für Sozialwissenschaften. Array 3.

Genz, A., Bretz, F., Miwa, T., Mi, X. and Hothorn, T. (2019). Mvtnorm: Multivariate Normal and T Distributions. https://CRAN.R-project.org/package=mvtnorm.

Groves, R.M. (2009). Survey Methodology. 2^nd ed. Wiley Series in Survey Methodology. Hoboken, New York: John Wiley& Sons, Inc.

Groves, R.M., and Lyberg, L. (2010). Total survey error: Past, present, and future. Public Opinion Quarterly, Oxford University Press, 74, 5, 849-879.

Henningsen, A. (2012). Linprog: Linear Programming/Optimization. https://CRAN.R-project.org/package=linprog.

Kish, L. (1962). Studies of interviewer variance for attitudinal variables. Journal of the American Statistical Association, 57, 297, 92-115. https://doi.org/10.1080/01621459.1962.10482153.

Kish, L. (1965). Survey Sampling. New York: John Wiley & Sons, Inc.

Lohr, S.L. (2014). Design effects for a regression slope in a cluster sample. Journal of Survey Statistics and Methodology, 2, 2, 97-125. https://doi.org/10.1093/jssam/smu003.

Lynn, P., and Gabler, S. (2004). Approximations to B* in the Prediction of Design Effects Due to Clustering. ISER Working Paper Series.

Lynn, P., Häder, S., Gabler, S. and Laaksonen, S. (2007). Methods for achieving equivalence of samples in cross-national surveys: The European social survey experience. Journal of Official Statistics, 23, 1, 107.

O’Muircheartaigh, C., and Campanelli, P. (1998). The relative impact of interviewer effects and sample design effects on survey precision. Journal of the Royal Statistical Society: Series A (Statistics in Society), 161, 1, 63-77.

Raudenbush, S.W. (1993). A crossed random effects model for unbalanced data with applications in cross-sectional and longitudinal research. Journal of Educational Statistics, 18, 4, 321-349. https://doi.org/10.2307/1165158.

R Core Team (2019). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Särndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. New York: Springer-Verlag.

Scheipl, F., Greven, S. and Kuechenhoff, H. (2008). Size and power of tests for a zero random effect variance or polynomial regression in additive and linear mixed models. Computational Statistics & Data Analysis, 52, 7, 3283-3299.

Schnell, R., and Kreuter, F. (2005). Separating interviewer and sampling-point effects. Journal of Official Statistics, 21, 3, 389-410.

The ESS Sampling Expert Panel (2016). Sampling Guidelines: Principles and Implementation for the European Social Survey. London: ESS ERIC Headquarters. http://www.europeansocialsurvey.org/docs/round8/methods/ESS8_sampling_guidelines.pdf.

Vassallo, R., Durrant, G. and Smith, P. (2017). Separating interviewer and area effects by using a cross-classified multilevel logistic model: Simulation findings and implications for survey designs. Journal of the Royal Statistical Society: Series A (Statistics in Society), 180, 2, 531-550.

Von Sanden, N.D. (2004). Interviewer Effects in Household Surveys: Estimation and Design. Ph.d. Thesis, Wollongong: University of Wollongong. http://ro.uow.edu.au/theses/312.

West, B.T., and Blom, A.G. (2017). Explaining interviewer effects: A research synthesis. Journal of Survey Statistics and Methodology, 5, 2, 175-211. https://doi.org/10.1093/jssam/smw024.

West, B.T., and Elliott, M.R. (2014). Frequentist and Bayesian approaches for comparing interviewer variance components in two groups of survey interviewers. Survey Methodology, 40, 2, 163-188. Paper available at https://www150.statcan.gc.ca/n1/en/pub/12-001-x/2014002/article/14092-eng.pdf.

ISSN : 1492-0921

Editorial policy

Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.

Submission of Manuscripts

Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.

Copyright

Published by authority of the Minister responsible for Statistics Canada.

Use of this publication is governed by the Statistics Canada Open Licence Agreement.

Catalogue No. 12-001-X

Frequency: Semi-annual

Ottawa

Date modified:: 2020-06-30

Language selection

Search and menus

Search

Considering interviewer and design effects when planning sample sizes
Section 4. Conclusions

Appendix

Results 1

Upper bounds for ${\bar{m}}_{I} (w),$ ${\bar{m}}_{C} (w)$ and $e f f_{w} (w)$

Result 2

Result 4

ESS6 variables used for empirical evaluation

References

Considering interviewer and design effects when planning sample sizes Section 4. Conclusions

Appendix

Results 1

Result 2

Result 4

ESS6 variables used for empirical evaluation

References

Editorial policy

Submission of Manuscripts

Note of appreciation

Standards of service to the public

Copyright

Considering interviewer and design effects when planning sample sizes
Section 4. Conclusions