The anchoring method: Estimation of interviewer effects in the absence of interpenetrated sample assignment
Section 2. Background
2.1 Interviewer variance
Between-interviewer variance affects survey estimates in
a manner similar to the design effects introduced by cluster sampling. One can
estimate the multiplicative increase in the total variance of an estimated mean
as where is the average number of interviews conducted
by individual interviewers and is the within-interviewer correlation in
answers elicited to a particular survey question (Kish, 1965). Typical values
of 35 respondents per interviewer and 0.03 for would therefore double the estimated variance of the mean, relative to the variance
with equal to zero. Failure to account for the
within-interviewer correlation introduced by interviewer effects leads to misspecification effects (Skinner, Holt and Smith, 1989),
resulting in anti-conservative inference due to underestimation of standard
errors.
2.2 Estimation of interviewer
variance
Researchers may wish to estimate interviewer variance
for correct statistical inference (Elliott and West, 2015), to identify
interviewers having unusual effects on data collection outcomes for purposes of
responsive survey design, or as the focus of a methodological study designed to
reduce its impact by understanding its causes (e.g., Brunton-Smith, Sturgis and
Williams, 2012; Sakshaug, Tutz and Kreuter, 2013). Interpenetrated designs,
which assign sampled cases to interviewers at random, allow for interviewer
variance to be accounted for using standard methods that account for clustering
in the observed data: generalized estimating equations (Liang and Zeger, 1986)
or mixed-effects models (Laird and Ware, 1982; Stiratelli, Laird and Ware,
1984). Temporarily ignoring sampling weights, a simple model for a
normally-distributed variable of interest that accounts for interviewer
variance is
where indexes a primary sampling unit (PSU), indexes the interviewer within the PSU, and the respondent associated with the interviewer in the PSU. Assuming that all of the error terms are
independent, that there are an average of interviewers in each of the PSUs, and that there are an average of interviews per interviewer, the variance of
the mean estimator is approximately inflated by a factor of where and As a practical matter, when the variance of is the only quantity of interest, the second
stage of clustering due to an interviewer can be ignored, as in an “ultimate cluster”
design (Kalton, 1983). Treating the random effect of the PSU as with variance the variance of the mean estimator is inflated by a factor of where
If multiple interviewers are nested within a single PSU
as assumed in (2.1), interviewer variances can still be estimated for
methodological purposes using multistage hierarchical linear models. However,
for reasons of cost efficiency, many area probability samples require a given
interviewer to restrict their efforts to a single sampling area (e.g., the U.S.
National Survey of Family Growth; see Lepkowski, Mosher, Groves, West, Wagner
and Gu, 2013), which completely aliases the components of variance due to
interviewers and areas. Such designs preclude any type of direct estimation of interviewer
variance, although from a purely analytic perspective, accounting for
clustering using the PSU IDs in analysis will account for the additional
interviewer variance introduced.
For other types of surveys and in particular telephone surveys this “automatic” accommodation of interviewer
effects at the variance estimation stage afforded by “ultimate cluster”
approaches does not occur. A spectacular example of this is the Behavioral Risk
Factor Surveillance System (BRFSS; Centers for Disease Control, 2013), a
massive annual telephone survey sponsored by the Centers for Disease Control
that is the only Federal health survey designed to provide state-level
estimates of key health factors such as smoking rates, obesity measures, and
cancer screening. Elliott and West (2015) found no evidence that any
substantial proportion of the 1,000 + manuscripts published using BRFSS
data accounted for interviewer effects when conducting variance estimation
based on these data, despite variance inflation factors of 10 or more at the
state level for estimates such as mean self-rated health. These authors found
evidence of substantial interviewer effects for selected survey items, and
variability in the variance of these effects themselves across states, when
applying both model-based and design-based approaches to estimate the variance
(although this analysis used naïve estimators in contrast to either the
standard regression or the anchoring methods discussed here, and so may have
overestimated this variance).
Importantly, secondary analysts still do not know for
sure if these components of variance are arising due to sampling variability,
true measurement error introduced by the interviewers, or differential non-response
among the interviewers. Because of the design effect definition noted above,
their impact on inference can still be large even if the intra-class
correlation (ICC) is small or moderate, since interviewers typically conduct
many interviews. Thus when Groves and Magilavy (1986) found mean ICCs between
0.002 and 0.02 among 25 to 55 variables across each of nine telephone surveys
of political, health, and economic issues, the design effect would range
between 1.04 and 1.38 for studies in which interviewers average 20 interviews
each, and between 1.10 and 1.98 if interviewers average 50 interviews each.
Some outcomes can have much higher ICCs Cernat and Sakshaug (2021) found ICCs on the
order of 0.30 for biometric measures, which would yield design effects on the
order of 15 if 50 interviews were conducted per interviewer. Although
interviewer variance studies for face-to-face data collections tend to be rare
because interpenetrated sample designs are more difficult to implement in such
settings, Schnell and Kreuter
(2005) found a median overall design effect of 2.0 in a multi-stage sample
survey on fear of crime, which was mostly attributable to interviewer effects
rather than spatial clustering. Thus the need for analysts to accommodate
interviewer effects is clear.
2.3 Accounting for interviewer
variance in inference in the absence of interpenetration
As noted in Section 2.2, when interviewers are nested within PSUs, standard methods of variance estimation based on “ultimate clusters” (Kalton, 1983) that account for the dependence of observations within a PSU will “automatically” absorb measurement error due to interviewers into the within-PSU correlation. However, whenever interviewers are not nested within PSUs ‒ as can occur in some area probability samples where interviewers cross sampling unit segments (e.g., O’Muircheartaigh and Campanelli, 1998; Vassallo, Durrant and Smith, 2017) ‒ clustering induced by interviewer effects must be accounted for directly. In such situations, cross-classified random effects models (Rasbash and Goldstein, 1994) of the form
may be employed, where indexes PSUs, indexes interviewers, and indexes interviews conducted by the interviewer (e.g., O’Muircheartaigh and
Campanelli, 1998; Schnell and Kreuter, 2005; Biemer, 2010; Durrant, Groves,
Staetsky and Steele, 2010). Extensions of these models are also possible for
non-linear link functions using generalized linear mixed models (e.g., Vassallo
et al., 2017).
Unfortunately, interpenetration can fail, either due to
differential non-response error among interviewers (West and Olson, 2010; West,
Kreuter and Jaenichen, 2013), non-random shift assignment (e.g., with daytime
interviewers more likely to interview non-working respondents), or other common
practices used to increase response rates, such as assigning experienced
interviewers to more difficult respondents (Brunton-Smith et al., 2012).
In the absence of interpenetration, standard methods to account for interviewer
variance may lead to “spurious” correlations within interviewers that have
nothing to do with interviewer-induced measurement error.
The literature is not completely devoid of approaches
for estimating (and accommodating) interviewer variance in non-interpenetrated
sample designs. Fellegi (1974), Biemer and Stokes (1985), Kleffe, Prasad and
Rao (1991), and Gao and Smith (1998) developed statistical methods for area probability
samples that assumed interpenetration for a random subset of PSUs, and a single
interviewer in each of the remaining PSUs. More recent work has considered
methods for estimation of interviewer variance in binary survey variables in
related settings of partial
interpenetration (von Sanden and Steel, 2008). Rohm, Carstensen, Fischer and Gnambs (2021)
used a two-parameter item response theory model to separate area and
interviewer effects under this assumption, which de-confounds interviewer and
area effects to the extent that each interviewer recruits in multiple areas and
vice versa (although lack of random assignment within an area can still yield
some degree of variance component bias). These methods are useful for obtaining
estimates of interviewer variance separate from area homogeneity for purposes
of assessing the independent impact of such variance. However, they are not
relevant for our more general setting of interest, where interviewers may not
cross PSUs and are not working random subsamples of the full sample (i.e., no
interpenetration).
Another common method found in the literature for
grappling with the problem of non-interpenetrated sample designs when
estimating interviewer variance is adjustment for the effects of respondent-
and area- or interviewer-level covariates in multilevel models (Hox, 1994;
Schaeffer, Dykema and Maynard, 2010; West, Kreuter and Jaenichen, 2013). These
methods are largely ad-hoc, and rely on the assumption that the included
covariates adequately account for all sources of variability that arise from
the areas (and would thus be attributed to the interviewers if the covariates
were not accounted for). This approach suffers from two major shortcomings.
First, many studies, and especially those relying on publicly available data,
may not contain sufficient area- or interviewer-level covariate information to
adequately account for the lack of randomization in interviewer assignment.
Second, the resulting estimators condition on these covariates, and these
conditional estimators are typically not of interest, with the focus being on
either marginal estimates of descriptive parameters, such as means or totals,
or parameters in models that typically do not condition on (or include) covariates.
ISSN : 1492-0921
Editorial policy
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.
© His Majesty the King in Right of Canada as represented by the Minister of Industry, 2022
Use of this publication is governed by the Statistics Canada Open Licence Agreement.
Catalogue No. 12-001-X
Frequency: Semi-annual
Ottawa