Multiple-frame surveys for a multiple-data-source world
Section 3. Estimation in classical multiple-frame surveys
The main problem for
inference in a classical multiple-frame survey
one that is designed so as to satisfy
Assumptions (A1) to (A6)
is how to account for potential overlap among
the samples. In the NSAF, telephone households were screened out of the area
sample, but in many applications screening is infeasible or it is more
cost-effective to obtain data from the full sample selected from each frame.
When separate surveys or data sources are not designed with data combination in
mind, the overlap depends on the coverage of the individual data sources.
With an overlap design,
units that are contained in more than one frame have multiple chances for being
selected in the sample. An estimator constructed by summing the weighted
observations from each of the
samples,
will be a biased estimator of
because the individual sample weights do not
reflect the multiple chances of selection for units in overlap domains. Methods
for estimating population totals thus typically multiply the survey weights
by a multiplicity adjustment
that satisfies
for each unit
resulting in the estimator
where
is the multiplicity-adjusted weight.
3.1 Hartley’s composite estimator
Hartley (1962) was the first author to present a rigorous theory of
estimation in dual-frame surveys where units in the overlap domain {1, 2}
might be sampled from both frames. This four-page paper made several important
contributions. First, Hartley defined the problem in statistical terms. Second,
he proposed an optimal estimator for combining the estimates from the two
surveys. And third, he studied the design problem of allocating the resources
to the different samples, with a joint consideration of the allocation and the
estimator that minimize the variance of the estimated population total subject
to a fixed cost.
Hartley (1962) estimated the population total
by
He proposed choosing
to minimize
This resulted in the value
The estimator in (3.2) is
of the form in (3.1) with multiplicity weight adjustments
If it is desired to use the optimal compositing factor
estimators may be substituted for the unknown
covariances in (3.3). Because
depends on covariances involving
however, the optimal multiplicity adjustment
may differ for different variables, giving a different set of weights for each.
In addition,
can be less than 0 or greater than 1, possibly
resulting in negative weights for some observations. These features carry over
to the
-frame generalization of Hartley’s optimal estimator
studied by Lohr and Rao (2006).
The estimator in (3.2),
with fixed value of
is
approximately unbiased for
under
Assumption (A5). If the estimated domain totals and the estimates of the
covariances in (3.3) are consistent, then the estimator with
is consistent
for
Saegusa (2019) studied Hartley’s estimator from the perspective of empirical process
theory, establishing a law of large numbers and a central limit theorem when
and
are both simple
random samples.
Hartley’s application was
in agriculture, and many of the early applications of dual-frame surveys were
for agriculture or business surveys (Kott and Vogel, 1995), where list
frames existed that contained the larger business or agricultural operations. A
dual-frame survey with a disproportionately larger sample from the list frame
reduced costs because (1) obtaining data from an operation in the list frame
was often less expensive than obtaining data from an operation in the area
frame and (2) oversampling the list frame was analogous to oversampling
high-variance strata in stratified sampling and thus resulted in greater
efficiency.
Later, as cellular
telephones became more prevalent, concern about bias from using landline
telephone samples alone led to use of dual-frame telephone surveys, with one
sample from a landline frame and a second sample from a cellular telephone
frame. Here, both frames are incomplete but together cover the population of
persons with telephones. For these surveys, an important consideration is how
to deal with persons having both kinds of telephones. The next section reviews
choices for the compositing.
3.2 Multiplicity weighting adjustments
Hartley’s optimal
estimator, with
uses a
different set of weights for each response variable, which can lead to internal
inconsistencies among estimators. Various authors have proposed estimators that
use a single set of weights for all analyses. Here, I briefly list some of the
multiplicity adjustment factors
that result in
one set of weights for the general estimator of the population total in (3.1).
The resulting estimators are approximately unbiased for the population total
under
Assumptions (A1), (A4), and (A5). These and additional estimators are reviewed
in detail by Lohr (2011), Lu, Peng and Sahr (2013), Ferraz and Vogel (2015), Arcos,
Rueda, Trujillo and Molina (2015),
and Baffour, Haynes, Western, Pennay,
Misson and Martinez (2016).
- Screening estimator, with
A unit sampled from Frame
is discarded if it is in any of Frames
This estimator is automatically used with a
screening design such as the NSAF; with an overlap design, its use means that
some data observations are thrown away.
- Multiplicity estimator, with
(number of
frames containing unit
In a dual-frame survey, this gives the
estimator in (3.2) with
Mecatti (2007) noted that
with the multiplicity estimator, Assumption (A4) can be replaced by the
slightly less restrictive assumption that
is known for each sampled unit
- The multiplicity estimator can also be viewed as a
special case of the generalized weight share method (Deville
and Lavallée, 2006) using the standardized link matrix, since the number of
links to population unit
is the number of frames containing that unit.
- Single-frame estimator (Bankier,
1986; Kalton and Anderson, 1986), which considers the observations as
if they had been sampled from a single frame. If inverse probability weights
are used, with
then
This estimator requires that the inclusion
probability for unit
be known for all
frames, including frames from which the unit
was not sampled. The multiplicity adjustments consider the inclusion
probabilities for the designs but not the relative variances, which are
affected by clustering and stratification in the individual samples.
- Effective sample size (ESS) estimator (Chu, Brick and Kalton, 1999;
O’Muircheartaigh and Pedlow, 2002), where the domain estimator from
each frame is weighted by the relative effective sample size from that frame.
Let
be the sample size from Frame
and let
denote the design effect for a key variable or
a smoothed design effect for multiple variables. The effective sample size for
is
and the multiplicity adjustment for unit
is
- This estimator considers the relative variances of
estimators from different samples and is often more efficient than the
screening, multiplicity, and single-frame estimators.
- The pseudo-maximum-likelihood (PML) estimator of Skinner
and Rao (1996) is of this type when the frame sizes
and domain sizes
are unknown; Skinner and Rao
(1996) recommended using the design effect for estimating
to establish the effective sample size for the
dual-frame case. The PML estimator is asymptotically equivalent to an ESS
estimator that poststratifies to the domain sizes
when those are known; when the frame sizes
are known but not
the PML estimator is asymptotically equivalent
to calibrating the ESS estimator to estimated domain sizes calculated from the
pseudo-likelihood function.
Approximately unbiased
estimates of the variances for all estimators considered in this section can be
derived under Assumptions (A1) to (A6) and additional regularity conditions
that ensure consistency of estimated totals and variance estimators from the
samples. Skinner and Rao (1996) studied linearization variance estimators; Chauvet (2016) derived linearization variance estimators for the French housing survey
that accounted for the variance reduction due to high sampling fractions from
some of the frames. Lohr and
Rao (2000) developed theory for using the
jackknife with multiple frames, and Lohr (2007) and Aidara (2019)
considered bootstrap variance estimators. These methods rely on Assumption (A3)
of independent samples; Chauvet
and de Marsac (2014) considered the
situation in which the samples share primary sampling units but independent
samples are taken at the second stage of the design.
Calculating linearization
variance estimates requires special software that implements the partial
derivative calculations for the multiple frames. Replication variance
estimation methods such as jackknife and bootstrap, however, can be calculated
in standard survey software by creating a single data set that contains all the
concatenated observations and weights
from the
samples and
creating replicate weights using standard methods for stratified multistage
samples (Metcalf and Scott,
2009). The concatenated data set has
strata, where
is the number
of strata for
observations
from different samples are in different strata. The replicate weight methods
also can include effects of calibration (see Section 3.3) on the variance.
Of course, many
applications call for estimates of quantities other than population totals, and
the multiple-frame theory applies to parameters that are smooth functions of
domain totals. A different compositing factor may be desired when quantities
other than population totals are of primary interest, however, and there may be
special considerations for other types of analyses. Other types of statistical
analyses that have been studied in the multiple-frame setting include linear (Lu, 2014b) and nonparametric (Lu, Fu and Zhang, 2021) regression, logistic
regression with ordinal data (Rueda, Arcos, Molina and Ranalli, 2018),
empirical distribution functions (Arcos, Martínez, Rueda and Martínez, 2017),
gross flow estimation with missing data (Lu and Lohr, 2010), and chi-squared tests (Lu, 2014a).
Lu (2014b) noted that linear regression parameters estimated
using the multiplicity-adjusted weights are the finite population regression
coefficients
that minimize
the sum of squares
However, one of
the reasons for taking a multiple-frame survey, rather than using an incomplete
frame, is a concern that population characteristics may differ across domains. Lu (2014b) suggested examining the residuals separately by domain and also fitting
separate regression models by domain to assess the appropriateness of the
regression model.
3.3 Calibration
The PML estimator is
calibrated to population counts that are known for the frames and domains. In a
dual-frame survey where
and
are known,
for
If the overlap
domain size
is also known,
the PML estimator is calibrated to all three domain sizes. Skinner (1991) used calibration with the single-frame estimator, raking the estimator to
the population frame counts.
Ranalli, Arcos, Rueda and
Teodoro (2016) studied general calibration theory for dual-frame
surveys. They assumed that a vector of auxiliary information
is available
with known population totals
and calculated
multiple-frame generalized regression weights as
where
is an arbitrary constant and
estimates
using the multiplicity-adjusted weights. Under
regularity conditions, they showed that for the dual-frame estimator in (3.2)
with fixed
the variance of the generalized regression
estimator
is approximated by
where
The variance of the estimator depends on the
residuals from the regression model just as in the single-frame case.
Särndal and Lundström (2005) distinguished among types of auxiliary information
that can be used in calibration. InfoU is information available at the
population level. A vector
can be
considered as InfoU if the population total
is known and
is observed for
every respondent in the sample. InfoS is information available at the level of
the sample, but not at the population level. Vector
is InfoS if it
is known for every member of the sample, both respondents and nonrespondents,
but
is unknown.
In a multiple-frame survey, the variables available
for InfoU and InfoS may differ across frames. For the NSAF, little auxiliary
information was known for nonrespondents in the RDD sample but address-related
information (for example, characteristics of the block group) was known for all
members of the area-frame sample. The reverse may be true for a dual-frame
survey in which Frame 1 is an area frame and Frame 2 is a list frame.
The list frame may have rich information that can be used for weighting class
adjustments or calibration, while the auxiliary information for the area frame
may be restricted to information measured in the survey for which population totals
are known from an external source such as a census or population register.
Ranalli et al. (2016) allowed for differing InfoU information across the
frames; some of the auxiliary variables may be known for units from all samples
and for the full population, while other variables may be of the form
with total
the total of
variable
in Frame
Calibration to
frame counts
is thus a
special case of the general calibration theory.
But the differing amounts of information for the
frames may also have a bearing on the multiplicity adjustments. Suppose that
Frame 2 has rich auxiliary information for calibration while Frame 1
has little information. Calibrating the weights
before
compositing may increase the relative effective sample size from
and thus
increase the value of
that would be
used for the ESS estimator.
Haziza and Lesage (2016) argued that a two-step weighting procedure offers
several advantages for single-frame surveys with nonresponse. The first step
divides the design weight for unit
by its
estimated response propensity (often calculated from InfoS information) and the
second step calibrates the nonresponse-adjusted weights to population control
totals (available from InfoU information). When there is substantial
nonresponse, weighting adjustment factors from step 1 are often much higher
than those from step 2; if the response propensity model is correct, the
weighting adjustments in step 2 converge to 1 as
The two-step
procedure is thus more robust toward misspecification of the calibration model.
The same considerations apply for multiple-frame
surveys. A two-step procedure, where step 1 adjusts the samples separately for
nonresponse and step 2 calibrates the combined samples, provides robustness to
the calibration model. Suppose that
has full
response;
has nonresponse
but the response propensities can be predicted perfectly from variable
Then,
performing a separate nonresponse adjustment for each sample in step 1 removes
the bias for
so that
Assumption (A5) is satisfied. If the data are combined first and then
calibrated using (3.4), however, the calibration may change the weights for
units in
in order to
meet the calibration constraints
introducing bias for the estimates from
while not
removing it for estimates from
More research
is needed on the ordering of steps for weight adjustments. It may be better to
perform two steps of nonresponse adjustments and calibration on each sample
separately, then adjust the weights for multiplicity, and then calibrate to
population totals (including re-calibrating on the individual frame variables).
One consequence of using an overlap estimator for a
multiple-frame survey is that the multiplicity adjustments may introduce more
weight variation, with observations belonging to one frame having much larger
weights than observations belonging to more than one frame. If, for example, a
list frame (Frame 2 in Figure 2.2(a, b)) is disproportionately
oversampled, then the sampling weights for observations in domain
which are
sampled only from Frame 1, may be large relative to the weights for the
other domains. Wolter, Ganesh, Copeland, Singleton and Khare (2019)
suggested using a shrinkage estimator, estimating
by
but the
shrinkage may introduce bias
after all, the reason for using a more
complicated multiple-frame design instead of just sampling from Frame 2 is
to avoid potential bias from omitting domain
.
A better
solution, if feasible, is to address the weight variation when designing the
survey, as discussed in Section 5.
3.4 Probability sample combined with census of a population
subset
Lohr (2014) and Kim and Tam (2021) noted that the situation in Figure 2.2(a)
includes the special case in which a probability sample
is taken from
Frame 1 having full coverage, and the sample
from Frame 2
is a census of domain {1, 2}. The overlap domain is thus defined to be the
units in
which may be
from administrative records or a convenience sample. Although
considered by
itself, may have undercoverage bias, in the multiple-frame setting the bias is
eliminated by the presence of a sample from Frame 1. The units in
have
and represent
themselves alone; they do not represent any units in other parts of the
population. When
is small, say
from a small convenience sample,
will have
little effect on dual-frame estimators
almost all of the population is in domain
.
But when
is large, as
may occur when Frame 2 consists of administrative records, the
availability of those records may improve the precision of
if Assumptions (A1)
to (A6) are met.
When
is a census
with no measurement error,
The estimator
in (3.2) is
taking
uses the known population total from Frame 2
and relies on Frame 1 only for estimation of the part of the population
not in Frame 2.
Kim and Tam (2021) noted that since
is known, it
can be used as an InfoU calibration total. They proposed two calibration
estimators: a ratio estimator
and a
generalized regression calibration estimator. For many designs, however, the
ratio estimator will be less efficient than
from (3.6)
because
the ratio adjustment can introduce extra variability from
that is excluded from
Calibrating
to
for
the generalized
regression weights in (3.4) become
resulting in
from (3.6). Similarly, calibrating on the
vector
results in
For some designs, the
variance can be reduced even further. Montanari (1987, 1998) proposed
using the regression coefficient
for
calibration, resulting in the estimator
Rao (1994)
called (3.8) the optimal regression estimator and showed that
For the dual-frame situation considered in
this section, with
and
where
is Hartley’s optimal value for
from (3.3).
Although we usually think of the compositing factor
as being
between 0 and 1,
can be outside
of this range. For a conceptual example, suppose that Frame 2 is a list of
children receiving food assistance at school and the sample from Frame 1
is a cluster sample of households. Then households in which one or more
children are receiving food assistance have some household members in domain
and other
members in domain
.
If
exhibits high
intra-household correlation, then we would expect
and
to be
positively correlated. In this case, Hartley’s optimal estimator results in
negative weights for units in domain
from the
probability sample.
Even though
is more
efficient for special situations such as the cluster sample described above, it
depends in practice on an estimate of the covariance, is optimal only for this
particular
variable, and
may have negative weights. Negative weights can also occur if one does optimal
calibration with auxiliary variable
in fact, that
calibration results in the estimator proposed by Fuller and Burmeister (1972).
These optimal regression estimators are sensitive to the model assumptions, and
in general I do not recommend their use.
When the Frame-2 sample is a census and Assumptions (A1)
to (A6) are met, the precision of population estimates depends entirely on the
design of
When the
samples are not designed to be part of a multiple-frame survey (and sometimes
even when they are), it is likely that one or more of the assumptions is
violated. Assumptions (A4) and (A6) are particularly suspect when it is desired
to combine data from surveys that were not designed with combination in mind.
Even if both surveys measure unemployment, they may use different questions so
that the unemployment statistics from
measure a
different concept than the statistics from
Domain
misclassification may also occur. A unit in the census
is known to
also be in complete Frame 1, but it may be difficult to tell whether a
unit in
is also in the
administrative records or convenience sample that serves as
These problems
are discussed in the next section.