A note on Wilson coverage intervals for proportions estimated from complex samples
Section 2. The extension
It is not hard to
generalize Wilson coverage intervals (also called “score intervals”) to complex
survey data. See, for example, Kott and Carr (1997). As with the Wilson itself,
one simply solves this equation for the true proportion
where
is a consistent estimator for
under probability-sampling theory, and
is the Normal
score for
given the goal is to produce a
coverage interval
is often set at 0.05). The missing piece to
equation (2.1) is
the so-called “effective sample size”, which in
the standard Wilson formulation is the sample size
In our more general context,
where
is a consistent estimator for the variance of
In order to
calculate
we need
both
and
to be
positive. In addition, let us assume that
for
some positive
and
is
Note
that the last three are always true under simple random sampling with
replacement so long as
Dropping
terms,
but allowing
to be
small (effectively
one
can derive this Wilson-like interval for
from
equation (2.1):
We can call this
the “complex-sampling Wilson coverage interval”. WesVar (2007) computes a
variant of this interval that does not drop
terms.
It is dropped here because other terms of that size will be dropped later in
this note.
If it is
reasonable to drop
terms in
deriving equation (2.2), one can also safely ignore the difference between
and
Under
simple random sampling without replacement,
where
is
the sampling fraction. When
is
very small, the distinction between with and without replacement sampling can
be ignored.
Observe that under
simple random sampling with replacement, the denominator of the pivotal
appearing on the left-hand side of equation (2.1) has no variance at all. By
contrast, the denominator in the traditional Wald pivotal,
can
have considerable variance, especially when
or
is
small. That is why Wilson intervals have superior performance under simple
random sampling, whether with or without replacement.
That superiority
carries over to complex sampling (see, for example, Kott, Andersson and Nerman,
2001), where the pivotal’s denominator is
which is
likely to have less variance than
in most
applications. For an intuition into why this is so, observe that a putative
variance estimator of the form
is
minimized when
Under simple random sampling, whether with or
without replacement,
is exactly
Although the
minimizing
is not exactly equal to
under
more complex sampling designs, the optimal
is
likely to be closer to
than to
0. It is thus not surprising that the variance of
will
usually be less than the variance of
Nevertheless,
a slight improvement on the complex-sampling Wilson coverage interval can be
made by replacing
in
equation (2.2) by
when
a consistent estimator for
exists
(see Kott et al., 2001).
As with the
standard Wilson, the center of the complex-sample Wilson interval in equation
(2.2) is slightly different from
when
is not
Its length
appears
longer than the Wald’s:
When
however,
ISSN : 1492-0921
Editorial policy
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.
© Minister of Industry, 2017
Use of this publication is governed by the Statistics Canada Open Licence Agreement.
Catalogue No. 12-001-X
Frequency: semi-annual
Ottawa