Comparison of the conditional bias and Kokic and Bell methods for Poisson and stratified sampling
Section 3. Review of methods based on conditional bias
3.1 Definition
The conditional
bias of an estimator
for the
parameter
for a
unit
was
defined in the framework of Sampling Theory by Moreno-Rebollo et al. (1999)
as follows:
The conditional
bias of a sampled unit is equal to the average of the difference between
and
on the
set of samples containing that unit. Similarly, the conditional bias of an
unsampled unit is equal to the average of the sampling error for all samples
not containing that unit.
In the case of a
one-phase sampling design, the conditional bias of the Horvitz-Thompson
estimator
associated with a sampled unit
is
defined by
where
designates the joint inclusion probability of
units
and
in the
sample. Conditional bias (3.3) is, in general, unknown since the values of the
variable of interest are only observed for the units in the sample. In
practice, it is possible to estimate it without bias, or in a robust way, from
the sample. We consider the conditionally unbiased estimator (see, for example,
Beaumont et al., 2013):
This estimator is
conditionally unbiased in the sense that
only if
are
strictly positive. Moreover, conditional bias (3.3) and its estimator (3.4)
depend on the inclusion probabilities
and the
joint inclusion probabilities
In other
words, conditional bias is a measure that takes the sampling design into
account.
For a Poisson
design, the conditional bias of the sampled unit
is given
by
Unlike the
case of other sampling designs, such as simple random sampling without
replacement, conditional bias (3.5) is known directly for all sample units and
does not require estimation from the sample because it does not depend on any
parameter of the finite population.
Conditional bias,
as demonstrated by Beaumont et al. (2013), is a direct measure of the
influence of each unit on the estimation error, the second relation being
verified for maximum entropy sampling designs:
3.2 A robust estimator based on conditional bias
As shown by
formulas (3.6) and (3.7), the
conditional bias (CB) measures the effect of each unit on the estimation
error and the estimation variance. A robust estimator should be defined in such
a way that observations of the sample have only controlled and limited values
of their conditional bias. Based on this idea, Beaumont et al. (2013)
suggested using an estimator of the form:
with
the
Huber function defined by
and
the
estimator defined in (3.4).
The Huber function is used to limit the
influence of the most influential units by truncating their conditional bias.
Parameter
can be
chosen according to various optimization criteria for the robust estimator. For
example,
can be
chosen to obtain the estimate having, under the sample design, the smallest
mean square error. However, it is relatively complex or sometimes impossible to
obtain an analytical expression of
for a
given sample design.
Beaumont et al. (2013) suggest choosing
i.e.,
the value of the constant
for
which the largest absolute value of the estimated conditional bias for the
sample observations on the robust estimator is the lowest. In this case, the
robust estimator is equal to:
The Beaumont, Haziza and Ruiz-Gazen estimator
is thus simple to implement. Compared to the Kokic and Bell method, it is more
general because it is valid for all sampling designs and does not require any
information outside the sample to be determined. In addition, it does not rely
on any hypotheses about the variable of interest. The resulting estimator is
robust under the sample design, while the Kokic and Bell estimator considers
the sampling design and the distribution of the variable of interest. However,
it is not designed to have the smallest mean square error, but to obtain an
estimator on which the influence of each unit is limited, by minimizing the
influence of the most influential unit.
The method has been extended to integrate more
elements of the sample design and to adapt to certain situations.
Favre-Martinoz et al. (2016) extended the method for a two-phase sampling
design, which makes it possible to take non-response into account when it is
assimilated to a second phase of Poisson drawing; Favre-Martinoz et al.
(2015) proposed a method for ensuring the consistency of the robust estimators
obtained when the parameters of interest are the totals of a variable in
different domains included in one another.
ISSN : 1492-0921
Editorial policy
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.
© Her Majesty the Queen in Right of Canada as represented by the Minister of Industry, 2018
Use of this publication is governed by the Statistics Canada Open Licence Agreement.
Catalogue No. 12-001-X
Frequency: Semi-annual
Ottawa