A method to find an efficient and robust sampling strategy under model uncertainty
Section 2. Optimal strategy under the superpopulation model
Let
be a finite
population of size
with elements
labeled
Let
be a known
vector of values of
auxiliary
variables and
the unknown
value of a study variable associated to unit
We are
interested in the estimation of the total of
Let
be the power
set of
A sample is any subset
and a sampling
design is a probability distribution on
denoted by
or simply
Let
be the inclusion
probability of
and
the joint
inclusion probability of
and
A probability
sampling design is a sampling design such that
for all
An estimator is a
real valued function of the sample,
By strategy we refer to the couple sampling design and estimator,
We consider only
probability sampling designs with fixed sample size. As a convenient stepping
stone we begin by considering unbiased linear estimators of the form
with
arbitrary known constants and
This estimator is called the difference
estimator. The estimator defined in this way is said to be calibrated on
as it satisfies
Note that if
for all
the estimator reduces to
that is, the Horvitz-Thompson estimator (Horvitz
and Thompson, 1952). In later sections we focus on the generalized regression
estimator (GREG).
The design Mean Squared
Error (MSE) of the difference estimator is
As mentioned in the
introduction, due to the non-existence of an optimal strategy under the
design-based approach, often a superpopulation model,
is proposed and
we search for an optimal strategy with respect to the anticipated
mean-squared error,
We may assume that the
-values are
realizations of the following model, denoted
with
where
is a vector of parameters,
and
The random sample
and the errors
are assumed to be independent. Following Rosén
(2000), the terms
and
will be called trend and spread,
respectively. The term trend should not in general be understood in a temporal
sense, rather it refers to the development of
-values with
Note that under
in the
difference estimator (2.1) is a random variable that represents the distance
between the value of the study variable and
i.e.,
Therefore
and
With some
algebra, it can be seen from (2.2) and (2.3) that the anticipated MSE of the
difference estimator becomes
Nedyalkova and Tillé (2008) derive the anticipated MSE in a more general
case.
Tillé and Wilhelm (2017)
give the anticipated MSE of the Horvitz-Thompson estimator. The second term in
(2.5) is the Godambe-Joshi lower bound (e.g., Särndal, Swensson and Wretman, 1992,
page 453).
The anticipated MSE in
(2.5) is the sum of two positive terms. It is easy to see that if
- the
estimator is calibrated on
the first term
vanishes and the anticipated MSE equals the Godambe-Joshi lower bound
- Furthermore, after imposing the fixed sample size
restriction
if
- the design
is such that
denoted
the second term
is minimized and we obtain
Conditions 1 and 2 suggest the specific roles of the design and the
estimator in the sampling strategy. The estimator should “explain” the trend in
the calibration sense of condition 1. The design should “explain” the spread. A
strategy that satisfies conditions 1 and 2 simultaneously will be called
optimal. In the same sense, any estimator and any design satisfying,
respectively, condition 1 and 2, will be called optimal. As this strategy plays
an important role in this paper, we will denote it by
ISSN : 1492-0921
Editorial policy
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.
© Her Majesty the Queen in Right of Canada as represented by the Minister of Industry, 2021
Use of this publication is governed by the Statistics Canada Open Licence Agreement.
Catalogue No. 12-001-X
Frequency: Semi-annual
Ottawa