A method to find an efficient and robust sampling strategy under model uncertainty
Section 4. Guiding the choice of sampling design with the help of a risk measure
We have seen in Section 3
that even a simple misspecification of the working model might result in the
strategy
not being
optimal. It is therefore risky to accept a given model as correct without any
type of assessment. While most of the information needed for an “objective”
evaluation of the model is not available at the design stage, it is possible to
reach some degree of confidence about the parameters in the working model that
allows for comparing a set of designs and make the decision about which one to
implement. In this section we propose a method to assist in the choice of the
sampling design.
The model expected MSE
(3.2) in Result 1 can be viewed as a function of
and
as everything
else is available at the design stage. To begin with, let us assume that
is also known.
Then we can write
For any design,
this function can be evaluated at any
and it indicates the loss incurred by assuming
that
is the true parameter when it is, in fact,
We can assume a prior distribution on
and calculate the risk under
where
is the sample space of
The design that yields the smallest risk
should be chosen. Note that numerical integration methods (e.g., Monte Carlo
simulation methods) may be needed to evaluate the risk (4.1).
In practice,
is unknown. We
propose two ways for dealing with it. The first one is to see now the loss as a
function of
and
and calculate
the risk as above, assuming a prior on the pair
and
The second one
is to provide some “guess” about its value. This approach can use the fact that
(Proof in the Appendix)
where
and
is the correlation between
and
(In Example 3 below, we give a more
convenient expression in a special case.) Although
is unknown, for repeated surveys we do have
some previous knowledge about it. In other cases it is often possible to have
some reasonable “guess” about it.
It remains to comment on
the choice of the prior distribution
The choice of
the distribution and its parameters is subjective and defined by the
statistician. Nevertheless, it should reflect the available knowledge about the
model parameter
In particular,
should be
centered around
Its variance
should reflect how confident we are about the working model. Note that a full
confidence on the working model would be a density with all its mass at
in which case
the risk (4.1) would be minimized by the
design given by
condition 2 in Section 2.
It might be argued that
by introducing
an additional
source of subjectivity has been added to the choice of the sampling design. The
prior may add a certain Bayesian flavor to the process, but note that
is only needed
for choosing the design. Hence, the inference is still design-based.
Furthermore, relying on an assumed model is also subjective in choice of
assumption and it does involve a risk. The risk measure in (4.1) allows for
quantification of that risk.
ISSN : 1492-0921
Editorial policy
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.
© Her Majesty the Queen in Right of Canada as represented by the Minister of Industry, 2021
Use of this publication is governed by the Statistics Canada Open Licence Agreement.
Catalogue No. 12-001-X
Frequency: Semi-annual
Ottawa