A method to find an efficient and robust sampling strategy under model uncertainty
Section 4. Guiding the choice of sampling design with the help of a risk measure

Table of contents

We have seen in Section 3 that even a simple misspecification of the working model might result in the strategy $π ps (δ_{2}) - diff (δ_{1})$ not being optimal. It is therefore risky to accept a given model as correct without any type of assessment. While most of the information needed for an “objective” evaluation of the model is not available at the design stage, it is possible to reach some degree of confidence about the parameters in the working model that allows for comparing a set of designs and make the decision about which one to implement. In this section we propose a method to assist in the choice of the sampling design.

The model expected MSE (3.2) in Result 1 can be viewed as a function of $β$ and $σ^{2},$ as everything else is available at the design stage. To begin with, let us assume that $σ^{2}$ is also known. Then we can write

$L_{p} (β) = {MSE}_{ξ p} (β | x, δ, σ) = {MSE}_{p} (\sum_{s} \frac{f (x_{k} | β_{1}) - f (x_{k} | δ_{1})}{π_{k}}) + σ^{2} \sum_{U} (\frac{1}{π_{k}} - 1) g {(x_{k} | β_{2})}^{2} .$

For any design, $p (\cdot),$ this function can be evaluated at any $β$ and it indicates the loss incurred by assuming that $δ$ is the true parameter when it is, in fact, $β .$ We can assume a prior distribution on $β,$ $h (β),$ and calculate the risk under $h,$

$R (p) = E_{h} ({MSE}_{ξ p} (β | x, δ, σ)) = \int_{Θ} h (β) \cdot {MSE}_{ξ p} (β | x, δ, σ) d β, (4.1)$

where $Θ$ is the sample space of $β .$ The design that yields the smallest risk should be chosen. Note that numerical integration methods (e.g., Monte Carlo simulation methods) may be needed to evaluate the risk (4.1).

In practice, $σ^{2}$ is unknown. We propose two ways for dealing with it. The first one is to see now the loss as a function of $β$ and $σ$ and calculate the risk as above, assuming a prior on the pair $β$ and $σ .$ The second one is to provide some “guess” about its value. This approach can use the fact that (Proof in the Appendix)

$σ^{2} \approx \frac{S_{f, f}}{{\bar{g}}^{2}} (\frac{1}{R_{f, y}^{2}} - 1) (4.2)$

where $S_{f, f} = \sum_{U} {(f (x_{k} | β_{1}) - \bar{f})}^{2} / N,$ $\bar{f} = \sum_{U} f (x_{k} | β_{1}) / N,$ ${\bar{g}}^{2} = \sum_{U} g {(x_{k} | β_{2})}^{2} / N$ and $R_{f, y}$ is the correlation between $f (x | β_{1})$ and $y .$ (In Example 3 below, we give a more convenient expression in a special case.) Although $R_{f, y}$ is unknown, for repeated surveys we do have some previous knowledge about it. In other cases it is often possible to have some reasonable “guess” about it.

It remains to comment on the choice of the prior distribution $h (β) .$ The choice of the distribution and its parameters is subjective and defined by the statistician. Nevertheless, it should reflect the available knowledge about the model parameter $β .$ In particular, $h (β)$ should be centered around $β = δ .$ Its variance should reflect how confident we are about the working model. Note that a full confidence on the working model would be a density with all its mass at $β = δ,$ in which case the risk (4.1) would be minimized by the $π ps$ design given by condition 2 in Section 2.

It might be argued that by introducing $h (β)$ an additional source of subjectivity has been added to the choice of the sampling design. The prior may add a certain Bayesian flavor to the process, but note that $h (β)$ is only needed for choosing the design. Hence, the inference is still design-based. Furthermore, relying on an assumed model is also subjective in choice of assumption and it does involve a risk. The risk measure in (4.1) allows for quantification of that risk.

ISSN : 1492-0921

Editorial policy

Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.

Submission of Manuscripts

Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.

Copyright

Published by authority of the Minister responsible for Statistics Canada.

Use of this publication is governed by the Statistics Canada Open Licence Agreement.

Catalogue No. 12-001-X

Frequency: Semi-annual

Ottawa

Date modified:: 2021-06-24

Language selection

Search and menus

Search

A method to find an efficient and robust sampling strategy under model uncertainty
Section 4. Guiding the choice of sampling design with the help of a risk measure

A method to find an efficient and robust sampling strategy under model uncertainty Section 4. Guiding the choice of sampling design with the help of a risk measure

Editorial policy

Submission of Manuscripts

Note of appreciation

Standards of service to the public

Copyright

A method to find an efficient and robust sampling strategy under model uncertainty
Section 4. Guiding the choice of sampling design with the help of a risk measure