“Optimal” calibration weights under unit nonresponse in survey sampling
Section 3. A simulation study
Properties
of the estimators were studied by means of a
Monte Carlo
simulation. We used an authentic population called KYBOK, which consists of
clerical municipalities in Sweden in 1992.
(This population was also used for simulation purposes in Särndal and Lundström
(2005) and Andersson and Särndal (2016).)
The
study variable
is “Expenditure on administration and
maintenance”
The population is divided into four groups
with respect to size, from the smallest to the largest. The group sizes are
and
The moon vector is
where
if the unit
belongs to population group
and otherwise 0,
The quantitative star variable
is the square root of “Revenue advances”,
which is highly positively correlated with
The
sample size/expected sample size was 300 and we used the exponential response
probability
where
is chosen according to the desired average
response probability; in this study varying between 0.60 and 0.86 (the latter
value being the chosen response probability in e.g., Särndal and Lundström
(2005)). Two sampling designs have been considered separately: simple random
sampling and Poisson sampling. In the latter case
For each combination of design, sample
size/expected sample size and average response probability, 10,000 samples were
generated. For each such sample
a response set
was created by performing independent
Bernoulli trials with probability
of success,
The
estimators of main interest are
and
but in this simulation study we will also
include an example of parametric propensity modelling based on
A simple choice is the logistic (logit) model
where we let
For each sample with its observed nonresponse
maximum likelihood estimation was used to obtain
yielding estimates
To obtain
the design weights
are then replaced by
before calibration. The logit model (3.2) is
misspecified since the true response probability is determined by (3.1).
An
arbitrary estimator
is assessed by the empirical (simulation
estimated) bias
variance
and mean squared error
where
Observe
that expressions such as “the bias has increased” should be interpreted in the
following as an increase of the bias in absolute value.
3.1 Results
As
a benchmark for the study where auxiliary information is not used at the design
stage, let us first consider the results for simple random sampling in Table
3.1. This is a case where
(Actually, to get equality the “star”
information is
for
As will hold throughout this study the bias of
is considerably larger than the bias of
which is a natural effect from the
construction of
based on a misspecified nonresponse model.
Furthermore, of these two estimators
has always the largest variance.
Looking
instead at the results in Table 3.2 for Poisson sampling, we can first observe
that for
both the bias and the variance are larger than
under simple random sampling.
on the other hand, has highly reduced bias
under Poisson sampling compared with simple random sampling, whereas there is a
slight increase in the variance. Then, turning to the proposed modified
estimator
we observe a further reduction in bias, except
for
Actually, the bias has a monotonic behaviour
and changes sign from positive to negative for
However, compared with
the variance of
is increased due to the inclusion of
in (2.4), thus leading to a trade-off between
the bias and the variance. We also note that of these two estimators
displays the largest MSE values, since the
dominating part of the MSE is the variance for these low levels of bias.
Table 3.1
Empirical bias variance and mean squared error for (Cal), (Cal logit) and (Opt) under simple random sampling with average response probabilities 0.86, 0.70, 0.65 and 0.60
Table summary
This table displays the results of Empirical bias variance and mean squared error for (Cal) Simple random sampling (Cal = Opt) (appearing as column headers).
|
Simple random sampling (Cal = Opt) |
|
|
|
0.86 |
0.70 |
0.65 |
0.60 |
| Cal |
-2.44 |
-4.00 |
-4.47 |
-4.89 |
| Cal logit |
4.81 |
19.4 |
26.4 |
35.5 |
|
|
|
0.86 |
0.70 |
0.65 |
0.60 |
| Cal |
8.40 |
9.59 |
10.2 |
11.3 |
| Cal logit |
10.7 |
10.9 |
13.2 |
16.1 |
|
|
|
0.86 |
0.70 |
0.65 |
0.60 |
| Cal |
1.44 |
2.57 |
3.01 |
3.52 |
| Cal logit |
3.38 |
38.9 |
71.9 |
127 |
Table 3.2
Empirical bias variance and mean squared error for (Cal), (Cal logit), (Opt) and (Optm) under Poisson sampling with average response probabilities 0.86, 0.70, 0.65 and 0.60
Table summary
This table displays the results of Empirical bias variance and mean squared error for (Cal) Poisson sampling (appearing as column headers).
|
Poisson sampling |
|
|
|
0.86 |
0.70 |
0.65 |
0.60 |
| Cal |
-2.88 |
-4.71 |
-5.17 |
-5.69 |
| Cal logit |
-12.1 |
-27.5 |
-32.9 |
-38.8 |
| Opt |
-0.0732 |
-0.329 |
-0.516 |
-0.810 |
| Optm |
0.690 |
0.274 |
0.0536 |
-0.277 |
|
|
|
0.86 |
0.70 |
0.65 |
0.60 |
| Cal |
4.46 |
5.25 |
5.56 |
5.81 |
| Cal logit |
5.17 |
6.60 |
7.17 |
7.57 |
| Opt |
1.39 |
1.63 |
1.75 |
1.84 |
| Optm |
2.05 |
2.89 |
3.22 |
3.51 |
|
|
|
0.86 |
0.70 |
0.65 |
0.60 |
| Cal |
5.29 |
7.47 |
8.23 |
9.05 |
| Cal logit |
19.8 |
82.2 |
115 |
127 |
| Opt |
1.39 |
1.64 |
1.78 |
1.91 |
| Optm |
2.10 |
2.90 |
3.22 |
3.52 |
ISSN : 1492-0921
Editorial policy
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.
© Her Majesty the Queen in Right of Canada as represented by the Minister of Industry, 2019
Use of this publication is governed by the Statistics Canada Open Licence Agreement.
Catalogue No. 12-001-X
Frequency: Semi-annual
Ottawa