Variance estimation under monotone non-response for a panel survey
Section 5. A simulation study
In this section, several artificial populations are
generated according to the model described in Section 5.1. In Section 5.2,
we consider several estimators for a change between totals, which illustrates
the heuristic reasoning in Section 4. A Monte Carlo experiment is
presented in Section 5.3, and several variance estimators for estimating a
total, a ratio or a parameter change are compared. The results from Tables 5.1
and 5.2 are readily reproducible using the R code provided in the Supplementary
Material.
5.1 Simulation set-up
We consider seven populations of size 10,000, each
containing three variables of interest
and
observed at times
and 3, respectively. The variables of interest
are generated according to the superpopulation model
The auxiliary variables
and
are independently generated from a Gamma
distribution with shape and scale parameters 2 and 1. Two auxiliary variables
and
not related to the variables of interest, are
generated similarly. The variables
and
are independently generated according to a
standard normal distribution. We use
and
which leads to a coefficient of determination
in model (5.1) approximately equal to 0.50.
The parameter
is set to 0, 0.2, 0.4, 0.6, 0.8, 1.0 and 1.2
for populations 1 to 7, respectively.
For each population, a simple random sample
of size
1,000 is selected. Three non-response phases
are then successively simulated. At each phase
the sub-sample of respondents
is obtained by Poisson sampling with a
response probability
for unit
defined as
We use
at each phase
For
we use
0.60, which corresponds to an average
response rate of 0.75. For
we use
0.75, which corresponds to an average
response rate of 0.81. Inside each sub-sample
the estimated response probabilities
are obtained by means of an unweighted
logistic regression.
5.2 Comparison of estimators for a difference of
totals
In this section, we are interested in comparing the
accuracy of two estimators for a difference of totals
for
and
for
and
and for
and
We consider the estimator
which makes use of the whole appropriate
sub-samples for variables
and
and the estimator
which makes use of the common sub-sample only.
These two estimators are compared through the relative difference (RD) of their
variances, which are defined as follows:
The true variances are replaced by their Monte Carlo approximation,
obtained by repeating
100,000 times the sample selection and
the non-response phases.
The results are presented
in Table 5.1. A positive RD indicates that the use of the common sample
only leads to a more accurate estimator. As could be expected, the RD
increases in all cases with
that is, when the correlation between
and
increases. For
and
and for
and
the estimator
is more accurate for
greater than 0.6. For
and
is more accurate for
greater than 0.8.
Table 5.1
Relative Difference (RD) between two estimators for a difference of totals
Table summary
This table displays the results of Relative Difference (RD) between two estimators for a difference of totals. The information is grouped by (appearing as row headers), , and (appearing as column headers).
|
|
|
|
0.0 |
-12 |
-27 |
-13 |
0.2 |
-09 |
-25 |
-11 |
0.4 |
-04 |
-20 |
-03 |
0.6 |
-05 |
-09 |
11 |
0.8 |
17 |
11 |
39 |
1.0 |
30 |
33 |
83 |
1.2 |
40 |
46 |
127 |
5.3 Performances of the variance estimators
In this section, we consider the artificial population 5
generated as described in Section 5.1.
The sample selection by means of simple random sampling of size
1,000 and the three non-response phases are
applied
5,000 times. We are interested in evaluating
the variance estimators and the simplified variance estimators, in case of
estimating a total, a ratio or a change in totals.
As for the total
we consider at each time
three estimators. The estimator
makes use of the weights
The estimator
makes use of the weights
obtained by calibrating the weights
on the population size and on the totals of
the auxiliary variables
and
The estimator
makes use of the weights
obtained by calibrating the weights
on the population size and on the totals of
the auxiliary variables
and
The working model is therefore well-specified
for
but not for
The proposed variance estimator for
is obtained from equation (2.16), and the
simplified variance estimator is obtained by plugging in (2.16) the simplified
variance estimator for non-response given in (2.17). The proposed variance
estimators for
and
are obtained from equation (3.8), and the
simplified variance estimators are obtained by plugging in (3.8) the simplified
variance estimator for non-response given in (3.9).
We are also interested in estimating the ratio
for
At each time
we consider three estimators. The estimator
makes use of the weights
The proposed variance estimator is obtained
from equation (3.14), by using the estimated linearized variable
The simplified variance estimator is obtained
by plugging in (3.14) the simplified variance estimator for non-response given
in (3.15). The estimators
and
make use of the calibrated weights
and
The proposed variance estimators are obtained
from equation (3.21). The simplified variance estimators are obtained by
plugging in (3.21) the simplified variance estimator for non-response given in
(3.22).
Finally, we are interested in estimating the change in
totals
for
At each time
we consider three estimators. The estimator
makes use of the weights
The proposed variance estimator is obtained
from equation (4.8), and the simplified variance estimator is obtained by
plugging in (4.8) the simplified variance estimator for non-response given in (4.9).
The estimators
and
make use of the calibrated weights
and
The proposed variance estimators are obtained
from equation (4.8), by replacing
by the estimated residual for the weighted
regression of
on the calibration variables. The simplified
variance estimators are obtained by plugging in (4.8) the simplified variance
estimator for non-response given in (4.9).
For a proposed variance estimator
we computed the Monte Carlo Percent Relative
Bias
where the global variance
was approximated through an independent set of
100,000 simulations. To evaluate the contribution of some component
into the variance estimator
we computed the contribution (in percent)
To evaluate the simplified variance estimator for the non-response
we computed the Monte Carlo Percent Relative
Bias
where the variance
due to non-response was approximated through
an independent set of 100,000 simulations.
The simulation results are presented in Table 5.2.
The proposed variance estimator is almost unbiased in all cases. As could be
expected, the contribution of the variance due to the sampling design decreases
with time, as the number of respondents decreases and as the variance due to
non-response becomes larger. The simplified variance estimator is highly biased
for the variance due to non-response in case of
The bias decreases quickly with time, but
remains large at time
The simplified variance estimator is almost
unbiased for a calibrated estimator when the working model is adequately
specified, but is severely biased otherwise. This is consistent with our
reasoning in Section 3.1. The simplified variance estimator is almost
unbiased for the three estimators of the ratio, and for the calibrated
estimators of the change in totals. In case of the non-calibrated estimator for
the change in totals, the bias can be as high as 30%.
Table 5.2
Relative bias of a global variance estimator, relative contribution to the estimators of variance components and relative bias of a simplified variance estimator for the variance due to non-response for the estimation of a total, a ratio or a change in totals with three sets of weights
Table summary
This table displays the results of Relative bias of a global variance estimator. The information is grouped by (appearing as row headers), (équation) (appearing as column headers).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0 |
-1 |
-2 |
-1 |
-1 |
-2 |
-1 |
-1 |
-3 |
- |
0 |
-2 |
- |
-1 |
-2 |
- |
-1 |
-2 |
- |
0 |
-2 |
- |
0 |
-2 |
- |
-1 |
-3 |
|
81 |
57 |
35 |
69 |
49 |
32 |
80 |
56 |
35 |
- |
49 |
32 |
- |
49 |
32 |
- |
50 |
33 |
- |
50 |
33 |
- |
49 |
32 |
- |
50 |
33 |
|
19 |
19 |
13 |
31 |
22 |
15 |
20 |
18 |
13 |
- |
22 |
15 |
- |
22 |
15 |
- |
22 |
15 |
- |
22 |
14 |
- |
22 |
15 |
- |
22 |
14 |
|
- |
25 |
18 |
- |
28 |
19 |
- |
25 |
17 |
- |
28 |
19 |
- |
28 |
19 |
- |
28 |
19 |
- |
28 |
18 |
- |
28 |
19 |
- |
28 |
18 |
|
- |
- |
34 |
- |
- |
34 |
- |
- |
34 |
- |
- |
34 |
- |
- |
34 |
- |
- |
34 |
- |
- |
34 |
- |
- |
34 |
- |
- |
34 |
|
559 |
188 |
80 |
0 |
-1 |
-2 |
83 |
34 |
15 |
- |
0 |
0 |
- |
-1 |
-2 |
- |
-1 |
-1 |
- |
19 |
30 |
- |
-1 |
-2 |
- |
3 |
5 |
ISSN : 1492-0921
Editorial policy
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.
© Her Majesty the Queen in Right of Canada as represented by the Minister of Industry, 2018
Use of this publication is governed by the Statistics Canada Open Licence Agreement.
Catalogue No. 12-001-X
Frequency: Semi-annual
Ottawa