6. A simulation study
Takis Merkouris
Previous | Next
We
have conducted a simulation to study the relative performance of the various
composite estimators for the nested version of the basic design (c). Values of
correlated scalar variables
and
were generated
from a bivariate log-normal distribution with mean and variance parameters
and
With fixed
four
combinations of variances
(5 and 10) and
three values of the correlation
(0.5, 0.7, 0.9)
were considered. Variances
imply skewness 2.65
and 4.33, respectively, while variances
imply skewness 1.43
and 2.15. For each of these twelve settings, a population of size
was created.
From each of the twelve populations a simple random sample
of size
was drawn
without replacement, and split into three simple random subsamples
with two
different allocations, namely,
and
the second
allocation giving larger combined samples
and
Thus, a total of
24 simulation settings were created. For each such setting, we computed the HT
estimators of the totals
and
using the full
sample
as well as the
HT estimator of
using
and
and the HT
estimator of
using
and
For the HT
estimators based on two subsamples, we employed the simple method for combining
two subsamples (Gonzales and Eltinge 2008) by a weighting adjustment involving
the probability of selection of a population unit in
or in
and in
or in
In addition, for
both
and
we computed the
CGR and COR estimators. Each simulation sampling setting was repeated 10,000
times.
The
simulated bias (in percent) of all estimators was smaller than 0.05%, with the
exception of two settings involving
with associated
population skewness of 4.33, where the largest observed values 0.14% and 0.17%
correspond to CGR and COR for
respectively, in
the sample allocation (2,000, 2,000, 1,000), dropping to 0.10% and 0.13% in the
more favorable allocation (1,500, 1,500, 2,000). Thus the relative efficiencies
of the estimators are evaluated using their simulated design variances.
Table
6.1 shows the efficiency of the composite estimators CGR and COR relative to
the HT estimators that use
and
The measure of
this relative efficiency is the percent relative difference of variances
[V(CGR)-V(HT)]/V(HT) and [V(COR)-V(HT)]/V(HT). A negative value of this measure
indicates the efficiency gain achieved by the two composite estimators. Not
shown in Table 6.1, the simulated loss of efficiency of the HT estimators of
both
and
due to not using
the full sample
is very close to
the nominal loss for SRS, that is, 66.8% for the allocation (2,000, 2,000, 1,000),
and 43.1% for the allocation (1,500, 1,500, 2,000).
Table 6.1
Relative differences (in percent) of variances of CGR and COR to HT for x and y, based on 10,000 simulated samples with two different sample allocations
Table summary
This table displays the results of Relative differences (in percent) of variances of CGR and COR to HT for x and y. The information is grouped by (n1, n2, n3) (appearing as row headers), (2,000; 2,000; 1,000), (1,500; 1,500; 2,000) and XXXX (appearing as column headers).
(n1, n2, n3) |
(2,000; 2,000; 1,000) |
(1,500; 1,500; 2,000) |
|
|
|
|
CGR |
COR |
CGR |
COR |
CGR |
COR |
CGR |
COR |
|
|
|
-2.24 |
-6.86 |
26.39 |
-6.23 |
-5.19 |
-6.29 |
12.59 |
-6.52 |
|
-11.90 |
-14.75 |
10.21 |
-13.96 |
-12.78 |
-13.24 |
0.25 |
-13.13 |
|
-24.89 |
-28.57 |
-12.49 |
-28.10 |
-21.55 |
-23.37 |
-14.55 |
-23.03 |
|
|
|
-0.27 |
-6.75 |
6.50 |
-6.26 |
-3.94 |
-6.60 |
0.50 |
-6.44 |
|
-11.47 |
-14.56 |
-6.29 |
-14.04 |
-12.87 |
-13.51 |
-9.51 |
-13.10 |
|
-28.14 |
-28.42 |
-25.74 |
-28.23 |
-23.70 |
-23.54 |
-22.07 |
-23.09 |
|
|
|
-4.57 |
-6.51 |
28.64 |
-6.17 |
-5.90 |
-5.98 |
17.57 |
-6.44 |
|
-11.29 |
-14.37 |
16.08 |
-13.92 |
-11.66 |
-12.90 |
6.69 |
-13.00 |
|
-20.32 |
-28.09 |
-2.46 |
-28.19 |
-18.46 |
-22.97 |
-6.97 |
-22.91 |
|
|
|
-4.79 |
-6.49 |
8.54 |
-6.13 |
-6.06 |
-6.22 |
3.41 |
-6.34 |
|
-13.27 |
-14.28 |
-2.57 |
-13.95 |
-13.27 |
-13.15 |
-6.00 |
-12.93 |
|
-26.01 |
-28.06 |
-20.37 |
-28.21 |
-22.18 |
-23.17 |
-18.48 |
-22.89 |
For
the variable
using the CGR
estimator at low correlation
and with allocation
(2,000, 2,000, 1,000) leads to an efficiency gain that ranges from 0.27% to
4.79% at the four different variance settings; this gain reflects the amount of
lost information recovered by the CGR estimator. Substantial gain is achieved
at
ranging from
11.29% to 13.27%, and more so at
ranging from
20.32% to 28.14%. With sample allocation (1,500, 1,500, 2,000) the CGR
estimator performs better at
and
but not at
Additional gain
is achieved by the COR estimator, which is more efficient than the CGR
estimator in all but two settings (where the estimators are equally efficient,
see column 7). The efficiency of the COR estimator relative to HT estimator is
close to the nominal for SRS efficiency, which is 6.25, 13.92 and 28.12 at
respectively,
for allocation (2,000, 2,000, 1,000), and 6.417, 13.186 and 23.30 for allocation
(1,500, 1,500, 2,000); see quantity E in Section 2, third last paragraph. As
expected, the CGR estimator competes better with the COR estimator with
increasing correlation and sample size.
For
the variable
the CGR
estimator is inferior to the HT estimator at correlation level
and in half of
the simulated settings at
see positive
values in columns 4 and 8. This inefficiency of the CGR estimator ranges from
6.50% (at
to 28.64% (at
in the sample
allocation (2,000, 2,000, 1,000), and reduces to 0.25% (at
to 17.57% (at
in the sample allocation
(1,500, 1,500, 2,000). This is explained by the larger skewness of
(the
variable being
used a auxiliary to
in the
regression procedure); the lower levels of inefficiency are observed at
when the
differential in skewness between
and
is the smallest.
On the other hand, at correlation
and with allocation
(2,000, 2,000, 1,000), the efficiency gain of the CGR estimator relative to the
HT estimator ranges from 2.46% (when the skewness differential is the largest)
to 25.74% (when the skewness differential is the smallest), with similar efficiency
levels displayed for allocation (1,500, 1,500, 2,000). The COR estimator is
more efficient than the CGR estimator in all settings, the relative efficiency
being close to the nominal one for SRS (same efficiency as with
For
too, the CGR
estimator competes better with COR estimator with increasing correlation and
sample size.
This
limited empirical study, which essentially simulates the SRS version of Theorem
confirms the
theory on the efficiency of the optimal estimator COR, even for modest sample
size, and shows the usefulness of the two composite estimators CGR and COR in
partially recovering the information loss due to splitting the full
questionnaire. It also shows that the practical CGR estimator is not always a
good substitute of the COR estimator for small samples and low correlation
between
and
Previous | Next