Comparison of some positive variance estimators for the Fay-Herriot small area model
6. Simulation results and analysisComparison of some positive variance estimators for the Fay-Herriot small area model
6. Simulation results and analysis
6.1 Monte Carlo Distribution
of the variance estimators
Table 6.1 shows that the REML variance
estimator has the lowest bias
and
the highest variance. The lower efficiency of REML may be due to it not being a
smooth function of the data caused by its split definition (3.1). The MIX
estimator inherits some of this low efficiency. The other variance estimators
have lower variability, higher positive bias but the conditional expectation of
AM.YL and AR.YL given
is
close to zero. The unconditional bias of AM.LL is higher than the unconditional
bias of the MIX. By definition of the MIX
estimator, the conditional bias of
the MIX and AM.LL estimators coincide. The MIX estimator also converges
faster than the other estimators. For example, given the probability
distribution over the 10,000 variance estimates with
we
calculated the probability of estimates lying within an interval containing
The
probability that the MIX estimates lie between 0.6 and 1.4 is 0.47 whereas the
probability that AM.YL estimates lie between 0.6 and 1.4 is 0.16. Furthermore,
the probability that MIX estimates are smaller than 0.2 is 0.05 whereas the
probability that AM.YL estimates are smaller than 0.2 is 0.53.
Table 6.1
Expectation, variance and conditional expectation and variance of
Table summary
This table displays the results of Expectation. The information is grouped by Method (appearing as row headers), m and XXXX (appearing as column headers).
Method
m
REML
15
1.48
3.38
43%
N/A
N/A
45
1.21
1.67
29%
N/A
N/A
100
1.07
0.81
16%
N/A
N/A
AM.LL
15
2.80
1.37
43%
1.80
0.11
45
1.88
1.01
29%
0.94
0.03
100
1.49
0.51
16%
0.63
0.01
MIX
15
2.28
1.87
43%
1.80
0.11
45
1.48
1.31
29%
0.94
0.03
100
1.17
0.66
16%
0.63
0.01
AR.YL
15
1.66
2.99
43%
0.27
0.01
45
1.24
1.72
29%
0.06
0.00
100
1.08
0.80
16%
0.02
0.00
AM.YL
15
0.52
0.84
43%
0.10
0.00
45
0.65
0.85
29%
0.03
0.00
100
0.76
0.59
16%
0.01
0.00
6.2 True MSE of the
EBLUP, average relative bias and average root relative MSE of the MSE
estimators
All variance estimators are consistent and
asymptotically normal with variance converging at the same rate. They differ in
their bias: REML, AR.YL and MIX have bias of the order of
whereas AM.LL and
AM.YL have bias of the order
The bias inherent in the last
three methods impacts the estimation of the MSE of the EBLUP even for a
moderate number of areas.
For
Tables
6.2a and 6.2b show that as
increases, the MSE of the EBLUP decreases and
this relationship holds irrespective of the number of areas. We observe that the MSE of
under
the REML and the MIX variance estimators are slightly higher than the rest of
the MSEs, due to the higher variability inherent in these variance estimators.
Table 6.2a presents results for the Taylor linearization MSE estimator and the
two parametric MSE estimators under REML, AM.LL, AR.YL and AM.YL variance
estimation. Table 6.2b presents results for the following MSE estimators under the
MIX variance estimation: RB_Y1 defined in (4.3), RB_Y2 defined in (4.2),
M_et_al, defined in (4.5), PB MSE and naïve PB MSE estimators. Among the Taylor MSE estimators, RB_Y1 and M_et_al under MIX
exhibit the lowest bias. Among the bootstrap MSE estimators PB under MIX and
Naive PB under AR.YL exhibit the lowest bias. Turning to the RRMSE of the MSE
estimators, it decreases as
increases. Differences between the RB_Y2 MSE estimator under the MIX and the Taylor
MSE estimator under the AM.YL seem small but consistent. While ARB is
lower for the RB_Y1, the M_et_al and the Naive MSE estimators under the MIX
method than for the RB_Y2 under the MIX, and also lower for the Taylor and the
Naive PB under the AR.YL method than for the RB_Y2 under the MIX, the opposite
happens in terms of RRMSE. This can be explained in part due to the extreme negative
conditional bias exhibited by these MSE estimators (i.e., the RB_Y1 and the
M_et_al under the MIX and the Taylor and the Naive PB under the AR.YL method)
as shown in Table 6.3. Even for
there is a relatively high proportion (16%) of
populations that yield
and in
these populations, estimates from most variance methods and most MSE estimators
are farthest below the true value. That is, for these MSE estimators, the
conditional MSE estimators do not fare well. The PB MSE estimator seems to
adjust well for bias, but it is more variable than the Naive PB MSE. When we
also include the ARB, the RRMSE and the
in the
evaluation, the RB_Y2 under the MIX method, followed closely by Naïve PB under
the MIX seems to perform the best. This may suggest the superiority of RB_Y2 and
Naive under MIX for
which is a moderate number of areas for this data.
Table 6.2a
MSE, ARB & RRMSE (percentage) of MSE Estimators,
Table summary
This table displays the results of MSE. The information is grouped by Method (appearing as row headers), This cell is empty, Taylor MSE estimator , PB estimator and Naïve PB estimator, calculated using XXXX, ARB and RRMSE units of measure (appearing as column headers).
Method
Taylor MSE estimator
PB estimator
Naïve PB estimator
ARB
RRMSE
ARB
RRMSE
ARB
RRMSE
REML
0.06
135.4
5.1
71.1
-4.4
80.7
1.6
69.9
0.1
132.1
5.3
64.7
-4.7
74.0
-0.2
63.0
0.14
119.5
6.0
61.9
-5.5
71.3
-1.8
59.9
0.2
119.2
6.5
53.6
-5.8
62.4
-3.4
51.7
0.3
106.6
8.2
46.7
-6.8
55.0
-5.6
44.8
AM.LL
0.06
134.9
6.1
75.4
8.2
66.9
31.3
63.8
0.1
131.2
6.8
68.1
7.8
59.5
27.5
55.7
0.14
118.3
8.1
64.6
7.8
55.6
26.5
51.2
0.2
117.6
8.4
55.4
6.5
46.7
21.6
42.1
0.3
104.5
10.2
46.7
5.5
38.8
18.2
34.0
AR.YL
0.06
135.4
6.6
69.3
-4.3
80.2
2.1
69.4
0.1
132.0
7.4
61.9
-4.5
73.4
0.3
62.5
0.14
119.4
9.0
58.0
-5.3
70.6
-1.2
59.3
0.2
119.0
10.6
48.2
-5.6
61.8
-2.9
51.1
0.3
106.4
14.7
38.5
-6.6
54.3
-5.1
44.1
AM.YL
0.06
134.7
10.0
63.2
-12.3
81.0
-19.6
65.9
0.1
131.3
12.0
56.6
-12.5
75.2
-19.7
61.2
0.14
118.8
15.0
53.1
-13.7
73.3
-21.4
59.8
0.2
118.6
18.1
44.8
-13.4
65.2
-20.7
53.5
0.3
106.4
25.2
38.4
-14.4
58.8
-21.7
48.6
Table 6.2b
MSE, ARB & RRMSE (percentage) of MSE Estimators
Table summary
This table displays the results of MSE This cell is empty, RB_Y1 , RB_Y2 , M_et_al , PB estimator and Naïve PB estimator, calculated using XXXX, ARB and RRMSE units of measure (appearing as column headers).
RB_Y1
RB_Y2
M_et_al
PB estimator
Naïve PB estimator
ARB
RRMSE
ARB
RRMSE
ARB
RRMSE
ARB
RRMSE
ARB
RRMSE
MIX
0.06
135.4
2.7
75.7
13.6
63.0
5.2
71.1
-3.0
75.3
8.8
62.4
0.1
132.1
3.6
68.3
14.9
56.1
5.3
64.7
-3.2
68.3
6.6
55.4
0.14
119.5
4.9
64.7
16.0
52.4
6.0
61.9
-3.9
65.1
5.3
51.8
0.2
119.1
6.3
55.2
16.7
43.8
6.5
53.6
-4.4
56.3
2.9
43.7
0.3
106.5
9.4
46.2
19.9
36.0
8.3
46.7
-5.4
48.6
0.6
36.7
Table 6.3
and
(percentage),
Table summary
This table displays the results of XXXX and XXXX (percentage). The information is grouped by Method (appearing as row headers), XXXX, This column is empty, Taylor MSE estimator, PB estimator and Naïve PB estimator, calculated using This cell is empty, RB_Y1, RB_Y2, M_et_al, PB estimator and Naïve PB estimator units of measure (appearing as column headers).
Method
This column is empty
Taylor MSE estimator
This column is empty
PB estimator
Naïve PB estimator
REML
0.06
135.6
This cell is empty
-76.5
This cell is empty
-98.6
-74.8
0.1
133.0
This cell is empty
-74.5
This cell is empty
-94.4
-71.8
0.14
121.5
This cell is empty
-78.6
This cell is empty
-98.0
-74.9
0.2
120.4
This cell is empty
-73.1
This cell is empty
-89.8
-68.6
0.3
108.0
This cell is empty
-73.6
This cell is empty
-88.2
-67.3
AM.LL
0.06
135.0
This cell is empty
-92.0
This cell is empty
-67.6
-26.1
0.1
132.2
This cell is empty
-85.2
This cell is empty
-62.0
-24.6
0.14
120.2
This cell is empty
-85.4
This cell is empty
-62.3
-25.4
0.2
118.8
This cell is empty
-74.4
This cell is empty
-54.1
-22.1
0.3
105.9
This cell is empty
-65.9
This cell is empty
-49.7
-20.6
AR.YL
0.06
135.5
This cell is empty
-68.6
This cell is empty
-96.9
-73.0
0.1
132.9
This cell is empty
-62.4
This cell is empty
-92.6
-70.0
0.14
121.4
This cell is empty
-61.1
This cell is empty
-96.1
-73.0
0.2
120.2
This cell is empty
-48.9
This cell is empty
-87.9
-66.7
0.3
107.8
This cell is empty
-34.5
This cell is empty
-86.1
-65.4
AM.YL
0.06
134.9
This cell is empty
-45.9
This cell is empty
-88.6
-74.7
0.1
132.1
This cell is empty
-39.4
This cell is empty
-85.4
-72.2
0.14
120.4
This cell is empty
-36.0
This cell is empty
-89.3
-75.7
0.2
119.6
This cell is empty
-23.6
This cell is empty
-82.3
-69.7
0.3
107.6
This cell is empty
-6.5
This cell is empty
-81.7
-69.3
This cell is empty
This cell is empty
RB_Y1
RB_Y2
M_et_al
PB estimator
Naïve PB estimator
MIX
0.06
135.0
-92.0
-22.0
-76.4
-46.0
-27.0
0.1
132.2
-85.2
-17.7
-74.3
-42.7
-25.9
0.14
120.2
-85.4
-15.0
-78.3
-43.3
-27.0
0.2
118.8
-74.4
-7.6
-72.8
-37.6
-23.9
0.3
105.9
-65.9
1.5
-73.1
-34.6
-22.6
Tables 6.4a and b below display results for
with
9 areas per
The
AM.YL yields MSEs smaller than the MIX, with differences in MSEs of at most 2%.
As the number of areas decreases, the bias of the variance estimators increase
and the MSE estimators are affected by this. Indeed, the ARB of all MSE
estimators have increased. In particular, the ARB of the Taylor MSE estimators
under YL and LL variance estimation and the ARB of RB_Y2, have increased by
100% over the ARB with 100 areas. In terms of RRMSE, the Taylor MSE under the
AM.YL has slightly lower RRMSE than the RB_Y2
under the MIX method for very small
In general, the variability (in RRMSE) of the RB_Y2 is lower than that of the Taylor
under LL and YL estimation and than that of the RB_Y1 and the M_et_al. This may
be due in part to the underestimation of the MSEs for the populations with zero
REML estimates, which, for
range
around 30% of all populations. Table 6.5 illustrates this better: given
there is
serious underestimation in RB_Y1 and M_et_al.
Table 6.4a
MSE, ARB & RRMSE (percentage) of MSE Estimators,
areas Table summary
This table displays the results of MSE. The information is grouped by Method (appearing as row headers), XXXX, Taylor MSE estimator, PB estimator and Naïve PB estimator, calculated using ARB and RRMSE units of measure (appearing as column headers).
Method
Taylor MSE estimator
PB estimator
Naïve PB estimator
ARB
RRMSE
ARB
RRMSE
ARB
RRMSE
REML
0.06
171.4
11.8
94.7
-4.7
107.0
6.2
89.2
0.1
174.1
11.9
83.9
-5.3
93.8
3.0
76.2
0.14
171.3
12.6
74.5
-5.4
81.9
1.1
65.3
0.2
166.6
13.9
63.4
-5.8
66.7
-1.2
52.0
0.3
128.9
20.1
63.0
-7.0
61.4
-3.1
46.7
AM.LL
0.06
171.1
15.5
100.0
16.0
84.9
43.5
83.3
0.1
173.4
16.8
87.0
14.4
71.1
36.7
68.5
0.14
170.4
17.7
75.7
12.6
59.7
30.7
56.7
0.2
165.3
18.2
61.7
9.9
46.2
23.5
43.2
0.3
127.5
25.6
55.0
10.0
39.7
22.6
36.6
AR.YL
0.06
171.1
17.2
89.9
-3.7
105.0
8.0
87.6
0.1
173.6
19.6
76.9
-4.3
91.8
4.8
74.6
0.14
170.8
22.6
65.8
-4.4
79.9
2.7
63.7
0.2
166.0
27.3
53.7
-4.8
64.8
0.3
50.5
0.3
128.3
43.8
54.8
-5.7
59.3
-1.3
45.0
AM.YL
0.06
167.5
30.2
78.4
-18.0
97.3
-23.8
73.3
0.1
169.6
36.5
72.2
-18.0
87.7
-23.6
66.7
0.14
167.0
42.7
69.3
-17.2
78.0
-22.3
59.7
0.2
162.8
52.1
70.8
-15.8
65.4
-20.3
50.6
0.3
126.0
81.3
91.1
-18.0
62.3
-22.9
48.4
Table 6.4b
MSE, ARB & RRMSE (percentage) of MSE Estimators,
areas Table summary
This table displays the results of MSE XXXX, RB_Y1, RB_Y2, M_et_al, PB estimator and Naïve PB estimator, calculated using ARB, RRMSE and RRMS units of measure (appearing as column headers).
RB_Y1
RB_Y2
M_et_al
PB estimator
Naïve PB estimator
ARB
RRMSE
ARB
RRMSE
ARB
RRMSE
ARB
RRMSE
ARB
RRMSE
MIX
0.06
171.4
9.8
99.4
31.9
84.0
11.8
94.7
3.5
93.8
21.9
78.5
0.1
174.0
12.1
86.2
33.2
73.1
11.9
83.9
2.6
80.4
17.5
65.1
0.14
171.2
14.5
74.9
34.4
64.6
12.6
74.5
2.0
68.7
14.0
54.4
0.2
166.5
17.7
61.7
36.0
55.8
13.9
63.4
0.7
54.5
9.8
41.8
0.3
128.9
28.8
57.6
48.8
58.2
20.2
63.1
0.3
48.6
8.7
35.9
Taking into account the ARB, the RRMSE and the
of the
MSE estimators, the Naive PB MSE estimator under the MIX performs the best for
larger
Table
6.6 displays performance measures, averaged over the five sampling variance
groups, for the three Taylor MSE estimators under the MIX with data from the
same model described in 5.1 but with three different values of
The
RB_Y2 performs better when
but as
becomes smaller, the M_et_al MSE estimator has an advantage, precisely because it was constructed under the premise that
is approximately zero.
Table 6.5
and
(percentage).
Areas Table summary
This table displays the results of XXXX and XXXX (percentage). XXXX Areas. The information is grouped by Method (appearing as row headers), XXXX, This column is empty, Taylor MSE estimator, PB estimator and Naïve PB estimator, calculated using This cell is empty, RB_Y1, RB_Y2 and M_et_al units of measure (appearing as column headers).
Method
This column is empty
Taylor MSE estimator
This column is empty
PB estimator
Naïve PB estimator
REML
0.06
170.2
This cell is empty
-64.3
This cell is empty
-89.7
-60.7
0.1
173.0
This cell is empty
-62.4
This cell is empty
-83.7
-57.1
0.14
170.2
This cell is empty
-58.1
This cell is empty
-75.5
-51.8
0.2
165.8
This cell is empty
-51.9
This cell is empty
-65.1
-44.8
0.3
131.1
This cell is empty
-59.0
This cell is empty
-70.5
-49.2
AM.LL
0.06
170.0
This cell is empty
-71.5
This cell is empty
-49.0
-3.1
0.1
172.3
This cell is empty
-61.5
This cell is empty
-42.1
-2.3
0.14
169.1
This cell is empty
-51.1
This cell is empty
-35.7
-2.1
0.2
164.7
This cell is empty
-38.3
This cell is empty
-28.3
-1.6
0.3
129.9
This cell is empty
-28.8
This cell is empty
-29.1
-3.7
AR.YL
0.06
169.9
This cell is empty
-48.3
This cell is empty
-86.2
-56.7
0.1
172.6
This cell is empty
-38.0
This cell is empty
-80.2
-53.2
0.14
169.7
This cell is empty
-25.9
This cell is empty
-72.2
-48.2
0.2
165.3
This cell is empty
-7.4
This cell is empty
-61.9
-41.5
0.3
130.5
This cell is empty
19.3
This cell is empty
-66.8
-45.5
AM.YL
0.06
166.6
This cell is empty
-8.2
This cell is empty
-73.5
-60.7
0.1
168.8
This cell is empty
3.8
This cell is empty
-70.1
-58.1
0.14
166.1
This cell is empty
16.1
This cell is empty
-64.1
-53.3
0.2
162.2
This cell is empty
35.9
This cell is empty
-56.1
-46.8
0.3
128.1
This cell is empty
72.8
This cell is empty
-62.5
-52.5
This cell is empty
This cell is empty
RB_Y1
RB_Y2
M_et_al
This cell is empty
This cell is empty
MIX
0.06
170.0
-71.5
6.2
-64.3
-28.1
-4.0
0.1
172.3
-61.5
13.2
-62.3
-23.8
-3.5
0.14
169.1
-51.1
18.9
-57.8
-20.0
-3.3
0.2
164.7
-38.3
26.8
-51.6
-15.7
-2.9
0.3
129.9
-28.8
40.4
-58.7
-16.7
-5.1
Table 6.6
MSE, ARB,
and RRMSE (percentage), 45 areas Table summary
This table displays the results of MSE. The information is grouped by XXXX (appearing as row headers), XXXX, RB_Y1, RB_Y2 and M_et_al, calculated using ARB, XXXX and RRMSE units of measure (appearing as column headers).
RB_Y1
RB_Y2
M_et_al
ARB
RRMSE
ARB
RRMSE
ARB
RRMSE
29
1
108
16
-50
75
36
21
66
14
-59
75
48
0.2
99
48
-36
101
113
88
114
47
-38
94
51
0.1
91
58
-33
108
137
107
127
58
-32
100
Tables 6.7a and 6.7b below show the outcomes
for
areas
with 3 areas per
Differences
in MSEs per variance estimation method are at most 5%.
There is no monotone relationship between ARB
or RRMSE and
which
could be an indication that the second order approximation to estimating the
MSE is poor under every method of variance estimation. The ARB of all Taylor
MSE estimators under the LL and the YL methods of variance estimation are
unacceptably high and the same is true for the RRMSE. The RB_Y2 under the MIX
does not fare well either. The reason for this last outcome is clear: the high
% of zero REML estimates (43%) implies the MIX coincides with AM.LL for the
zero REML populations. Thus, the MIX has a positive bias for
and
the RB_Y2 does not account for this bias. The RB_Y1 accounts for the bias in
the MIX, but the bias estimator is not very precise for
The
M_et_al MSE estimator almost coincides with the ARB and RRMSE of the Taylor MSE
estimator under the REML variance estimation, because by definition they are
equal when
The
of the
three Taylor MSE estimators under the MIX is poor. Taking into account all
performance measures, the bootstrap MSE estimators perform better than the
Taylor MSE estimators. For
areas
with 3 areas per
PB under
MIX performs the best, followed by the Naive under AR.YL and AM.YL.
Table 6.7a
MSE, ARB & RRMSE (percentage) of MSE estimators,
areas Table summary
This table displays the results of MSE. The information is grouped by Method (appearing as row headers), XXXX, Taylor MSE estimator, PB estimator and Naïve PB estimator, calculated using ARB and RRMSE units of measure (appearing as column headers).
Method
Taylor MSE estimator
PB estimator
Naïve PB estimator
ARB
RRMSE
ARB
RRMSE
ARB
RRMSE
REML
0.06
584.8
12.6
87.9
1.2
85.9
6.9
64.5
0.1
376.7
26.5
106.3
2.3
85.6
9.6
62.8
0.14
352.5
25.2
90.1
0.7
54.1
4.3
39.3
0.2
209.4
43.0
123.0
0.4
74.0
6.3
51.1
0.3
198.7
50.6
124.7
-1.0
46.3
2.6
31.5
AM.LL
0.06
589.3
24.1
89.3
13.7
61.2
24.1
65.8
0.1
380.7
48.3
107.1
19.4
58.6
32.5
62.9
0.14
355.7
40.2
88.6
10.0
36.2
16.8
38.1
0.2
212.5
76.3
117.9
17.8
45.1
28.7
47.3
0.3
200.7
76.5
105.1
10.7
26.9
17.2
27.6
AR.YL
0.06
583.3
23.8
83.3
3.2
79.5
3.2
61.6
0.1
375.1
53.3
106.7
5.4
78.6
5.4
59.7
0.14
351.3
53.3
102.7
2.4
49.4
2.4
37.1
0.2
207.7
107.3
153.1
4.1
66.2
4.1
47.2
0.3
197.5
142.0
199.4
1.9
41.1
1.9
28.9
AM.YL
0.06
571.4
41.6
103.5
-8.0
61.2
-9.2
43.3
0.1
363.3
95.0
161.4
-11.3
62.9
-13.2
44.1
0.14
342.0
97.2
179.7
-6.7
40.4
-7.8
29.3
0.2
197.0
198.4
274.6
-14.5
58.2
-16.7
41.7
0.3
191.4
270.2
362.4
-11.5
38.4
-13.1
28.7
Table 6.7b
MSE, ARB & RRMSE (percentage) of MSE estimators,
areas Table summary
This table displays the results of MSE XXXX, RB_Y1, RB_Y2, M_et_al, PB estimator and Naïve PB estimator, calculated using ARB, RRMSE, RRMS, %ARB and %RRMSE units of measure (appearing as column headers).
RB_Y1
RB_Y2
M_et_al
PB estimator
Naïve PB estimator
ARB
RRMSE
ARB
RRMSE
ARB
RRMSE
%ARB
%RRMSE
%ARB
%RRMSE
MIX
0.06
584.9
21.0
84.7
35.4
93.7
12.6
87.9
10.0
53.8
19.3
62.1
0.1
377.1
46.0
103.9
68.4
122.6
26.4
106.1
14.8
52.7
26.6
59.9
0.14
353.0
41.9
91.5
59.4
112.7
25.0
89.9
7.6
33.2
13.7
36.7
0.2
209.7
83.2
127.8
108.9
155.8
42.8
122.8
14.0
42.5
23.7
46.0
0.3
198.9
94.8
136.7
117.1
162.2
50.4
124.6
8.7
26.6
14.5
27.7
Summarizing, under the Fay-Herriot model with positive
and among the positive variance
estimators under study, the MIX and the AR.YL variance estimators are the only
ones with negligible asymptotic bias. The AM.YL and the LL variance estimators
have a larger asymptotic bias. On the other hand, our simulation showed that
for a moderate number of areas and for populations that yield zero REML
estimates, both YL variance estimators were negatively biased, and produced
EBLUPs that were close to the synthetic estimator of the mean. In contrast, the
MIX, built as the combination of the AM.LL and the REML, was only mildly
negatively biased in these populations. Moreover, the unconditional
distribution of the MIX approached normality much faster than those of the other variance
estimators.
Table 6.8
and
areas Table summary
This table displays the results of XXXX and XXXX XXXX areas. The information is grouped by Method (appearing as row headers), XXXX, This column is empty, Taylor MSE estimator, PB estimator and Naïve PB estimator, calculated using This cell is empty, RB_Y1, RB_Y2, M_et_al, PB estimator and Naive PB estimator units of measure (appearing as column headers).
Method
This column is empty
Taylor MSE estimator
This column is empty
PB estimator
Naïve PB estimator
REML
0.06
594.2
This cell is empty
-22.6
This cell is empty
-31.7
-16.5
0.1
381.2
This cell is empty
-32.9
This cell is empty
-43.2
-22.5
0.14
345.1
This cell is empty
-17.7
This cell is empty
-22.7
-10.7
0.2
212.7
This cell is empty
-41.1
This cell is empty
-47.3
-25.5
0.3
197.9
This cell is empty
-30.4
This cell is empty
-32.7
-17.6
AM.LL
0.06
595.6
This cell is empty
-4.1
This cell is empty
-5.7
12.1
0.1
385.7
This cell is empty
8.6
This cell is empty
-7.0
15.6
0.14
351.2
This cell is empty
18.9
This cell is empty
-2.0
10.4
0.2
216.0
This cell is empty
46.4
This cell is empty
-5.8
14.4
0.3
199.5
This cell is empty
67.0
This cell is empty
-2.9
9.8
AR.YL
0.06
592.2
This cell is empty
-0.8
This cell is empty
-27.1
-11.0
0.1
379.7
This cell is empty
21.0
This cell is empty
-36.5
-14.8
0.14
344.5
This cell is empty
44.0
This cell is empty
-18.6
-6.3
0.2
210.9
This cell is empty
98.2
This cell is empty
-38.6
-16.4
0.3
196.6
This cell is empty
177.3
This cell is empty
-26.1
-11.0
AM.YL
0.06
581.7
This cell is empty
30.7
This cell is empty
-21.9
-18.0
0.1
368.6
This cell is empty
79.8
This cell is empty
-31.5
-25.8
0.14
333.9
This cell is empty
98.3
This cell is empty
-15.2
-11.9
0.2
198.9
This cell is empty
198.0
This cell is empty
-36.4
-30.0
0.3
190.0
This cell is empty
296.3
This cell is empty
-26.2
-21.5
This cell is empty
This cell is empty
RB_Y1
RB_Y2
M_et_al
PB estimator
Naive PB estimator
MIX
0.06
595.6
-4.1
27.9
-22.9
3.4
17.8
0.1
385.7
8.6
57.1
-33.7
5.1
22.8
0.14
351.2
18.9
58.5
-19.1
4.9
14.3
0.2
216.0
46.4
102.4
-42.0
5.9
20.4
0.3
199.5
67.0
116.3
-30.9
4.8
13.4
In terms of MSE of the EBLUP, there were
considerable gains in precision over the direct estimator, under all methods of
variance estimation considered here, even for a small number of areas. The
AM.LL and both the AM.YL and the AR.YL variance estimators carried lower
variability than the REML and the MIX. It impacted only minimally the MSE of
the EBLUP: differences among MSEs for the same signal to noise ratio were
small. These differences widened as either the number of areas or the signal to
noise ratio decreased. Thus, it may possible that for an extremely low signal
to noise ratio, the MSE under MIX would be somewhat larger
than under the AM.YL variance estimator.
Under the MIX method of variance estimation, we
compared three different Taylor-type MSE estimators and two bootstrap MSE
estimators. All three Taylor estimators of the MSE under MIX (RB_Y1, RB_Y2 and
M_et_al) are unbiased up to the second order. Also the Taylor-type estimators
of the MSE under the LL and the YL are unbiased up to the second order. RB_Y1,
AM.LL and AM.YL may yield negative MSE estimates.
The Taylor MSE under the REML method of
variance estimation and the M_et_al under the MIX coincide by definition, hence
their performance measures have negligible differences (their true MSEs are
different, however in our study, for
the
MIX coincided with the REML 84% of the time). For a moderate number of areas, which for this data could be
and for populations that yield zero REML estimates, both the Taylor MSE
estimators under the REML and the M_et_al MSE estimators do not account for the
variation due to the estimation of
and this is reflected in their very negative
which is
below -60% for the
smaller signal to noise ratios. On the other hand, the RB_Y1 does account for
the variation due to the estimation of
but its
is also very negative: the RB_Y1 is a split
MSE estimator that for populations with
it
subtracts a factor of the unconditional bias of the AM.LL, which is always
positive, whereas a better formula for a split MSE estimator would be to use an
estimator of the conditional bias
Indeed, even for a moderate number of areas
Table
6.1 shows that the unconditional bias of the MIX is 49% whereas the conditional
bias of the MIX is -37%.
The PB MSE estimator under the AR.YL
and the MIX methods adjusted well for the bias but paid in terms of variance.
Among all the MSE estimators it appears that the Naive Bootstrap MSE estimator performed best, and even better under the
MIX variance estimation, when taking into account the three measures ARB,
and
RRMSE together. We found that for a moderate number of areas, the RB_Y2 had the
lowest RRMSE among the Taylor estimators under the MIX method. On the other hand, M_et_al is most reliable
when the true underlying variance
is very small: in this case M_et_al is
effectively the MSE estimator of the synthetic estimator of the small area
mean. We do not recommend relying on the second order approximation to the MSE
when
is small: the approximation (2.6) to the MSE
does not necessarily hold, the performance measures obtained from our study are
very unstable and they may vary from data set to data set.
In conclusion, under the hypothesis of
the
relative performances of competing positive variance estimators depend on the
size of
the
signal to noise ratio, the number of areas and the objective function. For a
moderate number of areas, the MIX variance estimator appeared to perform better
than the LL and the YL estimators in this study; under the MIX method, the
Naive PB MSE estimator had the lowest
and
RRMSE combined; the M_et_al MSE estimator under the MIX variance estimator
performed marginally better than the RB_Y1 when the underlying
was very
small. However, the percentage of REML zeros yielded under the simulation model
shows that an outcome of
and/or
negative tests of hypothesis do not necessarily mean that
is
sufficiently small to rely on M_et_al. In the absence of other information, the
Naive PB estimator under the MIX appears to perform better.
Acknowledgements
The authors would like to thank
Professor J.N.K. Rao from Carleton University for his useful comments and to
Victor Estevao from Statistics Canada for developing the grid maximization
especially for this project. We also would like to thank the reviewers for
their careful appraisal of our paper and for their suggestions to improve this
paper.
Appendix A
Proof of Theorem 4.1
The asymptotic variance of
is
given by:
We show that
Indeed, by the Holder and Minkowski
inequalities, with any
and
setting
and
the indicator
of
populations with
we
have:
since
is uniformly bounded and
Note that the AM.LL and REML estimators of
are
uniformly bounded as a consequence of their almost sure convergence to
(see,
for example, Yuan and Jennrich 1998).
Proof of Theorem 4.2
We denote by
the
maximum likelihood variance estimator.
We show first that
Let
be the
estimating equation that yields the variance estimator *. Equation (3.4)
implies:
With
and
equation (A.3) implies:
Now, using equation (A.4), the
consistency of the ML and AM.LL estimators of
the
two-term Taylor expansion of
at
and
as
the
left- hand side in (A.3) is equal to:
The last equality
above implies
Similarly, we establish a relationship between
and
given
that
follows
from conditions 1 through 3 in Section 3 and equation (3.1), we have:
Equation
(A.6) and the same argument as with the AM.LL estimator, imply:
Equations (A.5)
and (A.7) combined, yield:
Now we express the
bias of the MIX estimator by:
We add and subtract
from
the right-hand side of the equation above to obtain:
Now, since
is
uniformly bounded, we apply the Holder and Minkowski inequality with
and
equation (A.8) to the last term in (A.9) to obtain:
Proof of Remark 4.2:
is unbiased up to the second order
since
in
and
cancels
out in (A.11). But
and is uniformly bounded under
the regularity conditions given in Section 2, hence the last term in (A.11) is
also an
which renders
unbiased up to the second order.
Appendix B
B.1 Comparison between
REML and AR.YL using the scoring algorithm
The scoring algorithm could sometimes yield
zero estimates for the likelihood of the AR.YL. Indeed, for data sets simulated
under the model given in Section 5, with
and
the REML and AR.YL scoring algorithms yielded 28%
and 26% zeros respectively.
Figures B.1 to B.3 illustrate the why: the likelihoods correspond to a single
population generated under the model with
for
which
Description of Figure B.1, B.2 and B.3
Figure B.1
Figure
showing the REML likelihood on the x-axis versus on the y-axis. The REML likelihood reaches its
maximum value for After, it quickly decreases to 0 which is
reached close to
Figure B.2
Figure
showing the AR.YL likelihood on the x-axis versus on the y-axis. The AR.YL likelihood is
increasing up to its maximum value, reached for very close to 0. After, it quickly decreases
to 0 which is reached close to
Figure B.3
Figure
showing the AM.LL likelihood on the x-axis versus on the y-axis. The AM.LL likelihood is about 0
for After, it quickly increases to its maximum
value reached for close to 0. Finally, it slowly decreases to 0
which is reached for between 3 and 4.
Figure B.2 shows that the maximum value of the
AR.YL likelihood is very near the border. The scoring algorithm may often miss
the maximum and yield a zero value. Figure B.3 shows that the AM.LL likelihood
has a maximum value that differentiates better from the border.
B.2 Treatment of zeros
in the parametric bootstrap
For each estimate
and each
method of variance estimation:
Generate a large number B of random area
effects
and generate, independently of
sampling errors
Generate bootstrap data
If
then generate
from the synthetic model (see also Rao and
Molina 2015).
Fit the model to the bootstrap data and
obtain
for the MIX estimator calculate
is positive and
otherwise.
Now obtain
the corresponding EBLUP
the bootstrap components
and
The Naive MSE bootstrap estimator is
The PB MSE estimator (which is
adjusted for bias (Pfeffermann and Glickman 2004) is:
To calculate
average
over the populations with
and do similarly with
of
References
Chen, S., and Lahiri, P. (2008). On mean squared
prediction error estimation in small area estimation problems. Communications
in Statistics-Theory and Methods, 37, 1792-1798.
Chen, S., and Lahiri, P. (2011). On the estimation of
Mean Squared Prediction Error in small area estimation. Calcutta Statistical
Association Bulletin, 63, (Special 7th Triennial Proceedings
Volume), Nos. 249-252.
Cressie, N. (1992). REML estimation in empirical Bayes
smoothing of census undercount. Survey Methodology, 18, 1, 75-94.
Das, K., Jiang, J. and Rao, J.N.K. (2004). Mean squared error
of empirical predictor. The Annals of Statistics, 32, 2, 818-840.
Datta, G., and Lahiri, P. (2000). A unified measure of
uncertainty of estimated best linear unbiased predictors in small area
estimation problems. Statistica Sinica, 10, 613-627.
Estevao, V. (2014). Grid optimization algorithm for
maximum likelihood. Internal report, Statistical Research and Innovation Division
(SRID), Statistics Canada.
Fay, R.E., and Herriot, R.A. (1979). Estimation of income
from small places: An application of James-Stein Procedures to census data.
Journal of the American Statistical Association, 74, 269-277.
Lahiri, P., and Pramanik, S. (2011). Discussion of
“Estimating random effects via adjustment for density maximization” by C. Morris
and R. Tang. Statistical Science, 26, 2, 291-295.
Li, H., and Lahiri, P. (2011).
An adjusted maximum likelihood method for solving small area estimation
problems. Journal of Multivariate Analysis, 101, 882-892.
Molina, I., Rao, J.N.K. and Datta, G.S. (2015). Small
area estimation under a Fay-Herriot model with preliminary testing for the
presence of random effects. Survey Methodology, 41, 1, 1-19.
Morris, C.N. (2006). Mixed model prediction and small
area estimation (with discussions). Test, 15, 72-76.
Pfeffermann, D., and Glickman, H. (2004). Mean squarred
error approximation in small area estimation by use of parametric and
non-parametric bootstrap. Proceedings of the American Statistical
Association, Section on Survey Research Methods, Alexandria, VA. 4167-78.
Rao, J.N.K. (2003). Small Area Estimation. New
York: John Wiley & Sons, Inc.
Rao, J.N.K., and Molina, I. (2015). Small Area
Estimation, second edition. New York: John Wiley & Sons, Inc.
Rubin-Bleuer,
S., and Schiopu-Kratina, I. (2005). On the two-phase framework for joint
model and design-based inference. The Annals of Statistics, 33, 6,
2789‑2810.
Rubin-Bleuer, S., and You, Y. (2012). A positive
variance estimator for the Fay-Herriot small area model. SRID-2012-009E, Statistical
Research and Innovation Division (SRID), Statistics Canada.
Rubin-Bleuer,
S., Yung, W. and Landry, S. (2010). Adjusted maximum likelihood method
for a small area model accounting for time and area effects. SRID-2010-006E, Statistical
Research and Innovation Division (SRID), Statistics Canada.
Rubin-Bleuer,
S., Yung, W. and Landry, S. (2011). Adjusted maximum likelihood method
for a small area model accounting for time and area effects. Long abstract, Small Area Estimation, (SAE 20122) in Trier,Germany, International
Statistical Institute Satellite Conference.
Rubin-Bleuer,
S., Yung, W. and Landry, S. (2012). Variance Component Estimation through the
Adjusted Maximum Likelihood Approach. Presentation at the Conference in Honour
of the 75th birthday of J.N.K. Rao Carleton University, May
2012, Ottawa.
Yoshimori, M., and Lahiri, P. (2014). A new adjusted
maximum likelihood method for the Fay-Herriot small area model. Journal of
Multivariate Analysis, 124, 281‑294.
Yuan, P. (2009). Comparison of SAE methods of variance
estimation. Internal document, Statistical Research and Innovation Division
(SRID), Statistics Canada.
Yuan, K.H., and Jennrich, R. (1998). Asymptotics of
estimating equations under natural conditions. Journal of Multivariate Analysis, 65, 2, 245-260.
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.