Bayesian benchmarking of the Fay-Herriot model using random deletion
Section 4. Empirical studies
The purposes of these empirical studies are twofold.
First, it is demonstrated that the BFH model can be fit as stated in Section 2
and the deleting the last one benchmarking and random benchmarking methods are
performed. Second, the benchmarking methods are compared in a simulation study
that uses a well-used dataset in the small area literature.
In the data generation process, we use the data on corn
and soybean acres in Battese, Harter and Fuller (1988), available for 12
counties (areas) in
Iowa.
The resulting county-level corn and soybean acreages are constructed using a
number of segments sampled from the population (known number of segments).
Landsat satellite data on the number of pixels of corn and soybean in the
sampled segments (i.e., two covariates) are also available. The finite
population means of the number of pixels classified as corn and soybean for
each county are also reported. Starting with this dataset, we construct new
datasets with any number of areas.
The data generation process has two steps. In the first
step, the unit-level model
where
is fit to the data available for the
counties in Iowa. The area sample sizes are
and
Using least squares, we estimate
and
by
and
respectively. For the areas with sample size
greater than one, we set
equal to the estimated variance of the sample
mean
and we let
be their geometric mean. For the areas with sample
size equal to one, we set
equal to
The vector of covariates
has three elements, the integer one (for the
intercept), followed by the population means of pixels classified as corn and soybean.
In the second step, the data generation process for any
desired number
of small areas is illustrated. The covariates
are sampled with replacement from
Then, the area-level means are drawn using
where
and
are the least
squares estimates defined above. The sample variances
are generated
in two steps. First, the sample sizes are drawn from a uniform distribution,
Second, let
where
and
defined above.
Finally, the small area survey estimates are drawn using
The
benchmarking target is set equal to the sum of the
and variants of
this value,
scaled up or
down by 50%. In NASS’s practice, for crop county estimates, this target is an
already set state value. To evaluate the benchmarking methods in extreme cases,
we consider additional simulation scenarios, where an area sample size is set
to 2 or 50, or where the factor
is multiplied
by ten.
In
what follows, we report empirical results mostly for a simulation scenario
using 12 areas. Examples using larger number of areas are briefly discussed.
For example,
Iowa
has 99 counties, and one of NASS’s interests is in benchmarking county
estimates for planted acres, harvested acres and production (bushels) to the
predefined state-level total. For such small numbers of areas, no adjustment is
needed to the benchmarking procedures, deleting the last one or random
deletion, introduced in the previous sections. However, the computation may be
intolerable for an extremely large number of areas (say, one million), and some
adjustments would be needed to the current procedures.
It
is pertinent to discuss the computations for the simulation scenario with 12 areas.
For posterior inference under the BFH model, we have used 1,000 random draws,
and this runs in just a few seconds. On the other hand, it is more difficult to
run a Gibbs sampler for deleting one at a time or random deletion benchmarking.
However, we have provided an efficient Gibbs sampler as follows. We used a long
run of 20,000 iterations, with a “burn in” of the first 10,000 iterations,
choosing every tenth iterate thereafter. This was obtained by trial and error
that is gauged by the autocorrelations, the Geweke test for stationarity and
the effective sample sizes. For the 1,000 selected iterations, the
autocorrelations are all negligible. For random deletion benchmarking, the
p-values of the Geweke test for the three regression coefficients and
are, respectively, 0.651, 0.087, 0.828 and 0.699
(i.e., stationarity is not rejected), and the effective sample sizes are all
1,000. Also, the trace plots show no evidence of nonstationarity. Therefore,
the Gibbs sampler is efficient, taking a few seconds despite the large number
of runs.
The
performance of benchmarking methods is assessed using a set of metrics that
include posterior means (PM) and posterior standard deviations (PSD), and when
it is convenient, posterior coefficients of variation (PCV), numerical standard
errors (NSE) of the estimates and 95% highest posterior density intervals (95%
HPD). Numerical results are presented in Tables 4.1-4.8.
A
summarized version of the basic results is presented in Table 4.1, and
serves for comparison of the average, standard error and coefficient of
variation of the observed data with the PMs, PSDs, PCVs from the BFH model,
benchmarking (deleting the last one, LO) model and random benchmarking (RD)
model. The results in Table 4.1 apply to two simulation scenarios, where
small variation in the observed data, and
where
relatively larger variation in the observed
data. When
there are very little differences between the
observed data and the posterior quantities from the BFH, LO and RD models.
Given the small coefficients of variation for the survey estimates, it is
difficult for any model to further reduce variability. Hence, the PCVs are
comparable to the CVs of the survey estimates. On the other hand, three
interesting points can be made for the scenario where
First, the PMs under the BFH model can be very
different from those of LO and RD models and these latter two PMs are very
close. Second, the PSDs are much smaller than the standard errors of the
observed data; there are substantial gains in precision under the BFH model.
However, the PSDs are about four to five times smaller than those for the
observed data and the PSDs under the LO and RD model are about twice those of
the BFH model. The PCVs follow the same pattern. Third, LO and RD are very
close in all three measures (PMs, PSDs, PCVs) with RD model having just
slightly smaller PSDs. As expected, there is small difference between the LO
model and the RD model. But one must also observe that benchmarking the BFH
model is important because we can get answers that are different from the BFH
model at least in terms of posterior standard deviations and coefficients of
variation. Benchmarking is a jittering procedure, which helps to protect the
model from misspecification, and therefore it must lead to increased
variability in the small area estimates.
Table 4.1
Comparison of BFH model with no benchmarking, deleting the last one benchmarking and random benchmarking via posterior mean (PM), posterior standard deviation (PSD) and posterior coefficient of variation (PCV) for two values of
Table summary
This table displays the results of Comparison of BFH model with no benchmarking A, PM, PSD and PCV (appearing as column headers).
|
A |
PM |
PSD |
PCV |
OB |
BFH |
LO |
RD |
OB |
BFH |
LO |
RD |
OB |
BFH |
LO |
RD |
a. |
1 |
135.6 |
134.0 |
133.8 |
133.5 |
6.03 |
5.62 |
5.47 |
5.41 |
0.044 |
0.042 |
0.041 |
0.041 |
2 |
102.0 |
103.5 |
103.1 |
103.0 |
7.10 |
6.50 |
6.11 |
5.82 |
0.070 |
0.063 |
0.059 |
0.057 |
3 |
117.7 |
121.0 |
120.7 |
120.5 |
7.31 |
6.72 |
6.55 |
6.25 |
0.062 |
0.056 |
0.054 |
0.052 |
4 |
77.0 |
81.5 |
81.4 |
81.0 |
5.88 |
6.00 |
5.46 |
5.53 |
0.076 |
0.074 |
0.067 |
0.068 |
5 |
126.9 |
127.8 |
127.5 |
127.5 |
5.63 |
5.25 |
5.25 |
5.06 |
0.044 |
0.041 |
0.041 |
0.040 |
6 |
113.1 |
113.4 |
112.9 |
113.1 |
8.06 |
7.15 |
6.82 |
6.74 |
0.071 |
0.063 |
0.060 |
0.060 |
7 |
137.2 |
133.7 |
133.5 |
133.9 |
6.74 |
6.38 |
5.93 |
6.02 |
0.049 |
0.048 |
0.044 |
0.045 |
8 |
124.8 |
124.7 |
124.7 |
124.7 |
4.03 |
3.91 |
3.83 |
3.76 |
0.032 |
0.031 |
0.031 |
0.030 |
9 |
118.3 |
116.5 |
115.8 |
116.6 |
7.54 |
6.79 |
6.29 |
6.65 |
0.064 |
0.058 |
0.054 |
0.057 |
10 |
156.5 |
153.4 |
153.3 |
153.3 |
4.37 |
4.45 |
4.12 |
4.18 |
0.028 |
0.029 |
0.027 |
0.027 |
11 |
109.5 |
110.3 |
110.3 |
110.2 |
4.88 |
4.64 |
4.70 |
4.70 |
0.045 |
0.042 |
0.043 |
0.043 |
12 |
116.3 |
118.1 |
117.9 |
117.7 |
7.23 |
6.62 |
6.26 |
6.00 |
0.062 |
0.056 |
0.053 |
0.051 |
b. |
1 |
129.1 |
129.8 |
127.2 |
126.5 |
19.07 |
4.64 |
10.71 |
10.45 |
0.148 |
0.036 |
0.084 |
0.083 |
2 |
117.3 |
126.3 |
122.1 |
122.1 |
22.46 |
5.08 |
12.73 |
12.51 |
0.191 |
0.040 |
0.104 |
0.102 |
3 |
120.0 |
145.5 |
137.3 |
136.9 |
23.11 |
5.93 |
12.91 |
12.68 |
0.193 |
0.041 |
0.094 |
0.093 |
4 |
68.8 |
107.3 |
94.0 |
93.6 |
18.60 |
7.47 |
12.04 |
11.86 |
0.270 |
0.070 |
0.128 |
0.127 |
5 |
142.4 |
146.4 |
142.3 |
142.2 |
17.80 |
4.52 |
11.98 |
11.15 |
0.125 |
0.031 |
0.084 |
0.078 |
6 |
108.8 |
120.2 |
115.2 |
115.4 |
25.49 |
5.43 |
11.75 |
11.66 |
0.234 |
0.045 |
0.102 |
0.101 |
7 |
136.8 |
116.2 |
118.2 |
119.0 |
21.31 |
5.37 |
11.32 |
11.90 |
0.156 |
0.046 |
0.096 |
0.100 |
8 |
124.5 |
132.5 |
127.3 |
127.3 |
12.76 |
4.39 |
9.00 |
8.91 |
0.102 |
0.033 |
0.071 |
0.070 |
9 |
144.2 |
127.5 |
128.0 |
129.5 |
23.86 |
5.33 |
12.74 |
14.00 |
0.165 |
0.042 |
0.100 |
0.108 |
10 |
172.9 |
129.2 |
145.5 |
145.3 |
13.81 |
9.23 |
10.28 |
10.37 |
0.080 |
0.071 |
0.071 |
0.071 |
11 |
109.1 |
114.7 |
110.6 |
110.2 |
15.42 |
4.31 |
10.53 |
10.43 |
0.141 |
0.038 |
0.095 |
0.095 |
12 |
108.4 |
120.3 |
114.6 |
114.2 |
22.87 |
5.10 |
12.42 |
12.01 |
0.211 |
0.042 |
0.108 |
0.105 |
Under the basic simulation scenario, we compare the
deletion benchmarking methods to one of the methods in DGSM that provides
benchmarked posterior estimates without deletion. To match the notation in
DGSM, the benchmarking equation must be rewritten as
where
Let
denote the posterior means from the BFH model.
Now, define
and
Note that among
the several specifications in DGSM, we have selected
at random (no
preference). Then, the benchmarked Bayes estimators of DGSM are
Empirical results
using the estimator
are presented
in the note to Table 4.1. The largest difference between the benchmarked
estimates under different benchmarking methods is for area 10 (
OB: 172.9; BFH: 129.2; LO: 145.5; RD: 145.3; DGSM:
126.4). In general, the PMs from LO and RD are closer to
OB
(observed data). Otherwise, these estimates compare reasonably well with the LO
benchmarking and RD deletion although there are some small differences; DGSM
does not provide posterior standard deviations and credible intervals.
More detailed results for
are presented in Tables 4.2-4.8 and in
Figures 4.1-4.4. Our interest is mainly to compare deletion of a single
area (e.g., LO) and RD.
Using the results in Table 4.2, we conclude that
the PMs from the BFH model (without benchmarking) are slightly different from
the direct estimates, and as expected, larger than the smaller direct estimates
and smaller than the larger ones. Except for two areas, as expected, the PSDs are smaller than the direct standard deviations. For example, the smallest
direct estimate (76.997) has the largest shrinkage with a larger standard
deviation (5.881 vs. 5.995); the results are consistent with the standard
shrinkage that occurs in small area estimation. We note that the PCVs are all
small and the NSEs are reasonably small, too.
Table 4.2
Comparison of the direct estimator with posterior inference from the Bayesian Fay-Herriot model for the area parameters
Table summary
This table displays the results of Comparison of the direct estimator with posterior inference from the Bayesian Fay-Herriot model for the area parameters. The information is grouped by Area (appearing as row headers), , ,
, PM, PSD, PCV, NSE and 95% HPD (appearing as column headers).
Area |
|
|
|
PM |
PSD |
PCV |
NSE |
95% HPD |
1 |
5 |
135.575 |
6.031 |
133.985 |
5.617 |
0.042 |
0.057 |
(123.422, 145.402) |
2 |
7 |
101.980 |
7.101 |
103.461 |
6.498 |
0.063 |
0.065 |
(90.598, 116.134) |
3 |
24 |
117.655 |
7.309 |
121.006 |
6.716 |
0.056 |
0.066 |
(107.730, 134.124) |
4 |
23 |
76.997 |
5.881 |
81.473 |
5.995 |
0.074 |
0.058 |
(69.046, 92.578) |
5 |
21 |
126.917 |
5.629 |
127.832 |
5.248 |
0.041 |
0.052 |
(117.850, 138.406) |
6 |
9 |
113.132 |
8.061 |
113.393 |
7.147 |
0.063 |
0.068 |
(99.441, 127.451) |
7 |
5 |
137.236 |
6.739 |
133.661 |
6.378 |
0.048 |
0.064 |
(121.771, 146.662) |
8 |
20 |
124.839 |
4.034 |
124.732 |
3.906 |
0.031 |
0.039 |
(117.233, 132.309) |
9 |
16 |
118.306 |
7.544 |
116.479 |
6.785 |
0.058 |
0.071 |
(103.225, 130.003) |
10 |
9 |
156.503 |
4.368 |
153.355 |
4.449 |
0.029 |
0.045 |
(144.785, 162.031) |
11 |
23 |
109.546 |
4.877 |
110.348 |
4.637 |
0.042 |
0.047 |
(101.179, 119.294) |
12 |
9 |
116.314 |
7.232 |
118.098 |
6.623 |
0.056 |
0.068 |
(105.135, 131.186) |
The
estimates from the BFH model with deleting the last area and with random
deletion under a uniform prior (equal weights) are presented in Tables 4.3
and 4.4. The posterior weights barely differ from 0.083 with the largest one (0.097)
of the last area and smallest one (0.056) of the
area. Both random deletion and deleting the
last one provide improved precision, as the PSDs of the benchmarked estimates
are all smaller than the observed standard errors, for both benchmarking
methods. The NSEs are larger than for no benchmarking, but this barely matters
as these are errors of the PMs (the characteristic of the PM has three digits).
Table 4.3
Comparison of the direct estimator with posterior inference from the Bayesian Fay-Herriot model for the area parameters under random deletion benchmarking
Table summary
This table displays the results of Comparison of the direct estimator with posterior inference from the Bayesian Fay-Herriot model for the area parameters under random deletion benchmarking. The information is grouped by Area (appearing as row headers), , ,
, PM, PSD, PCV, NSE and 95% HPD (appearing as column headers).
Area |
|
|
|
PM |
PSD |
PCV |
NSE |
95% HPD |
1 |
5 |
135.575 |
6.031 |
133.516 |
5.431 |
0.041 |
0.171 |
(123.414, 143.541) |
2 |
7 |
101.980 |
7.101 |
102.903 |
5.793 |
0.056 |
0.199 |
(92.378, 114.250) |
3 |
24 |
117.655 |
7.309 |
120.671 |
6.237 |
0.052 |
0.194 |
(107.744, 132.190) |
4 |
23 |
76.997 |
5.881 |
81.170 |
5.597 |
0.069 |
0.202 |
(69.781, 91.177) |
5 |
21 |
126.917 |
5.629 |
127.652 |
5.036 |
0.039 |
0.170 |
(118.293, 137.228) |
6 |
9 |
113.132 |
8.061 |
112.805 |
6.707 |
0.059 |
0.223 |
(100.926, 126.074) |
7 |
5 |
137.236 |
6.739 |
133.908 |
6.007 |
0.045 |
0.177 |
(122.135, 145.344) |
8 |
20 |
124.839 |
4.034 |
124.703 |
3.757 |
0.030 |
0.120 |
(117.962, 132.304) |
9 |
16 |
118.306 |
7.544 |
116.451 |
6.650 |
0.057 |
0.249 |
(103.400, 129.316) |
10 |
9 |
156.503 |
4.368 |
153.222 |
4.216 |
0.028 |
0.134 |
(144.392, 160.854) |
11 |
23 |
109.546 |
4.877 |
110.221 |
4.694 |
0.043 |
0.150 |
(101.038, 119.570) |
12 |
9 |
116.314 |
7.232 |
117.780 |
5.997 |
0.051 |
0.208 |
(104.619, 128.158) |
Table 4.4
Comparison of the direct estimator with posterior inference from the Bayesian Fay-Herriot model for the area parameters under deleting the last area
Table summary
This table displays the results of Comparison of the direct estimator with posterior inference from the Bayesian Fay-Herriot model for the area parameters under deleting the last area. The information is grouped by Area (appearing as row headers), , ,
, PM, PSD, PCV, NSE and 95% HPD (appearing as column headers).
Area |
|
|
|
PM |
PSD |
PCV |
NSE |
95% HPD |
1 |
5 |
135.575 |
6.031 |
133.772 |
5.519 |
0.041 |
0.151 |
(122.213, 143.991) |
2 |
7 |
101.980 |
7.101 |
103.026 |
6.319 |
0.061 |
0.171 |
(89.424, 113.857) |
3 |
24 |
117.655 |
7.309 |
120.470 |
6.458 |
0.054 |
0.209 |
(108.783, 134.261) |
4 |
23 |
76.997 |
5.881 |
81.391 |
5.906 |
0.073 |
0.171 |
(69.636, 92.634) |
5 |
21 |
126.917 |
5.629 |
127.883 |
5.158 |
0.040 |
0.142 |
(117.282, 137.305) |
6 |
9 |
113.132 |
8.061 |
112.895 |
6.270 |
0.056 |
0.216 |
(100.664, 124.320) |
7 |
5 |
137.236 |
6.739 |
133.298 |
5.948 |
0.045 |
0.178 |
(121.831, 144.727) |
8 |
20 |
124.839 |
4.034 |
124.664 |
3.810 |
0.031 |
0.124 |
(117.321, 131.941) |
9 |
16 |
118.306 |
7.544 |
116.542 |
6.531 |
0.056 |
0.203 |
(104.238, 129.622) |
10 |
9 |
156.503 |
4.368 |
153.229 |
4.353 |
0.028 |
0.132 |
(144.443, 161.593) |
11 |
23 |
109.546 |
4.877 |
109.997 |
4.563 |
0.041 |
0.168 |
(101.428, 118.953) |
12 |
9 |
116.314 |
7.232 |
117.835 |
6.344 |
0.054 |
0.215 |
(106.421, 131.483) |
The
three methods (BFH, RD, LO) are compared using the results in Table 4.5.
The PMs are comparable, so that benchmarking (RD, LO) does not distort (shrink)
the estimates much beyond the shrinkage under the BFH model. Also, the PSDs under LO and RD are almost always smaller than those under the BFH model. For
eight of the twelve areas, RD has smaller PSDs than LO; in these areas, RD
shows roughly 1% decrease in PSD over LO and roughly 4% over the PSDs from BFH.
To
investigate how sensitive the PSDs are to different benchmarking targets, we
present results using three choices of targets in Table 4.6. The PSDs change only slightly over different targets and are still better than the
standard errors of the direct estimates.
As
part of designing a complex set of simulations, we consider using unequal
probabilities (weights) in the random deletion benchmarking, and present
results in Table 4.7. Uniform weights (EW) are compared to weights
inversely proportional (IW) to the sample sizes and to weights directly proportional
(DW) to the samples sizes. Again, small differences are present among the three
PMs and among the three PSDs. The PSDs are still smaller than those of the
direct estimates.
Using
the results in Table 4.8, we study how extreme sample sizes in the last
county (to be deleted) affect posterior inference. For this, we set the sample
size of the last county to be outside the simulation range (5-25), at 2 and 50.
First, consider the case in which the sample size of the last county is 2.
Consistent with previous findings, there are minor differences of the PMs over
no benchmarking, deleting the last one and random deletion for all counties.
The PSDs for LO and RD are smaller than those of BFH with nine of these PSDs for RD smaller than LO. However, for the last county, we observe relatively
large posterior standard deviations (10.00, 8.771, 8.525), roughly 15% decrease
in PSD of RD over no benchmarking. Next, consider the case in which the sample
size of the last county is 50. The patterns are similar, except the PSDs for
the last county are comparable to the others under BFH, LO and RD and again
there is an approximately 10% decrease (6.282, 5.958, 5.702) in PSD of RD over
no benchmarking. It appears that deliberately putting the county with the most
extreme sample size (small or large) as the last county can affect the
benchmarking procedure. In contrast, minor changes are observed when the areas
with extreme sample size are not systematically deleted. When the sample size
is 2, the new PMs and PSDs are the following, BFH: 124.307, 9.993; LO: 123.371,
9.000 RD: 123.540, 8.887. When the sample size is 50, the new PMs and PSDs are
the following, BFH: 118.167, 6.284; LO: 117.802, 6.094; RD: 117.716, 5.948.
Table 4.5
A summary of the comparison of inference from the direct estimator, the Bayesian Fay-Herriot (BFH) model, random deletion (RD) benchmarking and deleting the last one (LO)
Table summary
This table displays the results of A summary of the comparison of inference from the direct estimator. The information is grouped by Area (appearing as row headers), , , , BFH, RD and LO (appearing as column headers).
Area |
|
|
|
BFH |
RD |
LO |
PM |
PSD |
PM |
PSD |
PM |
PSD |
1 |
5 |
135.575 |
6.031 |
133.985 |
5.617 |
133.516 |
5.431 |
133.772 |
5.519 |
2 |
7 |
101.980 |
7.101 |
103.461 |
6.498 |
102.903 |
5.793 |
103.026 |
6.319 |
3 |
24 |
117.655 |
7.309 |
121.006 |
6.716 |
120.671 |
6.237 |
120.470 |
6.458 |
4 |
23 |
76.997 |
5.881 |
81.473 |
5.995 |
81.170 |
5.597 |
81.391 |
5.906 |
5 |
21 |
126.917 |
5.629 |
127.832 |
5.248 |
127.652 |
5.036 |
127.883 |
5.158 |
6 |
9 |
113.132 |
8.061 |
113.393 |
7.147 |
112.805 |
6.707 |
112.895 |
6.270 |
7 |
5 |
137.236 |
6.739 |
133.661 |
6.378 |
133.908 |
6.007 |
133.298 |
5.948 |
8 |
20 |
124.839 |
4.034 |
124.732 |
3.906 |
124.703 |
3.757 |
124.664 |
3.810 |
9 |
16 |
118.306 |
7.544 |
116.479 |
6.785 |
116.451 |
6.650 |
116.542 |
6.531 |
10 |
9 |
156.503 |
4.368 |
153.355 |
4.449 |
153.222 |
4.216 |
153.229 |
4.353 |
11 |
23 |
109.546 |
4.877 |
110.348 |
4.637 |
110.221 |
4.694 |
109.997 |
4.563 |
12 |
9 |
116.314 |
7.232 |
118.098 |
6.623 |
117.780 |
5.997 |
117.835 |
6.344 |
Table 4.6
Comparison of posterior inference of the area parameters under random deletion benchmarking with different targets (a = 1,435)
Table summary
This table displays the results of Comparison of posterior inference of the area parameters under random deletion benchmarking with different targets (a = 1. The information is grouped by Area (appearing as row headers), , , , a, 1.5a and 0.5a (appearing as column headers).
Area |
|
|
|
a |
1.5a |
0.5a |
PM |
PSD |
PM |
PSD |
PM |
PSD |
1 |
5 |
135.575 |
6.031 |
133.516 |
5.431 |
189.249 |
5.385 |
77.769 |
5.561 |
2 |
7 |
101.980 |
7.101 |
102.903 |
5.793 |
175.963 |
5.794 |
29.847 |
5.899 |
3 |
24 |
117.655 |
7.309 |
120.671 |
6.237 |
197.219 |
6.099 |
44.145 |
6.461 |
4 |
23 |
76.997 |
5.881 |
81.170 |
5.597 |
134.628 |
5.871 |
27.771 |
5.460 |
5 |
21 |
126.917 |
5.629 |
127.652 |
5.036 |
177.209 |
5.165 |
78.125 |
5.053 |
6 |
9 |
113.132 |
8.061 |
112.805 |
6.707 |
201.949 |
7.145 |
23.614 |
6.995 |
7 |
5 |
137.236 |
6.739 |
133.908 |
6.007 |
200.989 |
6.018 |
66.781 |
6.024 |
8 |
20 |
124.839 |
4.034 |
124.703 |
3.757 |
151.951 |
3.952 |
97.484 |
3.924 |
9 |
16 |
118.306 |
7.544 |
116.451 |
6.650 |
196.849 |
6.990 |
35.990 |
6.607 |
10 |
9 |
156.503 |
4.368 |
153.222 |
4.216 |
184.720 |
4.019 |
121.708 |
4.706 |
11 |
23 |
109.546 |
4.877 |
110.221 |
4.694 |
148.724 |
4.966 |
71.752 |
4.760 |
12 |
9 |
116.314 |
7.232 |
117.780 |
5.997 |
193.050 |
5.954 |
42.514 |
6.081 |
Table 4.7
Comparison of posterior inference of the area parameters under random deletion benchmarking with equal weights (EW), weights inversely proportional sample sizes (IW) and weights directly proportional to sample sizes (DW)
Table summary
This table displays the results of Comparison of posterior inference of the area parameters under random deletion benchmarking with equal weights (EW). The information is grouped by Area (appearing as row headers), , , , EW, IW and DW (appearing as column headers).
Area |
|
|
|
EW |
IW |
DW |
PM |
PSD |
PM |
PSD |
PM |
PSD |
1 |
5 |
135.575 |
6.031 |
133.516 |
5.431 |
133.508 |
5.518 |
133.436 |
5.404 |
2 |
7 |
101.980 |
7.101 |
102.903 |
5.793 |
103.042 |
5.737 |
103.049 |
5.809 |
3 |
24 |
117.655 |
7.309 |
120.671 |
6.237 |
120.529 |
6.176 |
120.634 |
6.247 |
4 |
23 |
76.997 |
5.881 |
81.170 |
5.597 |
81.167 |
5.571 |
81.111 |
5.567 |
5 |
21 |
126.917 |
5.629 |
127.652 |
5.036 |
127.669 |
5.079 |
127.541 |
5.055 |
6 |
9 |
113.132 |
8.061 |
112.805 |
6.707 |
112.762 |
6.704 |
113.074 |
6.716 |
7 |
5 |
137.236 |
6.739 |
133.908 |
6.007 |
133.965 |
5.968 |
133.798 |
6.027 |
8 |
20 |
124.839 |
4.034 |
124.703 |
3.757 |
124.829 |
3.734 |
124.719 |
3.757 |
9 |
16 |
118.306 |
7.544 |
116.451 |
6.650 |
116.300 |
6.707 |
116.502 |
6.640 |
10 |
9 |
156.503 |
4.368 |
153.222 |
4.216 |
153.238 |
4.198 |
153.204 |
4.220 |
11 |
23 |
109.546 |
4.877 |
110.221 |
4.694 |
110.190 |
4.697 |
110.208 |
4.690 |
12 |
9 |
116.314 |
7.232 |
117.780 |
5.997 |
117.802 |
6.010 |
117.726 |
5.989 |
For comparison, different posterior densities are presented
in Figures 4.1-4.4. In Figures 4.1 and 4.2, we present posterior
densities of all twelve area parameters when each area, in turn, is deleted. We
observe that the posterior densities are slightly different around the modes,
but nothing remarkable. In Figures 4.3 and 4.4, we present posterior
densities of all twelve area parameters under the FH model (unconstrained),
random deletion benchmarking and deleting the last one. There are some
differences among the three densities, but again these are not alarmingly
different.
Finally, empirical results are presented for a
simulation scenario with 99 areas, reflecting the 99 counties in
Iowa. The data are
generated as previously described, and the BFH model without benchmarking, with
random deletion benchmarking, and with deleting the last one benchmarking is
fit using 20,000 iterations for the Gibbs sampler. For each model fit, the
first 10,000 iterations are used as a burn-in and every tenth iteration is kept
thereafter. The BFH model fitting takes 15 seconds, while the deletion
benchmarking models takes slightly less than three minutes each. For the random
deletion benchmarking model parameters, the regression coefficients
and the variance
the p-values of the Geweke test are,
respectively, 0.822, 0.128, 0.752 and 0.219, and the effective sample sizes are
all 1,000 for the 1,000 selected iterations (i.e., an efficient Gibbs sampler).
Note that the target is 12,162.93 and the sum of the PMs from the BFH model is
12,168.49, a difference of 5.56. In Figure 4.5, we present a plot of the
coefficients of variation under random deletion benchmarking, deleting the last
one benchmarking and BFH model versus the direct estimates by area. The
differences among these models are not remarkable. Most of the points with
direct CVs larger than about 0.04 fall below the
straight line. However, some points (diamond)
under the BFH model are above the
line, four of them are noticeable, possibly
shrinking too much. We conclude that it is sensible to perform the random
deletion benchmarking.
Table 4.8
A summary of the comparison of inference from the direct estimator, the Bayesian Fay-Herriot (BFH) model, deleting the last one (LO) and random deletion (RD) benchmarking when the last county is extreme
Table summary
This table displays the results of A summary of the comparison of inference from the direct estimator Area, , , , BFH, LO and RD (appearing as column headers).
|
Area |
|
|
|
BFH |
LO |
RD |
PM |
PSD |
PM |
PSD |
PM |
PSD |
a. The last county size is 2. |
1 |
5 |
135.575 |
6.031 |
134.116 |
5.607 |
133.772 |
5.473 |
133.510 |
5.409 |
2 |
7 |
101.980 |
7.101 |
103.205 |
6.482 |
102.818 |
6.118 |
102.745 |
5.837 |
3 |
24 |
117.655 |
7.309 |
121.110 |
6.730 |
120.911 |
6.577 |
120.666 |
6.260 |
4 |
23 |
76.997 |
5.881 |
81.586 |
6.021 |
81.741 |
5.544 |
81.196 |
5.631 |
5 |
21 |
126.917 |
5.629 |
127.901 |
5.252 |
127.552 |
5.264 |
127.619 |
5.041 |
6 |
9 |
113.132 |
8.061 |
113.454 |
7.147 |
112.889 |
6.818 |
113.074 |
6.815 |
7 |
5 |
137.236 |
6.739 |
133.938 |
6.339 |
133.479 |
5.968 |
133.947 |
5.994 |
8 |
20 |
124.839 |
4.034 |
124.753 |
3.906 |
124.699 |
3.824 |
124.738 |
3.735 |
9 |
16 |
118.306 |
7.544 |
116.199 |
6.806 |
115.329 |
6.327 |
116.065 |
6.785 |
10 |
9 |
156.503 |
4.368 |
153.419 |
4.434 |
153.148 |
4.174 |
153.240 |
4.213 |
11 |
23 |
109.546 |
4.877 |
110.512 |
4.645 |
110.473 |
4.696 |
110.324 |
4.686 |
12 |
2 |
121.881 |
12.75 |
124.243 |
10.00 |
123.755 |
8.771 |
123.444 |
8.525 |
b. The last county size is 50. |
1 |
5 |
135.575 |
6.031 |
133.984 |
5.618 |
133.745 |
5.461 |
133.452 |
5.385 |
2 |
7 |
101.980 |
7.101 |
103.462 |
6.499 |
103.136 |
6.086 |
103.044 |
5.780 |
3 |
24 |
117.655 |
7.309 |
121.006 |
6.716 |
120.832 |
6.536 |
120.698 |
6.232 |
4 |
23 |
76.997 |
5.881 |
81.473 |
5.995 |
81.596 |
5.512 |
81.162 |
5.728 |
5 |
21 |
126.917 |
5.629 |
127.832 |
5.248 |
127.519 |
5.238 |
127.661 |
5.001 |
6 |
9 |
113.132 |
8.061 |
113.393 |
7.146 |
112.929 |
6.777 |
112.899 |
6.675 |
7 |
5 |
137.236 |
6.739 |
133.659 |
6.380 |
133.351 |
5.947 |
133.851 |
5.941 |
8 |
20 |
124.839 |
4.034 |
124.732 |
3.906 |
124.713 |
3.821 |
124.726 |
3.825 |
9 |
16 |
118.306 |
7.544 |
116.480 |
6.785 |
115.766 |
6.269 |
116.319 |
6.601 |
10 |
9 |
156.503 |
4.368 |
153.355 |
4.449 |
153.225 |
4.173 |
153.306 |
4.230 |
11 |
23 |
109.546 |
4.877 |
110.347 |
4.637 |
110.378 |
4.692 |
110.155 |
4.689 |
12 |
50 |
116.538 |
6.791 |
118.117 |
6.282 |
118.035 |
5.958 |
117.952 |
5.702 |
Description for Figure 4.1
Figure presenting the posterior densities for
to
when each area is deleted at a time (e.g., the
first area is deleted in the first panel etc.). There are six graphs, one for
each theta, overlapping the density curves of the twelve areas. The posterior
density is on the y-axis, ranging from 0.0 to 0.12. Theta is on the x-axis,
ranging from 60 to 180. The posterior densities are similar in width, but the
modes differ. It’s around theta = 130 for theta_1 and theta_5; around
theta = 105 for theta_2; around theta = 120 for theta_3;
around theta = 80 for theta_4 and around theta = 110 for
theta_6.
Description for Figure 4.2
Figure presenting the posterior densities for
to
when each area is deleted at a time (e.g., the
first area is deleted in the first panel etc.). There are six graphs, one for
each theta, overlapping the density curves of the twelve areas. The posterior
density is on the y-axis, ranging from 0.0 to 0.12. Theta is on the x-axis,
ranging from 60 to 180. The posterior densities are similar, but there are
slight differences. The mode is around theta = 130 for theta_7;
around theta = 120 for theta_8, theta_9 and theta_12; around theta = 150
for theta_10 and around theta = 110 for theta_11. Densities are
narrower and higher for theta_8, theta_10 and theta_11.
Description for Figure 4.3
Figure presenting the posterior densities for
to
under the Fay-Herriot model, under random
deletion benchmarking and for area-12 deletion. There are six graphs, one for
each theta, overlapping the density curves of the three deletion types. The
posterior density is on the y-axis, ranging from 0.0 to 0.10. Theta is on the
x-axis, ranging from 60 to 180. The posterior densities are similar in width,
but the modes differ. It’s around theta = 130 for theta_1 and
theta_5; around theta = 105 for theta_2; around theta = 120
for theta_3; around theta = 80 for theta_4 and around theta = 110
for theta_6.
Description for Figure 4.4
Figure presenting the posterior densities for
to
under the Fay-Herriot model, under random
deletion benchmarking and for area-12 deletion. There are six graphs, one for
each theta, overlapping the density curves of the three deletion types. The
posterior density is on the y-axis, ranging from 0.0 to 0.10. Theta is on the
x-axis, ranging from 60 to 180. The posterior densities are similar, but there
are slight differences. The mode is around theta = 130 for theta_7;
around theta = 120 for theta_8, theta_9 and theta_12; around theta = 150
for theta_10 and around theta = 110 for theta_11. Densities are
narrower and higher for theta_8, theta_10 and theta_11.
Description for Figure 4.5
Figure
presenting a scatter plot of the coefficients of variation under the random
deletion benchmarking, deleting the last one and the Bayesian Fay-Herriot model
for 99 areas. The posterior CV is on the y-axis, ranging from 0.0 to 0.10. The
direct CV is on the x-axis, ranging from 0.0 to 0.10. A 45° straight line is
added to the graph. The differences among these models are not remarkable. Most
of the points with direct CVs larger than about 0.04 fall below the 45°
straight line. However, some points under the BFH model are above the 45° line,
four of them are noticeable, possibly shrinking too much.