Two local diagnostics to evaluate the efficiency of the empirical best predictor under the Fay-Herriot model
Section 4. Two diagnostics to evaluate the local performance of the B estimator
4.1 An approach conditional on
From expression (2.5) in Section 2 and noting that
we obtain the conditional distribution of
:
Conditioning on
gives a better idea of the possible values
can take. In particular, when the value of
is strictly greater than 0, the conditional
distribution of
may deviate significantly from its
unconditional distribution:
The first diagnostic is defined as the conditional probability:
This diagnostic can be written as a function of
and the standardized error (2.4):
where
is the
distribution function of the standard normal distribution. The proof of result
(4.2) is given in Appendix A.
When this diagnostic takes values close to 0, we may
conclude that
is most likely larger than
and that the direct estimator is preferable to
the B estimator. To obtain a decision rule associated with this diagnostic, it
is necessary to choose a threshold below which we decide to choose the direct
estimator and above which the B estimator is chosen. A 50% threshold seems
quite natural. Another idea is to apply an empirical approach and identify a
break in the distribution of the values of diagnostic
for the
domains.
This diagnostic is not entirely design-based because it
involves the conditional distribution
It is therefore necessary to validate carefully
the Fay-Herriot model before using it. Unfortunately, it is not possible to
validate the assumptions on both
and
because the values of the parameters
are not observed. However, the combined
Fay-Herriot model (2.3) can be validated using model residuals (see, for
example, Hidiroglou, Beaumont and Yung, 2019). These residuals are obtained by
replacing the unknown quantities in the standardized error (2.4) with their
estimates (see Section 5). A graph of residuals versus model predicted
values is often suggested to validate the linearity assumption of the model.
The normality assumption of the error
can be verified by a Q-Q plot of the residuals
or normality tests such as the Shapiro-Wilk test. In case the model is not
completely satisfactory, a conservative threshold of 75% may be appropriate.
The diagnostic in the following section is entirely
design-based. It is therefore not dependent on the validity of the linking model.
In this sense, it is considered more robust than the diagnostic (4.2). However,
it relies on assumptions about the sampling errors
discussed in Section 2, including the normality
assumption of
4.2 Use of a design-based hypothesis test on the
parameter
In the design-based approach to inference,
is fixed and the standardized error (2.4) follows
the distribution:
We have a unique observation of this random variable. We
use it to test if
is larger than
We consider the test:
We use
as our test statistic. We expect that
will have smaller values under
than under
Let
be the observed value of the statistic
and
The
-value of the test is defined as the
probability that the statistic
is greater than the observed value
under the null hypothesis. Appendix B shows
that the
-value is:
where
Since the second term is often negligible compared to
the first term, especially when
or
is large, our second diagnostic is:
This second diagnostic can be interpreted as follows:
When
is small, we can assume that
is likely to be larger than
and the direct estimator is then preferred to the
B estimator. For the choice of a decision threshold, values typically used as
levels for hypothesis testing (e.g., 5% or 10%) can be used as a guide. With
these small values, the B estimator is favoured. As with the previous
diagnostic, the threshold can be determined by locating a break in the
distribution of the values of diagnostic
for the
domains.
4.3 Some properties of diagnostics 1 and 2
In this section, we study the behaviour of the functions
and
for limiting cases of
and
and note their similarities and differences.
Case 1:
is fixed
and
From equations (4.2) and (4.4) it can be shown that, for
the two functions
and
decrease as
increases. In other words, the derivative of
these functions with respect to
is negative. In addition, the limit when
of these two functions tends toward 0. For a
sufficiently large value of
the two diagnostics will therefore favour the
direct estimator.
Case 2:
is fixed
and
From equation (4.2), we observe that
We
can show that
is minimized when
Therefore,
0.98. Since this value is close to 1,
diagnostic 1 leads to choosing the B estimator in this case if a threshold of
0.50 or even 0.75 is chosen.
From
equation (4.4) we obtain:
We
can show that, for
the function
is minimized when
Hence,
0.84. With a threshold smaller than 0.50,
diagnostic 2 leads to the same decision as diagnostic 1 in this case, i.e. to
choose the B estimator.
Case 3:
is fixed
and
The
two functions
and
tend toward 1 in this case. Therefore, diagnostics
1 and 2 lead to choosing the B estimator.
Case 4:
is fixed
and
The
two functions
and
tend toward 0 in this case. Diagnostics 1 and
2 lead here to choosing the direct estimator.
Case 5:
is fixed
and
The
function
tends toward 1 for any fixed value of
Therefore, diagnostic 1 favours the B estimator for small values of
We note that
Therefore, contrary to Diagnostic 1,
Diagnostic 2 will lead to choosing the direct estimator if
is sufficiently large even when
is infinitely close to 0. For example, with a
decision threshold at 0.05 and
Diagnostic 2 favours the direct estimator when
2.64.
In the first four cases above, both diagnostics lead to
the same decision. There is a difference only in Case 5 where
We therefore expect that Diagnostic 2 will
choose the direct estimator more often than Diagnostic 1 for small values of
Consider, for example, a threshold of 0.5 for
Diagnostic 1 and of 0.05 for Diagnostic 2. For a threshold of 0.5, we can show
that Diagnostic 1 leads to choosing the direct estimator as soon as
is larger than a value approximately equal to
i.e. as soon as
As for Diagnostic 2, for a threshold of 0.05,
it leads to choosing the direct estimator as soon as
For
0.01, Diagnostic 1 thus leads to choosing the
direct estimator when
100.5, while Diagnostic 2 leads to choosing
the direct estimator when
2.64. The gap narrows as
increases. For example, for
0.2, Diagnostic 1 chooses the direct estimator
when
5.48 and Diagnostic 2 chooses the direct
estimator when
2.57. The above discussion seems to suggest
that Diagnostic 2 leads to choosing the direct estimator more often than
Diagnostic 1. However, there are cases where Diagnostic 1 chooses the direct
estimator contrary to Diagnostic 2. These cases generally occur for fairly
large values of
For example, for
0.8, Diagnostic 1 chooses the direct estimator
when
1.68, while Diagnostic 2 chooses the direct
estimator only when
2.08.