# Development of a small area estimation system at Statistics Canada

Section 3. Area level model

The area level small area estimator first appeared in the seminal paper of Fay and Herriot (1979). Following that paper, let the parameter of interest be ${\theta}_{i};$ common examples are totals, ${Y}_{i}={\displaystyle {\sum}_{j\text{\hspace{0.17em}}\in \text{\hspace{0.17em}}{U}_{i}}{y}_{j}},$ or means, ${\overline{Y}}_{i}={Y}_{i}/{N}_{i}.$ As noted above, the vector of auxiliary variables may differ from the one used in direct estimation and is denoted as $z.$ The area level model can be expressed as two equations.

The first equation, commonly known as the *sampling model*, is given by

$${\widehat{\theta}}_{i}={\theta}_{i}+{e}_{i}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(3.1)$$

and expresses the direct estimate ${\widehat{\theta}}_{i}$ in terms of the unknown parameter ${\theta}_{i}$ plus a random error ${e}_{i}$ due to sampling. The sampling errors ${e}_{i}$ are independently and identically distributed with mean 0 and variance ${\psi}_{i}:$ that is ${E}_{p}\left({e}_{i}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{\theta}_{i}\right)=0$ and ${V}_{p}\left({e}_{i}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{\theta}_{i}\right)={\psi}_{i},$ where $p$ denotes expectation in terms of the sample design. Note that ${\psi}_{i}$ is also the design variance of ${\widehat{\theta}}_{i}$ and is typically unknown.

The second equation, known as the* linking model*, is given by

$${\theta}_{i}={z}_{i}^{T}\beta +{b}_{i}{v}_{i}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(3.2)$$

and expresses
the parameter
${\theta}_{i}$
as a
fixed effect
${z}_{i}^{T}\beta $
plus a
random effect
${v}_{i}$
multiplied by
${b}_{i}.$
In the
production system, the
${b}_{i}$
term has
a default value of one but can be specified by the user to control
heteroscedastic errors or the impact of influential observations. The random
effects
${v}_{i}$
are
independently and identically distributed with mean 0 and unknown model
variance
${\sigma}_{v}^{2},$
that is
${E}_{m}\left({v}_{i}\right)=0$
and
${V}_{m}\left({v}_{i}\right)={\sigma}_{v}^{2}$
where
${E}_{m}$
denotes
the model expectation and
${V}_{m}$
the
model variance. The random errors
${e}_{i}$
are
independent of the random effects
${v}_{i}.$
The
combination of the *sampling model* and *linking model* results in a single
generalized linear mixed model (GLMM) given by

$${\widehat{\theta}}_{i}={z}_{i}^{T}\beta +{b}_{i}{v}_{i}+{e}_{i}.\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(3.3)$$

From the Fay-Herriot model (3.3), we observe that ${E}_{mp}\left({\widehat{\theta}}_{i}\right)={z}_{i}^{T}\beta $ and ${V}_{mp}\left({\widehat{\theta}}_{i}\right)={b}_{i}^{2}{\sigma}_{v}^{2}+{\tilde{\psi}}_{i},$ where ${\tilde{\psi}}_{i}={E}_{m}\left({\psi}_{i}\right)$ is the smoothed design variance of ${\widehat{\theta}}_{i}.$ In general, we cannot treat ${\psi}_{i}$ as fixed, as it is not strictly a function of auxiliary data. If the ${\sigma}_{v}^{2}\u2019s$ and ${\tilde{\psi}}_{i}\u2019s$ are known, the solution to the GLMM yields the Best Linear Unbiased Predictor (BLUP), ${\tilde{\theta}}_{i}^{\text{BLUP}}$

$${\tilde{\theta}}_{i}^{\text{BLUP}}=\{\begin{array}{ll}{\gamma}_{i}{\widehat{\theta}}_{i}+\left(1-{\gamma}_{i}\right){z}_{i}^{T}\tilde{\beta}\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in A\hfill \\ {z}_{i}^{T}\tilde{\beta}\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in \overline{A}\hfill \end{array}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(3.4)$$

where ${\gamma}_{i}=\left({b}_{i}^{2}{\sigma}_{v}^{2}\right)/\left({\tilde{\psi}}_{i}+{b}_{i}^{2}{\sigma}_{v}^{2}\right)$ and $\tilde{\beta}={\left({\displaystyle {\sum}_{i\in A}{z}_{i}{z}_{i}^{T}}/\left({\tilde{\psi}}_{i}+{b}_{i}^{2}{\sigma}_{v}^{2}\right)\right)}^{-1}{\displaystyle {\sum}_{i\in A}{z}_{i}{\widehat{\theta}}_{i}}/\left({\tilde{\psi}}_{i}+{b}_{i}^{2}{\sigma}_{v}^{2}\right).$

There
are four recursive procedures for estimating
${\sigma}_{v}^{2}$
and
$\beta $
in the production system. The first three
assume that
${\tilde{\psi}}_{i}$
is known, or that a smoothed version of it is
available (see the following section for details). Under this assumption, the
variance components can be computed via the Fay-Herriot procedure (FH) as
outlined in Fay and Herriot (1979), the restricted maximum likelihood (REML),
or the Adjusted Density Maximization (ADM) due to Li and Lahiri (2010). The
fourth procedure, WF, due to Wang and Fuller (2003) assumes that
${\psi}_{i}$
is estimated by
${\widehat{\psi}}_{i}$
given that
${n}_{i}\ge 2.$
The WF procedure does not require any
smoothing of the estimated
${\widehat{\psi}}_{i}$
values before
estimating
${\sigma}_{v}^{2}.$
Wang and
Fuller (2003) carried out simulations with
${n}_{i}$
ranging
from 9 to 36 and found that their procedure
yielded reasonable estimates of
${\theta}_{i}$
and its
estimated mean squared error.

The main difference between these four procedures is how the ${\sigma}_{v}^{2}\u2019s$ are computed. They are all based on an iterative scoring algorithm that obtains ${\widehat{\sigma}}_{v}^{2}$ as an estimate of the model variance ${\sigma}_{v}^{2}.$ The FH, REML, and WF procedures may yield ${\widehat{\sigma}}_{v}^{2}\u2019s$ that are smaller than zero. If this occurs, the ${\widehat{\sigma}}_{v}^{2}\u2019s$ are set to zero for both the FH and REML procedures. A drawback of truncating the estimated ${\sigma}_{v}^{2}$ to zero is that the resulting small area estimator will be synthetic for all areas. Li and Lahiri (2010) suggested the ADM as a way to address the problem of obtaining negative ${\widehat{\sigma}}_{v}^{2}$ by maximizing an adjusted likelihood defined as a product of the model variance and a standard likelihood. Although the ADM method always gives a positive solution for ${\sigma}_{v}^{2},$ it should be used cautiously because it overestimates the model variance. The REML, FH and ADM procedures use the smoothed values of the estimated ${\widehat{\psi}}_{i}$ values obtained from the sample or some estimate provided by the user. For the WF procedure, if ${\widehat{\sigma}}_{v}^{2}<0,$ Wang and Fuller (2003) suggested to set ${\widehat{\sigma}}_{v}^{2}$ to $0.5\text{\hspace{0.17em}}\sqrt{\text{}\widehat{V}\left({\widehat{\sigma}}_{v}^{2}\right)},$ where

$$\widehat{V}\left({\widehat{\sigma}}_{v}^{2}\right)={\displaystyle \sum _{i\in A}2{\kappa}_{i}^{2}\left[{\left({\widehat{\psi}}_{i}+{b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}\right)}^{2}+\frac{{\left({\widehat{\psi}}_{i}\right)}^{2}}{\left({n}_{i}-1\right)}\right]}$$

and

$${\kappa}_{i}=\frac{{\left[{b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}+\frac{\left({n}_{i}+1\right)}{\left({n}_{i}-1\right)}{\widehat{\psi}}_{i}\right]}^{-1}}{{\displaystyle {\sum}_{i\in A}{\left[{b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}+\frac{\left({n}_{i}+1\right)}{\left({n}_{i}-1\right)}{\widehat{\psi}}_{i}\right]}^{-1}}}.$$

Plugging ${\widehat{\sigma}}_{v}^{2}$ and an estimate of ${\tilde{\psi}}_{i}\u2019s$ into the ${\tilde{\theta}}_{i}^{\text{BLUP}},$ defined by equation (3.4), yields the Empirical Best Linear Unbiased Predictor (EBLUP), ${\widehat{\theta}}_{i}^{\text{EBLUP}}.$ It is given by

$${\widehat{\theta}}_{i}^{\text{EBLUP}}=\{\begin{array}{ll}{\widehat{\gamma}}_{i}{\widehat{\theta}}_{i}+\left(1-{\widehat{\gamma}}_{i}\right){z}_{i}^{T}\widehat{\beta}\hfill & \text{for}i\in A\hfill \\ {z}_{i}^{T}\widehat{\beta}\hfill & \text{for}i\in \overline{A}\hfill \end{array}$$

where ${\widehat{\gamma}}_{i}=\left({b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}\right)/\left({\ddot{\psi}}_{i}+{b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}\right),$ $\widehat{\beta}={\left({\displaystyle {\sum}_{i\in A}{z}_{i}{z}_{i}^{T}}/\left({\ddot{\psi}}_{i}+{b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}\right)\right)}^{-1}{\displaystyle {\sum}_{i\in A}{z}_{i}{\widehat{\theta}}_{i}^{\text{DIR}}}/\left({\ddot{\psi}}_{i}+{b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}\right),$ and ${\ddot{\psi}}_{i}$ is chosen according to the procedure used. For the REML, FH and ADM procedures the ${\ddot{\psi}}_{i}\u2019s$ are the smoothed values of the estimated ${\widehat{\psi}}_{i}$ values obtained from the sample or some estimate provided by the user. For the WF procedure, we have that ${\ddot{\psi}}_{i}={\widehat{\psi}}_{i}.$ If the estimated model variance ${b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}$ is relatively small compared with ${\ddot{\psi}}_{i},$ then ${\widehat{\gamma}}_{i}$ will be small and more weight will be attached to the synthetic estimator ${z}_{i}^{T}\widehat{\beta}.$ Similarly, more weight is attached to the direct estimator, ${\widehat{\theta}}_{i},$ if the design variance ${\ddot{\psi}}_{i}$ is relatively small.

Details of the required computations can be found in the methodology specifications for the production system in Estevao et al. (2015).

## 3.1 Estimation of the smooth design variance

The design variance, ${\psi}_{i},$ could be used as an estimator of the smooth design variance ${\tilde{\psi}}_{i}={E}_{m}\left({\psi}_{i}\right)$ if it were known. In most cases, it is unknown. To get around this difficulty, a design-unbiased variance estimator ${\widehat{\psi}}_{i}$ of ${\psi}_{i}$ is assumed to be available; i.e., ${E}_{p}\left({\widehat{\psi}}_{i}\right)={\psi}_{i}.$ Under this assumption, we have that

$${E}_{mp}\left({\widehat{\psi}}_{i}\right)={E}_{m}\left({\psi}_{i}\right)={\tilde{\psi}}_{i}.$$

A simple unbiased estimator of the smooth design variance ${\tilde{\psi}}_{i}$ is ${\widehat{\psi}}_{i}.$ However, ${\widehat{\psi}}_{i}$ may be quite unstable when the sample size in domain $i$ is small. A more efficient estimator is obtained by modelling ${\widehat{\psi}}_{i}$ given ${z}_{i}.$ Dick (1995) and Rivest and Belmonte (2000) considered smoothing models given by

$$\mathrm{log}\left({\widehat{\psi}}_{i}\right)={x}_{i}^{T}\alpha +{\epsilon}_{i},$$

where ${x}_{i}$ is a vector of explanatory variables that are functions of ${z}_{i},$ $\alpha $ is a vector of unknown model parameters to be estimated, and ${\epsilon}_{i}$ is a random error with ${E}_{mp}\left({\epsilon}_{i}\right)=0$ and constant variance ${\sigma}_{\epsilon}^{2}={V}_{mp}\left({\epsilon}_{i}\right).$ We also assume that the errors ${\epsilon}_{i}$ are identically distributed conditionally on ${z}_{i},$ $i=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}m.$ From the above model, we observe that

$${\tilde{\psi}}_{i}={E}_{mp}\left({\widehat{\psi}}_{i}\right)=\mathrm{exp}\left({x}_{i}^{T}\alpha \right)\Delta ,$$

where
$\Delta ={E}_{mp}\left(\mathrm{exp}\left({\epsilon}_{i}\right)\right).$
Dick
(1995) estimated
${\tilde{\psi}}_{i}$
by
omitting the factor
$\Delta .$
Rivest
and Belmonte (2000) estimated
$\Delta $
by
assuming that the errors
${\epsilon}_{i}$
are
normally distributed. However, we observed empirically that the resulting
estimator of
$\Delta $
is
sensitive to deviations from the normality assumption. This assumption is
avoided by using a method of moments (see

$$\widehat{\Delta}\left(\alpha \right)=\frac{{\displaystyle {\sum}_{i=1}^{m}{\widehat{\psi}}_{i}}}{{\displaystyle {\sum}_{i=1}^{m}\mathrm{exp}\left({x}_{i}^{T}\alpha \right)}}.$$

An estimator $\widehat{\alpha}$ of the vector of unknown model parameters $\alpha $ is necessary to estimate ${\tilde{\psi}}_{i}.$ It is obtained using the ordinary least squares method as

$$\widehat{\alpha}={\left({\displaystyle \sum _{i=1}^{m}{x}_{i}{x}_{i}^{T}}\right)}^{-1}{\displaystyle \sum _{i=1}^{m}{x}_{i}\mathrm{log}\left({\widehat{\psi}}_{i}\right)}.$$

The estimator ${\widehat{\tilde{\psi}}}_{i}$ of ${\tilde{\psi}}_{i}$ is then given by

$${\widehat{\tilde{\psi}}}_{i}=\mathrm{exp}\left({x}_{i}^{T}\widehat{\alpha}\right)\widehat{\Delta}\left(\widehat{\alpha}\right).$$

A nice property of ${\widehat{\tilde{\psi}}}_{i}$ is that the average of the smooth design variance estimator, ${\widehat{\tilde{\psi}}}_{i},$ is equal to the average of the direct variance estimator, ${\widehat{\psi}}_{i};$ i.e.,

$$\frac{{\displaystyle {\sum}_{i=1}^{m}{\widehat{\tilde{\psi}}}_{i}}}{m}=\frac{{\displaystyle {\sum}_{i=1}^{m}{\widehat{\psi}}_{i}}}{m}.$$

This ensures that ${\widehat{\tilde{\psi}}}_{i}$ does not systematically overestimate or underestimate ${\tilde{\psi}}_{i}={E}_{mp}\left({\widehat{\psi}}_{i}\right).$

## 3.2 Benchmarking

If the parameter of interest ${\theta}_{i}$ is a total $\left({\theta}_{i}={Y}_{i}\right),$ the user may wish to have the sum of the small area estimates, $\widehat{\theta}={\displaystyle {\sum}_{i\in A\cup \overline{A}}{\widehat{\theta}}_{i}^{\text{EBLUP}}},$ agree with the estimated totals $\widehat{Y}={\displaystyle {\sum}_{i\in A}{\widehat{Y}}_{i}}$ at the overall sample level $s;$ i.e., $\widehat{\theta}=\widehat{Y}.$ In the case of a mean, ${\theta}_{i}={\overline{Y}}_{i},$ this benchmarking condition becomes ${\sum}_{i\in A\cup \overline{A}}{N}_{i}{\widehat{\theta}}_{i}^{\text{EBLUP}}}={\displaystyle {\sum}_{i\in A}{N}_{i}{\widehat{\theta}}_{i}},$ where ${\widehat{\theta}}_{i}={\widehat{\overline{Y}}}_{i}.$

Two methods are available in the production system to ensure benchmarking for area based small area estimates. The first one is based on a difference adjustment and the second one is based on an augmented vector. They are valid for any method used to compute ${\widehat{\theta}}_{i}^{\text{EBLUP}}$ or whether the variance estimate ${\ddot{\psi}}_{i}$ has been smoothed or not. The benchmarking based on a difference adjustment is an adaptation of the benchmarking given in Battese et al. (1988). The benchmarking based on an augmented vector is due to Wang, Fuller and Qu (2008).

*Difference adjustment*:
For this method, the
${\widehat{\theta}}_{i}^{\text{EBLUP}}$
estimator is adjusted only for those areas
where the realized sample size
${n}_{i}\ge 1,\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in A$
and the
synthetic estimates
${z}_{i}^{T}\widehat{\beta}$
for
$i\in \overline{A}$
are left
as is. The resulting benchmarked estimator is given by
${\widehat{\theta}}_{i}^{\text{EBLUP},\text{\hspace{0.17em}}b}$
and is
defined as follows

$${\widehat{\theta}}_{i}^{\text{EBLUP},\text{\hspace{0.17em}}b}=\{\begin{array}{ll}{\widehat{\theta}}_{i}^{\text{EBLUP}}+{\alpha}_{i}\left({\widehat{\theta}}^{*}-{\displaystyle \sum _{d\in A}{\omega}_{d}{\widehat{\theta}}_{d}^{\text{EBLUP}}}\right)\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in A\hfill \\ {z}_{i}^{T}\widehat{\beta}\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in \overline{A}\hfill \end{array}$$

where
${\alpha}_{i}={\left\{{\displaystyle {\sum}_{i\in {U}_{A}}{\omega}_{i}^{2}\left({\ddot{\psi}}_{i}+{b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}\right)}\right\}}^{-1}{\omega}_{i}\left({\ddot{\psi}}_{i}+{b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}\right)$
for
$i\in A,$
${\omega}_{i}=1,$
if the
benchmarking is to a total, and
${\omega}_{i}={N}_{i}/N,$
if the
benchmarking is for the mean. The estimator
${\widehat{\theta}}^{*}$
is a
value provided by the user that represents the total or mean of the
$y$
*-*values of population
$U.$
The
benchmarking ensures that
${\sum}_{i\in A\cup \overline{A}}{\omega}_{i}{\widehat{\theta}}_{i}^{\text{EBLUP},\text{\hspace{0.17em}}b}}={\widehat{\theta}}^{*}.$

*Augmented vector*: The vector
${z}_{i}^{T}$
is
augmented with
${\omega}_{i}{\ddot{\psi}}_{i},$
to form
${z}_{i}^{*T}=\left({z}_{i}^{T}\text{},\text{\hspace{0.17em}}{\omega}_{i}{\ddot{\psi}}_{i}\right)$
with
${\omega}_{i}$
and
${\ddot{\psi}}_{i}$
as
previously defined. The resulting augmented generalized linear mixed model
(GLMM) equation is given by

$${\widehat{\theta}}_{i}={z}_{i}^{*T}{\beta}^{\text{*}}+{b}_{i}{v}_{i}^{*}+{e}_{i}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(3.5)$$

where ${E}_{m}\left({v}_{i}^{*}\right)=0$ and ${V}_{m}\left({v}_{i}^{*}\right)={\sigma}_{v}^{*2}.$ The estimates for ${\beta}^{*}$ and ${\sigma}_{v}^{*2}$ are once more solved recursively for the four EBLUP procedures that we denote as ${\widehat{\theta}}_{i}^{\text{EBLUP*}}.$

The resulting benchmarked estimator ${\widehat{\theta}}_{i}^{\text{EBLUP}*,\text{\hspace{0.17em}}b}$ is given by

$${\widehat{\theta}}_{i}^{{\text{EBLUP}}^{*}\text{},\text{\hspace{0.17em}}b}=\{\begin{array}{ll}{\widehat{\gamma}}_{i}^{*}{\widehat{\theta}}_{i}^{{\text{EBLUP}}^{*}}+\left(1-{\widehat{\gamma}}_{i}^{*}\right){z}_{i}^{*T}{\widehat{\beta}}^{\text{*}}\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in A\hfill \\ {z}_{i}^{*T}{\widehat{\beta}}^{\text{*}}\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in \overline{A}\hfill \end{array}$$

where ${\widehat{\gamma}}_{i}^{*}=\left({b}_{i}^{2}{\widehat{\sigma}}_{v}^{*2}\right)/\left({\ddot{\psi}}_{i}+{b}_{i}^{2}{\widehat{\sigma}}_{v}^{*2}\right),$ and ${\widehat{\beta}}^{*}={\left({\displaystyle {\sum}_{i\in A}{z}_{i}^{*}{z}_{i}^{*T}}/\left({\ddot{\psi}}_{i}+{b}_{i}^{2}{\widehat{\sigma}}_{v}^{*2}\right)\right)}^{-1}{\displaystyle {\sum}_{i\in A}{z}_{i}^{*}{\widehat{\theta}}_{i}}/\left({\ddot{\psi}}_{i}+{b}_{i}^{2}{\widehat{\sigma}}_{v}^{*2}\right).$

All the components of ${\widehat{\theta}}_{i}^{{\text{EBLUP}}^{*}\text{},\text{\hspace{0.17em}}b}$ are computed using the augmented model given by (3.5). It can be shown that ${\sum}_{i\in A\cup \overline{A}}{\omega}_{i}{\widehat{\theta}}_{i}^{{\text{EBLUP}}^{*}\text{},\text{\hspace{0.17em}}b}}={\displaystyle {\sum}_{i\in A}{\omega}_{i}{\widehat{\theta}}_{i}},$ and hence the benchmarking holds.

The difference adjustment and augmented vector methods are two ways that benchmarking can be satisfied. Wang et al. (2008) suggested other procedures that can be used. Specifically, they adapted the self-calibrated estimator You and Rao (2002) developed in the context of the unit level model to the area level model. You, Rao and Hidiroglou (2013) obtained an estimator of the mean squared prediction error and its bias under a misspecified model.

## 3.3 Mean squared error estimation

The reliability of the EBLUP estimators is obtained as $\text{MSE}\left({\widehat{\theta}}_{i}^{\text{EBLUP}}\right)=E{\left({\widehat{\theta}}_{i}^{\text{EBLUP}}-{\theta}_{i}\right)}^{2}.$ The expectation is with respect to models (3.3) for the non-benchmarked estimator, and (3.5) for the benchmarked estimator.

The estimated Mean Squared Errors (MSEs) of the area level estimators are given in Table 3.1. The specific form of the $g$ terms and the estimated variances can be found in Rao and Molina (2015) or in Estevao et al. (2015). For the benchmarked estimators, the estimated MSE for the difference adjustment approach uses the non-benchmarked MSE formulas. For the case of the augmented vector approach, the MSE is based on augmenting the vector ${z}_{i}^{T}$ with ${\omega}_{i}{\ddot{\psi}}_{i}.$

Estimator | mse |
---|---|

Fay-Herriot | $\text{mse}\left({\widehat{\theta}}_{i}^{\text{FH}}\right)=\{\begin{array}{ll}{g}_{0i}+{g}_{1i}+{g}_{2i}+2{g}_{3i}\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in A\hfill \\ {z}_{i}^{T}\text{var}\left(\widehat{\beta}\right){z}_{i}+\text{}{b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in \overline{A}\hfill \end{array}$ |

ADM | $\text{mse}\left({\widehat{\theta}}_{i}^{\text{ADM}}\right)=\{\begin{array}{ll}{g}_{0i}+{g}_{1i}+{g}_{2i}+2{g}_{3i}\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in A\hfill \\ {z}_{i}^{T}\text{var}\left(\widehat{\beta}\right){z}_{i}+\text{}{b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in \overline{A}\hfill \end{array}$ |

REML | $\text{mse}\left({\widehat{\theta}}_{i}^{\text{REML}}\right)=\{\begin{array}{ll}{g}_{1i}+{g}_{2i}+2{g}_{3i}\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in A\hfill \\ {z}_{i}^{T}\text{var}\left(\widehat{\beta}\right){z}_{i}+\text{}{b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in \overline{A}\hfill \end{array}$ |

WF | $\text{mse}\left({\widehat{\theta}}_{i}^{\text{WF}}\right)=\{\begin{array}{ll}{g}_{1i}+{g}_{2i}+2{g}_{3i}+{g}_{4i}\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in A\hfill \\ {z}_{i}^{T}\text{var}\left(\widehat{\beta}\right){z}_{i}+{b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}\hfill & \text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}i\in \overline{A}\hfill \end{array}$ |

The various $g$ terms in Table 3.1 can be interpreted as follows. The $\text{}{g}_{0i}$ is a bias correction term for FH and ADM. The ${g}_{1i}$ term given by ${g}_{1i}={\widehat{\gamma}}_{i}{\ddot{\psi}}_{i},$ accounts for most of the MSE if the number of areas is large. The ${g}_{2i}$ term accounts for the estimation of $\beta ,$ and $2{g}_{3i}$ accounts for the estimation of ${\sigma}_{v}^{2}.$ The ${g}_{4i}$ term in the WF procedure reflects that the estimated value of ${\psi}_{i},$ ${\widehat{\psi}}_{i},$ has been used. The estimated variance of $\widehat{\beta},$ given by $\text{var}\left(\widehat{\beta}\right)={\left({\displaystyle {\sum}_{i\in A}{\scriptscriptstyle \frac{{z}_{i}{z}_{i}^{T}}{{\ddot{\psi}}_{i}+{b}_{i}^{2}{\widehat{\sigma}}_{v}^{2}}}}\right)}^{-1}$ is dependent on the particular procedure used to estimate ${\sigma}_{v}^{2}.$

## Report a problem on this page

Is something not working? Is there information outdated? Can't find what you're looking for?

Please contact us and let us know how we can help you.

- Date modified: