2. Basic theory
Jae-kwang Kim, Seunghwan Park and Seo-young Kim
Previous | Next
In
this section, we first introduce the basic theory for combining the information
for small area estimation. We first consider the simple case of combining two
surveys. Assume that there are two surveys, survey and survey obtained
from separate probability sampling designs. The two surveys are not necessarily
independent. From survey we obtain a design unbiased estimator
and its variance
estimator
From survey we obtain a design unbiased estimator
of
The sampling
error of
can be expressed
by the sampling error model
and
and
represent the
sampling errors associated with
and
such that
Our parameter of interest is the population total
of
in area
From
(1.1), we obtain the following area level model:
where
We can express (2.2)
in terms of population mean
where
If we use a
nested error model
where
and
then
The nested error
model is quite popular in small area estimation (e.g., Battese, Harter and Fuller 1988) and it assumes that
for
Because
is often quite
large, we can safely assume that
The model (2.2)
is called structural error model because it describes the structural
relationship between the two latent variables
and
The two models,
(2.1) and (2.2), are often encountered in the measurement error model
literature (Fuller 1987). Thus, the model for small area estimation can be
viewed as a measurement error model, as suggested by Fuller (1991) who
originally used the measurement error model approach in the unit-level modeling
for small area estimation.
Now,
if we define
combining (2.1)
and (2.3), we have
which can also be written as
Thus, when all the model parameters in (2.5) are known, the best estimator
of
can be computed
by
where
is the
variance-covariance matrix of
The variance of
is given by
The estimator in
(2.6) can be called the Generalized Least Squares (GLS) estimator because it
uses the technique of the generalized least squares method in the linear model
theory. The GLS method is useful because it is optimal and it can incorporate
additional sources of information naturally. For example, if another estimator
for
is also
available and satisfies
and
then the extended GLS model is written as
and the GLS estimator can be obtained by
where
is the
variance-covariance matrix of
The GLS
estimator has variance
If
is independent
of
the efficiency
gain by incorporating
into GLS in
terms of relative variance can be expressed as
where
The gain is high
if both the sampling variance of
and the model
variance
are small. If
then there is no
gain.
Remark 1 Note that model (2.5) can also
be written as
The GLS estimator obtained
from (2.8), which is the same as the GLS estimator obtained from (2.5), can be
expressed as
where
and
The estimator
when computed with estimated parameter
is called the synthetic estimator and the
optimal estimator in (2.9) is often called the composite estimator. It can be
shown that, ignoring the effect of estimating
the variance of the composite estimator is
equal to
and, as
the composite estimator is more efficient than
the direct estimator.
Previous | Next