4 Weighted log composite likelihood: a unified approach
J.N.K. Rao, F. Verret and M.A. Hidiroglou
Previous | Next
In this section we propose a unified approach
applicable to both linear and generalized linear multi-level models. This
approach is based on the concept of composite likelihood which has become
popular in the non-survey literature to handle clustered or spatial data (see
e.g., Lindsay 1988, Lele and Taper 2002 and Varin, Reid and Firth 2011). A pairwise
marginal composite likelihood is obtained by multiplying the likelihood
contributions from all the distinct pairs within clusters. Note that the
composite likelihood is obtained by pretending the sub-models are independent.
When the super-population model holds for the sample, then we can obtain
parameter estimators by maximizing the pairwise composite likelihood. Here we
extend this approach to handle informative designs by obtaining weighted
estimating equations that require only the marginal weights
and
and the pairwise weights
as in Section 3.
The census log pairwise composite likelihood is
given by
where
is the marginal joint density of
and
We estimate (4.1) by the design-weighted log
pairwise composite likelihood
which depends only on the first order level 1 and level 2 inclusion
probabilities and the second order level 1 probabilities. We then solve the
weighted composite score equations
obtained from (4.2) to get a weighted composite likelihood estimator,
of
. The proposed method is
applicable to linear and generalized linear two-level models.
We note that
, given by (4.3), is a vector of
estimating functions with zero expectation with respect to the design and the
model, i.e.,
Using this result, it can be shown that the
weighted composite likelihood (WCL) estimator
of
is design-model consistent as the number of
level 2 units in the sample,
increases, even when the within cluster sample
sizes,
are small. Details of the proof are given in
Yi, Rao and Li (2012). In the non-survey context, we have limited theoretical
and empirical evidence that the composite likelihood approach leads to
efficient estimators (e.g., Bellio and Varin 2005, Lindsay et al. 2011). Our
simulation study (Section 5) indicates that the weighted composite likelihood
approach performs well in terms of efficiency, even for small within-cluster
sample sizes.
In the case of the nested error model (3.13), following
Lele and Taper (2002) we can simplify the pairwise composite likelihood
approach by replacing the bivariate density function
by the univariate density functions of
and the difference
For the mean model (2.2), we have
and
. By reparametrizing
as
where
we see that the parameters of the two
univariate density functions are distinct and the log composite likelihoods
corresponding to
and
are given by
and
We then solve the resulting weighted composite score equations
to get the weighted composite likelihood (WCL) estimators
and
. The WCL estimators are
identical to (3.9)
(3.11) obtained by the weighted estimating
equations approach of Section 3.
We now turn to the nested error linear regression
model (3.13). We first note that
where
and
It follows that the weighted composite score
equations are given by
and
The resulting WCL estimators of
,
and
are given by
and
The estimator of
is given by
Again, the WCL estimators
,
and
are identical to (3.17)-(3.19) obtained from
the weighted estimating equations approach of Section 3.
The above composite likelihood approach, based on
and
, is not applicable to the linear
two-level model given by (2.4) because the parameter vector,
, is not identifiable under the
composite likelihood obtained from the
and
. We need the pairwise method to
handle model (2.4).
Marginally,
is bivariate normal with means
and
and
covariance matrix
It now follows
from (4.3) that the weighted composite score equations are given by
and
where
is the
matrix with rows
and
,
and
is the P-vector
with elements
and the
distinct elements of
denoted by
. We can solve the weighted
composite score equations (4.4) and (4.5) iteratively using the Newton-Raphson
method or some other iterative method to obtain the WCL estimators
and
.
In the special case of the nested error linear regression model (3.13),
the census score equations, based on the full census log-likelihood
given by (2.5), can be written in a closed
form. The corresponding sample weighted score equations depend only on the
level 1 weights
and
and the level 2 weights
similar to the weighted composite score
equations (see the Appendix). The resulting estimators are design-model
consistent for
, unlike the estimators based on
the weighted pseudo log-likelihood
given by (2.7) and (2.8). However, for more
complex models, such as two level models with random slopes, the sample
weighted score equations will depend on third order and fourth order level 1
inclusion probabilities, unlike the weighted composite score equations (4.3)
that depend only on the first order and second order level 1 inclusion
probabilities, even for complex multi-level models. We have therefore not
included the weighted score equations approach, based on the full census log-likelihood,
in the simulation study.
Previous | Next