4 Weighted log composite likelihood: a unified approach

J.N.K. Rao, F. Verret and M.A. Hidiroglou

In this section we propose a unified approach applicable to both linear and generalized linear multi-level models. This approach is based on the concept of composite likelihood which has become popular in the non-survey literature to handle clustered or spatial data (see e.g., Lindsay 1988, Lele and Taper 2002 and Varin, Reid and Firth 2011). A pairwise marginal composite likelihood is obtained by multiplying the likelihood contributions from all the distinct pairs within clusters. Note that the composite likelihood is obtained by pretending the sub-models are independent. When the super-population model holds for the sample, then we can obtain parameter estimators by maximizing the pairwise composite likelihood. Here we extend this approach to handle informative designs by obtaining weighted estimating equations that require only the marginal weights $w_{i}$ and $w_{j | i}$ and the pairwise weights $w_{j k | i},$ as in Section 3.

The census log pairwise composite likelihood is given by

$l_{C} (θ) = \sum_{i = 1}^{N} \sum_{j < k = 1}^{M_{i}} \log f (y_{i j}, y_{i k} | θ), (4.1)$

where $f (y_{i j}, y_{i k} | θ)$ is the marginal joint density of $y_{i j}$ and $y_{i k} .$ We estimate (4.1) by the design-weighted log pairwise composite likelihood

$l_{w C} (θ) = \sum_{i \in s} w_{i} \sum_{j < k \in s (i)} w_{j k | i} \log f (y_{i j}, y_{i k} | θ) (4.2)$

which depends only on the first order level 1 and level 2 inclusion probabilities and the second order level 1 probabilities. We then solve the weighted composite score equations

${\hat{U}}_{w C} (θ) = \partial l_{w C} (θ) / \partial θ = 0, (4.3)$

obtained from (4.2) to get a weighted composite likelihood estimator, ${\hat{θ}}_{w C},$ of $θ$ . The proposed method is applicable to linear and generalized linear two-level models.

We note that ${\hat{U}}_{w C} (θ)$ , given by (4.3), is a vector of estimating functions with zero expectation with respect to the design and the model, i.e., $E_{m} E_{p} {{\hat{U}}_{w C} (θ)} = 0 .$ Using this result, it can be shown that the weighted composite likelihood (WCL) estimator ${\hat{θ}}_{w C}$ of $θ$ is design-model consistent as the number of level 2 units in the sample, $n,$ increases, even when the within cluster sample sizes, $m_{i},$ are small. Details of the proof are given in Yi, Rao and Li (2012). In the non-survey context, we have limited theoretical and empirical evidence that the composite likelihood approach leads to efficient estimators (e.g., Bellio and Varin 2005, Lindsay et al. 2011). Our simulation study (Section 5) indicates that the weighted composite likelihood approach performs well in terms of efficiency, even for small within-cluster sample sizes.

In the case of the nested error model (3.13), following Lele and Taper (2002) we can simplify the pairwise composite likelihood approach by replacing the bivariate density function $f (y_{i j}, y_{i k} | θ)$ by the univariate density functions of $y_{i j}$ and the difference $z_{i j k} = y_{i j} - y_{i k} .$ For the mean model (2.2), we have $y_{i j} ~ N (μ, σ_{v}^{2} + σ_{e}^{2})$ and $z_{i j k} ~ N (0, 2 σ_{e}^{2})$ . By reparametrizing $θ = {(μ, σ_{v}^{2}, σ_{e}^{2})}^{T}$ as $ϕ = {(μ, σ^{2}, σ_{e}^{2})}^{T}$ where $σ^{2} = σ_{v}^{2} + σ_{e}^{2},$ we see that the parameters of the two univariate density functions are distinct and the log composite likelihoods corresponding to $y_{i j}$ and $z_{i j k}$ are given by

$l_{w C y} (μ, σ^{2}) = \sum_{i \in s} w_{i} \sum_{j \in s (i)} w_{j | i} \log f (y_{i j} | μ, σ^{2})$

and

$l_{w C z} (σ_{e}^{2}) = \sum_{i \in s} w_{i} \sum_{j < k \in s (i)} w_{j k | i} \log f (z_{i j k} | σ_{e}^{2}) .$

We then solve the resulting weighted composite score equations

${\hat{U}}_{w C y 1} (μ, σ^{2}) = \partial l_{w C y} (μ, σ^{2}) / \partial μ = \sum_{i \in s} w_{i} \sum_{j \in s_{i}} w_{j | i} (y_{i j} - μ) / σ^{2} = 0,$

${\hat{U}}_{w C y 2} (μ, σ^{2}) = \partial l_{w C y} (μ, σ^{2}) / \partial σ^{2} = \frac{1}{2} \sum_{i \in s} w_{i} \sum_{j \in s (i)} w_{j | i} [- \frac{1}{σ_{}^{2}} + \frac{{(y_{i j} - μ)}^{2}}{σ^{4}}] = 0$

${\hat{U}}_{w C z} (σ_{e}^{2}) = \partial l_{w C z} (σ_{e}^{2}) / \partial σ_{e}^{2} = \frac{1}{2} \sum_{i} w_{i} \sum_{j < k \in s (i)} w_{j k | i} (- \frac{1}{σ_{e}^{2}} + \frac{z_{i j k}^{2}}{2 σ_{e}^{4}}) = 0$

to get the weighted composite likelihood (WCL) estimators ${\hat{μ}}_{w C}, {\hat{σ}}_{v w C}^{2}$ and ${\hat{σ}}_{e w C}^{2}$ . The WCL estimators are identical to (3.9) $-$ (3.11) obtained by the weighted estimating equations approach of Section 3.

We now turn to the nested error linear regression model (3.13). We first note that $y_{i j} ~ N (x_{i j}^{T} β, σ^{2})$ where $σ^{2} = σ_{v}^{2} + σ_{e}^{2},$ and $z_{i j k} = y_{i j} - y_{i k} ~ N {{(x_{i j} - x_{i k})}^{T} β, 2 σ_{e}^{2}} .$ It follows that the weighted composite score equations are given by

$\begin{matrix} {\hat{U}}_{w C y 1} (β, σ^{2}) = \partial l_{w C y} (β, σ^{2}) / \partial β \\ = \sum_{i \in s} w_{i} \sum_{j \in s (i)} w_{j | i} x_{i j} (y_{i j} - x_{i j}^{T} β) = 0 \end{matrix}$

$\begin{matrix} {\hat{U}}_{w C y 2} (β, σ^{2}) = \partial l_{w C y} (β, σ^{2}) / \partial σ^{2} \\ = - \frac{1}{2} \sum_{i \in s} w_{i} \sum_{j \in s (i)} w_{j | i} [\frac{1}{σ^{2}} - \frac{{(y_{i j} - x_{i j}^{T} β)}^{2}}{σ^{4}}] = 0 \end{matrix}$

and

$\begin{matrix} {\hat{U}}_{w C z} (σ_{e}^{2}) = \partial l_{w C z} (σ_{e}^{2}) / \partial σ_{e}^{2} \\ = - \frac{1}{2} \sum_{i \in s} w_{i} \sum_{j < k \in s (i)} w_{j k | i} {\frac{1}{σ_{e}^{2}} - \frac{{[z_{i j k} - {(x_{i j} - x_{i k})}^{T} β]}^{2}}{2 σ_{e}^{4}}} = 0. \end{matrix}$

The resulting WCL estimators of $β$ , $σ_{v}^{2}$ and $σ_{e}^{2}$ are given by

${\hat{β}}_{w C} = {(\sum_{i \in s} \sum_{j \in s (i)} w_{i j} x_{i j} x_{i j}^{T})}^{- 1} (\sum_{i \in s} \sum_{j \in s (i)} w_{i j} x_{i j} y_{i j}),$

${\hat{σ}}_{w C}^{2} = \sum_{i \in s} \sum_{j \in s (i)} w_{i j} {(y_{i j} - x_{i j}^{T} {\hat{β}}_{w C})}^{2} / \sum_{i \in s} \sum_{j \in s (i)} w_{i j},$

and

${\hat{σ}}_{e w C}^{2} = \sum_{i \in s} w_{i} \sum_{j < k \in s (i)} w_{j k | i} {[z_{i j k} - {(x_{i j} - x_{i k})}^{T} {\hat{β}}_{w C}]}^{2} / (2 \sum_{i \in s} w_{i} \sum_{j < k \in s (i)} w_{j k | i}) .$

The estimator of $σ_{v}^{2}$ is given by ${\hat{σ}}_{v w C}^{2} = {\hat{σ}}_{w C}^{2} - {\hat{σ}}_{e w C}^{2} .$ Again, the WCL estimators ${\hat{β}}_{W C}$ , ${\hat{σ}}_{v W C}^{2}$ and ${\hat{σ}}_{e W C}^{2}$ are identical to (3.17)-(3.19) obtained from the weighted estimating equations approach of Section 3.

The above composite likelihood approach, based on $y_{i j}$ and $z_{i j k} = y_{i j} - y_{i k}$ , is not applicable to the linear two-level model given by (2.4) because the parameter vector, $θ$ , is not identifiable under the composite likelihood obtained from the $y_{i j}$ and $z_{i j k}$ . We need the pairwise method to handle model (2.4).

Marginally, ${(y_{i j}, y_{i k})}^{T}$ is bivariate normal with means $x_{i j}^{T} β$ and $x_{i k}^{T} β$ and $2 \times 2$ covariance matrix

$Σ_{i (j k)} = [\begin{matrix} σ_{e}^{2} + x_{i j}^{T} Σ_{v} x_{i j} & x_{i j}^{T} Σ_{v} x_{i k} \\ x_{i k}^{T} Σ_{v} x_{i j} & σ_{e}^{2} + x_{i k}^{T} Σ_{v} x_{i k} \end{matrix}] .$

It now follows from (4.3) that the weighted composite score equations are given by

$β : {\hat{U}}_{w C β} = \sum_{i \in s} w_{i} \sum_{j < k \in s (i)} w_{j k | i} X_{i (j k)}^{T} Σ_{i (j k)}^{- 1} (y_{i (j k)} - X_{i (j k)}^{T} β) = 0 (4.4)$

and

$\begin{array}{l} τ : {\hat{U}}_{w C l} = \frac{1}{2} \sum_{i \in s} w_{i} \sum_{j < k \in s (i)} w_{j k | i} [{(y_{i (j k)} - X_{i (j k)}^{T} β)}^{T} Σ_{i (j k)}^{- 1} \frac{\partial Σ_{i (j k)}}{\partial τ_{l}} Σ_{i (j k)}^{- 1} (y_{i (j k)} - X_{i (j k)}^{T} β) (4.5) \\ - tr (Σ_{i (j k)}^{- 1} \frac{\partial Σ_{i (j k)}}{\partial τ_{l}})] = 0, l = 1, ..., p (p + 1) / 2 + 1 = P \end{array}$

where $X_{i (j k)}$ is the $2 \times p$ matrix with rows $x_{i j}^{T}$ and $x_{i k}^{T}$ , $y_{i (j k)} = {(y_{i j}, y_{i k})}^{T}$ and $τ$ is the P-vector with elements $τ_{1} = σ_{e}^{2}$ and the $p (p + 1) / 2$ distinct elements of $Σ_{v}$ denoted by $τ_{2}, ..., τ_{P}$ . We can solve the weighted composite score equations (4.4) and (4.5) iteratively using the Newton-Raphson method or some other iterative method to obtain the WCL estimators ${\hat{β}}_{w C}$ and ${\hat{τ}}_{w C}$ .

In the special case of the nested error linear regression model (3.13), the census score equations, based on the full census log-likelihood $l (θ)$ given by (2.5), can be written in a closed form. The corresponding sample weighted score equations depend only on the level 1 weights $w_{j | i}$ and $w_{j k | i}$ and the level 2 weights $w_{i},$ similar to the weighted composite score equations (see the Appendix). The resulting estimators are design-model consistent for $θ$ , unlike the estimators based on the weighted pseudo log-likelihood $l_{w} (θ)$ given by (2.7) and (2.8). However, for more complex models, such as two level models with random slopes, the sample weighted score equations will depend on third order and fourth order level 1 inclusion probabilities, unlike the weighted composite score equations (4.3) that depend only on the first order and second order level 1 inclusion probabilities, even for complex multi-level models. We have therefore not included the weighted score equations approach, based on the full census log-likelihood, in the simulation study.

Previous | Next

Date modified:: 2017-09-20

Language selection

Search and menus

Search

Publications

Survey Methodology

Browse by

4 Weighted log composite likelihood: a unified approach