2 A general procedure for constructing fully efficient replication weights

Jae Kwang Kim and Changbao Wu

In principle, we can construct replication weights for any measurable sampling design, using the method outlined in Fay (1984) and Fay and Dippo (1989), such that the resulting replication variance estimators are algebraically equivalent to the standard linearization variance estimators.

Let $U = {1, 2, \dots, N}$ be the set of $N$ units in the finite population and $S = {1, 2, \dots, n}$ be the set of $n$ units in the sample, selected according to a probability sampling design. Let $w_{i} = 1 / π_{i}$ be the basic design weight, where $π_{i} = P (i \in S)$ is the first order inclusion probability for unit $i .$

Let $y_{i}$ be the value of the study variable $y$ for unit $i$ and $t_{y} = \sum_{i = 1}^{N} y_{i}$ be the population total of interest. The Horvitz-Thompson estimator of $t_{y}$ is given by

${\hat{t}}_{y} = \sum_{i \in S} w_{i} y_{i} .$ (2.1)

The estimator ${\hat{t}}_{y}$ given in (2.1) is also called the expansion estimator, with the basic design weight $w_{i}$ denoting the number of units in the population represented by unit $i$ in the sample. The standard variance estimator of ${\hat{t}}_{y}$ can be written as

$v = \sum_{i \in S} \sum_{j \in S} Ω_{i j} y_{i} y_{j},$ (2.2)

where $Ω_{i j} = (π_{i j} - π_{i} π_{j}) / (π_{i j} π_{i} π_{j})$ and $π_{i j} = P (i, j \in S)$ is the second order joint inclusion probability for $(i j) .$ It is assumed that $π_{i j} > 0$ for all $(i j) .$ Note that $π_{i i} = π_{i} .$ The standard variance estimator $v$ is often viewed as fully efficient since it is the Horvitz-Thompson estimator of the design-based variance $V ({\hat{t}}_{y}) .$

Let $Δ = (Ω_{i j})$ be an $n \times n$ matrix. We can re-write (2.2) as $v = y^{'} Δ y,$ where $y = (y_{1}, y_{2}, \dots, y_{n})^{'}$ is the vector of sampled $y_{i} ’ s .$ The matrix $Δ$ is nonnegative definite and can be decomposed as

$Δ = \sum_{k = 1}^{p} λ_{k} δ_{k} {δ^{'}}_{k}$ (2.3)

for some $λ_{k} > 0$ and some $n –$ dimensional vectors $δ_{k}, k = 1, 2, \dots, p .$ The most well-known decomposition (2.3) is given by the spectral decomposition where $δ_{k}$ is the eigenvector associated with the eigenvalue $λ_{k} .$ In practice, very small eigenvalues are often ignored for computational reasons. For stratified sampling, the matrix $Δ$ is block-diagonal so the computational burden may be alleviated. However, we do not restrict (2.3) to the spectral decomposition. Any decomposition satisfying (2.3) can be used.

Suppose that we want to express the fully efficient variance estimator $v$ given by (2.2) as a replication variance estimator in the form of

$v_{R} = \sum_{k = 1}^{L} c_{k} {({\hat{t}}_{y}^{(k)} - {\hat{t}}_{y})}^{2},$ (2.4)

where ${\hat{t}}_{y}^{(k)} = \sum_{i \in S} w_{i}^{(k)} y_{i}, w^{(k)} = (w_{1}^{(k)}, \dots, w_{n}^{(k)})^{'}$ is the $k^{th}$ set of replication weights, $c_{k} > 0$ is the factor associated with the $k^{th}$ set of replication weights and $L$ is the total number of replications; see Kim, Navarro and Fuller (2006) for further discussion.

The form given by (2.4) does not include all replication variance estimators. For instance, Campbell (1980) provided a jackknife variance estimator where the pseudovalues are derived based on the von Mises approximation to the parameter of interest. Nevertheless, most replication variance estimators can be put in this form.

We have the following result on the construction of $w^{(k)}$ for $v_{R}$ based on the decomposition (2.3).

Theorem 1. The fully efficient variance estimator $v$ and the replication variance estimator $v_{R}$ are algebraically identical if we let $L = p$ and $w^{(k)} = w + {(λ_{k} / c_{k})}^{1 / 2} δ_{k},$ where $w = (w_{1}, \dots, w_{n})^{'}$ is the set of original basic design weights.

Proof. The proof follows directly from the fact that $v = y^{'} Δ y = \sum_{k = 1}^{p} λ_{k} {({δ^{'}}_{k} y)}^{2}$ and that ${\hat{t}}_{y}^{(k)} - {\hat{t}}_{y} = (w^{(k)} - w)^{'} y = {(λ_{k} / c_{k})}^{1 / 2} {δ^{'}}_{k} y .$

The choices of $c_{k} ’ s$ can be arbitrary and bear no impact on the validity and efficiency of the replication variance estimators. However, certain choices of $c_{k}$ will result in replication weights with negative values, which is undesirable as it may produce negative replicates for the parameters that are always positive. In practical situations one can always choose relatively large $c_{k}$ to avoid negative values for replication weights. In our simulation study (Case I) reported in Section 5, the problem of negative replication weights can be eliminated with the choice of $c_{k} = 1.$

The replication variance estimator $v_{R} = \sum_{k = 1}^{L} c_{k} {({\hat{t}}_{y}^{(k)} - {\hat{t}}_{y})}^{2}$ with $L = p$ and replication weights $w^{(k)} = w + {(λ_{k} / c_{k})}^{1 / 2} δ_{k}$ is fully efficient for an arbitrary variable $y .$ It also provides fully efficient variance estimator for $\hat{θ}$ when $θ$ is a smooth function of population means or totals. Practical implementation of the method depends crucially on two related issues: (i) the feasibility of the decomposition of the $n \times n$ matrix $Δ$ specified in (2.3); and (ii) the number of sets of replication weights required to achieve the full efficiency determined by $p = rank (Δ) .$

As for the first issue, modern advances in computational power and improved performances of available software packages make it possible to do the spectral decomposition with relatively large $n .$ For instance, on a 12-CPU unix machine with 96 gigabytes of memory, the R function eigen() can handle matrices of sizes at least as large as $n =$ 4,000. Note that the computational task involved here is for survey runners at the data preparation stage and is not for users of the data files. As for the second issue, the value of $p$ is related to the given sampling design. For simple random sampling without replacement, we have

$Δ = N^{2} (1 - n / N) {(n (n - 1))}^{- 1} (I_{n} - 1_{n} {1^{'}}_{n} / n),$

where $I_{n}$ is the $n \times n$ identity matrix and $1_{n} = (1, 1, \dots, 1)^{'}$ is the $n \times 1$ vector of $1 ’ s .$ It follows that $p = rank (Δ) = trace (I_{n} - 1_{n} {1^{'}}_{n} / n) = n - 1.$ This is typically the case for single stage unequal probability sampling designs. For stratified simple random sampling, we have $p = n - H,$ where $H$ is the total number of strata.

It should be noted that $p \leq n$ for any sampling design and the exact value of $p$ is not required for the proposed procedure to be implemented. However, since the values of $p$ and $n$ have the same order of magnitude, the proposed method requires a large number of replicates whenever $n$ is large. Under the current practices in sample surveys, the fully efficient replication weights described above become immediately implementable if $p \leq$ 500 and the second order inclusion probabilities $π_{i j}$ are available. When $p$ is large, a two-stage procedure to be described in Section 3 can be used to produce a small number $L_{0}$ sets of replication weights for public-use data files.

In some cases, the spectral decomposition (2.3) can be avoided. For example, Deville (1999) argued that the variance estimator of ${\hat{t}}_{y}$ under unequal probability sampling designs with fixed sample size can be approximated by

$v ≐ c \sum_{i \in S} (1 - π_{i}) {(\frac{y_{i}}{π_{i}} - {\tilde{t}}_{y})}^{2} (2.5)$

where $c = {(1 - \sum_{i \in S} b_{i}^{2})}^{- 1}, b_{i} = (1 - π_{i}) / \sum_{k \in S} (1 - π_{k})$ and ${\tilde{t}}_{y} = \sum_{i \in S} b_{i} (y_{i} / π_{i}) .$ More generally, we consider the following form of matrix $Δ$ in $v = y^{'} Δ y,$ where

$Δ = Δ_{0} - Δ_{0} X {(X^{'} Δ_{0} X)}^{- 1} X^{'} Δ_{0}$ (2.6)

where $Δ_{0} = diag {λ_{1}, \dots, λ_{n}}, λ_{i} > 0$ for all $i = 1, 2, \dots, n, X^{'} = (x_{1}, \dots, x_{n})$ and $x_{i}$ is a vector of design and auxiliary variables. Many elementary single-stage sampling designs take the form (2.6) for variance estimation. In particular, Deville's formula in (2.5) can be expressed as $v ≐ y^{'} Δ y$ with $Δ$ given by (2.6), where $λ_{i} = c π_{i}^{- 2} (1 - π_{i})$ in $Δ_{0}$ and $x_{i} = π_{i} .$ The conditional Poisson sampling design to be discussed in Section 5 also takes the form (2.6) where $x_{i}$ are the design variables in the design constraint $\sum_{i \in S} π_{i}^{- 1} x_{i} = \sum_{i = 1}^{N} x_{i} .$

For the matrix given by (2.6), it can be shown that

$y^{'} Δ y = (y - X \hat{β})^{'} Δ_{0} (y - X \hat{β}),$

where $\hat{β} = {(X^{'} Δ_{0} X)}^{- 1} X^{'} Δ_{0} y .$ Thus, we have

$y^{'} Δ y = \sum_{k = 1}^{n} λ_{k} {(y_{k} - {x^{'}}_{k} \hat{β})}^{2},$ (2.7)

which is useful in deriving an expression for replication variance estimator in the form given by (2.4). The fully efficient variance estimator $v$ in (2.7) and the replication variance estimator $v_{R}$ in (2.4) are algebraically identical if we let $L = n$ and $w^{(k)} = w + {(λ_{k} / c_{k})}^{1 / 2} δ_{k},$ where $w = (w_{1}, \dots, w_{n})^{'}$ is the set of original basic design weights and $δ_{k} = (δ_{1 k}, \dots, δ_{n k})^{'}$ with

$δ_{i k} = {\begin{array}{l} - 1 + {x^{'}}_{k} {(X^{'} Δ_{0} X)}^{- 1} x_{i} λ_{i} & if i = k \\ {x^{'}}_{k} {(X^{'} Δ_{0} X)}^{- 1} x_{i} λ_{i} & otherwise . \end{array}$

The proof follows directly from the fact that ${δ^{'}}_{k} y = - y_{k} + {x^{'}}_{k} \hat{β}, y^{'} Δ y = \sum_{k = 1}^{n} λ_{k} {({δ^{'}}_{k} y)}^{2}$ and that ${\hat{t}}_{y}^{(k)} - {\hat{t}}_{y} = (w^{(k)} - w)^{'} y = {(λ_{k} / c_{k})}^{1 / 2} {δ^{'}}_{k} y .$

Previous | Next

Date modified:: 2017-09-20

Language selection

Search and menus

Search

Publications

Survey Methodology

Browse by

2 A general procedure for constructing fully efficient replication weights