Publications

Survey Methodology

Browse by

3 Methodology

Iván A. Carrillo and Alan F. Karr

3.1 Motivation

Assume that (in a non-survey context) interest lies in the $p \times 1$ vector parameter $β$ in the following model:

$ξ : (\begin{array}{l} E [Y_{i j} | X_{i j}] = μ_{i j} = g^{- 1} ({X^{'}}_{i j} β), & j = 1,2, \dots, J, i = 1,2, \dots \\ Var [Y_{i j} | X_{i j}] = ϕ ν (μ_{i j}), & j = 1,2, \dots, J, i = 1,2, \dots \\ Cov [Y_{i} | X_{i}] = Σ_{i}, & i = 1,2, \dots \\ Y_{k} ⊥ Y_{l} | X_{k}, X_{l}, & k \neq l = 1,2, \dots; \end{array}$ (3.1)

where $Y_{i j}$ is the response variable for subject $i$ at wave $j, X_{i j}$ is a $p \times 1$ vector of covariates, $Y_{i} = {(Y_{i 1}, Y_{i 2}, \dots, Y_{i J})}^{'},$ $X_{i} = (X_{i 1}, X_{i 2}, \dots, X_{i J})$ is a $p \times J$ matrix; $g (\cdot)$ is a monotonic one-to-one differentiable "link function�; $ν (\cdot)$ is the "variance function� with known form; and $ϕ > 0$ is the "dispersion parameter.� Since, in general, the $J \times J$ covariance matrix $Σ_{i}$ is hard to specify, we model it as $Cov [Y_{i} | X_{i}] = V_{i} = A_{i}^{1 / 2} R (α) A_{i}^{1 / 2},$ a "working� covariance matrix; where $A_{i} = diag [ϕ ν (μ_{i 1}), ϕ ν (μ_{i 2}), \dots, ϕ ν (μ_{i J})]$ and $R (α)$ is a "working� correlation matrix, both of dimension $J \times J,$ and $α$ is a vector that fully characterizes $R (α)$ (see Liang and Zeger 1986).

To estimate $β$ we select a (single-cohort) sample of $n$ elements from model $ξ$ and we (intend to) measure each of them at $J$ occasions. If all the elements in the sample respond at every single occasion $j,$ the task can be completed with the usual generalized estimating equation (GEE) methodology of Liang and Zeger (1986). However, in any study it is rarely the case that all subjects do respond at all waves. It is more common to have some elements in the sample who drop out of the study.

Under this situation, and assuming that the missing responses can be regarded as missing at random or MAR (see Rubin 1976), in particular that the dropout at a given wave does not depend on the current (unobserved) value, Robins, Rotnitzky and Zhao (1995) proposed to estimate $β$ by solving the estimating equations: $\sum_{i = 1}^{n} (\partial {μ^{'}}_{i} / \partial β) V_{i}^{- 1} {\hat{Δ}}_{i} (y_{i} - μ_{i}) = 0,$ where $μ_{i} = {(μ_{i 1}, μ_{i 2}, \dots, μ_{i J})}^{'},$ ${\hat{Δ}}_{i} = diag [R_{i 1} {\hat{q}}_{i 1}^{- 1}, R_{i 2} {\hat{q}}_{i 2}^{- 1}, \dots, R_{i J} {\hat{q}}_{i J}^{- 1}], R_{i j}$ is the response indicator for subject $i$ at wave $j,$ and ${\hat{q}}_{i j}$ is an estimate of the probability that subject $i$ is observed through wave $j .$

For survey applications, one would use the estimating equation $\sum_{i \in s} [w_{i} (\partial {μ^{'}}_{i} / \partial β) V_{i}^{- 1} {\hat{Δ}}_{i} (y_{i} - μ_{i})] = 0,$ where $w_{i}$ is the survey weight for subject $i .$ Another way of writing this equation is $\sum_{i \in s} (\partial {μ^{'}}_{i} / \partial β) V_{i}^{- 1} {\hat{Δ}}_{w i} (y_{i} - μ_{i}) = 0,$ with ${\hat{Δ}}_{w i} =diag [w_{i} R_{i 1} {\hat{q}}_{i 1}^{- 1}, w_{i} R_{i 2} {\hat{q}}_{i 2}^{- 1}, \dots, w_{i} R_{i J} {\hat{q}}_{i J}^{- 1}] \cdot$

We notice that the diagonal elements of ${\hat{Δ}}_{w i}$ are simply wave-specific nonresponse-adjusted survey weights whenever the subject is observed, and are equal to zero whenever the subject is missing. This feature in and of itself suggests a solution to the multi-cohort problem, which will be presented in the next section.

3.2 A novel approach to combining cohorts in longitudinal surveys

Based on the discussion in the previous section, if we have a fixed-panel, fixed-panel-plus-'births', repeated-panel, rotating-panel, split-panel, or refreshment sample survey, we propose to estimate the superpopulation parameter $β$ in model $ξ$ by the solution to the estimating equations:

$Ψ_{s} (β) = \sum_{i \in s} \frac{\partial {μ^{'}}_{i}}{\partial β} V_{i}^{- 1} W_{i} (y_{i} - μ_{i}) = 0;$ (3.2)

where the sum is over the sample $s,$ i.e., over all the elements selected (for the first time) in any of the samples $s_{1 (1)}, s_{2 (2)}, \dots, s_{J (J)} .$ The diagonal matrix $W_{i}$ is $W_{i} = diag [I_{i} (U_{1}) w_{i 1}, I_{i} (U_{2}) w_{i 2}, \dots, I_{i} (U_{J}) w_{i J}],$ with $w_{i j}$ being the (nonresponse-adjusted) cross-sectional weight for subject $i$ at wave $j$ (as long as subject $i$ is part of sample $s_{j}$ ) and $I_{i} (U_{j})$ is the indicator of whether subject $i$ belongs to finite population $U_{j}$ or not. In Section 3.2.1 we argue why this is a reasonable estimation procedure, and in Section 3.2.2 we discuss the missing value issue.

The cross-sectional weights $w_{i j},$ in $W_{i},$ are such that the sample $s_{j}$ represents $U_{j},$ when used in conjunction with said weights. This means that, for each observation $i$ in sample $s_{j},$ there has to be a survey weight $w_{i j},$ which could be regarded as the number of units that such observation represents in $U_{j} .$ However, remember that the sample $s_{j}$ is composed of different sets of subjects, or different subsamples (the different cohorts), and the integration of these subsamples into a single cross-sectional weight variable $w_{i j}$ may not be a straightforward task.

For the SDR, the construction of the cross-sectional weight for wave $j$ is not too complicated as the different cohorts are selected independently, from non-overlapping populations. The base weight in that case is easy to compute, and all that remains is the adjustment for things like attrition and calibration to known totals in the population $U_{j} .$

On the other hand, in other situations, for example, when a frame of new members does not exist, the new cohort may need to be selected from the overall population at the given wave, or from a frame containing new members plus some old members, or from multiple frames. In such cases, the building of the cross-sectional weights may not be as straightforward, and the theory of multiple frames may need to be used. We refer the reader to the works of Lohr (2007) and Rao and Wu (2010), and references therein, for cases like that.

Expression (3.2) is a generalization of equation (2.25) in Vieira (2009). The latter is applicable only when all the subjects have the same number of observations or any missing responses can be regarded as missing completely at random or MCAR (see Rubin 1976). As discussed in Robins, et al. (1995), using such an equation when the missing responses are not MCAR produces inconsistent estimators; therefore, with a rotation scheme like that of the SDR, where not all subjects are dropped (or kept) with the same probabilities, its usage would not be appropriate. The adequacy of equation (3.2) in that case and when there are missing responses is addressed in sections 3.2.1 and 3.2.2, respectively. If all subjects have cross-sectional weights that do not vary over time (or have a single longitudinal weight) equation (3.2) reduces to equation (2.25) in Vieira (2009).

3.2.1 Unbiasedness

The unbiasedness property of the estimating function is important because, as Song (2007, Section 5.4) argues, it is the most crucial assumption in order to obtain a consistent estimator.

Let us define $β_{N},$ the so-called "census estimator,� to be the solution to the following finite population estimating equation:

$Ψ_{U} (β_{N}) = \sum_{i \in U} \frac{\partial {μ^{'}}_{i}}{\partial β_{N}} V_{i}^{- 1} I_{i} (U) (y_{i} - μ_{i} (β_{N})) = 0,$ (3.3)

where the sum is over $U,$ i.e., over all the elements who became members of the target population in any of $U_{1 (1)}, U_{2 (2)}, \dots, U_{J (J)},$ and $I_{i} (U) =diag [I_{i} (U_{1}), I_{i} (U_{2}), \dots I_{i} (U_{J})] .$ In order to show design-unbiasedness of the estimating function $Ψ_{s} (β),$ we need to show that its design expectation is $Ψ_{U} (β)$ for any $β .$

The sampling design characteristics of a longitudinal survey can be thought of as those of a multiphase sample, as can be seen in Särndal, Swensson and Wretman (1992, Section 9.9). We therefore use the methodology of multiphase sampling for the derivations. We assume, without loss of generality, that there are only three waves; the derivations with just three waves show the patterns for general $J,$ with respect to unbiasedness and variance.

As we mentioned earlier, we assume that $w_{i j}$ is the cross-sectional weight for subject $i$ at wave $j,$ if that subject belongs to $s_{j},$ and zero otherwise. From the theory of multiphase sampling we have that for $i \in s_{1 (1)}, w_{i 1} = π_{i 1}^{- 1}, w_{i 2} = π_{i 1}^{- 1} π_{i 2 | s_{1 (1)}}^{- 1},$ and $w_{i 3} = π_{i 1}^{- 1} π_{i 2 | s_{1 (1)}}^{- 1} π_{i 3 | s_{2 (1)}}^{- 1};$ for $i \in s_{2 (2)}, w_{i 2} = π_{i 2}^{- 1}$ and $w_{i 3} = π_{i 2}^{- 1} π_{i 3 | s_{2 (2)}}^{- 1};$ and for $i \in s_{3 (3)}, w_{i 3} = π_{i 3}^{- 1};$ where $π_{i j}$ is the inclusion probability of subject $i$ in sample $s_{j (j)}$ and $π_{i j | s_{j - 1 (j^{'})}}$ is the conditional inclusion probability of subject $i$ in sample $s_{j (j^{'})}$ given $s_{j - 1 (j^{'})} .$

Using $E_{p} (\cdot)$ to denote the expectation with respect to the sampling design, we have:

$E_{p} [\sum_{i \in s} \frac{\partial {μ^{'}}_{i}}{\partial β} V_{i}^{- 1} W_{i} (y_{i} - μ_{i})] = E_{p} [\sum_{j = 1}^{3} \sum_{i \in s_{j (j)}} B_{i} W_{i} e_{i}];$ (3.4)

where $B_{i} = (\partial {μ^{'}}_{i} / \partial β) V_{i}^{- 1}$ and $e_{i} = y_{i} - μ_{i} .$ For example, for $\sum_{i \in s_{2 (2)}} B_{i} W_{i} e_{i}$ we obtain:

$\begin{matrix} E_{p} [\sum_{i \in s_{2 (2)}} B_{i} W_{i} e_{i}] = E {E [\sum_{i \in U_{2 (2)}} B_{i} D_{i} e_{i} | s_{2 (2)}]} = E {\sum_{i \in U_{2 (2)}} B_{i} D_{i}^{*} e_{i}} \\ = \sum_{i \in U_{2 (2)}} B_{i} D_{i}^{* *} e_{i} \overset{def}{=} \sum_{i \in U_{2 (2)}} B_{i} I_{i} (U) e_{i}, \end{matrix}$ (3.5)

where $D_{i} = diag [0, I_{i} (U_{2}) w_{i 2} I_{i} (s_{2 (2)}), I_{i} (U_{3}) w_{i 3} I_{i} (s_{3 (2)}) I_{i} (s_{2 (2)})],$
$D_{i}^{*} = diag [0, (I_{i} (U_{2}) w_{i 2} \times I_{i} (s_{2 (2)})), (I_{i} (U_{3}) π_{i 3 | s_{2 (2)}} I_{i} (s_{2 (2)})) / (π_{i 2} π_{i 3 | s_{2 (2)}})],$ and $D_{i}^{* *} = diag [0, (I_{i} (U_{2}) π_{i 2}) / π_{i 2}, (I_{i} (U_{3}) π_{i 2}) / π_{i 2}];$ similarly we can show that $E_{p} [\sum_{i \in s_{1 (1)}} B_{i} W_{i} e_{i}] = \sum_{i \in U_{1 (1)}} B_{i} I_{i} (U) e_{i}$ and $E_{p} [\sum_{i \in s_{3 (3)}} B_{i} W_{i} e_{i}] = \sum_{i \in U_{3 (3)}} B_{i} I_{i} (U) e_{i} .$ From these expressions and equation (3.4) we conclude that $E_{p} [Ψ_{s} (β)] = Ψ_{U} (β)$ for any $β,$ which means that the estimating function $Ψ_{s} (β)$ is design-unbiased for the finite population estimating function.

Furthermore, as the target of inference is the superpopulation parameter, we need to guarantee that the model for $μ_{i j}$ is such that $E_{ξ} (Y_{i j} - μ_{i j}) = 0$ is satisfied, where $E_{ξ} (\cdot)$ represents the expectation with respect to model $ξ .$ For if this is the case, we have:

$E_{ξ p} [Ψ_{s} (β)] \overset{def}{=} E_{ξ} E_{p} [Ψ_{s} (β)] = E_{ξ} [Ψ_{U} (β)] = \sum_{i \in U} \frac{\partial {μ^{'}}_{i}}{\partial β} V_{i}^{- 1} I_{i} (U) E_{ξ} (y_{i} - μ_{i}) = 0;$

so that the estimating function $Ψ_{s} (β)$ is model-design unbiased. The requirement $E_{ξ} (Y_{i j} - μ_{i j}) = 0$ means that the mean model needs to be correctly specified; consequently, one needs to pay attention to residual diagnostics for the particular model being fitted.

3.2.2 A note on nonresponse

In the SDR, as in any other (longitudinal) survey, there is nonresponse. Some sampled individuals choose not to participate at all, whereas some subjects participate in some waves but not in others. The SDR remedies this situation by making a nonresponse adjustment to the cross-sectional survey weights.

Assume that the nonresponse adjustment at wave $j$ is a multiplication by the inverse of the estimated wave $j$ response probability ${\hat{π}}_{r i j} .$ For example, the nonresponse-adjusted weight for a person who did respond at wave 3 (and was first selected at wave 2), i.e., for $i \in r_{3 (2)},$ would be $w_{r i 3} = π_{i 2}^{- 1} π_{i 3 | s_{2 (2)}}^{- 1} {\hat{π}}_{r i 3}^{- 1} .$

We need to redefine the estimating equation, to include only the respondents, as $Ψ_{r} (β) = \sum_{i \in r} (\partial {μ^{'}}_{i} / \partial β) V_{i}^{- 1} W_{r i} (y_{i} - μ_{i}) = 0,$ where the sum is over the respondent set $r,$ i.e., over all the elements who belonged for the first time in any of the respondent sets $r_{1 (1)}, r_{2 (2)}, \dots, r_{J (J)},$ and the matrix $W_{r i}$ is $W_{r i} = diag [I_{i} (U_{1}) w_{r i 1}, I_{i} (U_{2}) w_{r i 2}, \dots, I_{i} (U_{J}) w_{r i J}] .$ Also, denote by $r_{j (j^{'})}$ the set of cohort $j^{'}$ respondents at wave $j .$ Obviously, $w_{r i j} = 0$ if $i \notin r_{j} = \cup_{j^{'} = 1}^{j} r_{j (j^{'})} .$

If additionally, the response mechanism $(R)$ can be assumed to be MAR, we then have, for example for $\sum_{i \in r_{2 (2)}} B_{i} W_{r i} e_{i} :$

$E_{R} {\sum_{i \in r_{2 (2)}} B_{i} W_{r i} e_{i}} = E_{R} {\sum_{i \in s_{2 (2)}} B_{i} D_{i} e_{i}} = \sum_{i \in s_{2 (2)}} B_{i} D_{i}^{*} e_{i} = \sum_{i \in s_{2 (2)}} B_{i} D_{i}^{* *} e_{i} \overset{def}{=} \sum_{i \in s_{2 (2)}} B_{i} W_{i} e_{i},$ (3.6)

where $D_{i} = diag [0, I_{i} (U_{2}) w_{r i 2} I_{i} (r_{2 (2)}), I_{i} (U_{3}) w_{r i 3} I_{i} (r_{3 (2)})],$
$D_{i}^{*} = diag [0, (I_{i} (U_{2}) π_{r i 2}) / (π_{i 2} \times {\hat{π}}_{r i 2}), (I_{i} (U_{3}) π_{r i 3}) / (π_{i 2} π_{i 3 | s_{2 (2)}} {\hat{π}}_{r i 3})],$ and $D_{i}^{* *} = diag [0, I_{i} (U_{2}) w_{i 2}, I_{i} (U_{3}) w_{i 3}] .$ The third equality in (3.6) requires that the nonresponse model used for ${\hat{π}}_{r i j}$ satisfies $E_{R} [I_{i} (r_{j (j^{'})})] \overset{def}{=} π_{r i j} = {\hat{π}}_{r i j} .$ This means that in the model for ${\hat{π}}_{r i j}$ we have to include as much available information, thought to influence the nonresponse propensity, as possible, in order for this assumption (i.e., the MAR assumption) to be tenable. For example, if the nonresponse is thought to be independent across waves, one should include, in the model for ${\hat{π}}_{r i j},$ as many variables from the corresponding wave as possible. If, on the other hand, it is reasonable to assume that the response propensity at a given wave depends on previous responses (and possibly response history), then those responses should be included in the response model, and so on.

The design as well as the model-design unbiasedness follow immediately from (3.6) together with the previous section. Hereafter we therefore ignore the issue of nonresponse for notational simplicity.

3.3 Variance and variance estimation

We now develop a (Taylor Series) linearization for the variance of the proposed estimator. The basic technique is due to Binder (1983). For simplicity in the derivations and notation we divide through by $N;$ we redefine

$Ψ_{s} (β) = N^{- 1} \sum_{i \in s} \frac{\partial {μ^{'}}_{i}}{\partial β} V_{i}^{- 1} W_{i} (y_{i} - μ_{i})$ and $Ψ_{U} (β) = N^{- 1} \sum_{i \in U} \frac{\partial {μ^{'}}_{i}}{\partial β} V_{i}^{- 1} I_{i} (U) (y_{i} - μ_{i}),$

where $N = \sum_{j = 1}^{J} N_{j} .$ Let $\hat{β}$ be our estimator, which satisfies $Ψ_{s} (\hat{β}) = 0,$ and let $β_{N}$ be the "census estimator,� which satisfies $Ψ_{U} (β_{N}) = 0 .$ Assume $β_{N} - β = O_{P} (1 / \sqrt{N_{m}})$ and $\hat{β} - β_{N} = O_{P} (1 / \sqrt{n_{m}}),$ with $N_{m} = \min {N_{1}, N_{2}, \dots, N_{J}}$ and $n_{m} = \min {n_{1}, n_{2}, \dots, n_{J}} .$ We can write the total error of $\hat{β}$ as $\hat{β} - β = (\hat{β} - β_{N}) + (β_{N} - β) =$ Sampling Error + Model Error. After some straightforward calculations, the total variance, or more precisely the total MSE, can be decomposed as:

$V_{Tot} = E_{ξ p} (\hat{β} - β) {(\hat{β} - β)}^{'} = V_{Sam} + 2 \otimes C_{Sam - Mod} + o (1 / n_{m}),$ (3.7)

where $2 \otimes A = A + A^{'}$ for any matrix $A, V_{Sam} = E_{ξ} V_{p}$ is the "sampling variance� component, $2 \otimes C_{Sam - Mod}$ is the cross "sampling-model variance� component, $V_{p} =$ $E_{p} [(\hat{β} - β_{N}) {(\hat{β} - β_{N})}^{'}],$ $C_{Sam - Mod} = E_{p} C_{ξ},$ and $C_{ξ} = E_{ξ} (\hat{β} - β) {(β_{N} - β)}^{'} .$ Furthermore, by Taylor series expansions we can obtain the following approximations: $\hat{β} - β_{N} = {[H (β_{N})]}^{- 1} Ψ_{s} (β_{N}) + o_{P} (1 / \sqrt{n_{m}}),$ $\hat{β} - β = {[\hat{H} (β)]}^{- 1} Ψ_{s} (β) + o_{P} (1 / \sqrt{n_{m}}),$ and $β_{N} - β = {[H (β)]}^{- 1} Ψ_{U} (β) + o_{P} (1 / \sqrt{N_{m}}),$ where we define $H (β) = N^{- 1} \sum_{i \in U} (\partial {μ^{'}}_{i} / \partial β) V_{i}^{- 1} I_{i} (U) (\partial μ_{i} / \partial β)$ and $\hat{H} (β) = N^{- 1} \sum_{i \in s} (\partial {μ^{'}}_{i} / \partial β) V_{i}^{- 1} W_{i} (\partial μ_{i} / \partial β) .$

We then get, for $V_{p}$ and $C_{ξ}$ in (3.7),

$V_{p} = {[H (β_{N})]}^{- 1} {Var}_{p} [Ψ_{s} (β_{N})] {[H (β_{N})]}^{- 1} + o_{P} (1 / n_{m}),$ (3.8)

$\begin{matrix} C_{ξ} = {[\hat{H} (β)]}^{- 1} E_{ξ} [Ψ_{s} (β) {Ψ^{'}}_{U} (β)] {[H (β)]}^{- 1} + o_{P} (1 / n_{m}) \\ = N^{- 1} {[\hat{H} (β)]}^{- 1} {\hat{H}}_{Σ V} (β) {[H (β)]}^{- 1} + o_{P} (1 / n_{m}), \end{matrix}$ (3.9)

where ${Var}_{p} [Ψ_{s} (β_{N})] = E_{p} [Ψ_{s} (β_{N}) {Ψ^{'}}_{s} (β_{N})]$ and ${\hat{H}}_{Σ V} (β) = N^{- 1} \sum_{i \in s} [(\partial {μ^{'}}_{i} / \partial β) V_{i}^{- 1} W_{i} Σ_{i} \times V_{i}^{- 1} (\partial μ_{i} / \partial β)];$ the derivation of (3.9) can be found in the Appendix.

In conclusion, so far we have found that:

$\begin{matrix} V_{Tot} & = E_{ξ} V_{p} + 2 \otimes E_{p} C_{ξ} + o (1 / n_{m}) \\ = E_{ξ} {{[H (β_{N})]}^{- 1} {Var}_{p} [Ψ_{s} (β_{N})] {[H (β_{N})]}^{- 1}} \\ + 2 \otimes N^{- 1} E_{p} {{[\hat{H} (β)]}^{- 1} {\hat{H}}_{Σ V} (β) {[H (β)]}^{- 1}} + o (1 / n_{m}) . \end{matrix}$ (3.10)

In (3.10) all the terms can be estimated by "plugging in� the estimate $\hat{β},$ except for the term ${Var}_{p} [Ψ_{s} (β_{N})];$ this is the subject of the next section.

If the sampling fraction is small, i.e., $n ≪ N,$ the first term in expression (3.10) is a good approximation for the total variance; i.e., the expression for $V_{Tot}$ is simply $E_{ξ} V_{p}$ (and lower order terms). If, on the other hand, the sampling fraction is large, both terms in (3.10) are required.

3.3.1 Design variance of the estimating function

In order to derive an expression for ${Var}_{p} [Ψ_{s} (β_{N})],$ we assume $J = 3,$ as before. The methodology is that of two-phase sampling (more precisely, multiphase sampling), as discussed in chapter 9 of Särndal, et al. (1992). After some derivations (see Appendix), and defining $B_{i} = {(\partial {μ^{'}}_{i} / \partial β) |}_{β = β_{N}} V_{i}^{- 1}, e_{i} = y_{i} - μ_{i} (β_{N}), e_{i (1 \dots 3)} = e_{i}, e_{i (2 \dots 3)} = {(0, e_{i 2}, e_{i 3})}^{'},$ and $e_{i (3 \dots 3)} = {(0,0, e_{i 3})}^{'},$ we obtain:

${Var}_{p} [Ψ_{s} (β_{N})] = \sum_{j = 1}^{3} D_{(j)} = \sum_{j = 1}^{3} \sum_{k = j}^{3} D_{(j) k},$ (3.11)

where $D_{(j)} \overset{def}{=} N^{- 2} {Var}_{p} (\sum_{i \in s_{j (j)}} B_{i} W_{i} e_{i}) = \sum_{k = j}^{3} D_{(j) k},$ for $j = 1,2,3,$

$N^{2} D_{(j) j} \overset{def}{=} Var [\sum_{i \in s_{j (j)}} w_{i j} B_{i} I_{i} (U) e_{i (j \dots 3)}], for j = 1,2,3,$

$N^{2} D_{(j - 1) j} \overset{def}{=} E {Var [\sum_{i \in s_{j (j - 1)}} w_{i j} B_{i} I_{i} (U) e_{i (j \dots 3)} | s_{j - 1 (j - 1)}]}, for j = 2,3,$

$N^{2} D_{(1) 3} \overset{def}{=} E {E [Var (\sum_{i \in s_{3 (1)}} w_{i 3} B_{i} I_{i} (U) e_{i (3 \dots 3)} | s_{2 (1)}, s_{1 (1)}) | s_{1 (1)}]},$

and in the Appendix we show that:

$N^{2} D_{(j) k} = Var [\sum_{i \in s_{k (j)}} w_{i k} B_{i} I_{i} (U) e_{i (k \dots 3)}] - Var [\sum_{i \in s_{k - 1 (j)}} w_{i, k - 1} B_{i} I_{i} (U) e_{i (k \dots 3)}],$

for $j = 1,2,3,$ and $3 \geq k > j .$ In general, we have proved the following

Property 3.1 The (design) variance of $Ψ_{s} (β_{N})$ can be decomposed as:

$\begin{array}{l} {Var}_{p} [Ψ_{s} (β_{N})] \\ = \frac{1}{N^{2}} \sum_{j^{'} = 1}^{J} \sum_{j = j^{'}}^{J} {{Var}_{p} [\sum_{i \in s_{j (j^{'})}} w_{i j} B_{i} I_{i} (U) e_{i (j \dots J)}] - {Var}_{p} [\sum_{i \in s_{j - 1 (j^{'})}} w_{i, j - 1} B_{i} I_{i} (U) e_{i (j \dots J)}]} \end{array}$ (3.12)

$= \frac{1}{N^{2}} \sum_{j = 1}^{J} {{Var}_{p} [\sum_{i \in s_{j}} w_{i j} B_{i} I_{i} (U) e_{i (j \dots J)}] - {Var}_{p} [\sum_{i \in s_{j - 1}} w_{i, j - 1} B_{i} I_{i} (U) e_{i (j \dots J)}]},$ (3.13)

where we let $w_{i, j - 1} = 0$ whenever $j = j^{'}, w_{i 0} = 0,$ and to get (3.13) we have changed variables and used the independence among cohorts.

In (3.11), (3.12), and (3.13) we have assumed that the cohorts are design-independent. However, in some cases this assumption may not be tenable; an example of such a case is the multiple frame situation discussed in the first part of Section 3.2. Another instance in which it may not be appropriate to assume cohort independence is when weight adjustments cross cohorts, which is the case of the SDR; we discuss this issue in Section 5. Calculations for the case of three cohorts, in the Appendix, show that (3.13) holds for the variance terms even without independence. The Appendix also identifies conditions under which it is a good approximation for the covariance terms.

3.3.2 Estimation

The estimation of $V_{Tot}$ in (3.10) can be achieved as follows. $H (β_{N}), \hat{H} (β),$ and $H (β)$ can be estimated by $\hat{H} (\hat{β}) .$ ${\hat{H}}_{Σ V} (β)$ can be estimated by ${\hat{H}}_{Σ V} (\hat{β}),$ where $Σ_{i} = Cov [Y_{i} | X_{i}]$ can be estimated by ${\hat{e}}_{i} {\hat{e}}^{'}_{i} .$

We use (3.13) in Property 3.1 to estimate ${Var}_{p} [Ψ_{s} (β_{N})] .$ As long as there is a method to estimate the variance of (cross-sectional) Horvitz-Thompson (H-T) estimators, expression (3.13) can be used. If we define $Z_{i j} = B_{i} I_{i} (U) e_{i (j \dots J)},$ we notice that each of the terms involved in the computation of (3.13), terms like ${Var}_{p} [\sum_{i \in s_{j}} w_{i j} Z_{i j}],$ is simply the variance of a wave $ j$ H-T estimator. Obviously, the variance estimation method needs to account for the sampling design as well as for any nonresponse and calibration adjustments performed, but this does not present any additional complications beyond what is found in any cross-sectional problem, as everything is implemented cross-sectionally. The SDR uses replication to estimate variances of cross-sectional estimators, but any method of design variance estimation can be used.

We use the cross-sectional replicate weights that SDR provides, but we do not re-estimate the parameter of interest at each replicate. First, note that we require replication only for the estimation of the "meat� $({Var}_{p} [Ψ_{s} (β_{N})])$ of the design variance $(E_{ξ} V_{p}) .$ Secondly, although $\hat{β}$ does appear in the expression for the H-T estimator whose variance needs to be calculated (and re-calculated at each replicate), the work of Roberts, Binder, Kova�ević, Pantel and Phillips (2003), who apply the "estimating function bootstrap� (Hu and Kalbfleisch 2000) to survey data, show that in a setting like ours, it is not necessary to re-compute the estimator at each replicate, but that the full-sample estimator suffices. This simplification speeds up the computation of the replicate estimates.

As a way of illustration, say we currently are at wave $j,$ i.e., we are estimating the $j^{th}$ term in (3.13). The $r^{th}$ replicate of the first term is $\sum_{i \in s_{j}} w_{i j}^{(r)} B_{i} (\hat{β}) I_{i} (U) e_{i (j \dots J)} (\hat{β}),$ where $w_{i j}^{(r)}$ is the $r^{th}$ replicate weight for subject $i$ at wave $j,$ and the $r^{th}$ replicate of the second term is $\sum_{i \in s_{j - 1}} w_{i, j - 1}^{(r)} B_{i} (\hat{β}) I_{i} (U) e_{i (j \dots J)} (\hat{β}),$ where $w_{i, j - 1}^{(r)}$ is the $r^{th}$ replicate weight for subject $i$ at wave $j - 1.$

Previous | Next

Date modified:: 2017-09-20

Language selection

Search and menus

Search