# Variance estimation under monotone non-response for a panel survey Section 2. Correction of non-response and attrition

### 2.1  Notation and main assumptions

We are interested in a finite population $U.$ A sample ${s}_{0}$ is first selected according to some sampling design $p\left(\cdot \right),$ and we assume that the first-order inclusion probabilities ${\pi }_{i}$ are strictly positive for any $i\in U.$ This first sampling phase corresponds to the original inclusion of units in the sample.

We consider the case of a panel survey in which the sole units in the original sample ${s}_{0}$ are followed over time, without reentry or late entry units at subsequent times to represent possible newborns. We are therefore interested in estimating some parameter defined over the population $U,$ for some study variable ${y}_{t}$ taking the value ${y}_{it}$ for the unit $i$ at time $t.$ The units in the sample ${s}_{0}$ are followed at subsequent times $\delta =1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}t,$ and the sample is prone to unit non-response at each time. We note ${r}_{i}^{\delta }$ for the response indicator for unit $i$ at time $\delta ,$ and ${s}_{\delta }$ for the subset of respondents at time $\delta .$

We assume monotone non-response resulting in the nested sequence ${s}_{0}\supset {s}_{1}\supset ...\supset {s}_{t}.$ For $\delta =1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}t,$ we note ${p}_{i}^{\delta }=\text{Pr}\left(i\in {s}_{\delta }|\text{\hspace{0.17em}}{s}_{\delta -1}\right)$ for the response probability of some unit $i$ to be a respondent at time $\delta .$ We assume that the data are missing at random, i.e. the response probability ${p}_{i}^{\delta }$ at time $\delta$ can be explained by the variables observed at times $0,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}\delta -1,$ including the variables of interest, see for example Zhou and Kim (2012). Also, we assume that at any time $\delta$ the units answer independently of one another, and we note ${p}_{ij}^{\delta }={p}_{i}^{\delta }{p}_{j}^{\delta }$ for the probability that two distinct units $i$ and $j$ answer jointly at time $\delta .$

### 2.2  Reweighted estimator

We are interested in estimating the total $Y\left(t\right)={\sum }_{i\in U}\text{\hspace{0.17em}}{y}_{it}$ at time $t.$ In practice, the response probabilities at each time are unknown and need to be estimated. We assume that at each time $\delta$ the probability of response is parametrically modeled as

${p}_{i}^{\delta }={f}^{\delta }\left({z}_{i}^{\delta }\text{​},\text{\hspace{0.17em}}{\alpha }^{\delta }\right)\text{ }\text{ }\text{ }\text{ }\text{ }\left(2.1\right)$

for some known function ${f}^{\delta }\left(\cdot ,\cdot \right),$ where ${z}_{i}^{\delta }$ is a vector of variables observed for all the units in ${s}_{\delta -1},$ and ${\alpha }^{\delta }$ denotes some unknown parameter. Here and elsewhere, the superscript $\delta$ will be used when we account for non-response at time $\delta ,$ like for the probability ${p}_{i}^{\delta }$ of unit $i$ to be a respondent at time $\delta .$ Following the approach in Kim and Kim (2007), we assume that the true parameter is estimated by ${\stackrel{^}{\alpha }}^{\delta }\text{​},$ the solution of the estimating equation

$\frac{\partial }{\partial \alpha }\sum _{i\in {s}_{\delta -1}}\text{\hspace{0.17em}}{k}_{i}^{\delta }\left\{\text{\hspace{0.17em}}{r}_{i}^{\delta }\mathrm{ln}\left({p}_{i}^{\delta }\right)+\left(1-{r}_{i}^{\delta }\right)\mathrm{ln}\left(1-{p}_{i}^{\delta }\right)\right\}=0,\text{ }\text{ }\text{ }\text{ }\text{ }\left(2.2\right)$

with ${k}_{i}^{\delta }$ some weight of unit $i$ in the estimating equation. Customary choices for these weights include ${k}_{i}^{\delta }=1$ and ${k}_{i}^{\delta }={\pi }_{i}^{-1}\text{​},$ see Fuller and An (1998), Beaumont (2005) and Kim and Kim (2007).

The estimated response probability at time $\delta$ is ${\stackrel{^}{p}}_{i}^{\delta }={f}^{\delta }\left({z}_{i}^{\delta }\text{​},\text{\hspace{0.17em}}{\stackrel{^}{\alpha }}^{\delta }\right).$ The propensity score adjusted estimator at time $t,$ which will be simply called the reweighted estimator in what follows, is defined as

${\stackrel{^}{Y}}_{t}\left(t\right)=\sum _{i\in {s}_{t}}\text{\hspace{0.17em}}\frac{{y}_{it}}{{\pi }_{i}{\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t}}\text{ }\text{ }\text{with}\text{ }\text{ }{\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t}=\prod _{\delta =1}^{t}\text{\hspace{0.17em}}{\stackrel{^}{p}}_{i}^{\delta }.\text{ }\text{ }\text{ }\text{ }\text{ }\left(2.3\right)$

Here and elsewhere, the subscript $t$ will be used when the sample observed at time $t$ is used for estimation, like for ${\stackrel{^}{Y}}_{t}\left(\cdot \right)$ which makes use of the sample ${s}_{t}.$ We simplify the notation as ${\stackrel{^}{Y}}_{t}\left(t\right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\equiv \text{\hspace{0.17em}}\text{\hspace{0.17em}}{\stackrel{^}{Y}}_{t}$ when the total at time $t$ is estimated by using the sample observed at time $t.$

### 2.3  Variance computation

Under some regularity assumptions on the response mechanisms and some regularity conditions on the ${p}^{\delta }\left(\cdot ,\cdot \right)’\text{s,}$ we obtain from Theorem 1 in Kim and Kim (2007) that we can write

${\stackrel{^}{Y}}_{t}={\stackrel{^}{Y}}_{\text{lin}\text{​},\text{\hspace{0.17em}}t}\left(t\right)+{O}_{p}\left(N{n}^{-1}\right),\text{ }\text{ }\text{ }\text{ }\text{ }\left(2.4\right)$

where

${\stackrel{^}{Y}}_{\text{lin}\text{​},\text{\hspace{0.17em}}t}\left(t\right)=\sum _{i\in {s}_{t-1}}\text{\hspace{0.17em}}\frac{1}{{\pi }_{i}{\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t\text{\hspace{0.17em}}-\text{\hspace{0.17em}}1}}\text{\hspace{0.17em}}\left\{{k}_{i}^{t}{\pi }_{i}{\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t\text{\hspace{0.17em}}-\text{\hspace{0.17em}}1}{p}_{i}^{t}{\left({h}_{i}^{t}\right)}^{\top }{\gamma }^{t}+\frac{{r}_{i}^{t}}{{p}_{i}^{t}}\text{\hspace{0.17em}}\left({y}_{it}-{k}_{i}^{t}{\pi }_{i}{\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t\text{\hspace{0.17em}}-\text{\hspace{0.17em}}1}{p}_{i}^{t}{\left({h}_{i}^{t}\right)}^{\top }{\gamma }^{t}\right)\right\},\text{ }\text{ }\text{ }\left(2.5\right)$

and where for any $\delta =1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}t$ we denote by ${h}_{i}^{\delta }$ the value of ${h}_{i}^{\delta }\left(\alpha \right)=\partial \text{logit}\left({p}_{i}^{\delta }\right)/\partial \alpha$ evaluated at $\alpha ={\alpha }^{\delta },$ and

${\gamma }^{\delta }={\left\{\sum _{i\in {s}_{\delta -1}}{k}_{i}^{\delta }{p}_{i}^{\delta }\left(1-{p}_{i}^{\delta }\right){h}_{i}^{\delta }{\left({h}_{i}^{\delta }\right)}^{\top }\right\}}^{-1}\sum _{i\in {s}_{\delta -1}}\text{\hspace{0.17em}}\frac{1-{p}_{i}^{\delta }}{{\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}\delta \text{\hspace{0.17em}}-\text{\hspace{0.17em}}1}}\text{\hspace{0.17em}}{h}_{i}^{\delta }\text{\hspace{0.17em}}\frac{{y}_{it}}{{\pi }_{i}}.\text{ }\text{ }\text{ }\left(2.6\right)$

From (2.5), we obtain that

$E\left\{{\stackrel{^}{Y}}_{\text{lin}\text{​},\text{\hspace{0.17em}}t}\left(t\right)|\text{\hspace{0.17em}}{s}_{t-1}\right\}={\stackrel{^}{Y}}_{t-1}\left(t\right),\text{ }\text{ }\text{ }\text{ }\left(2.7\right)$

with ${\stackrel{^}{Y}}_{t-1}\left(t\right)$ the estimator of $Y\left(t\right)$ computed on ${s}_{t-1}.$ Using a proof by induction, it follows from (2.4) and (2.7) that ${\stackrel{^}{Y}}_{t}$ is approximately unbiased for $Y\left(t\right).$ Also, the variance of ${\stackrel{^}{Y}}_{t}$ may be asymptotically approximated by

${V}_{\text{app}}\left({\stackrel{^}{Y}}_{t}\right)=V\left(\sum _{i\in {s}_{0}}\frac{{y}_{it}}{{\pi }_{i}}\right)+E\left[\sum _{\delta =1}^{t}\text{\hspace{0.17em}}V\left\{{\stackrel{^}{Y}}_{\text{lin}\text{​},\text{\hspace{0.17em}}\delta }\left(t\right)|\text{\hspace{0.17em}}{s}_{\delta -1}\right\}\right].\text{ }\text{ }\text{ }\text{ }\left(2.8\right)$

The first term in the right-hand side of (2.8) is the variance due to the sampling design, that we note as ${V}^{p}\left({\stackrel{^}{Y}}_{t}\right).$ The second term in the right-hand side of (2.8) is the variance due to non-response, that we note as ${V}^{\text{nr}}\left({\stackrel{^}{Y}}_{t}\right).$ From (2.5), this asymptotic variance is given by

${V}^{\text{nr}}\left({\stackrel{^}{Y}}_{t}\right)=E\left(\sum _{\delta =1}^{t}\text{\hspace{0.17em}}{V}^{\text{nr}\delta }\left({\stackrel{^}{Y}}_{t}\right)\right),\text{ }\text{ }\text{ }\text{ }\left(2.9\right)$

where

${V}^{\text{nr}\delta }\left({\stackrel{^}{Y}}_{t}\right)=\sum _{i\in {s}_{\delta -1}}{p}_{i}^{\delta }\left(1-{p}_{i}^{\delta }\right)\text{\hspace{0.17em}}{\left(\frac{{y}_{it}}{{\pi }_{i}{\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}\delta \text{\hspace{0.17em}}-\text{\hspace{0.17em}}1}{p}_{i}^{\delta }}-{k}_{i}^{\delta }{\left({h}_{i}^{\delta }\right)}^{\top }{\gamma }^{\delta }\right)}^{2}\text{​}.\text{ }\text{ }\text{ }\text{ }\left(2.10\right)$

We note that for each of its component $\delta =1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}t,$ the term ${V}^{\text{nr}\delta }\left({\stackrel{^}{Y}}_{t}\right)$ in (2.10) includes a centering term ${k}_{i}^{\delta }{\left({h}_{i}^{\delta }\right)}^{\top }{\gamma }^{\delta }\text{​},$ which is essentially a prediction of ${\left({\pi }_{i}{\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}\delta \text{\hspace{0.17em}}-\text{\hspace{0.17em}}1}{p}_{i}^{\delta }\right)}^{-1}{y}_{i}$ by means of regressors ${h}_{i}^{\delta }\text{​}.$ This centering is due to the estimation of the response probabilities. Suppressing these centering terms, equations (2.9) and (2.10) would lead to the variance of the estimator of $Y\left(t\right)$ we would obtain by replacing in (2.3) the estimated probabilities by their true values. The variance of this estimator is usually larger than that of the reweighted estimator in (2.3); see also Beaumont (2005), equation (5.7) and Kim and Kim (2007), equation (17), for the case $t=1.$

### 2.4  Variance estimation

At time $t,$ an approximately unbiased estimator for the variance due to the sampling design ${V}^{p}\left({\stackrel{^}{Y}}_{t}\right)$ is

${\stackrel{^}{V}}_{t}^{p}\left({\stackrel{^}{Y}}_{t}\right)=\sum _{i,\text{\hspace{0.17em}}j\in {s}_{t}}\text{\hspace{0.17em}}\frac{{\Delta }_{ij}}{{\pi }_{ij}}\text{\hspace{0.17em}}\frac{1}{{\stackrel{^}{p}}_{ij}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t}}\text{\hspace{0.17em}}\frac{{y}_{it}}{{\pi }_{i}}\text{\hspace{0.17em}}\frac{{y}_{jt}}{{\pi }_{j}},\text{ }\text{ }\text{ }\text{ }\left(2.11\right)$

where ${\stackrel{^}{p}}_{ij}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t}\equiv {\prod }_{\delta =1}^{t}\text{\hspace{0.17em}}{\stackrel{^}{p}}_{ij}^{\delta }\text{​},$ and where ${\stackrel{^}{p}}_{ij}^{\delta }={\stackrel{^}{p}}_{i}^{\delta }$ if $i=j,$ and ${\stackrel{^}{p}}_{ij}^{\delta }={\stackrel{^}{p}}_{i}^{\delta }{\stackrel{^}{p}}_{j}^{\delta }$ otherwise. Following equation (25) in Kim and Kim (2007), ${V}^{\text{nr}}\left({\stackrel{^}{Y}}_{t}\right)$ may be approximately unbiasedly estimated at time $t$ by

${\stackrel{^}{V}}_{t}^{\text{nr}}\left({\stackrel{^}{Y}}_{t}\right)=\sum _{\delta =1}^{t}\text{\hspace{0.17em}}{\stackrel{^}{V}}_{t}^{\text{nr}\delta }\left({\stackrel{^}{Y}}_{t}\right)\text{ }\text{ }\text{ }\text{ }\text{ }\left(2.12\right)$

where

${\stackrel{^}{V}}_{t}^{\text{nr}\delta }\left({\stackrel{^}{Y}}_{t}\right)=\sum _{i\in {s}_{t}}\frac{{\stackrel{^}{p}}_{i}^{\delta }\left(1-{\stackrel{^}{p}}_{i}^{\delta }\right)}{{\stackrel{^}{p}}_{i}^{\delta \text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t}}{\left(\frac{{y}_{it}}{{\pi }_{i}{\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}\delta }}-{k}_{i}^{\delta }{\left({\stackrel{^}{h}}_{i}^{\delta }\right)}^{\top }{\stackrel{^}{\gamma }}_{t}^{\delta }\right)}^{2}\text{​},\text{ }\text{ }\text{ }\text{ }\left(2.13\right)$

${\stackrel{^}{h}}_{i}^{\delta }=h\left({z}_{i}\text{​},\text{\hspace{0.17em}}{\stackrel{^}{\alpha }}^{\delta }\right),\text{ }\text{ }\text{ }\text{ }\left(2.14\right)$

$γ ^ t δ = { ∑ i∈ s t k i δ p ^ i δ ( 1− p ^ i δ ) p ^ i δ → t h ^ i δ ( h ^ i δ ) ⊤ } −1 ∑ i∈ s t 1− p ^ i δ p ^ i 1 → t h ^ i δ y it π i . (2.15) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8piea0lXxcrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaacuaHZoWzgaqcamaaDaaaleaacaWG0b aabaGaeqiTdqgaaOGaaGypamaacmaabaWaaabuaeqaleaacaWGPbGa eyicI4Saam4CamaaBaaameaacaWG0baabeaaaSqab0GaeyyeIuoaki aaysW7caWGRbWaa0baaSqaaiaadMgaaeaacqaH0oazaaGcdaWcaaqa aiqadchagaqcamaaDaaaleaacaWGPbaabaGaeqiTdqgaaOWaaeWaae aacaaIXaGaeyOeI0IabmiCayaajaWaa0baaSqaaiaadMgaaeaacqaH 0oazaaaakiaawIcacaGLPaaaaeaaceWGWbGbaKaadaqhaaWcbaGaam yAaaqaaiabes7aKjaaykW7cqGHsgIRcaaMc8UaamiDaaaaaaGcceWG ObGbaKaadaqhaaWcbaGaamyAaaqaaiabes7aKbaakmaabmaabaGabm iAayaajaWaa0baaSqaaiaadMgaaeaacqaH0oazaaaakiaawIcacaGL PaaadaahaaWcbeqaamrr1ngBPrwtHrhAXaqeguuDJXwAKbstHrhAG8 KBLbaceeGae8hPIujaaaGccaGL7bGaayzFaaWaaWbaaSqabeaacqGH sislcaaIXaaaaOWaaabuaeqaleaacaWGPbGaeyicI4Saam4CamaaBa aameaacaWG0baabeaaaSqab0GaeyyeIuoakmaalaaabaGaaGymaiab gkHiTiqadchagaqcamaaDaaaleaacaWGPbaabaGaeqiTdqgaaaGcba GabmiCayaajaWaa0baaSqaaiaadMgaaeaacaaIXaGaaGPaVlabgkzi UkaaykW7caWG0baaaaaakiaaysW7caaMe8UabmiAayaajaWaa0baaS qaaiaadMgaaeaacqaH0oazaaGccaaMe8UaaGjbVpaalaaabaGaamyE amaaBaaaleaacaWGPbGaamiDaaqabaaakeaacqaHapaCdaWgaaWcba GaamyAaaqabaaaaOGaaGOlaiaaywW7caaMf8UaaGzbVlaaywW7caGG OaGaaGOmaiaac6cacaaIXaGaaGynaiaacMcaaaa@A109@$

This leads to the global variance estimator at time $t$

${\stackrel{^}{V}}_{t}\left({\stackrel{^}{Y}}_{t}\right)={\stackrel{^}{V}}_{t}^{p}\left({\stackrel{^}{Y}}_{t}\right)+{\stackrel{^}{V}}_{t}^{\text{nr}}\left({\stackrel{^}{Y}}_{t}\right).\text{ }\text{ }\text{ }\text{ }\text{ }\left(2.16\right)$

A simplified estimator of the variance due to non-response is obtained by ignoring the prediction terms ${k}_{i}^{\delta }{\left({\stackrel{^}{h}}_{i}^{\delta }\right)}^{\top }{\stackrel{^}{\gamma }}_{t}^{\delta }$ for each of the $\delta =1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}t$ variance components. After some algebra, this leads to the simplified variance estimator

${\stackrel{^}{V}}_{t\text{​},\text{\hspace{0.17em}}\text{simp}}^{\text{nr}}\left\{{\stackrel{^}{Y}}_{t}\left(t\right)\right\}=\sum _{i\in {s}_{t}}\text{\hspace{0.17em}}\frac{1-{\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t}}{{\left({\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t}\right)}^{2}}{\left(\frac{{y}_{it}}{{\pi }_{i}}\right)}^{2}\text{​}.\text{ }\text{ }\text{ }\text{ }\text{ }\left(2.17\right)$

The main advantage of this simplified variance estimator is that it only requires the knowledge of the estimated response probabilities. On the other hand, the computation of the variance estimator in (2.12) requires the knowledge of the response models used at all times. The simplified variance estimator is therefore of particular interest for secondary users of the survey data, for which the estimated response probabilities may be the only available information related to the response modeling. This simplified variance estimator will tend to overestimate the variance due to non-response of $\left({\stackrel{^}{Y}}_{t}\right)$ if the prediction term ${k}_{i}^{\delta }{\left({h}_{i}^{\delta }\right)}^{\top }{\gamma }^{\delta }$ partly explains ${\left({\pi }_{i}{\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}\delta \text{\hspace{0.17em}}-\text{\hspace{0.17em}}1}{p}_{i}^{\delta }\right)}^{-1}{y}_{it}\text{​}.$

### 2.5  Application to the logistic regression model

In the particular case when a logistic regression model is used at each time $\delta ,$ the model (2.1) may be rewritten as

$\text{logit}\left({p}_{i}^{\delta }\right)={\left({z}_{i}^{\delta }\right)}^{\top }{\alpha }^{\delta }.\text{ }\text{ }\text{ }\text{ }\text{ }\left(2.18\right)$

We obtain ${\stackrel{^}{h}}_{i}^{\delta }={z}_{i}^{\delta }\text{​},$ and the estimator for the variance due to non-response is given by (2.12), with

${\stackrel{^}{V}}_{t}^{\text{nr}\delta }\left({\stackrel{^}{Y}}_{t}\right)=\sum _{i\in {s}_{t}}\text{\hspace{0.17em}}\frac{{\stackrel{^}{p}}_{i}^{\delta }\left(1-{\stackrel{^}{p}}_{i}^{\delta }\right)}{{\stackrel{^}{p}}_{i}^{\delta \text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t}}\text{\hspace{0.17em}}{\left(\frac{{y}_{it}}{{\pi }_{i}{\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}\delta }}-{k}_{i}^{\delta }{\left({z}_{i}^{\delta }\right)}^{\top }{\stackrel{^}{\gamma }}_{t}^{\delta }\right)}^{2}\text{​},\text{ }\text{ }\text{ }\text{ }\left(2.19\right)$

${\stackrel{^}{\gamma }}_{t}^{\delta }={\left\{\sum _{i\in {s}_{t}}\text{\hspace{0.17em}}{k}_{i}^{\delta }\frac{{\stackrel{^}{p}}_{i}^{\delta }\left(1-{\stackrel{^}{p}}_{i}^{\delta }\right)}{{\stackrel{^}{p}}_{i}^{\delta \text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t}}{z}_{i}^{\delta }{\left({z}_{i}^{\delta }\right)}^{\top }\right\}}^{-1}\sum _{i\in {s}_{t}}\text{\hspace{0.17em}}\frac{1-{\stackrel{^}{p}}_{i}^{\delta }}{{\stackrel{^}{p}}_{i}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t}}\text{\hspace{0.17em}}{z}_{i}^{\delta }\text{\hspace{0.17em}}\frac{{y}_{it}}{{\pi }_{i}}.\text{ }\text{ }\text{ }\text{ }\left(2.20\right)$

If the reweighted estimator is computed at time $t=1,$ the estimator in (2.12) for the variance due to non-response may be rewritten as

${\stackrel{^}{V}}_{1}^{\text{nr}}\left({\stackrel{^}{Y}}_{1}\right)=\sum _{i\in {s}_{1}}\text{\hspace{0.17em}}\left(1-{\stackrel{^}{p}}_{i}^{1}\right){\left(\frac{{y}_{i1}}{{\pi }_{i}{\stackrel{^}{p}}_{i}^{1}}-{k}_{i}^{1}{\left({z}_{i}^{1}\right)}^{\top }{\stackrel{^}{\gamma }}_{1}^{1}\right)}^{2}.\text{ }\text{ }\text{ }\text{ }\left(2.21\right)$

If the reweighted estimator is computed at time $t=2,$ the estimator in (2.12) for the variance due to non-response may be rewritten as

$V ^ 2 nr ( Y ^ 2 ) = ∑ i∈ s 2 ( 1− p ^ i 1 ) p ^ i 2 ( y i2 π i p ^ i 1 − k i 1 ( z i 1 ) ⊤ γ ^ 2 1 ) 2 + ∑ i∈ s 2 ( 1− p ^ i 2 ) ( y i2 π i p ^ i 1 p ^ i 2 − k i 2 ( z i 2 ) ⊤ γ ^ 2 2 ) 2 ​. (2.22) MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaebbnrfifHhDYfgasaacH8rrps0l bbf9q8WrFfeuY=Hhbbf9y8WrFr0xc9vqFj0db9qqvqFr0dXdHiVc=b YP0xH8piea0lXxcrpe0db9Wqpepic9qr=xfr=xfr=tmeaabaqaciGa caGaaeqabaqaaeaadaaakeaafaqaaeGacaaabaGabmOvayaajaWaa0 baaSqaaiaaikdaaeaacaqGUbGaaeOCaaaakmaabmaabaGabmywayaa jaWaaSbaaSqaaiaaikdaaeqaaaGccaGLOaGaayzkaaaabaGaaGypam aaqafabeWcbaGaamyAaiabgIGiolaadohadaWgaaadbaGaaGOmaaqa baaaleqaniabggHiLdGcdaWcaaqaamaabmaabaGaaGymaiabgkHiTi qadchagaqcamaaDaaaleaacaWGPbaabaGaaGymaaaaaOGaayjkaiaa wMcaaaqaaiqadchagaqcamaaDaaaleaacaWGPbaabaGaaGOmaaaaaa GcdaqadaqaamaalaaabaGaamyEamaaBaaaleaacaWGPbGaaGOmaaqa baaakeaacqaHapaCdaWgaaWcbaGaamyAaaqabaGcceWGWbGbaKaada qhaaWcbaGaamyAaaqaaiaaigdaaaaaaOGaeyOeI0Iaam4AamaaDaaa leaacaWGPbaabaGaaGymaaaakmaabmaabaGaamOEamaaDaaaleaaca WGPbaabaGaaGymaaaaaOGaayjkaiaawMcaamaaCaaaleqabaWefv3y SLgznfgDOfdaryqr1ngBPrginfgDObYtUvgaiqqacqWFKksLaaGccu aHZoWzgaqcamaaDaaaleaacaaIYaaabaGaaGymaaaaaOGaayjkaiaa wMcaamaaCaaaleqabaGaaGOmaaaaaOqaaaqaaiabgUcaRmaaqafabe WcbaGaamyAaiabgIGiolaadohadaWgaaadbaGaaGOmaaqabaaaleqa niabggHiLdGcdaqadaqaaiaaigdacqGHsislceWGWbGbaKaadaqhaa WcbaGaamyAaaqaaiaaikdaaaaakiaawIcacaGLPaaadaqadaqaamaa laaabaGaamyEamaaBaaaleaacaWGPbGaaGOmaaqabaaakeaacqaHap aCdaWgaaWcbaGaamyAaaqabaGcceWGWbGbaKaadaqhaaWcbaGaamyA aaqaaiaaigdaaaGcceWGWbGbaKaadaqhaaWcbaGaamyAaaqaaiaaik daaaaaaOGaeyOeI0Iaam4AamaaDaaaleaacaWGPbaabaGaaGOmaaaa kmaabmaabaGaamOEamaaDaaaleaacaWGPbaabaGaaGOmaaaaaOGaay jkaiaawMcaamaaCaaaleqabaGae8hPIujaaOGafq4SdCMbaKaadaqh aaWcbaGaaGOmaaqaaiaaikdaaaaakiaawIcacaGLPaaadaahaaWcbe qaaiaaikdaaaGccaaMb8UaaGOlaiaaywW7caaMf8UaaGzbVlaaywW7 caaMf8UaaiikaiaaikdacaGGUaGaaGOmaiaaikdacaGGPaaaaaaa@A347@$

In practice, the model of Response Homogeneity Groups (RHG) is often assumed when correcting for unit non-response. Under this model, it is assumed that at each time $\delta =1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}t,$ the sub-sample ${s}_{\delta -1}$ may be partitioned into $C\left(\delta -1\right)$ groups ${s}_{\delta -1}^{c}\text{​},\text{\hspace{0.17em}}c=1,\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}C\left(\delta -1\right),$ such that the response probability ${p}_{i}^{\delta }$ is constant inside a group. This model is a particular case of the logistic regression model in (2.18), obtained with

${z}_{i}^{\delta }={\left[1\left\{\text{\hspace{0.17em}}i\in {s}_{\delta -1}^{1}\right\},\text{\hspace{0.17em}}\dots ,\text{\hspace{0.17em}}1\left\{\text{\hspace{0.17em}}i\in {s}_{\delta -1}^{C\left(\delta -1\right)}\right\}\right]}^{\top }\text{​},\text{ }\text{ }\text{ }\text{ }\text{ }\left(2.23\right)$

and the variance due to non-response is estimated accordingly. Explicit formulas are given in Appendix.

﻿
Date modified: