# Variance estimation under monotone non-response for a panel survey

Section 7. Conclusion

In this paper, we considered variance estimation accounting for weighting adjustments in panel surveys. We proposed both an approximately unbiased variance estimator and a simplified variance estimator for estimators of totals, complex parameters and measures of change, which covers most cases that may be encountered in practice. Our simulation results indicate that the proposed variance estimator performs well in all cases considered. The simplified variance estimator tends to overestimate the variance of the expansion estimator for totals, and to overestimate the variance for calibrated estimators of totals when the calibration variables lack of explanatory power for the variable of interest. However, the simplified variance estimator performs well for the estimation of ratios and change in totals with calibrated weights, even if the calibration model is not appropriate for the study variable.

The assumption of independent response behaviour is usually not tenable for multi-stage surveys, since units within clusters tend to be correlated with respect to the response behaviour. In this context, estimation of response probabilities based upon conditional logistic regression in the context of correlated responses has been studied by Skinner and D’Arrigo (2011), see also Kim, Kwon and Park (2016). Extending the present work in the context of correlated response behaviour is a challenging problem for further research.

## Acknowledgements

We thank the Editors, an Associate Editor and the referees for useful comments and suggestions which led to an improvement of the paper.

## Appendix

### Estimation of the variance due to non-response for Response Homogeneity Groups

We consider the model of Response Homogeneity Groups introduced in Section 2.5. Recall that this model may be summarized as follows: at each time $\delta \mathrm{=1,}\text{\hspace{0.17em}}\dots \mathrm{,}\text{\hspace{0.17em}}t,$ the sub-sample ${s}_{\delta -1}$ is partitioned into $C\left(\delta -1\right)$ groups ${s}_{\delta -1}^{c}\mathrm{,}\text{\hspace{0.17em}}c\mathrm{=1,}\text{\hspace{0.17em}}\dots \mathrm{,}\text{\hspace{0.17em}}C\left(\delta -1\right).$ The response probabilities are assumed to be constant within the groups.

This model is equivalent to the logistic regression model in (2.18), with

$${z}_{i}^{\delta}={\left[1\left\{i\in {s}_{\delta -1}^{1}\right\}\mathrm{,}\text{\hspace{0.17em}}\dots \mathrm{,}\text{\hspace{0.17em}}1\left\{i\in {s}_{\delta -1}^{C\left(\delta -1\right)}\right\}\right]}^{\top}\text{}\mathrm{.}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(\text{A}.1)$$

The equation (2.2) leads to the estimated response probabilities

$${\widehat{p}}_{i}^{\delta}\mathrm{=}\frac{{\displaystyle {\sum}_{i\in {s}_{\delta -1}^{c}}{k}_{i}^{\delta}{r}_{i}^{\delta}}}{{\displaystyle {\sum}_{i\in {s}_{\delta -1}^{c}}{k}_{i}^{\delta}}}\text{\hspace{1em}}\text{\hspace{1em}}\text{for}\text{\hspace{1em}}\text{\hspace{1em}}i\in {s}_{\delta -1}^{c}\mathrm{.}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(\text{A}.2)$$

We first consider the case when the reweighted estimator is computed at time $t\mathrm{=1.}$ In the estimator of the variance due to non-response given in (2.21), the vector ${\widehat{\gamma}}_{1}^{1}$ simplifies as

$${\widehat{\gamma}}_{1}^{1}={\left(\frac{{\displaystyle {\sum}_{i\in {s}_{1}\cap {s}_{0}^{1}}{\scriptscriptstyle \frac{{y}_{i1}}{{\pi}_{i}}}}}{{\widehat{p}}_{1}^{1}{\displaystyle {\sum}_{i\in {s}_{1}\cap {s}_{0}^{1}}{k}_{i}^{1}}}\mathrm{,}\text{\hspace{0.17em}}\dots \mathrm{,}\text{\hspace{0.17em}}\frac{{\displaystyle {\sum}_{i\in {s}_{1}\cap {s}_{0}^{C\left(0\right)}}{\scriptscriptstyle \frac{{y}_{i1}}{{\pi}_{i}}}}}{{\widehat{p}}_{C\left(0\right)}^{1}{\displaystyle {\sum}_{i\in {s}_{1}\cap {s}_{0}^{C\left(0\right)}}{k}_{i}^{1}}}\right)}^{\top}\mathrm{.}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(\text{A}.3)$$

After some algebra, the variance estimator in (2.21) may be rewritten as

$${\widehat{V}}_{1}^{\text{nr}}\left({\widehat{Y}}_{1}\right)\mathrm{=}{\displaystyle \sum _{c\mathrm{=1}}^{C\left(0\right)}}\frac{\left(1-{\widehat{p}}_{c}^{1}\right)}{{\left({\widehat{p}}_{c}^{1}\right)}^{2}}{\displaystyle \sum _{i\in {s}_{1}\cap {s}_{0}^{c}}}{\left(\frac{{y}_{i1}}{{\pi}_{i}}-{k}_{i}^{1}\frac{{\displaystyle {\sum}_{j\in {s}_{1}\cap {s}_{0}^{c}}{\scriptscriptstyle \frac{{y}_{j1}}{{\pi}_{j}}}}}{{\displaystyle {\sum}_{j\in {s}_{1}\cap {s}_{0}^{c}}{k}_{j}^{1}}}\right)}^{2}\mathrm{.}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(\text{A}.4)$$

We now consider the case when the reweighted estimator is computed at time $t\mathrm{=2.}$ We focus on the simpler case when the same system of RHGs is kept over time. In the estimator of the variance due to non-response given in (2.22), the vectors ${\widehat{\gamma}}_{2}^{1}$ and ${\widehat{\gamma}}_{2}^{2}$ simplify as

$${\widehat{\gamma}}_{2}^{1}\mathrm{=}{\left(\frac{{\displaystyle {\sum}_{i\in {s}_{2}\cap {s}_{1}^{1}}{\scriptscriptstyle \frac{{y}_{i2}}{{\pi}_{i}}}}}{{\widehat{p}}_{1}^{1}{\displaystyle {\sum}_{i\in {s}_{2}\cap {s}_{1}^{1}}{k}_{i}^{1}}}\mathrm{,}\text{\hspace{0.17em}}\dots \mathrm{,}\text{\hspace{0.17em}}\frac{{\displaystyle {\sum}_{i\in {s}_{2}\cap {s}_{1}^{C\left(0\right)}}{\scriptscriptstyle \frac{{y}_{i2}}{{\pi}_{i}}}}}{{\widehat{p}}_{C\left(0\right)}^{1}{\displaystyle {\sum}_{i\in {s}_{2}\cap {s}_{1}^{C\left(0\right)}}{k}_{i}^{1}}}\right)}^{\top}\text{}\mathrm{,}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(\text{A}.5)$$

$${\widehat{\gamma}}_{2}^{2}\mathrm{=}{\left(\frac{{\displaystyle {\sum}_{i\in {s}_{2}\cap {s}_{1}^{1}}{\scriptscriptstyle \frac{{y}_{i2}}{{\pi}_{i}}}}}{{\widehat{p}}_{1}^{1}{\widehat{p}}_{1}^{2}{\displaystyle {\sum}_{i\in {s}_{2}\cap {s}_{1}^{1}}{k}_{i}^{2}}}\mathrm{,}\text{\hspace{0.17em}}\dots \mathrm{,}\text{\hspace{0.17em}}\frac{{\displaystyle {\sum}_{i\in {s}_{2}\cap {s}_{1}^{C\left(0\right)}}{\scriptscriptstyle \frac{{y}_{i2}}{{\pi}_{i}}}}}{{\widehat{p}}_{C\left(0\right)}^{1}{\widehat{p}}_{C\left(0\right)}^{2}{\displaystyle {\sum}_{i\in {s}_{2}\cap {s}_{1}^{C\left(0\right)}}{k}_{i}^{2}}}\right)}^{\top}\mathrm{.}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(\text{A}.6)$$

After some algebra, the variance estimator in (2.22) may be rewritten as

$$\begin{array}{ll}{\widehat{V}}_{2}^{\text{nr}}\left({\widehat{Y}}_{2}\right)\hfill & \mathrm{=}{\displaystyle \sum _{c\mathrm{=1}}^{C\left(0\right)}}\frac{\left(1-{\widehat{p}}_{c}^{1}\right)}{{\widehat{p}}_{c}^{2}}{\displaystyle \sum _{i\in {s}_{2}\cap {s}_{1}^{c}}}{\left(\frac{{y}_{i2}}{{\pi}_{i}{\widehat{p}}_{c}^{1}}-{k}_{i}^{1}\frac{{\displaystyle {\sum}_{j\in {s}_{2}\cap {s}_{1}^{c}}{\scriptscriptstyle \frac{{y}_{j2}}{{\pi}_{j}}}}}{{\displaystyle {\sum}_{j\in {s}_{2}\cap {s}_{1}^{c}}{k}_{j}^{1}}}\right)}^{2}\hfill \\ \hfill & \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{\displaystyle \sum _{c\mathrm{=1}}^{C\left(0\right)}}\left(1-{\widehat{p}}_{c}^{2}\right){\displaystyle \sum _{i\in {s}_{2}\cap {s}_{1}^{c}}}{\left(\frac{{y}_{i2}}{{\pi}_{i}{\widehat{p}}_{c}^{1}{\widehat{p}}_{c}^{2}}-{k}_{i}^{2}\frac{{\displaystyle {\sum}_{j\in {s}_{2}\cap {s}_{1}^{c}}{\scriptscriptstyle \frac{{y}_{j2}}{{\pi}_{j}}}}}{{\displaystyle {\sum}_{j\in {s}_{2}\cap {s}_{1}^{c}}{k}_{j}^{2}}}\right)}^{2}\text{}\mathrm{.}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(\text{A}.7)\hfill \end{array}$$

If we further assume that ${k}_{i}^{\delta}$ is constant over times $\delta \mathrm{=1,}\text{\hspace{0.17em}}2,$ and may thus be rewritten as ${k}_{i},$ the expression in (A.7) simplifies as

$${\widehat{V}}_{2}^{\text{nr}}\left({\widehat{Y}}_{2}\right)\mathrm{=}{\displaystyle \sum _{c\mathrm{=1}}^{C\left(0\right)}}\frac{\left(1-{\widehat{p}}_{c}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}2}\right)}{{\left({\widehat{p}}_{c}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}2}\right)}^{2}}{\displaystyle \sum _{i\in {s}_{2}\cap {s}_{1}^{c}}}{\left(\frac{{y}_{i2}}{{\pi}_{i}}-{k}_{i}\frac{{\displaystyle {\sum}_{j\in {s}_{2}\cap {s}_{1}^{c}}{\scriptscriptstyle \frac{{y}_{j2}}{{\pi}_{j}}}}}{{\displaystyle {\sum}_{j\in {s}_{2}\cap {s}_{1}^{c}}{k}_{j}}}\right)}^{2}\mathrm{.}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(\text{A}.8)$$

with ${\widehat{p}}_{c}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}2}\mathrm{=}{\displaystyle {\prod}_{\delta \mathrm{=1}}^{2}}\text{\hspace{0.17em}}{\widehat{p}}_{c}^{\delta}$ for $c\mathrm{=1,}\text{\hspace{0.17em}}\dots \mathrm{,}\text{\hspace{0.17em}}C\left(0\right).$ This simplification of the variance estimator can be extended to the reweighted estimator at time $t.$ Assuming that the RHGs are kept over time, and that ${k}_{i}^{\delta}\mathrm{=}{k}_{i}$ for any $\delta \mathrm{=1,}\text{\hspace{0.17em}}\dots \mathrm{,}\text{\hspace{0.17em}}t,$ the variance estimator in (2.12) may be written as

$${\widehat{V}}_{t}^{\text{nr}}\left({\widehat{Y}}_{t}\right)\mathrm{=}{\displaystyle \sum _{c\mathrm{=1}}^{C\left(0\right)}}\frac{\left(1-{\widehat{p}}_{c}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t}\right)}{{\left({\widehat{p}}_{c}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t}\right)}^{2}}{\displaystyle \sum _{i\in {s}_{t}\cap {s}_{t-1}^{c}}}{\left(\frac{{y}_{it}}{{\pi}_{i}}-{k}_{i}\frac{{\displaystyle {\sum}_{j\in {s}_{t}\cap {s}_{t-1}^{c}}{\scriptscriptstyle \frac{{y}_{jt}}{{\pi}_{j}}}}}{{\displaystyle {\sum}_{j\in {s}_{t}\cap {s}_{t-1}^{c}}{k}_{j}}}\right)}^{2}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(\text{A}.9)$$

with ${\widehat{p}}_{c}^{1\text{\hspace{0.17em}}\to \text{\hspace{0.17em}}t}\mathrm{=}{\displaystyle {\prod}_{\delta \mathrm{=1}}^{t}}\text{\hspace{0.17em}}{\widehat{p}}_{c}^{\delta}$ for $c\mathrm{=1,}\text{\hspace{0.17em}}\dots \mathrm{,}\text{\hspace{0.17em}}C\left(0\right).$

## References

Beaumont, J.-F. (2005). Calibrated imputation in surveys
under a quasimodel-assisted approach. *Journal
of the Royal Statistical Society, Series B*, 67, 445-458.

Beaumont, J.-F., and Haziza, D. (2016). A note on the
concept of invariance in two-phase sampling designs. *Survey Methodology*, 42, 2, 319-323. Paper available at https://www150.statcan.gc.ca/n1/pub/12-001-x/2016002/article/14662-eng.pdf.

Berger, Y. (2004). Variance estimation for measures of
change in probability sampling. *Canadian Journal of Statistics*, 32, 4, 451-467.

Caron, N., and Ravalet, P. (2000). Estimation dans les enquêtes répétées : application à l’enquête emploi en continu. Technical report INSEE, Paris.

Chauvet, G., and Goga, C. (2018). Linearization versus bootstrap for variance
estimation of the change between Gini indexes. *Survey Methodology*, 44, 1, 17-42. Paper available at
https://www150.statcan.gc.ca/n1/pub/12-001-x/2018001/article/54926-eng.pdf.

Clarke, P., and Tate, P. (2002). An application of
non-ignorable non-response models for gross flows estimation in the British
labour force survey. *Australian** & **New Zealand Journal** of ** Statistics*, 4, 413-425.

Deville, J.-C., and Särndal, C.-E. (1992). Calibration
estimators in survey sampling. *Journal of
the American Statistical Association*, 87, 376-382.

Ekholm, A., and Laaksonen, S. (1991). Weighting via
response modeling in the finnish household budget survey. *Journal of Official Statistics*, 7, 325-327.

Fay, R. (1992). When are inferences from multiple
imputation valid? *Proceedings of the
Survey Research Methods Section*, American Statistical Association, 81, 1,
227-232.

Fuller, W., and An, A. (1998). Regression adjustment for
non-response. *Journal of the Indian
Society of Agricultural Statistics*, 51, 331-342.

Fuller, W.A., Loughin, M.M. and Baker, H.D. (1994).
Regression weighting in the presence of nonresponse with application to the
1987-1988 Nationwide Food Consumption Survey. *Survey Methodology*, 20, 1, 75-85. Paper available at https://www150.statcan.gc.ca/n1/pub/12-001-x/1994001/article/14429-eng.pdf.

Goga, C., Deville, J.-C. and Ruiz-Gazen, A. (2009).
Composite estimation and linearization method for two-sample survey data. *Biometrika*, 96, 691-709.

Hawkes, D., and Plewis, I. (2009). Modelling nonresponse
in the national child development study. *Journal
of the Royal Statistical Society, Series A*, 169, 479-491.

Juillard, H., Chauvet, G. and Ruiz-Gazen, A. (2017).
Estimation under cross-classified sampling with application to a childhood
survey. *Journal of the American
Statistical Association*, 112, 850-858.

Kalton, G. (2009). Design for surveys over time. *Handbook of Statistics*, 29, 89-108.

Kim, J.K., and Kim, J.J. (2007). Nonresponse weighting
adjustment using estimated response probability. *Canadian Journal of Statistics*, 35, 501-514.

Kim, J.K., Kwon, Y. and Park, M. (2016). Calibrated
propensity score method for survey nonresponse in cluster sampling. *Biometrika*, 103, 461-473.

Laaksonen, S. (2007). Weighting for two-phase surveyed
data. *Survey Methodology*, 33, 2, 121-130.
Paper available at https://www150.statcan.gc.ca/n1/pub/12-001-x/2007002/article/10489-eng.pdf.

Laaksonen, S., and Chambers, R.L. (2006). Survey
estimation under informative nonresponse with follow-up. *Journal of Official Statistics*, 22, 81-95.

Laniel, N. (1988). Variances for a rotating sample from
a changing population. *Proceedings of the
Business and Economics Statistics Section*, American Statistical
Association, 246-250.

Laurie, H., Smith, R. and Scott, L. (1999). Strategies
for reducing nonresponse in a longitudinal panel survey.* Journal of Official Statistics*, 15, 269-282.

Lynn, P. (2009). Methods for longitudinal surveys. *Methodology of Longitudinal Surveys*, 1-19.

Nordberg, L. (2000). On variance estimation for measures
of change when samples are coordinated by the use of permanent random numbers. *Journal of Official Statistics*, 16, 363-378.

Pirus, C., Bois, C., Dufourg, M., Lanoë, J.,
Vandentorren, S., Leridon, H. and the Elfe team (2010). Constructing a cohort:
Experience with the French Elfe project. *Population*,
65, 637-670.

Qualité, L.,
and Tillé, Y. (2008). Variance estimation of changes in repeated surveys
and its application to the Swiss survey of value added. *Survey Methodology*, 34, 2, 173-181. Paper available at https://www150.statcan.gc.ca/n1/pub/12-001-x/2008002/article/10758-eng.pdf.

Rendtel, U., and Harms, T. (2009). Weighting and
calibration for household panels. *Methodology
of Longitudinal Surveys*, 265-286.

Rizzo, L., Kalton, G. and Brick, J.M. (1996). A
comparison of some weighting adjustment methods for panel nonresponse. *Survey Methodology*, 22, 1, 43-53. Paper
available at https://www150.statcan.gc.ca/n1/pub/12-001-x/1996001/article/14386-eng.pdf.

Silva, P., and Skinner, C. (1997). Cross-classiffed
sampling: Some estimation theory. *Variable
Selection for Regression Estimation in Finite Populations*, 23, 23-32.

Skinner, C. (2015). Cross-classiffed sampling: Some
estimation theory. *Statistics &
Probability Letters*, 104, 163-168.

Skinner, C., and D’Arrigo, J. (2011). Inverse
probability weighting for clustered non-response. *Biometrika*, 98, 953-966.

Skinner, C., and Vieira, M. (2005). Design effects in the analysis of longitudinal survey data. S3RI Methdology Working Papers, M05/13. Southampton, UK: Southampton Statistical Sciences Research Institute.

Slud, E.V., and Bailey, L. (2010). Evaluation and
selection of models for attrition nonresponse adjustment. *Journal of Official Statistics*, 26, 1-18.

Tam, S. (1984). On covariance from overlapping samples. *The American Statistician*, 38, 1-18.

Vandecasteele, L., and Debels, A. (2007). Attrition in
panel data: The effectiveness of weighting. *European
Sociological Review*, 23, 1, 81-97.

Zhou, M., and Kim, J. (2012). An effcient method of
estimation for longitudinal surveys with monotone missing data. *Biometrika*, 99, 631-648.

## Report a problem on this page

Is something not working? Is there information outdated? Can't find what you're looking for?

Please contact us and let us know how we can help you.

- Date modified: