5 Conclusion and perspectives for research

Hervé Cardot, Alain Dessertaine, Camelia Goga, Étienne Josserand and Pauline Lardin

Previous

In this study, we have implemented and compared different strategies for using auxiliary information for estimating, and constructing confidence bands for, the mean of data in the form of curves. This information can be taken into consideration at the time of sampling by using unequal probability designs or during estimation with simple random sampling without replacement, assisted by a functional-response regression model. It seems clear from our example of electricity consumption curves that when total consumption for the previous week is known, the precision of estimators of the mean can be greatly improved compared with an SRSWOR-type sampling.

Also, in this context of large samples and high-dimensional data, it also seems possible to construct, for these different strategies, confidence bands that have empirical coverage rates close to the desired rates. The two considered approaches--estimation of the covariance function and simulation of Gaussian or bootstrap processes-seem to perform comparably in terms of the width of the confidence bands; the main difference is in the computation time. The bootstrap, which seems more general because it does not require having a good estimator of the covariance function, proves to be much slower in practice.

Sometimes, in these flows of large-scale data, there are losses of information owing to signal transmission problems. The end result is that the utility has incomplete records of some trajectories. This issue, of partial non-response, can probably be dealt with by considering adaptations of classical non-response techniques (Haziza 2009) in the functional context. A fundamental question, then, is how to construct good estimators of the covariance function.

Acknowledgements

We wish to thank the anonymous referees as well as Guillaume Chauvet and Jean-Claude Deville for their helpful comments, which led to improvements in this study.

Bibliography

Bickel, P., and Krieger, A. (1989). Confidence bands for a distribution function using the bootstrap. Journal of the American Statistical Association, 84, 95-100.

Booth, J., Butler, R. and Hall, P. (1994). Bootstrap methods for finite population. Journal of the American Statistical Association, 89, 1282-1289.

Canty, A.J., and Davison, A.C. (1999). Resampling-based variance estimation for labour force surveys. The Statistician, 48, 379-391.

Cardot, H., Chaouch, M., Goga, C. and Labruère, C. (2010). Properties of design-based functional principal components analysis. Journal of Statistical Planning and Inference, 140, 75-91.

Cardot, H., Degras, D. and Josserand, E. (2013). Confidence bands for Horvitz-Thompson estimators using sampled noisy functional data. Bernouilli, 19, 2067-2097.

Cardot, H., Goga, C. and Lardin, P. (2013). Uniform convergence and asymptotic confidence bands for model-assisted estimators of the mean of sampled functional data. Electronic J. of Statistics, 7, 562-596.

Cardot, H., and Josserand, E. (2011). Horvitz-thompson estimators for functional data: Asymptotic confidence bands and optimal allocation for stratified sampling. Biometrika, 98, 107-118.

Chaouch, M., and Goga, C. (2012). Using complex surveys to estimate the l1-median of a functional variable: Application to electricity load curves. International Statistical Review, 80, 40-59.

Chauvet, G. (2007). Méthodes de bootstrap en population finie. PhD thesis, Université de Rennes II.

Chauvet, G., and Tillé, Y. (2006). A fast algorithm of balanced sampling. Computational Statistics, 21, 53-61.

Cochran, W. (1977). Sampling techniques. New York: John Wiley & sons, Inc., 3rd Edition.

Cuevas, A., Febrero, M. and Fraiman, R. (2006). On the use of the bootstrap for estimating functions with functional data. Computational Statistics and Data Analysis, 51, 1063-1074.

Dauxois, J., and Pousse, A. (1976). Les analyse factorielles en calcul des probabilités et en statistique : essai d'étude synthétique. PhD thesis, Université Paul Sabatier, Toulouse.

Degras, D. (2011). Simultaneous confidence bands for parametric regression with functional data. Statistica Sinica, 21(4), 1735-1765.

Dessertaine, A. (2008). Estimation de courbes de consommation électrique à partir des mesures synchrones. In Méthodes de Sondages (Eds., P. Guibert, D. Haziza, A. Ruiz-Gazen and Y. Tillé), Dunod, France, 353-357.

Deville, J. (1974). Méthodes statistiques et numériques de l'analyse harmonique. Ann. Insee, 15, 3-104.

Deville, J., and Tillé, Y. (2004). Efficient balanced sampling: The cube algorithm. Biometrika, 91, 893-912.

Deville, J. and Tillé, Y. (2005). Variance approximation under balanced sampling. Journal of Statistical Planning and Inference, 128 :569-591.

Faraway, J. (1997). Regression analysis for a functional response. Technometrics, 39(3) :254-261.

Ferraty, F. and Romain, Y., editors (2011). Oxford Handbook of Functional Data Analysis. Oxford University Press.

Goga, C. and Ruiz-Gazen, A. (2013). Efficient estimation of nonlinear finite population parameters using nonparametrics, to appear in the Journal of the Royal Statistical Society, Series B, DOI: 10.1111/rssb.12024.

Gross, S. (1980). Median estimation in sample surveys. ASA Proceedings of Survey Research, pages 181-184.

Hájek, J. (1964). Asymptotic theory of rejective sampling with varying probabilities from a finite population. Annals of Mathematical Statistics, 35 :1491-1523.

Haziza, D. (2009). Imputation and inference in the presence of missing data. In Rao, C. and Pfeffermann, D., editors, Sample Surveys : Theory Methods and Inference, volume 29 of Handbook of Statistics, pages 215-246. North-Holland.

Madow, W. (1949). On the theory of systematic sampling, ii. Annals of Mathematical Statistics, 19 :535-545.

Ramsay, J. and Silverman, B. (2005). Functional data analysis. Springer, New York, second edition.

Rao, J. andWu, C. (1988). Resampling inference with complex data. Journal of the American Statistical Association, 83 :231-241.

Särndal, C.-E., Swensson, B., and Wretman, J. (1992). Model assisted survey sampling. Springer.

Sen, A. (1953). On the estimate of the variance in sampling with varying probabilities. Journal of the Indian Society of Agricultural Statistics, 5 :119-127.

Sitter, R. R. (1992). A resampling procedure for complex survey data. Journal of the American Statistical Association, 87 :755-765.

Tillé, Y. (2011). Ten years of balanced sampling with the cube method : an appraisal. Survey Methodology, 37 :215-226.

Yates, F. and Grundy, P. (1953). Selection without replacement from within strata with probability proportional to size. Journal of the Royal Statistical Society, B, 15 :235-261.

Previous

Date modified: