Survey Methodology
Archived Content
Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.
Release date: May 7, 2019
This issue of the journal Survey Methodology is a special collaboration with the International Statistical Review in honour of Prof. J.N.K. Rao's contributions.
Contemporary theory and practice of survey sampling: A celebration of research contributions of J.N.K. Rao
J.N.K. Rao is a Distinguished Research Professor in the School of Mathematics and Statistics at Carleton University, Canada. He is the world’s leading researcher in the area of survey methodology and has profoundly influenced the field of sample surveys as used by government agencies and other organizations and businesses. Professor Rao received an MA from Bombay University in 1956 and a Ph.D. from Iowa State University in 1961. For more than 50 years, he has been a driving force in the development of unequal probability sampling methods, small sample approximations, analysis of complex survey data, empirical likelihood based inferences, variance estimation techniques and re-sampling methods, and missing data solutions with sound design-based properties. His abiding effort in meeting real world needs led to another prolific area of his research on small area estimation, highlighted by his book Small Area Estimation (1st edition in 2003 and 2nd edition with Molina in 2015) published by Wiley.
In addition to his phenomenal research impact, Professor Rao has had a significant influence on official statistics agencies through his participation on advisory boards and panels, and his role as advisor and consultant. He has also inspired several generations of survey statisticians through his teaching, mentoring and research collaboration. In particular, he mentored many Chinese statisticians who have become top researchers in Chinese universities.
During his remarkable and continuing academic career, Professor Rao has been honored by an array of prestigious academic awards, including the Gold Medal of the Statistical Society of Canada (1993), the Annual Morris Hansen Lecture (1998), the Waksberg Award (2005), the inaugural SAE Award (2017), and Honorary Doctorates from University of Waterloo, Canada (2008) and Catholic University of Sacred Heart, Italy (2013). He is Fellow of the American Statistical Association (1964), the American Association for the Advancement of Science (1965), and the Institute of Mathematical Statistics (1972). He was elected Fellow of the Royal Society of Canada in 1991.
On the occasion of Professor Rao’s 80th Birthday, the Big Data Institute and the School of Mathematics and Statistics at Yunnan University, China, hosted a conference (May 24-27, 2017) celebrating Professor Rao’s research contributions. Professor Jiahua Chen, the Director of the Big Data Institute and a long-time research collaborator of Professor Rao, was the Chair of the Organizing Committee. The conference brought together a distinguished group of researchers from many countries and presented a world-class scientific program on contemporary theory and practice in survey sampling.
In honour of Professor Rao’s contributions, The International Statistical Review and Survey Methodology have agreed to publish joint special issues of papers presented at the conference. The special issue of the International Statistical Review features 15 papers. The first paper is a specially invited submission from Professor Rao on “My Chancy Life as a Statistician”, which provides a brief account with amazing anecdotes on his personal and research journey from India first to the United States and then to Canada. This paper is also reproduced in the Survey Methodology special issue. The remaining 14 papers in the International Statistical Review special issue are from all plenary speakers at the conference, covering diverse topics that reflect the current state-of-the-art research development in survey sampling. The Survey Methodology special issue contains 8 papers which are a subset of the remaining papers which were presented at the conference.
The joint special issues would not be possible without the unconditional support of the Guest co-editors of the International Statistical Review, Drs. Ray Chambers and Nalini Ravishanker and the Editor of Survey Methodology, Wesley Yung. We would also like to use this opportunity to thank the sponsors of the conference, the Canadian Statistical Sciences Institute (CANSSI), the International Association of Survey Statisticians (IASS) of the International Statistical Institute (ISI), the International Chinese Statistical Association (ICSA), the International India Statistical Association (IISA), the Statistical Society of Canada (SSC), and Yunnan University, for their support.
Jiahua Chen, Yunnan University and University of British Columbia
Changbao Wu, University of Waterloo
Guest co-editors for the International Statistical Review Special Issue
Song Cai, Carleton University
Mahmoud Torabi, University of Manitoba
Guest co-editors for the Survey Methodology Special Issue
Special contribution
My chancy life as a Statistician
by J.N.K. Rao
In this short article, I will attempt to provide some highlights of my chancy life as a Statistician in chronological order spanning over sixty years, 1954 to present.
Invited papers
Bayesian small area demography
by Junni L. Zhang, John Bryant and Kirsten Nissen
Demographers are facing increasing pressure to disaggregate their estimates and forecasts by characteristics such as region, ethnicity, and income. Traditional demographic methods were designed for large samples, and perform poorly with disaggregated data. Methods based on formal Bayesian statistical models offer better performance. We illustrate with examples from a long-term project to develop Bayesian approaches to demographic estimation and forecasting. In our first example, we estimate mortality rates disaggregated by age and sex for a small population. In our second example, we simultaneously estimate and forecast obesity prevalence disaggregated by age. We conclude by addressing two traditional objections to the use of Bayesian methods in statistical agencies.
Small area estimation of survey weighted counts under aggregated level spatial model
by Hukum Chandra, Ray Chambers and Nicola Salvati
The empirical predictor under an area level version of the generalized linear mixed model (GLMM) is extensively used in small area estimation (SAE) for counts. However, this approach does not use the sampling weights or clustering information that are essential for valid inference given the informative samples produced by modern complex survey designs. This paper describes an SAE method that incorporates this sampling information when estimating small area proportions or counts under an area level version of the GLMM. The approach is further extended under a spatial dependent version of the GLMM (SGLMM). The mean squared error (MSE) estimation for this method is also discussed. This SAE method is then applied to estimate the extent of household poverty in different districts of the rural part of the state of Uttar Pradesh in India by linking data from the 2011-12 Household Consumer Expenditure Survey collected by the National Sample Survey Office (NSSO) of India, and the 2011 Indian Population Census. Results from this application indicate a substantial gain in precision for the new methods compared to the direct survey estimates.
Measurement error in small area estimation: Functional versus structural versus naïve models
by William R. Bell, Hee Cheol Chung, Gauri S. Datta and Carolina Franco
Small area estimation using area-level models can sometimes benefit from covariates that are observed subject to random errors, such as covariates that are themselves estimates drawn from another survey. Given estimates of the variances of these measurement (sampling) errors for each small area, one can account for the uncertainty in such covariates using measurement error models (e.g., Ybarra and Lohr, 2008). Two types of area-level measurement error models have been examined in the small area estimation literature. The functional measurement error model assumes that the underlying true values of the covariates with measurement error are fixed but unknown quantities. The structural measurement error model assumes that these true values follow a model, leading to a multivariate model for the covariates observed with error and the original dependent variable. We compare and contrast these two models with the alternative of simply ignoring measurement error when it is present (naïve model), exploring the consequences for prediction mean squared errors of use of an incorrect model under different underlying assumptions about the true model. Comparisons done using analytic formulas for the mean squared errors assuming model parameters are known yield some surprising results. We also illustrate results with a model fitted to data from the U.S. Census Bureau’s Small Area Income and Poverty Estimates (SAIPE) Program.
Small area quantile estimation via spline regression and empirical likelihood
by Zhanshou Chen, Jiahua Chen and Qiong Zhang
This paper studies small area quantile estimation under a unit level non-parametric nested-error regression model. We assume the small area specific error distributions satisfy a semi-parametric density ratio model. We fit the non-parametric model via the penalized spline regression method of Opsomer, Claeskens, Ranalli, Kauermann and Breidt (2008). Empirical likelihood is then applied to estimate the parameters in the density ratio model based on the residuals. This leads to natural area-specific estimates of error distributions. A kernel method is then applied to obtain smoothed error distribution estimates. These estimates are then used for quantile estimation in two situations: one is where we only have knowledge of covariate power means at the population level, the other is where we have covariate values of all sample units in the population. Simulation experiments indicate that the proposed methods for small area quantiles estimation work well for quantiles around the median in the first situation, and for a broad range of the quantiles in the second situation. A bootstrap mean square error estimator of the proposed estimators is also investigated. An empirical example based on Canadian income data is included.
Development of a small area estimation system at Statistics Canada
by Michel A. Hidiroglou, Jean-François Beaumont and Wesley Yung
The demand for small area estimates by users of Statistics Canada’s data has been steadily increasing over recent years. In this paper, we provide a summary of procedures that have been incorporated into a SAS based production system for producing official small area estimates at Statistics Canada. This system includes: procedures based on unit or area level models; the incorporation of the sampling design; the ability to smooth the design variance for each small area if an area level model is used; the ability to ensure that the small area estimates add up to reliable higher level estimates; and the development of diagnostic tools to test the adequacy of the model. The production system has been used to produce small area estimates on an experimental basis for several surveys at Statistics Canada that include: the estimation of health characteristics, the estimation of under-coverage in the census, the estimation of manufacturing sales and the estimation of unemployment rates and employment counts for the Labour Force Survey. Some of the diagnostics implemented in the system are illustrated using Labour Force Survey data along with administrative auxiliary data.
Weighted censored quantile regression
by Chithran Vasudevan, Asokan Mulayath Variyath and Zhaozhi Fan
In this paper, we make use of auxiliary information to improve the efficiency of the estimates of the censored quantile regression parameters. Utilizing the information available from previous studies, we computed empirical likelihood probabilities as weights and proposed weighted censored quantile regression. Theoretical properties of the proposed method are derived. Our simulation studies shown that our proposed method has advantages compared to standard censored quantile regression.
Empirical likelihood inference for missing survey data under unequal probability sampling
by Song Cai and J.N.K. Rao
Item nonresponse is frequently encountered in sample surveys. Hot-deck imputation is commonly used to fill in missing item values within homogeneous groups called imputation classes. We propose a fractional hot-deck imputation procedure and an associated empirical likelihood for inference on the population mean of a function of a variable of interest with missing data under probability proportional to size sampling with negligible sampling fractions. We derive the limiting distributions of the maximum empirical likelihood estimator and empirical likelihood ratio, and propose two related asymptotically valid bootstrap procedures to construct confidence intervals for the population mean. Simulation studies show that the proposed bootstrap procedures outperform the customary bootstrap procedures which are shown to be asymptotically incorrect when the number of random draws in the fractional imputation is fixed. Moreover, the proposed bootstrap procedure based on the empirical likelihood ratio is seen to perform significantly better than the method based on the limiting distribution of the maximum empirical likelihood estimator when the inclusion probabilities vary considerably or when the sample size is not large.
Improved Horvitz-Thompson estimator in survey sampling
by Xianpeng Zong, Rong Zhu and Guohua Zou
The Horvitz-Thompson (HT) estimator is widely used in survey sampling. However, the variance of the HT estimator becomes large when the inclusion probabilities are highly heterogeneous. To overcome this shortcoming, in this paper we propose a hard-threshold method for the first-order inclusion probabilities. Specifically, we carefully choose a threshold value, then replace the inclusion probabilities smaller than the threshold by the threshold. Through this shrinkage strategy, we construct a new estimator called the improved Horvitz-Thompson (IHT) estimator to estimate the population total. The IHT estimator increases the estimation accuracy much although it brings a bias which is relatively small. We derive the IHT estimator’s mean squared error and its unbiased estimator, and theoretically compare the IHT estimator with the HT estimator. We also apply our idea to construct an improved ratio estimator. We numerically analyze simulated and real data sets to illustrate that the proposed estimators are more efficient and robust than the classical estimators.
- Date modified: