Small area estimation using Fay-Herriot area level model with sampling variance smoothing and modeling
Section 1. Introduction
Small area estimation is popular and important in survey
data analysis. Model-based estimates have been widely used in practice to
provide reliable estimates for small areas. In practice, area level models are
usually used whenever direct survey estimates and area level auxiliary
variables are available. Various area level models have been proposed to
improve the precision of the direct survey estimates, see Rao and Molina
(2015). Among the area level models, the Fay-Herriot model (Fay and Herriot,
1979) is a basic area level model widely used in small area estimation. The
Fay-Herriot model has two components, namely, a sampling model for the direct
survey estimates and a linking model for the small area parameter of interest.
The sampling model assumes that a direct survey estimator is design unbiased for the small area
parameter such that
where is the
sampling error associated with the direct estimator and is the number of small areas. It is customary
to assume that are
independently normal random variables with mean and
sampling variance The linking model assumes that the small area
parameter is
related to auxiliary variables through
a linear regression model given as
where is a vector
of regression coefficients, and the are
area-specific random effects assumed to be independent and identically
distributed with and The assumption of normality is generally
included. Random effects and
sampling errors are mutually independent. The model variance is
unknown and needs to be estimated. Combining models (1.1) and (1.2) leads to a
linear mixed model given as
Model (1.3) involves both design-based random errors and model-based random effects For the Fay-Herriot model, the sampling
variance is usually assumed to be known. This is a very
strong assumption. In practice, unbiased direct estimates of the sampling
variances are generally available. To make use of the direct sampling variance
estimates, two approaches are available in practice, namely, smoothing and
modeling. For the smoothing approach, smoothed estimates of the sampling
variances are used in the Fay-Herriot model and then treated as known. The
smoothing approach requires external variables and external models such as use
of the generalized variance function (GVF) and design effects. You and
Hidiroglou (2012) particularly studied the GVF and design effects methods for
sampling variance smoothing for proportions. In this paper, we will use a GVF
model proposed in You and Hidiroglou (2012) for the sampling variance
smoothing.
As an alternative to smoothing, sampling variance
modeling is also commonly used in practice. Let denote the direct estimator for the sampling
variance We consider a custom model for as where and is the sample size for the area. Rivest and Vandal (2002) and Wang and
Fuller (2003) used empirical best linear unbiased prediction (EBLUP) method to
obtain the model-based estimates. You and Chapman (2006) considered a
hierarchical Bayes (HB) approach and combined the sampling variance model with the small area model (1.3) to construct
an integrated model. The integrated model borrows strength for small area estimates and
sampling variance estimates simultaneously. The integrated HB modeling approach
with has thus been widely used in practice, for example, You
(2008, 2016), Dass, Maiti, Ren and Sinha (2012), Sugasawa, Tamae and
Kubokawa (2017), Ghosh, Myung and Moura (2018), and Hidiroglou, Beaumont and
Yung (2019).
In this paper, we consider both the smoothing and
modeling approaches for the sampling variances. In Section 2, we present
the EBLUP method based on both the smoothed and direct estimates of the
sampling variances. In Section 3, we present the Fay-Herriot HB model and three
other HB models based on sampling variance modeling. We compare the effects of
sampling variance smoothing and modeling in Section 4 through a real data
analysis, and we offer some suggestions in Section 5.
ISSN : 1492-0921
Editorial policy
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.
© His Majesty the King in Right of Canada as represented by the Minister of Industry, 2022
Use of this publication is governed by the Statistics Canada Open Licence Agreement.
Catalogue No. 12-001-X
Frequency: Semi-annual
Ottawa