Small area estimation using Fay-Herriot area level model with sampling variance smoothing and modeling
Section 4. Application

Table of contents

In this section, we apply the models in Sections 2 and 3 to the Canadian Labour Force Survey (LFS) data and compare the EBLUP and HB estimates. The LFS releases monthly unemployment rate estimates for large areas such as the nation and provinces as well as local areas such as Census Metropolitan Areas (CMAs) and Census Agglomerations (CAs) across Canada. The direct LFS estimates for some local areas are not reliable exhibiting very large coefficient of variations (CVs) due to small sample sizes. Model-based estimators are considered to improve the direct LFS estimates. As an illustration, we apply the Fay-Herriot model to the May 2016 unemployment rate estimates at the CMA/CA level, and compare the model-based estimates and the direct estimates with the census estimates to compare the effects of sampling variance smoothing and modeling. Hidiroglou et al. (2019) also compared the model-based LFS estimates with the census estimates. For the unemployment rate estimation, the local area employment insurance monthly beneficiary rate is used as an auxiliary variable in the model. For comparison of point estimates, we compute the absolute relative error (ARE) of the direct and model estimates with respect to the census estimates for each CMA/CA as follows:

${ARE}_{i} = | \frac{θ_{i}^{Census} - θ_{i}^{Est}}{θ_{i}^{Census}} |,$

where $θ_{i}^{Est}$ is the direct or the EBLUP/HB estimate and $θ_{i}^{Census}$ is the corresponding census value of the unemployment rate. Then we take the average of AREs over CMA/CAs. For CV, we compute the average CVs of the direct and model-based estimates. We prefer a model with smaller ARE and smaller CV.

We first apply the models to all the 117 CMA/CAs with sample size $\geq 2,$ and then apply them to 92 CMA/CAs with sample size $\geq 5,$ and finally 79 CMA/CAs with sample size $\geq 7.$ Table 4.1 presents the average ARE and the corresponding average CV (in brackets). In Table 4.1, the model with Smoothed sv indicates that a smoothed sampling variance is used, Direct sv indicates that a direct sampling variance estimate is used.

With Smoothed sv, both FH-EBLUP and FH-HB substantially improve the direct survey estimates with much smaller ARE and CV. In particular, FH-HB has the smallest ARE, and FH-EBLUP has the smallest CV. For example, over the 117 areas, the direct LFS estimator has ARE 0.263 with average CV 0.329, FH-EBLUP Smoothed sv has ARE 0.124 with average CV 0.087, FH-HB Smoothed sv has ARE 0.118 with average CV 0.116. The good performance of FH-EBLUP and FH-HB with Smoothed sv indicates that the smoothing GVF (2.2) is very useful and effective in improving the model-based estimates.

With Direct sv, both FH-EBLUP and FH-HB perform the worst among all the models, with almost identical results under this scenario. The other three HB models perform better than the FH-EBLUP and FH-HB using direct sv. YLLM and STKM perform better than YCM with smaller ARE and smaller CV. YLLM and STKM perform very similarly for all the CMA/CA groups, and YLLM consistently has slightly smaller ARE than STKM, but YLLM has slightly larger CV than STKM. For example, over the 117 areas, YLLM has ARE 0.135, STKM has ARE 0.137, and YLLM has average CV 0.123, and STKM has average CV 0.122. YCM has ARE 0.148 with CV 0.136, FH-HB has ARE 0.171 with CV 0.221.

Table 4.1
Comparison of average absolute relative error (ARE) and average CV in parenthesis
Table summary
This table displays the results of Comparison of average absolute relative error (ARE) and average CV in parenthesis. The information is grouped by CMA/CAs (appearing as row headers), Direct, FH-EBLUP, FH-HB, YCM, YLLM and STKM (appearing as column headers).
CMA/CAs	Direct	FH-EBLUP	FH-HB	FH-EBLUP	FH-HB	YCM	YLLM	STKM
CMA/CAs	LFS	Smoothed sv	Smoothed sv	Direct sv	Direct sv	Direct sv	Direct sv	Direct sv
Average over 117 CMA/CAs	0.263	0.124	0.118	0.170	0.171	0.148	0.135	0.137
(sample size $\geq 2$ )	(0.329)	(0.087)	(0.116)	(0.238)	(0.221)	(0.136)	(0.123)	(0.122)
Average over 92 CMA/CAs	0.216	0.124	0.116	0.133	0.132	0.132	0.125	0.127
(sample size $\geq 5$ )	(0.262)	(0.076)	(0.103)	(0.123)	(0.123)	(0.121)	(0.117)	(0.116)
Average over 79 CMA/CAs	0.181	0.122	0.113	0.126	0.122	0.122	0.118	0.120
(sample size $\geq 7$ )	(0.232)	(0.057)	(0.094)	(0.115)	(0.115)	(0.115)	(0.114)	(0.113)

Now we present a Bayesian model comparison using conditional predictive ordinate (CPO) for the four HB models with Direct sv. CPOs are the observed likelihoods based on the cross-validation predictive distribution $f (y_{i} | y_{obs (i)}) .$ We compute the CPO values for each observed data point $y_{i, obs}$ and larger CPO indicates that $y_{i, obs}$ supports the model and a better model fit. For model choice, we can compute the CPO ratio of model A against model B. If this ratio is greater than 1, then $y_{i, obs}$ supports model A. We compute the CPO ratio for YCM/FH-HB, YLLM/FH-HB and STKM/FH-HB, and count the number of the CPO ratios are larger than 1. We can also plot the CPO values or summarize the CPO values by taking the average of the estimated CPOs. For more detail on CPO, see for example, Gilks, Richardson and Spiegelhalter (1996), page 153, You and Rao (2000), and Molina, Nandram and Rao (2014). Table 4.2 presents the CPO mean and median values over the 117 CMA/CAs and the number of CPO ratios larger than 1.

Table 4.2
Summary of CPO values and CPO ratios over 117 CMA/CAs
Table summary
This table displays the results of Summary of CPO values and CPO ratios over 117 CMA/CAs FH-HB, YCM, YLLM and STKM (appearing as column headers).
	Direct sv	Direct sv	Direct sv	Direct sv
	FH-HB	YCM	YLLM	STKM
CPO Mean	0.1053	0.1222	0.1242	0.1238
CPO Median	0.0976	0.1004	0.1045	0.1051
# of CPO ratio >1	-	72	78	76

It is clear from Table 4.2 that YCM, YLLM and STKM have larger CPO values than FH-HB, which indicate that the HB model with sampling variance modeling is preferred when the direct sampling variance estimates are used, and YLLM and STKM are better than YCM. For CPO ratios, among the 117 areas, 72 areas/observations support YCM, 78 areas support YLLM and 76 areas support STKM. Therefore more observations support YCM, YLLM and STKM over FH-HB, and YLLM has the most number of CPO ratios that are larger than 1. The CPO comparison is consistent with the results reported in Table 4.1. For other model checking and evaluation methods, see Hidiroglou et al. (2019).

ISSN : 1492-0921

Editorial policy

Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.

Submission of Manuscripts

Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.

Copyright

Published by authority of the Minister responsible for Statistics Canada.

Use of this publication is governed by the Statistics Canada Open Licence Agreement.

Catalogue No. 12-001-X

Frequency: Semi-annual

Ottawa

Date modified:: 2022-01-06

Language selection

Search and menus

Search

Small area estimation using Fay-Herriot area level model with sampling variance smoothing and modeling
Section 4. Application

Small area estimation using Fay-Herriot area level model with sampling variance smoothing and modeling Section 4. Application

Editorial policy

Submission of Manuscripts

Note of appreciation

Standards of service to the public

Copyright

Small area estimation using Fay-Herriot area level model with sampling variance smoothing and modeling
Section 4. Application