A short note on quantile and expectile estimation in unequal probability samples 5. SimulationsA short note on quantile and expectile estimation in unequal probability samples 5. Simulations

We run a small simulation study to show the performance of the expectile based estimates. In the following, we make use of the Mizuno sampling method (see Midzuno 1952) and define the inclusion probabilities $π_{j}$ proportional to a measure of size $x,$ see R package “sampling” by Tillé and Matei (2015). We examine two data sets also used in Kuk (1988). The first data set (Dwellings) contains two variables, the number of dwelling units $(X),$ and the number of rented units $(Y),$ which are highly correlated (with a correlation of 0.97); see also Kish (1965). The second data set (Villages) includes information on the population $(X)$ and on the number of workers in household industry $(Y)$ for 128 villages in India; see Murthy (1967). In the second data set the correlation between $Y$ and $X$ is 0.54. In order to compare our simulation results with the results of Kuk (1988) we choose the same sample size of $n =30$ (from a total population of $N =270$ for the Dwellings data and $N =128$ for the Villages data).

We compare quantiles defined by inversion of ${\hat{F}}_{R}$ with quantiles defined by inversion of ${\hat{F}}_{R}^{M} .$ In Table 5.1 we give the root mean squared error (RMSE) and the relative efficiency for specified quantiles. We note that the median for the village data and for the Dwelling data also upper quantiles derived from expectiles yield increased efficiency. Also the efficiency gain does not hold uniformly as we observe a loss of efficiency for lower quantiles.

Table 5.1
Comparison of mean squared error on a basis of 500 replications
Table summary
This table displays the results of Comparison of mean squared error on a basis of 500 replications XXXX, quantiles XXXX, quantiles from expectiles XXXX and relative efficiency XXXX (appearing as column headers).
	$α$	quantiles $\sqrt{MSE ({\hat{Q}}_{R} (α))}$	quantiles from expectiles $\sqrt{MSE ({\hat{Q}}_{R}^{M} (α))}$	relative efficiency $\frac{\sqrt{MSE ({\hat{Q}}_{R}^{M} (α))}}{\sqrt{MSE ({\hat{Q}}_{R} (α))}}$
Dwellings	0.1	2.57	2.76	1.07
	0.25	1.77	1.97	1.11
	0.5	2.45	2.35	0.96
	0.75	3.15	2.91	0.92
	0.9	4.20	3.43	0.82
Villages	0.1	5.52	6.65	1.21
	0.25	11.41	10.31	0.90
	0.5	12.29	11.69	0.95
	0.75	16.24	15.41	0.95
	0.9	13.31	18.34	1.38

To obtain more insight we run a simulation scenario which involves a larger sample size of $n =100$ selected from populations of sizes $N = 1,000$ and $N = 10,000 .$ We draw $Y$ and $X$ from a bivariate log standard normal distribution with $μ =0$ and $σ =1.$ The variables $Y$ and $X$ are drawn such that the correlation between the variables is equal to 0.9. We again calculate the root mean squared error for a range of $α$ values and show the relative efficiency of the expectile based approach in Figure 5.1. For better visual presentation we show a smoothed version of the relative efficiency. We notice a reduction in the root mean squared error for both cases $N = 1,000$ and $N = 10,000 .$ We may conclude that the expectiles can easily be fitted in unequal probability sampling and the relation between expectiles and the distribution function can be used numerically to calculate quantiles with increased efficiency. This efficiency gain holds for upper quantiles only, that is for $α$ bounded away from zero. Note however that the sampling scheme is such that large values of $Y$ are sampled with higher probability, reflecting that the sampling scheme aims to get more reliable estimates for the right hand side of the distribution function, i.e., for large quantiles. If we are interested in small quantiles we should use a different samling scheme by giving individuals with small values of $Y$ an increased inclusion probability. In this case the behavior shown in Figure 5.1 would be mirrored with respect to $α .$

Figure 5.1 of article 14545

Description of Figure 5.1

Figure made of two graphs presenting the relative root mean squared error of quantiles and quantiles from expectiles for the Probability Proportional to Size (PPS) design calculated from 500 repetitions, for $N = 1,000$ and $N = 10,000 .$ For both graphs, the y axis is the ratio of RMSE quantiles from $F_{R}^{M}$ and from $F_{R},$ going from 0.90 to 1.15. $α$ is on the x axis, going from 0.01 to 0.99. For $N = 1,000,$ the ratio is close to 1.15 for small $α -$ values before decreasing between 0.90 and 0.95 for an $α -$ value of about 0.25. After, the ratio is globally increasing slowly toward 1.00 when $α$ increases. For $N = 10,000,$ the ratio is close to 1.10 for small $α -$ values before decreasing to about 0.95 for $α$ between 0.20 and 0.25. After, the ratio is globally increasing more quickly toward 1.00 when $α$ increases.

ISSN : 1492-0921

Editorial policy

Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.

Submission of Manuscripts

Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.

Copyright

Published by authority of the Minister responsible for Statistics Canada.

Use of this publication is governed by the Statistics Canada Open Licence Agreement.

Catalogue No. 12-001-X

Frequency: semi-annual

Ottawa

Date modified:: 2016-06-22

Language selection

Search and menus

Search

A short note on quantile and expectile estimation in unequal probability samples 5. SimulationsA short note on quantile and expectile estimation in unequal probability samples 5. Simulations