# A short note on quantile and expectile estimation in unequal probability samples 1. Introduction

Quantile estimation and quantile regression have seen a number of new developments in recent years with Koenker (2005) as a central reference. The principle idea is thereby to estimate an inverted cumulative distribution function, generally called the quantile function $Q\left(\alpha \right)\mathrm{=}{F}^{-1}\left(\alpha \right)\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\alpha \in \left(\mathrm{0,1}\right),$ where the 0.5 quantile $Q\left(0.5\right),$ the median, plays a central role. For survey data tracing from an unequal probability sample with known probabilities of inclusion Kuk (1988) shows how to estimate quantiles taking the inclusion probabilities into account. The central idea is to estimate a distribution function of the variable of interest and invert this in a second step to obtain the quantile function. Chambers and Dunstan (1986) propose a model-based estimator of the distribution function. Rao, Kovar and Mantel (1990) propose a design-based estimator of the cumulative distribution function using auxiliary information. Bayesian approaches in this direction have recently been proposed in Chen, Elliott, and Little (2010) and Chen, Elliott, and Little (2012).

Quantile estimation results from minimizing an ${L}_{1}$ loss function as demonstrated in Koenker (2005). If the ${L}_{1}$ loss is replaced by the ${L}_{2}$ loss function one obtains so called expectiles as introduced in Aigner, Amemiya and Poirier (1976) or Newey and Powell (1987). For $\alpha \in \left(\mathrm{0,1}\right),$ this leads to the expectile function $M\left(\alpha \right)$ which, like the quantile function $Q\left(\alpha \right),$ uniquely defines the cumulative distribution function $F\mathrm{(}y\mathrm{)}$ . Expectiles are relatively easy to estimate and they have recently gained some interest, see e.g., Schnabel and Eilers (2009), Pratesi, Ranalli, and Salvati (2009), Sobotka and Kneib (2012) and Guo and Härdle (2013). However since expectiles lack a simple interpretation their acceptance and usage in statistics is less developed than quantiles, see Kneib (2013). Quantiles and expectiles are connected in that a unique and invertible transformation function ${h}_{y}\mathrm{:}\left[\mathrm{0,1}\right]\to \left[\mathrm{0,1}\right]$ exists so that $M\left(h\left(\alpha \right)\right)\mathrm{=}Q\left(\alpha \right),$ see Yao and Tong (1996) and De Rossi and Harvey (2009). This connection can be used to estimate quantiles from a set of fitted expectiles. The idea has been used in Schulze Waltrup, Sobotka, Kneib and Kauermann (2014) and the authors show empirically that the resulting quantiles can be more efficient than empirical quantiles, even if a smoothing step is applied to the latter (see Jones 1992). An intuitive explanation for this is that expectiles account for all the data while quantiles based on the empirical distribution function only take the left (or the right) hand side of the data into account. That is, the median is defined by the 50% left (or 50% right) part of the data while the mean (as 50% expectile) is a function of all data points. In this note we extend these findings and demonstrate how expectiles can be estimated for unequal probability samples and how to obtain a fitted distribution function from fitted expectiles.

The paper is organized as follows. In Section 2 we give the necessary notation and discuss quantile regression in unequal probability sampling. This is extended in Section 3 towards expectile estimation. Section 4 utilizes the connection between expectiles and quantiles and demonstrates how to derive quantiles from fitted expectiles. Section 5 demonstrates in simulations the efficiency gain in quantiles derived from expectiles and a discussion concludes the paper in Section 6.

- Date modified: