# A short note on quantile and expectile estimation in unequal probability samples 4. From expectiles to the distribution function

Both, the quantile function $Q\left(\alpha \right)$ and the expectile function $M\left(\alpha \right)$ uniquely define a distribution function $F\left(\mathrm{.}\right).$ While $Q\left(\alpha \right)$ is just the inversion of $F\left(\mathrm{.}\right)$ the relation between $M\left(\alpha \right)$ and $F\left(\mathrm{.}\right)$ is more complicated. Following Schnabel and Eilers (2009) and Yao and Tong (1996), we have the relation

$$M\left(\alpha \right)\mathrm{=}\frac{\left(1-\alpha \right)G\left(M\left(\alpha \right)\right)+\alpha \left\{M\left(0.5\right)-G\left(M\left(\alpha \right)\right)\right\}}{\left(1-\alpha \right)F\left(M\left(\alpha \right)\right)+\alpha \left\{1-F\left(M\left(\alpha \right)\right)\right\}}\mathrm{,}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(4.1)$$

where $G\left(m\right)$ is the moment function defined through $G\left(m\right)\mathrm{=}{\displaystyle {\sum}_{i\mathrm{=1}}^{N}}\text{\hspace{0.17em}}{Y}_{i}1\left\{{Y}_{i}\le m\right\}/N.$ Expression (4.1) gives the unique relation of function $M\left(\alpha \right)$ to the distribution function $F\left(\mathrm{.}\right).$ The idea is now to solve (4.1) for $F\left(\mathrm{.}\right),$ that is to express the distribution $F\left(\mathrm{.}\right)$ in terms of the expectile function $M\left(\mathrm{.}\right).$ Apparently, this is not possible in analytic form but it may be calculated numerically. To do so, we evaluate the fitted function $\widehat{M}\left(\alpha \right)$ at a dense set of values $\mathrm{0<}{\alpha}_{1}\mathrm{<}{\alpha}_{2}\dots \mathrm{<}{\alpha}_{L}\mathrm{<1}$ and denote the fitted values as ${\widehat{m}}_{l}\mathrm{=}\widehat{M}\left({\alpha}_{l}\right).$ We also define left and right bounds through ${\widehat{m}}_{o}\mathrm{=}{\widehat{m}}_{1}-{c}_{0}$ and ${\widehat{m}}_{L+1}\mathrm{=}{\widehat{m}}_{L}+{c}_{L+1},$ where ${c}_{0}$ and ${c}_{L}$ are some constants to be defined by the user. For instance, one may set ${c}_{0}\mathrm{=}{\widehat{m}}_{2}-{\widehat{m}}_{1}$ and ${c}_{L+1}\mathrm{=}{\widehat{m}}_{L}-{\widehat{m}}_{L-1}.$ By doing so we derive fitted values for the cumulative distribution function $F\left(\mathrm{.}\right)$ at ${\widehat{m}}_{l}$ which we write as ${\widehat{F}}_{l}\mathrm{:=}\widehat{F}\left({\widehat{m}}_{l}\right)\mathrm{=}{\displaystyle {\sum}_{j\mathrm{=1}}^{l}}\text{\hspace{0.17em}}{\widehat{\delta}}_{j}$ for non-negative steps ${\widehat{\delta}}_{j}\ge \mathrm{0,}j\mathrm{=1,}\dots \mathrm{,}L$ with ${\sum}_{j\mathrm{=1}}^{L}}\text{\hspace{0.17em}}{\widehat{\delta}}_{j}\le 1.$ We define ${\widehat{\delta}}_{L+1}\mathrm{=1}-{\displaystyle {\sum}_{l\mathrm{=1}}^{L}}\text{\hspace{0.17em}}{\widehat{\delta}}_{l}$ to make $\widehat{F}\left(\mathrm{.}\right)$ a distribution function. Assuming a uniform distribution between the dense supporting points ${\widehat{m}}_{l}$ we may express the moment function $G\left(\mathrm{.}\right)$ by simple stepwise integration as

$${\widehat{G}}_{l}\mathrm{:=}\widehat{G}\left({\widehat{m}}_{l}\right)\mathrm{=}{\displaystyle {\int}_{-\infty}^{{m}_{l}}}x\mathrm{}d\widehat{F}\left(x\right)\mathrm{=}{\displaystyle \sum _{j\mathrm{=1}}^{l}}\text{\hspace{0.17em}}{\widehat{d}}_{j}{\widehat{\delta}}_{l}\mathrm{,}$$

where ${\widehat{d}}_{j}\mathrm{=}\left({\widehat{m}}_{j}-{\widehat{m}}_{j-1}\right)/2$ with the constraint that ${\widehat{G}}_{L+1}\mathrm{=}\widehat{M}\left(0.5\right)$ and $\widehat{M}\left(0.5\right)\mathrm{=}{\displaystyle {\sum}_{j\mathrm{=1}}^{n}}\left({y}_{j}/{\pi}_{j}\right)/{\displaystyle {\sum}_{j\mathrm{=1}}^{n}}\left(1/{\pi}_{j}\right).$ With the steps ${\widehat{\delta}}_{l}\mathrm{,}l\mathrm{=1,}\dots \mathrm{,}L$ we can now re-express (4.1) as

$${\widehat{m}}_{l}\mathrm{=}\frac{\left(1-\alpha \right){\displaystyle \sum _{j\mathrm{=1}}^{l}}\text{\hspace{0.17em}}{\widehat{d}}_{j}{\widehat{\delta}}_{j}+\alpha \left(\widehat{M}\left(0.5\right)-{\displaystyle \sum _{j\mathrm{=1}}^{l}}\text{\hspace{0.17em}}{\widehat{d}}_{j}{\widehat{\delta}}_{j}\right)}{\left(1-\alpha \right){\displaystyle \sum _{j\mathrm{=1}}^{l}}\text{\hspace{0.17em}}{\widehat{\delta}}_{j}+\alpha \left(1-{\displaystyle \sum _{j\mathrm{=1}}^{l}}\text{\hspace{0.17em}}{\widehat{\delta}}_{j}\right)}\mathrm{,}l\mathrm{=1,}\dots \mathrm{,}L\mathrm{,}$$

which is then be solved for ${\widehat{\delta}}_{1}\mathrm{,}\dots \mathrm{,}{\widehat{\delta}}_{L}.$ This is a numerical exercise which is conceptually relatively straightforward. Details can be found in Schulze Waltrup et al. (2014). Once we have calculated ${\widehat{\delta}}_{1}\mathrm{,}\dots \mathrm{,}{\widehat{\delta}}_{L}$ we have an estimate for the cumulative distribution function which is denoted as ${\widehat{F}}_{N}^{M}\left(y\right)\mathrm{=}{\displaystyle {\sum}_{l\mathrm{:}{\widehat{m}}_{l}\mathrm{<}y}}\text{\hspace{0.17em}}{\widehat{\delta}}_{l}.$ We may also invert ${\widehat{F}}_{N}^{M}\left(\mathrm{.}\right)$ which leads to a fitted quantile function which we denote with ${\widehat{Q}}_{N}^{M}\left(\alpha \right).$

As Kuk (1988) shows, both theoretically and empirically, ${\widehat{F}}_{R}\left(\mathrm{.}\right)$ is more efficient than ${\widehat{F}}_{N}(.).$ We make use of this relationship and apply it to ${\widehat{F}}_{N}^{M}(.),$ which yields the estimator

$${\widehat{F}}_{R}^{M}\mathrm{:=1}-\frac{1}{N}{\displaystyle \sum _{j\mathrm{=1}}^{n}}1/{\pi}_{j}+\frac{{\displaystyle \sum _{j\mathrm{=1}}^{n}}1/{\pi}_{j}}{N}{\widehat{F}}_{N}^{M}\mathrm{.}$$

In the next section we compare the quantiles calculated from the expectile based estimator ${\widehat{F}}_{R}^{M}$ with quantiles calculated from ${\widehat{F}}_{R}.$ Note that neither ${\widehat{F}}_{R}^{M}$ nor ${\widehat{F}}_{R}$ are proper distribution functions since they are not normed to take values between 0 and 1.

- Date modified: