# Unequal probability inverse sampling Section 5. Unequal probability sampling with replacement

Unequal probability sampling is not really more difficult to process when the draw is with replacement. Now let ${p}_{ik}$ denote the probability of an occupation being drawn in each draw with

$$\sum _{k\in L}}\text{\hspace{0.17em}}{p}_{ik}\mathrm{=1.$$

Let ${P}_{i}$ be the sum of ${p}_{ik}$ limited to the occupations in enterprise $i:$

$${P}_{i}\mathrm{=}{\displaystyle \sum _{k\in {F}_{i}}}\text{\hspace{0.17em}}{p}_{ik}\mathrm{.}$$

In this case, ${X}_{i}$ has a negative binomial distribution with parameters $r$ and ${P}_{i}.$ Therefore,

$$\text{E}\left({X}_{i}\right)\mathrm{=}\frac{r\left(1-{P}_{i}\right)}{{P}_{i}}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{var}\left({X}_{i}\right)\mathrm{=}\frac{r\left(1-{P}_{i}\right)}{{P}_{i}}\mathrm{.}$$

Let ${A}_{ik}\mathrm{,}k\in L$ be the number of times that unit $k$ is selected in the sample. In an unequal probability design with replacement of size $n,$ the values of ${A}_{ik}$ have a multinomial distribution. Therefore,

$$\mathrm{Pr}\left({A}_{ik}\mathrm{=}{a}_{ik}\mathrm{,}k\in L\right)\mathrm{=}n\mathrm{!}{\displaystyle \prod _{k\in L}}\frac{{p}_{ik}^{{a}_{ik}}}{{a}_{ik}\mathrm{!}}\mathrm{,}$$

where ${A}_{ik}\mathrm{=0,}\dots \mathrm{,}n\mathrm{,}$ and

$$\sum _{k\in L}}\text{\hspace{0.17em}}{a}_{ik}\mathrm{=}n\mathrm{.$$

If this multinomial vector is conditioned on a fixed size in one part of the population, then

$$\begin{array}{ll}\mathrm{Pr}\left({A}_{ik}\mathrm{=}{a}_{ik}\mathrm{,}k\in {F}_{i}|\text{\hspace{0.17em}}{\displaystyle \sum _{k\in {F}_{i}}}{A}_{ik}\mathrm{=}r\right)\hfill & \mathrm{=}\frac{\mathrm{Pr}\left({A}_{ik}\mathrm{=}{a}_{ik}\mathrm{,}k\in {F}_{i}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{and}\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\displaystyle \sum _{k\in {F}_{i}}}\text{\hspace{0.17em}}{A}_{ik}\mathrm{=}r\right)}{\mathrm{Pr}\left({\displaystyle \sum _{k\in {F}_{i}}}\text{\hspace{0.17em}}{A}_{ik}\mathrm{=}r\right)}\hfill \\ \hfill & \mathrm{=}\frac{\frac{n\mathrm{!}{\left(1-{P}_{i}\right)}^{\left(n-r\right)}}{\left(n-r\right)\mathrm{!}}{\displaystyle \prod _{k\in {F}_{i}}}\frac{{p}_{ik}^{{a}_{ik}}}{{a}_{ik}\mathrm{!}}}{\frac{n\mathrm{!}{P}_{i}^{r}{\left(1-{P}_{i}\right)}^{n-r}}{r\mathrm{!}\left(n-r\right)\mathrm{!}}}\hfill \\ \hfill & \mathrm{=}r\mathrm{!}{\displaystyle \prod _{k\in {F}_{i}}}{\left(\frac{{p}_{ik}}{{P}_{i}}\right)}^{{a}_{ik}}\frac{1}{{a}_{ik}\mathrm{!}}\mathrm{,}\hfill \end{array}$$

with

$$\sum _{k\in {F}_{i}}}\text{\hspace{0.17em}}{a}_{ik}\mathrm{=}r\mathrm{.$$

This shows that, if the sum of ${A}_{ik}$ is conditioned on one part of the population, the distribution remains multinomial and conditionally there is still an unequal probability design with replacement.

With the procedure in which we draw with replacement until we obtain $r$ occupations in enterprise $i,$ we have

$$\text{E}\left({A}_{ik}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{X}_{i}\right)\mathrm{=}\{\begin{array}{ll}\frac{r{p}_{ik}}{{P}_{i}}\hfill & \text{if}\text{\hspace{0.17em}}\text{\hspace{0.17em}}k\in {F}_{i}\hfill \\ \frac{{X}_{i}{p}_{ik}}{1-{P}_{i}}\hfill & \text{if}\text{\hspace{0.17em}}\text{\hspace{0.17em}}k\in {D}_{i}\mathrm{.}\hfill \end{array}$$

The expected value of ${A}_{ik}$ is

$${\pi}_{k\text{\hspace{0.17em}}|\text{\hspace{0.17em}}i}\mathrm{=}\text{EE}\left({A}_{ik}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{X}_{i}\right)\mathrm{=}\frac{r{p}_{ik}}{{P}_{i}}\mathrm{,}$$

$k\in L\mathrm{.}$ The problem is that we know ${p}_{ik}\mathrm{,}r$ and ${X}_{i},$ but not ${P}_{i}.$ We can estimate ${P}_{i}$ using the method of moments by solving $\text{E}\left({X}_{i}\right)\mathrm{=}{X}_{i},$ which gives

$${X}_{i}\mathrm{=}\frac{r\left(1-{\widehat{P}}_{i}\right)}{{\widehat{P}}_{i}}$$

and therefore

$${\widehat{P}}_{i1}\mathrm{=}\frac{r}{{X}_{i}+r}\mathrm{.}$$

The maximum likelihood method provides the same estimator as the method of moments, but this estimator is biased (Mikulski and Smith 1976; Johnson et al. 2005, page 222). In fact, the unbiased minimum variance estimator is

$${\widehat{P}}_{i2}\mathrm{=}\frac{r-1}{{X}_{i}+r-1}\mathrm{.}$$

However, $1/{\widehat{P}}_{i1}$ is unbiased for ${P}_{i}.$

Again, since we are using weights that are inverses of ${\pi}_{k\text{\hspace{0.17em}}|\text{\hspace{0.17em}}i}.$ The inverses of ${\pi}_{k\text{\hspace{0.17em}}|\text{\hspace{0.17em}}i}$ are thus estimated as follows:

$$\widehat{1/{\pi}_{k\text{\hspace{0.17em}}|\text{\hspace{0.17em}}i}}\mathrm{=}\{\begin{array}{lll}\frac{{\widehat{P}}_{i2}}{r{p}_{ik}}\hfill & \mathrm{=}\frac{r-1}{\left({X}_{i}+r-1\right)r{p}_{ik}}\hfill & \text{if}\text{\hspace{0.17em}}\text{\hspace{0.17em}}k\in {F}_{i}\hfill \\ \frac{1-{\widehat{P}}_{i2}}{{X}_{i}{p}_{ik}}\hfill & \mathrm{=}\frac{1}{\left({X}_{i}+r-1\right){p}_{ik}}\hfill & \text{if}\text{\hspace{0.17em}}\text{\hspace{0.17em}}k\in {D}_{i}\mathrm{.}\hfill \end{array}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(5.1)$$

- Date modified: