Unequal probability inverse sampling Section 7. Discussion

The selection problem can therefore be resolved for all cases, with or without replacement and with equal or unequal probabilities. The proposed solution based on the elimination method respects the inclusion probabilities exactly, which is not true for Ohlsson’s sequential sampling. The implementation is especially simple, since the program provides an ordered sequence of occupations to propose until the objective has been met.

The estimation issue is slightly more difficult. For the unequal probability sampling without replacement, we must make do with a heuristic solution. As well, it can be seen that, in the second stage, there tends to be lower inclusion probabilities in enterprises that have many occupations. This should lead us to select with greater probabilities the enterprises that may have a larger number of occupations, to avoid selecting occupations with probabilities that are too unequal.

Acknowledgements

The author wishes to thank Pierre Lavallée for submitting this interesting problem and providing thoughtful comments on an earlier version of this article. The author also thanks Audrey-Anne Vallée for her meticulous proofreading, and a referee and writer of Survey Methodology for their pertinent remarks, which made it possible to improve this article.

Appendix

#
# Load sampling package, which contains the function inclusionprobabilities().
#
library(sampling)
#
# The function returns a vector with the sequence numbers of the eliminations.
# The last (resp. first) unit eliminated is the first (resp. last)
# component of the vector.
# The function therefore provides the numbers of the units to be presented
# successively for the inverse selection.
# The argument x is the vector of values of the auxiliary variable used to calculate
# the inclusion probabilities.
#
elimination<-function(x)

{
pikb=x/sum(x)
M = length(pikb)
n = sum(pikb)
sb = rep(1, M)
b = rep(1, M)
res=rep(0, M)
for (i in 1:(M)) {
a = inclusionprobabilities(pikb, M - i)
v = 1 - a/b
b = a
p = v * sb
p = cumsum(p)
u = runif(1)
for (j in 1:length(p)) if (u < p[j])
break
sb[j] = 0
res[i]=j
}
res[M:1]
}

#
# 500,000 simulations with a size in a list of size M=20.
# By taking the first m components of vector v, we obtain a sample
# of size m.
#
M=20
x=runif(M)
Pik=array(0,c(M,M))
#
# Calculate the inclusion probabilities for all sample sizes from 1 to 20.
#
for(i in 1:M) Pik[i,]=inclusionprobabilities(x, i)
rowSums(Pik)

SIM=50000
SS=array(0,c(M,M))
for(i in 1:SIM)
{
S=array(0,c(M,M))
v=elimination(x)
for(i in 1:M) S[i,v[1:i]]=1
SS=SS+S
}
SS=SS/SIM
#
# Compare actual and empirical inclusion probabilities.
#
Pik
SS
SS-Pik

References

Chikkagoudar, M.S. (1966). A note on inverse sampling with equal probabilities. Sankhyā, A28, 93-96.

Chikkagoudar, M.S. (1969). Inverse sampling without replacement. Australian Journal of Satistic, 11, 155-165.

Hájek, J. (1971). Discussion of an essay on the logical foundations of survey sampling, part on by D. Basu. In Foundations of Statistical Inference, (Eds., Godambe, V.P. and Sprott, D.A.), page 326, Toronto, Canada. Holt, Rinehart, Winston.

Johnson, N.L., Kemp, A.W. and Kotz, S. (2005). Univariate Discrete Distributions. New York: John Wiley & Sons, Inc.

Mikulski, P.W., and Smith, P.J. (1976). A variance bound for unbiased estimation in inverse sampling. Biometrika, 63(1), 216-217.

Miller, G.K., and Fridell, S.L. (2007). A forgotten discrete distribution? Reviving the negative hypergeometric model. The American Statistician, 61(4), 347-350.

Murthy, M.N. (1957). Ordered and unordered estimators in sampling without replacement. Sankhyā, 18, 379-390.

Ohlsson, E. (1995). Sequential Poisson sampling. Research report 182, Stockholm University, Sweden.

Ohlsson, E. (1998). Sequential Poisson sampling. Journal of Official Statistics, 14, 149-162.

Pathak, P.K. (1964). On inverse sampling with unequal probabilities. Biometrika, 51, 185-193.

Rosén, B. (1997). On sampling with probability proportional to size. Journal of Statistical Planning and Inference, 62, 159-191.

Salehi, M.M., and Seber, G.A.F. (2001). A new proof of Murthy’s estimator which applies to sequential sampling. The Australian and New Zealand Journal of Statistics, 43, 281-286.

Sampford, M.R. (1962). Methods of cluster sampling with and without replacement for clusters of unequal sizes. Biometrika, 49(1/2), 27-40.

Tillé, Y. (1996). An elimination procedure of unequal probability sampling without replacement. Biometrika, 83, 238-241.

Tillé, Y. (2006). Sampling Algorithms. New York: Springer.

Date modified: