Unequal probability inverse sampling Section 1. Problem

The problem arose as part of a question on Statistics Canada’s new Job Vacancy and Wage Survey (JVWS). The JVWS comprises a wage component and a job vacancy component. The wage component looks at average wages, minimum wages, maximum wages and starting wages for various occupations.

The objective is to provide wage statistics by economic regions (economic regions are subdivisions of provinces). In the first stage, a sample of 100,000 business locations (also known as local units of enterprises) are selected using a Poisson design stratified by industry and economic region.

For simplicity, the term “enterprise” will be used in the rest of the document instead of “location,” keeping in mind that Statistics Canada defines a location as “a production unit located at a single geographical location at or from which economic activity is conducted and for which a minimum of employment data are available.”

For purposes of managing response burden, it is not possible to identify every occupation in each enterprise. Therefore, proposing a list of occupations and asking whether the listed occupations exist in an enterprise has been considered. Occupations can then be randomly drawn from the list and proposed successively to the head of the enterprise until r MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpipC0xd9Wqpe0dd9 qqaqFeFr0xbbG8FaYPYRWFb9fi0lXxbvc9Ff0dfrpm0dXdHqps0=vr 0=vr0=fdbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOCaaaa@351E@ occupations have been reached. Since the most common occupations are of specific interest, it is useful to consider cases in which occupations are selected with unequal probabilities from the list in proportion to their prevalence in the total population. Note that this method was not implemented for Statistics Canada’s Job Vacancy and Wage Survey. The survey decided to present a list, of fixed length, of occupations to the surveyed enterprises. Nevertheless, the theoretical properties of the proposed method remain of interest.

“Inverse sampling” refers to a scheme in which units are selected successively until a predetermined number of units with a certain characteristic is obtained. Inverse sampling must not be confused with rejective sampling. In rejective sampling, a sample is selected according to a design, and the sample is rejected if it does not have the desired characteristic (e.g., a specific sample size or an average equal to that of the population). The selection of samples is repeated until a sample with the desired property is obtained.

Inverse sampling raises a certain number of theoretical questions. How can such a design be implemented with equal or unequal inclusion probabilities? What is the probability of inclusion of an occupation within each enterprise? How can a variable of interest be estimated using a sample consisting of a few enterprises and a few occupations within them? How can the number of occupations in the enterprise be estimated? More generally, how can this survey be implemented and how can estimation be done?

The key issue is the way in which the occupations are selected. They may be selected using a simple design with or without replacement, or with unequal probabilities. One option would be to select the units with unequal probabilities using the sequential Poisson sampling method proposed by Ohlsson (1998) or the Pareto sampling method proposed by Rosén (1997). The inverse sampling problem has already been discussed by Murthy (1957), Sampford (1962), Pathak (1964), Chikkagoudar (1966, 1969), and Salehi and Seber (2001). However, the parameter to be estimated here is unique, since estimates of average revenue among all enterprises having a specific occupation are desired. We also propose a new unequal-probability inverse design without replacement.

This article is organized as follows: In Section 2, the problem is stated and the notation is defined. The equal probability case with replacement is discussed in Section 3, and the equal probability case without replacement is discussed in Section 4. The unequal probability case with replacement is developed in Section 5. A new selection method for the unequal probability case without replacement is presented in Section 6. Finally, Section 7 contains a short discussion.

Date modified: