A note on the concept of invariance in two-phase sampling designs Section 1. Introduction
Two-phase sampling designs are often used in surveys when the sampling frame contains little or no auxiliary information. It consists of first selecting a large sample from the population (typically using a rudimentary sampling design) in order to collect data on variables that are inexpensive to obtain and that are related to the characteristics of interest. The idea behind two-phase sampling is to create a pseudo-sampling frame richer in auxiliary information than the original sampling frame. Then, using the variables observed in the first phase, an efficient sampling procedure can be used to select a (typically small) subsample from the first-phase sample in order to collect the characteristics of interest. Two-phase sampling may also be helpful in a context of nonresponse as the set of respondents is often viewed as a second-phase sample.
We adopt the following notation: consider a population of size A vector is generated according to the sampling design where denotes a vector of indicators such that is either equal to 0 or 1. The first-phase sample, denoted by is the set of population units for which and is the size of Then, a vector is generated according to the sampling design where denotes the vector of indicators such that is either equal to 0 or 1. The second-phase sample, denoted by is the set of population units for which both and and is the size of In practice, note that the indicators are not generated for the population units belonging to the set However, at least conceptually, nothing precludes defining these indicators for the units outside the first-phase sample.
Let and be the first-order and second-order selection probabilities at the first-phase. Similarly, let and be the first-order and second-order selection probabilities at the second-phase. Note that the (first-order and second-order) selection probabilities at the second-phase may depend on the realized sample
The paper is organized as follows. In Section 2, we define the concepts of weak and strong invariance and provide some examples. In Section 3, we discuss the implications of weak and strong invariance from an inferential point of view. In particular, we discuss the reverse decomposition of the variance in the case of a strongly invariant two-phase sampling design.
- Date modified: