# A note on the concept of invariance in two-phase sampling designs Section 2. The concept of invariance

We distinguish the concept of strong invariance that may also be called distribution invariance from that of weak invariance that may also be called first-two-moment invariance.

**Definition 1. ***A two-phase sampling design is said to be strongly (or
distribution) invariant provided that *

$$F\left({I}_{2}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{I}_{1}\right)\mathrm{=}F\left({I}_{2}\right)\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}\text{\hspace{1em}}(2.1)$$

A consequence of Definition 1 is that $F\left({I}_{1}\mathrm{,}{I}_{2}\right)\mathrm{=}F\left({I}_{1}\right)F\left({I}_{2}\right)$ and therefore, with a strongly invariant two-phase sampling design, the vector ${I}_{2}$ can be generated prior to the vector ${I}_{1}\mathrm{.}$ In practice, the concept of strong invariance is satisfied for only few two-phase sampling designs. A first example is Poisson sampling at the second phase. This covers the case of nonresponse, which is often viewed as a Poisson sampling design at the second phase. An other example is two-stage sampling. Both are described in greater detail below.

**Example 1. ***At the first phase, a sample **
${s}_{1}$
** is selected according
to an arbitrary sampling design followed by Poisson sampling at the second
phase, where the units selection probability **
${\pi}_{2i}\left({I}_{1}\right)$
** are set prior to
sampling, which means that **
${\pi}_{2i}\left({I}_{1}\right)\mathrm{=}{\pi}_{2i}$
** for **
$i\in U.$
** Since Poisson sampling
is completely characterized by its first-order selection probabilities, we have **
$F\left({I}_{2}\text{\hspace{0.17em}}|\text{\hspace{0.17em}}{I}_{1}\right)\mathrm{=}F\left({I}_{2}\right).$
** As a result, this
sampling design is strongly invariant. It can be implemented as follows: first,
generate the vector **
${I}_{2}$
** according to the
Poisson sampling design **
$F\left({I}_{2}\right)$
** and, independently,
generate the vector **
${I}_{1}$
** according to the
design **
$F\left({I}_{1}\right).$
*

**Example 2. ***Two-stage cluster sampling can be described as follows: at
the first stage, a sample of clusters is selected randomly from the population
of clusters. Then, at the second stage, within each cluster selected at the
first stage, a sample of elements is randomly selected. Note that, even in this
case, the vector **
${I}_{1}$
** is still defined at
the element level, with its size **
$N$
** corresponding to the
number of elements in the population. Under this set-up, the selection
indicator for an element **
$j$
** within cluster **
$i,$
** **
${I}_{1ij}\mathrm{,}$
** is equal to 1 for all
elements **
$j$
** within a selected
cluster **
$i.$
** Therefore, two-stage
sampling is a special case of two-phase sampling as described in Section 1. If
the selection within clusters is independent of which clusters have been
selected in the first phase, then we are in the presence of a strongly
invariant two-stage cluster sampling design. This is satisfied if the selection
of elements within clusters is independent of the selection of elements in any
other cluster. A strongly invariant two-stage cluster sampling designs can be
implemented by reversing the actual act of sampling: instead of sampling the clusters
first, we begin by selecting the elements in each of the population clusters,
and then sampling the clusters. *

Note that our definition of strong invariance for two-stage designs is slightly different from the one given in Särndal, Swensson and Wretman (1992, Chapter 4) because the latter restrict to clusters selected at the first stage. However, for practical purposes, both definitions are essentially equivalent. We used Definition 1 rather the standard definition of Särndal et al. (1992) because the latter does not extend easily to the case of two-phase sampling.

**Definition 2. ***A two-phase sampling design is said to be weakly (or
first-two-moment) invariant if *

$${\pi}_{2i}\left({I}_{1}\right)\mathrm{=}{\pi}_{2i}\text{\hspace{1em}}and\text{\hspace{1em}}{\pi}_{2ij}\left({I}_{1}\right)\mathrm{=}{\pi}_{2ij}\text{\hspace{1em}}i\in {s}_{1}\mathrm{,}j\in {s}_{1}\mathrm{.}$$

Clearly, a strongly invariant two-phase sampling design is weakly invariant but the opposite is not true. The next example describes a sampling design that is weakly invariant but not strongly invariant.

**Example 3. ***At the first phase, we select a sample, **
${s}_{1}\mathrm{,}$
** of size **
${n}_{1}\mathrm{,}$
** according to an
arbitrary fixed-size sampling design. From **
${s}_{1}\mathrm{,}$
** we select a simple
random sample without replacement, **
${s}_{2},$
** of size **
${n}_{2}\mathrm{,}$
** where **
${n}_{2}$
** is fixed prior to sampling.
This two-phase sampling design is weakly invariant since **
${\pi}_{2i}\mathrm{=}{n}_{2}/{n}_{1}\mathrm{,}$
** and **
${\pi}_{2ij}\mathrm{=}{n}_{2}\left({n}_{2}-1\right)/{n}_{1}\left({n}_{1}-1\right)\mathrm{,}$
** which remain the same
from one realization of **
${I}_{1}$
** to another. However,
it is not strongly invariant since it is not possible to generate **
${I}_{2}$
** prior to **
${I}_{1}$
** and meet the
fixed-size sample size constraint for **
${n}_{2}.$
** In fact, this would
also be true for any fixed-size sampling design at the second phase satisfying **
${\pi}_{2i}\left({I}_{1}\right)\mathrm{=}{\pi}_{2i}$
** and **
${\pi}_{2ij}\left({I}_{1}\right)\mathrm{=}{\pi}_{2ij}\mathrm{.}$
*

Finally, we describe a non-invariant two-phase sampling design.

**Example 4. **

*At the first phase, we select a simple random sample without replacement,*

*${s}_{1}\mathrm{,}$*

*of size*

*${n}_{1}\mathrm{,}$*

*according to an arbitrary fixed-size sampling design. For every*

*$i\in {s}_{1}\mathrm{,}$*

*we record an auxiliary variable*

*$x.$*

*From*

*${s}_{1},$*

*a second-phase sample,*

*${s}_{2}\mathrm{,}$*

*of fixed size*

*${n}_{2}\mathrm{,}$*

*is selected using an inclusion probability proportional-to-size procedure. In this case, we have*

$${\pi}_{2i}\left({I}_{1}\right)\mathrm{=}\frac{{n}_{2}{x}_{i}}{{\displaystyle \sum _{i\in U}}\text{\hspace{0.17em}}{x}_{i}{I}_{1i}}\mathrm{.}$$

*Clearly, the inclusion probability of unit **
$i$
** in **
${s}_{2}$
** vary from one
realization of **
${I}_{1}$
** to another. Since **
${\pi}_{2i}\left({I}_{1}\right)$
** is a function of **
${I}_{1}\mathrm{,}$
** it is known only after
the first-phase sample **
${s}_{1}$
** is actually realized.*