A note on the concept of invariance in two-phase sampling designs Section 3. Implications of the invariance propertyA note on the concept of invariance in two-phase sampling designs Section 3. Implications of the invariance property

3.1 Weak invariance

For an arbitrary two-phase sampling design, the inclusion probability of unit $i,$ $π_{i}, i \in s_{1},$ is generally unknown and is defined as

$\begin{array}{l} π_{i} & = E (I_{1 i} I_{2 i}) \\ = E {I_{1 i} E (I_{2 i} | I_{1})} \\ = \sum_{i_{1} : i_{1 i} =1} π_{2 i} (I_{1}) P (I_{1} = i_{1}), \end{array} (3.1)$

where $i_{1}$ denotes a realisation of the random vector $I_{1} .$ Therefore, the $π_{i} ’ s$ are generally unknown because they require the knowledge of $P (I_{1} = i_{1})$ for every possible $I_{1}$ (in many cases, we do) but also of $π_{2 i} (I_{1})$ for every $I_{1} .$ The latter are generally unknown because $π_{2 i} (I_{1})$ may depend on the outcome of phase 1. However, if the sampling design is weakly invariant, then $π_{2 i} (I_{1}) = π_{2 i}$ and (3.1) reduces to

$π_{i} = π_{2 i} \sum_{i_{1} : i_{1 i} =1} P (I_{1} = i_{1}) = π_{1 i} π_{2 i} . (3.2)$

Suppose that we are interested in estimating the population total $t_{y} = \sum_{i \in U} y_{i} .$ Since the $π_{i} ’ s$ are generally unknown, the Horvitz-Thompson estimator of $t_{y},$

${\hat{t}}_{H T} = \sum_{i \in s_{2}} π_{i}^{- 1} y_{i},$

cannot be used, in general. Instead, it is common practice to use the double expansion estimator

${\hat{t}}_{D E} = \sum_{i \in s_{2}} π_{1 i}^{- 1} π_{2 i} {(I_{1})}^{- 1} y_{i} .$

In general, both ${\hat{t}}_{H T}$ and ${\hat{t}}_{D E}$ differ. However, for weakly invariant two-phase designs, it is clear from (3.2), that both are identical.

3.2 Strong invariance

Let $θ$ be a finite population parameter and $\hat{θ}$ be an estimator of $θ .$ The total variance of $\hat{θ}$ can be expressed as

$V (\hat{θ}) = V E (\hat{θ} | I_{1}) + E V (\hat{θ} | I_{1}) . (3.3)$

Decomposition (3.3) is often called the two-phase decomposition of the variance; e.g., Särndal et al. (1992). If the two-phase sampling design is strongly invariant, the total variance of $\hat{θ}$ can alternatively be decomposed as

$V (\hat{θ}) = E V (\hat{θ} | I_{2}) + V E (\hat{θ} | I_{2}) . (3.4)$

The decomposition (3.4) is often called the reverse decomposition of the variance as the order of sampling is reversed, which can only be justified provided the two-phase design is strongly invariant. The decomposition (3.4) cannot be used in the case of weakly invariant two-phase design as the vector $I_{2}$ cannot be generated prior to the vector $I_{1} .$ The reverse decomposition was studied in the context of nonresponse by Fay (1991), Shao and Steel (1999) and Kim and Rao (2009), among others. In a nonresponse context, assuming that the units respond independently of one another, the set of respondents can be viewed as a second-phase sample selected according to Poisson sampling with unknown inclusion probabilities, called response probabilities. If the latter remain the same from one realization of the sample to another, we are essentially in the presence of a strongly invariant two-phase sampling design. Decomposition (3.4) can be used to justify simplified variance estimators for two-phase sampling designs; see Beaumont, Béliveau and Haziza (2015).

Acknowledgements

The authors are grateful to an Associate Editor and a reviewer for their comments and suggestions, which improved the quality of this paper. David Haziza’s research was funded by a grant from the Natural Sciences and Engineering Research Council of Canada.

References

Beaumont, J.-F., Béliveau, A. and Haziza, D. (2015). Clarifying some aspects of variance estimation in two-phase sampling. Journal of Survey Statistics and Methodology, 3, 524-542.

Fay, R.E. (1991). A design-based perspective on missing data variance. Proceedings of the 1991 Annual Research Conference, US Bureau of the Census, 429-440.

Kim, J.K., and Rao, J.N.K. (2009). A unified approach to linearization variance estimation from survey data after imputation for item nonresponse. Biometrika, 96, 917-932.

Särndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. Springer-Verlag, New York.

Shao, J., and Steel, P. (1999). Variance estimation for survey data with composite imputation and nonnegligible sampling fractions. Journal of the American Statistical Association, 94, 254-265.

ISSN : 1492-0921

Editorial policy

Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.

Submission of Manuscripts

Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.

Copyright

Published by authority of the Minister responsible for Statistics Canada.

Use of this publication is governed by the Statistics Canada Open Licence Agreement.

Catalogue No. 12-001-X

Frequency: semi-annual

Ottawa

Date modified:: 2016-12-20

Language selection

Search and menus

Search

A note on the concept of invariance in two-phase sampling designs Section 3. Implications of the invariance propertyA note on the concept of invariance in two-phase sampling designs Section 3. Implications of the invariance property

3.1 Weak invariance

3.2 Strong invariance

Acknowledgements

References