6 Discussion

Jan A. van den Brakel

Previous

In factorial designs the levels of two or more treatment factors are varied and all possible treatment combinations are considered simultaneously. These designs are widely used in scientific experimentation for several reasons. The main effects of the factors are averaged over the levels of the other factors. Conclusions about the various effects are therefore based on a wider range of conditions, which increases the validity of the results. Furthermore, interaction between the different treatment factors can be analyzed, although the power of these tests decreases as the number of factors that are combined in one experiment increases. Finally factorial designs are more efficient compared to single-factor experiments, since fewer experimental units are required to estimate the main effects with the same precision.

In this paper a design-based theory is developed for the analysis of factorial designs that are embedded in probability samples. This approach is particularly appropriate to quantify the effects of the different design parameters of a survey process on the parameter estimates of a sample survey. Applications can be found in total survey design, empirical research into survey practise and quantifying discontinuities in series of repeatedly conducted surveys. Design-based analysis procedures are developed to test hypotheses about population means for factorial designs where the ultimate sampling units are randomized over the different treatment combinations through a CRD or an RBD. Procedures for factorial designs where clusters of sampling units are randomized over the treatment combinations or to test hypotheses about ratios of population totals are obtained analogously to the methods developed in van den Brakel (2008) for single-factor experiments.

The design-based variance estimator that is developed for the various treatment effects does not require joint inclusion probabilities nor design-covariances between the different subsamples. As a result a design-based analysis procedure for factorial designs embedded in complex probability samples is obtained with the attractive relatively simple structure as if the sampling units are drawn with unequal selection probabilities with replacement. The traditional advantages of factorial designs, summarized in the first paragraph of the discussion, still apply under this design-based approach. As illustrated with variance expression (2.29) fewer experimental units are required to estimate the main effects with the same precision in a factorial setup compared to separate single-factor designs.

The advantage of an RBD over a CRD is that the between block variance is removed from the estimated treatment effects. In the standard model-based theory for the analysis of randomized experiments, an F-test for the blocks as well as the treatment factors is available. Under restricted randomization of an RBD, however, it is generally argued that an F-test for the block effects is not valid. In these cases alternative measures to evaluate the efficiency of an RBD are available; see for example Montgomery (2001). In the design-based theory developed for RBDs in this paper there is an asymmetry between the block and treatment factors, as in the case of the randomization approach followed by Hinkelmann and Kempthorne (1994). Due to the restricted randomization within the blocks there is no meaningful test for the main effect of the block factor available.

Acknowledgement

The author wishes to thank the Associate Editor and the unknown referees for giving constructive comments on a former draft of this paper. The views expressed in this paper are those of the author and do not necessarily reflect the policy of Statistics Netherlands.

References

Chambers, R.L., and Skinner, C.J. (2003), Analysis of Survey Data, Chichester: John Wiley.

Chipperfield, J., and Bell, P. (2010), Embedded experiments in repeated and overlapping surveys, Journal of the Royal Statistical Society, Series A, 173, 51-66.

De Leeuw, E., Callegaro, M., Hox, J., Korendijk, E., and Lensvelt-Mulders, G. (2007). The influence of advance letters in response in telephone surveys. Public Opinion Quarterly, 71, 413-443.

Fienberg, S.E., and Tanur, J.M. (1987), Experimental and Sampling Structures: Parallels Diverging and Meeting, International Statistical Review, 55, 75-96.

Fienberg, S.E., and Tanur, J.M. (1988), From the inside out and the outside in: Combining experimental and sampling structures, The Canadian Journal of Statistics, 16, 135-151.

Fienberg, S.E., and Tanur, J.M. (1989), Combining Cognitive and Statistical Approaches to Survey Design, Science, 243, 1017-1022.

Fienberg, S.E., and Tanur, J.M. (1996), Reconsidering the Fundemental Contributions of Fisher and Neyman on Experimentation and Sampling, International Statistical Review, 64, 237-253.

Groves, R.M., Cialdini R.B., and Couper, M.P. (1992), Understanding the decision to participate in a survey, Public Opinion Quarterly, 56, 475-495.

Groves, R.M., and Couper, M.P. (1998), Nonresponse in household interview surveys, New York: John Wiley.

Hájek, J. (1971), Comment on "An essay on the logical foundations of survey sampling" by Basu, D., in Foundations of Statistical Inference (Eds. Godambe, V.P., and Sprott D.A.), Toronto: Holt, Rinehart, and Winston.

Hidiroglou, M.A., and Lavallée, P. (2005), Indirect two-phase sampling: Applying it to questionnaire field-testing, Proceedings of Statistics Canada Symposium 2005: Methodological challenges for future information needs.

Hinkelmann, K., and Kempthorne, O. (1994), Design and Analysis of Experiments, Volume 1: Introduction to experimental design, New York: John Wiley.

Horvitz, D.G., and Thompson, D.J. (1952), A generalization of sampling without replacement from a finite universe, Journal of the American Statistical Association, 47, 663-685.

Jäckle, A., Roberts, C., and Lynn, P. (2010), Assessing the effect of data collection mode on measurement, International Statistical Review, 78, 3-20.

Kempthorne, O. (1955), The Randomization Theory of Experimental Inference, Journal of the American Statistical Association, 50, 946-967.

Luiten, A., Campanelli, P., Klaasen, D., and Beukenhorst, D. (2008), Advance letters and the language and behaviour profile, paper presented at the 19th International Workshop on Household Survey Nonresponse.

Mahalanobis, P.C. (1946), Recent experiments in statistical sampling in the Indian Statistical Institute, Journal of the Royal Statistical Society, 109, 325-370.

Montgomery, D.C., (2001), Design and Analysis of Experiments, New York: John Wiley.

Narain, R. (1951), On sampling without replacement with varying probabilities, Journal of the Indian Society of Agricultural Statistics, 3, 169-174.

Särndal, C.E., Swensson, B., and Wretman, J.H. (1992), Model Assisted Survey Sampling, New York: Springer Verlag.

Scheffé, H. (1959), The Analysis of Variance, New York: John Wiley.

Skinner, C.J., Holt, D., and Smith, T.M.F. (1989), Analysis of Complex Surveys, Chichester: John Wiley.

van den Brakel, J.A. (2008), Design-based analysis of embedded experiments with applications in the Dutch Labour Force Survey, Journal of the Royal Statistical Society, Series A, 171, 581-613.

van den Brakel, J.A. (2010), Design-based analysis of factorial designs embedded in probability samples. Discussion paper 201014, Statistics Netherlands, Heerlen.

van den Brakel, J.A., and Binder, D. (2000), Variance estimation for experiments embedded in complex sampling schemes, Proceedings of the section on Survey Research Methods, American Statistical Association, 805-810.

van den Brakel, J.A., and Van Berkel, C.A.M. (2002), A Design-based Analysis Procedure for Two-treatment Experiments Embedded in Sample Surveys. An Application in the Dutch Labor Force Survey, Journal of Official Statistics, 18, 217-231.

van den Brakel, J.A., and Renssen, R.H. (1998), Design and analysis of experiments embedded in sample surveys, Journal of Official Statistics, 14, 277-295.

van den Brakel, J.A., and Renssen, R.H. (2005), Analysis of experiments embedded in complex sampling designs, Survey Methodology, 31, 23-40.

Wald, A. (1943), Tests of statistical hypotheses concerning several parameters when the number of observations is large, Trans. American Mathematical Society, 54, 426-482.

Previous

Date modified: