Conditional calibration and the sage statistician
Section 8. Implementing this idea in practice

To implement this idea in practice would require work, certainly more intellectual effort than is currently expended in many statistical investigations. The implementation would begin at the same place as is standard in current carefully constructed studies. We would begin by considering a set of procedures, each of which is usually conservatively calibrated in the traditional sense, for the problem at hand. Then we would collect opinions from experts about the generally plausible Truths in the specific situation we are facing; this step is executed in some current problems, although typically informally.

If possible, then we should gather some information for W k , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGxbWdamaaBaaaleaapeGaam4AaaWdaeqaaOGaaiilaaaa@38F7@ the prior weights on the possible truths; these could be useful for later consideration of the construction of the matching averages M k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGnbWdamaaBaaaleaapeGaam4AaaWdaeqaaaaa@3833@ (no asterisk yet, because the data, Y * , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGzbWdamaaCaaaleqabaGaaeOkaaaakiaacYcaaaa@3898@ are not yet observed). We should obtain agreement on how to define M k MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGnbWdamaaBaaaleaapeGaam4AaaWdaeqaaaaa@3833@ and whether to use the prior weights W k . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGxbWdamaaBaaaleaapeGaam4AaaWdaeqaaOGaaiOlaaaa@38F9@ This is the ABC task. Finally, agreement is needed on how to use the CC plot to compare the various procedures being considered.

All of this effort should be conducted before the actual data set Y * MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpgpC0xc9LqFf0xc9 qqpeuf0xe9q8qiYRWFGCk9vi=dbbf9v8Gq0db9qqpm0dXdHqpq0=vr 0=vr0=edbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGzbWdamaaCaaaleqabaGaaeOkaaaaaaa@37DE@ is observed. For this reason, alone, the implementation of this idea is more intellectually demanding than standard practice, but it is a component of being a sage statistician.

Acknowledgements

This is the written version of DB Rubin’s Waksberg Award address delivered 8 November 2018 in Ottawa, Canada. The author was a friend of Joseph Waksberg and was always impressed with his wise application of statistics, and hopes that this contribution continues that tradition. Other versions of this talk have been given over the past five years, most recently as the SN Roy Invited Lecture in Kolkata India 27 Decmber 2018. The author acknowledges very helpful comments from Roderick Little, Tommy Wright, as well as from Wesley Yung and other members of the Survey Methodology editorial board, and recently from Hal Stern and Yannis Yatracos.

References

Box, G.E.P. (1976). Science and statistics. Journal of the American Statistical Association, 71(356), 791-799.

Box, G.E.P. (1980). Sampling and Bayes’ inference in scientific modelling and robustness. Journal of the Royal Statistical Society, Series A, 143(4), 383-430.

Cochran, W.G. (1963). Sampling Techniques, 2nd Edition. New York: John Wiley & Sons, Inc.

De Finetti, B. (1972). Probability, Induction, and Statistics. New York: John Wiley & Sons, Inc.

Dempster, A.P. (1967). Upper and lower probabilities induced by a multivalued mapping. The Annals of Mathematical Statistics, 38(2), 325-339.

Ferris, T. (2018). Topics in Casual Inference and the Law. Senior Data Scientist, Google.

Fisher, R.A. (1934). Contribution to a discussion of J. Neyman’s paper on the two different aspects of the representative method. Journal of the Royal Statistical Society, 97, 614-619.

Fisher, R.A. (1956). Statistical Methods and Scientific Inference. Edinburgh: Oliver and Boyd.

Kish, L. (1965). Survey Sampling. New York: John Wiley & Sons, Inc.

Lehmann, E.L. (1959). Testing Statistical Hypotheses. New York: John Wiley & Sons, Inc.

Lindley, D.V. (1971). Bayesian Statistics: A Review, SIAM .

Little, R.J. (2008). Weighting and prediction in sample surveys. Calcutta Statistical Association Bulletin, 60, 3-4, 147-167.

Martin, R., and Liu, C. (2016). Inferential Models. New York: Chapman and Hall/CRC.

Neyman, J. (1923). On the application of probability theory to agricultural experiments. Essay on principles. Translated into English in Statistical Science, 1990, 5(4), 463-472.

Neyman, J. (1934). On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society, 97(4), 558-625.

Rosenbaum, P.R., and Rubin, D.B. (1984). Sensitivity of Bayes inference with data dependent stopping rules. The American Statistician, 38, 106-109.

Rubin, D.B. (1978). Multiple imputations in sample surveys-A phenomenological Bayesian approach to nonresponse. Proceedings of the Survey Research Methods Section, American Statistical Association, 20-34.

Rubin, D.B. (1983). A case study of the robustness of Bayesian methods of inference: Estimating the total in a finite population using transformations to normality. Scientific Inference, Data Analysis and Robustness. New York: Academic Press, Inc., 213-244.

Rubin, D.B. (1984). Bayesianly justifiable and relevant frequency calculations for the applied statistician. Annals of Statistics, 12, 1151-1172.

Rubin, D.B. (1996). Multiple imputation after 18+ years. Journal of the American Statistical Association, 91, 434, 473-489.

Rubin, D.B. (2008). Discussion of “Weighting and prediction in sample surveys” by R.J. Little. Calcutta Statistical Association Bulletin, 60, 185-190.

Rubin, D.B. (2016). Fisher, Neyman, and Bayes at FDA. Journal of Biopharmaceutical Sciences, 26, 1020-1024.

Savage, L.J. (1954). The Foundations of Statistics. Wiley Publications in Statistics.

Tavare, S., Balding, D.J., Griffiths , R.C. and Donnelly, P. (1997). Inferring coalescence times from DNA sequence data. Genetics, 145(2), 505-518.

Von Neumann, J. (1947). The mathematician. In Works of the Mind, (Ed., R.B. Haywood), University of Chicago Press, 180-196.

Wald, A. (1950). Statistical Decision Functions. New York: John Wiley & Sons, Inc.


Date modified: