Conditional calibration and the sage statistician
Section 7. The conditional calibration plot and its use for sagely selecting procedures to use with observed data
The conditionally calibrated (CC) statistician faced with estimating using procedure from data set cares about being approximately calibrated, i.e., close to 95% especially for Truths with large values of indicating that such Truths could have plausibly generated In other words, when comparing procedures for estimating from the sage statistician, in addition to conservative unconditional calibration (i.e., confidence coverage), especially cares about accurate calibration for Truths that are plausible, and therefore implicitly ignores the calibration of procedures for Truths that are implausible given

Description for Figure 7.1
Figure presenting the conditional calibration plot. It contains hypothetical simulation results with a fixed data set and a fixed set of nine possible Truths for three procedures: conditional calibration (CC), not conditional calibration (Not CC) and confidence interval (CI). The calibration is on the y-axis, ranging from 0% to 100%, where 0% to 95% correspond to invalid, 95% is nominal and 95% to 100% correspond to too inclusive. The axis is not linear in but expanded for values of closer to unity. The for the nine possible truths are on the x-axis, ranging from 0 to 1.
CC procedure is labeled “Smile” because it is approximately calibrated close to 95% for close to 1), even if is well below 95% for much lower than 1. A second procedure, Not CC, is labeled “Frown” because it is not CC, i.e. is substantially less than 95% even if is close to 1. CI procedure is labeled “Neutral [CI]” because, although it is a valid confidence interval in Neyman’s sense of having its minimum local calibration at least 95%, it is not approximately calibrated for close to 1.
Figure 7.1 presents hypothetical simulation results with a fixed data set and a fixed set of nine possible Truths (with nine associated local match rates to for three procedures, indicated by faces. The vertical axis is not linear in but expanded for values of closer to unity, which is where our interest is focused. One procedure is labeled “Smile” because it is approximately calibrated close to 95%) for possible Truths that could have generated close to 1), even though poorly calibrated well below 95%) for a priori possible Truths that are implausible given the observed much lower than 1). A second procedure is labeled “Frown” because it is not CC, being invalid (meaning its local calibration is substantially less than 95%), including for truths that are plausible given The third procedure is labeled as “Neutral [CI]” because, although it is a valid confidence interval in Neyman’s sense of having its minimum local calibration at least 95%, it is not approximately calibrated for Truths that are plausible given the observed data set, This procedure could, for me, be described by a mild frown, but maybe not for Neyman, based on our 1970’s conversation.
That is, to repeat, Neymanian (conservative = confidence) calibration for each procedure formally just cares about the procedures’ minimum across the entire ensemble of a priori possible truths. Also, the rigid Bayesian just cares about the weighted average of the across the possible truths, weighted by the prior possibly unreliable distribution for the truths, The sage CC statistician cares about approximate local calibration of procedures for those Truths that are plausible; if a confidence-valid 95% procedure displays values substantially bigger than 95% for plausible Truths, this suggests that there exist better CC procedures for this situation with data set that is, calibrated procedures that are more efficient and so result in shorter intervals. Notice for example, that the confidence-valid procedure in Figure 7.1 (Neutral face) has worse CC than Smile, and thus although a plausible competitor to Smile at the design stage should be seen as inferior to Smile after seeing data because it is too conservative for some of the relevant Truths.
- Date modified: