Calibrated Bayesian Inference: a
Comment on The Vindication of
Magnitude-Based Inference Roderick J Little Sportscience 22,
sportsci.org/2018/CommentsOnMBI/rjl.htm, 2018 Summary: Direct probability statements about the
sizes of effects requires Bayesian methods. The Hopkins and Batterham
approach appears to be a special case of “calibrated Bayes” inference, which
seeks Bayesian inferences with “dispersed” priors that yield posterior
credibility intervals with good frequentist properties. I think that in many
settings calibrated Bayes is a good basis for inference. But we should not
bury the prior distribution, which should be declared and subject to
criticism, along with other aspects of the statistical model. The ASA Statement on P-Values describes limitations of hypothesis
testing that are broadly acknowledged by many statisticians (Wasserstein and
Lazar, 2016). Confidence intervals are an improvement, since they focus on
estimated sizes of effects with associated estimates of uncertainty. However,
it is not possible to make direct statements about the "chance that the
true effect is large" without being Bayesian, and therefore invoking a
prior probability distribution for parameters. My impression of MBI is that
it is basically computing the posterior distribution under a
"dispersed" uniform prior, and then computing posterior
probabilities of attaining various sizes of effects. I think this use of
“dispersed priors” that do not inject strong prior information is often a
reasonable approach. However, it has a long history, and I don't think it
requires a new name. In statistics, this approach is often called “objective Bayes”.
The term is somewhat problematic, since there is no prior distribution that
is completely “objective”, a criticism dating back to Fisher (1922). I prefer
the term “calibrated Bayes”, an approach that seeks priors that lead to
posterior distributions with good frequentist properties – for example, 95%
posterior credibility intervals should be well “calibrated”, in the sense of
having close to 95% confidence coverage in repeated sampling. Two excellent
papers, by Box (1980) and Rubin (1984), capture the essence of this approach.
A relatively non-technical discussion is Little (2006). Concerns
of subjectivity have led people to try to finesse the need to formulate a
prior distribution. A famous example is Fisher’s mysterious “fiducial”
inference, which seems only to work in special cases. However, I think prior
distributions play an important role in the inference, as illustrated in the
well-known "screening paradox" for a rare disease. Prior
distributions need to be out in the open and subject to criticism, as for
other features of a statistical model. Box
GEP (1980). Sampling and Bayes' inference in scientific modelling and
robustness. Journal of the Royal
Statistical Society Series A 143, 383-430 Fisher
RA (1922). On the mathematical foundations of theoretical statistics. Philosophical
Transactions of the Royal Society of London Series A 222, 309-368 Little
RJA (2006). Calibrated Bayes: a Bayes/frequentist roadmap. The American Statistician
60, 213-223 Rubin
DB (1984). Bayesianly justifiable and relevant frequency calculations for the
applied statistician. Annals of Statistics 12, 1151-1172 Back to index of comments. Back to The Vindication of Magnitude-Based Inference. First published 3 June
2018. |