A paper to read by Gelman and Shalizi: Philosophy and the practice of Bayesian statistics

The great 20^th century physicist Richard Feynman supposedly quipped “Philosophy of science is about as useful to scientists as ornithology is to birds.” As always, Feynman has a point, but in the fields of statistics, machine learning, and data science, understanding at least some of the philosophy behind techniques can prevent an awful lot of silliness and generate better results.

In their paper, Philosophy and the practice of Bayesian statistics, (British Journal of Mathematical and Statistical Psychology 2013, 66, 8-38) Andrew Gelman and Cosma Shalizi offer a thoughtful piece on what is really going on – or what really should be going on – in Bayesian inference. This paper is a short, highly interesting read, and I strongly suggest that all data scientists in the federal government put it on their reading lists.

For the uninitiated, statistical inference falls into two broad schools. The first, often called “classical statistics”, follows Neyman-Pearson hypothesis tests, Neyman’s confidence intervals, and Fisher’s p-values. Statistical inference rests on maximizing the likelihood function, leading to parameter estimates with standard errors. This school of statistics is usually the first one people encounter in introductory courses. The second school – Bayesian statistical inference – starts with a prior distribution over the parameter space and uses data to transform the prior into a posterior distribution. The philosophies behind each school are often said to be deductive in the classical case, and inductive in the Bayesian one. The classical school follows a method that leads to rejection or falsification of a hypothesis while the Bayesian school follows an inductive “learning” procedure with beliefs that rise and fall with posterior probabilities. Basically, if it’s not in the posterior, the Bayesian says it’s irrelevant. The Bayesian philosophy has always made me feel a bit uncomfortable. Bayesian methods are not the issue, I use them all the time, it’s the interpretation of pure inductive learning that has always bothered me. To me, I’ve felt that in the end the the prior-to-posterior procedure is actually a form of deductive reasoning but with regularization over the model space.

Gelman and Shalizi go right to the heart of this issue claiming that “this received view [pure inductive learning] of Bayesian inference is wrong.” In particular, the authors address the question: What if the “true” model does not belong to any prior or collection of priors, which is always the case in the social sciences? In operations research and anything connected to the social sciences, all models are false; we always start with an approximation that we ultimately know is wrong, but useful. Gelman and Shalizi provide a wonderful discussion about what happens with Bayesian inference in which the “true” model does not form part of the prior, a situation they label as the “Bayesian principal-agent problem”.

In the end, Gelman and Shalizi emphasize the need for model testing and checking, through new data or simulations. They demand that practical statisticians interrogate their models, pushing them to the breaking point and discovering what ingredients can make the models stronger. We need to carefully examine how typical or extreme our data are relative to what our models predict. The authors highlight the need for graphical and visual checks in comparisons of the data to simulations. This model checking step applies equally to Bayesian model building and thus in that sense both schools of statistics are hypothetico-deductive in their reasoning. In fact, the real power behind Bayesian inference lies in its deductive ability over lots of inferences. The authors essentially advocate the model building approach of George Box and hold to a largely Popperian philosophy.

Finally, Gelman and Shalizi caution us that viewing Bayesian statistics as subjective inductive inference can lead us to complacency in picking and averaging over models rather than trying to break our models and push them to the limit.

While Feynman might have disparaged the philosopher, he was a bit of a philosopher himself from time to time. In an address to the Caltech YMCA Lunch Forum on May 2, 1956, he said:

“That is, if we investigate further, we find that the statements of science are not of what is true and what is not true, but statements of what is known to different degrees of certainty: “It is very much more likely that so and so is true than that it is not true;” or “such and such is almost certain but there is still a little bit of doubt;” or – at the other extreme – “well, we really don’t know.” Every one of the concepts of science is on a scale graduated somewhere between, but at neither end of, absolute falsity or absolute truth.

It is necessary, I believe, to accept this idea, not only for science, but also for other things; it is of great value to acknowledge ignorance. It is a fact that when we make decisions in our life we don’t necessarily know that we are making them correctly; we only think that we are doing the best we can – and that is what we should do.”

I think Feynman would have been very much in favour of Gelman’s and Shalizi’s approach – how else can we learn from our mistakes?

Leave a Reply Cancel reply