Deborah Mayo gave a precis of her error-statistical view of inquiry; I'm not likely to improve on my first attempt to summarize it. (Let me also plug her later paper with D. R. Cox, "Frequentist Statistics as a Theory of Inductive Inference".) As for simplicity, Mayo expressed great skepticism as to whether it has any virtues as such. The key thing, to her mind, is to probe theories in ways which expose potential faults and rule out alternatives. (She had a nice slogan summarizing this, but my hand-writing is too illegible for me to reconstruct it.) "Simplicity", in its various senses, may or may not be conducive to this; it's nice if it is, but if not, not. — It occurs to me that many of the capacity-control measures used in learning theory (Rademacher complexity, VC entropy, etc.) have a severity-ish flavor of "how easily could this model seem to work really well, even if it's rubbish?", and it would be interesting to try to make that more precise, perhaps along the lines of Balduzzi's paper on falsification and generalization error.
For the final roundtable, the speakers were supposed to take turns answering various questions posed by Kevin Kelly, beginning with "What is Ockham's Razor"? Dr. Vapnik went first, and felt compelled to stage an intervention, going to the chalk-board to remind the audience that counting parameters has almost nothing to do with over-fitting (offering support vector machines and boosting as examples), re-iterating the importance of regularizing ill-posed inverse problems and of finite-sample risk bounds, bemoaning the tendency of the statistical profession to follow Fisher's lead on parametric maximum likelihood rather than Glivenko and Cantelli's lead on non-parametric uniform convergence, and decrying the neglect of conditional density estimation. Finally, I believe he said that Ockham's Razor is at best a primitive approximation to structural risk minimization, but I am not at all sure I understood him correctly.
I spoke next, and all I can say is that Dr. Vapnik is a tough act to follow. If any consensus emerged on the nature of Ockham's Razor, it eluded me.
Miscellaneous topics: the biology of prions; how close Popper may or may not have come to defining VC dimension; how much ibuprofen can safely be taken every day; where to find a good sazerac near campus; Descartes's theological argument for the Markov property.
Once upon a time, I was involved in a project on modeling the growth of bacterial biofilms on hard surfaces, such as teeth and sewage tanks. (None of my work ended up making it into the paper, nor did it deserve to, so it wouldn't be right for me to say "we", but I know whereof I speak.) The team used what feels like a very simple model, where "food" (don't ask) diffused through the ambient medium, unless it got absorbed by bits of the bacterial film living on a hard surface; bits of film which were well-enough fed could expand onto nearby territory, or buckle up into the medium, but they died off without food. What made this simple, it seems to me, we having so few distinct processes or mechanisms in the mode. Each process also had, in the notation available, a very simple representation, but that's not really the point. But if the probability of film growth as a function of food intake had followed an arbitrary curve drawn free-hand, it probably would've taken a lot of parameters to write down as a spline, but it would still have been a simple model. (At most it would have led to statistical issues with a quantitative fit.) To make the model more complex, it would have had to incorporate other processes. For instance, the model treats all bits of biofilm as equivalent, regardless of how old it is, but it's entirely conceivable that as a bit of the film ages, there are processes of maturation (or ecological succession) which make it more or less efficient at converting food to biomass. A model which included an age-dependent "yield curve" needn't have any more adjustable parameters (the curve could be taken from a separate experiment and fixed), but it would definitely be more complex.
Now, the model didn't seem to need such a mechanism, so it doesn't have one. (In fact, I'm not sure this issue was even considered it at the time.) It's this, the leave-out-processes-you-don't-need, which seems to me the core of the Razor for scientific model-building. This is definitely not the same as parameter-counting, and I think it's also different from capacity control and even from description-length-measuring (cf.), though I am open to Peter persuading me otherwise. I am not, however, altogether sure how to formalize it, or what would justify it, beyond an aesthetic preference for tidy models. (And who died and left the tidy-minded in charge?) The best hope for such justification, I think, is something like Kevin's idea that the Razor helps us get to the truth faster, or at least with fewer needless detours. Positing processes and mechanisms which aren't strictly called for to account for the phenomena is asking for trouble needlessly — unless those posits happen to be right. There is also the very subtle issue of what phenomena need to be accounted for. (The model was silent about the color of the biofilms; why disdain that?)
To sum up, I emerge from the workshop (1) not sure that my favorite complexity measure captures the kind of complexity which is involved in the use of Occam's Razor in scientific modeling (though it might), (2) not sure that said use is an aid to finding truth as opposed to a world-building preference of those William James called "tough minded" (though it might), and (3) unsure how to connect the most promising-sounding account of how the Razor might help to actual scientific practice. In other words, I spent three days talking with philosophers, and am now more confused than ever.
There's one more post I want to write about the Razor and related topics, but there's other stuff to do before then. Once again, I'll point you to Deborah Mayo and Larry Wasserman in the meanwhile. In particular, don't miss Larry's post about the excellent talk Grünwald gave after the conference at the statistics department, about some aspects of in-consistent Bayesian inference and their remedies. It might lead to a universal form of posterior predictive checking (maybe), which would be very welcome indeed.
Posted by crshalizi at June 26, 2012 22:40 | permanent link