It would be wrong to say that Judea Pearl knows more
about causal inference than anyone
else — I can think of some
rivals very close to
where I'm writing this — but he certainly knows *a lot*, and has
worked tirelessly to formulate and spread the modern way of thinking about the
subject, centered around
graphical models and their associated structural equations. I remember
spending many happy hours with his book Causality when it came out
in 2000, and look forward to spending more with the new edition, which is
making its way to me through the mail now. In the meanwhile, however, there
is what he describes as "A new survey paper, gently summarizing everything I know about causation (in only 43 pages)":

- "Causal Inference in Statistics: An Overview", forthcoming
in Statistics Surveys
**3**(2009): 96--146 [Free PDF] *Abstract*: This review presents empirical researchers with recent advances in causal inference, and stresses the paradigmatic shifts that must be undertaken in moving from traditional statistical analysis to causal analysis of multivariate data. Special emphasis is placed on the assumptions that underly all causal inferences, the languages used in formulating those assumptions, the conditional nature of all causal and counterfactual claims, and the methods that have been developed for the assessment of such claims. These advances are illustrated using a general theory of causation based on the Structural Causal Model (SCM) described in Pearl (2000a), which subsumes and unifies other approaches to causation, and provides a coherent mathematical foundation for the analysis of causes and counterfactuals. In particular, the paper surveys the development of mathematical tools for inferring (from a combination of data and assumptions) answers to three types of causal queries: (1) queries about the effects of potential interventions, (also called "causal effects" or "policy evaluation") (2) queries about probabilities of counterfactuals, (including assessment of "regret," "attribution" or "causes of effects") and (3) queries about direct and indirect effects (also known as "mediation"). Finally, the paper defines the formal and conceptual relationships between the structural and potential-outcome frameworks and presents tools for a symbiotic analysis that uses the strong features of both.

The paper assumes a reader who's reasonably well-grounded in statistics, though not necessarily in the causal-inference literature. (Of such readers, I imagine applied economists might have more unlearning to do than most, because they will keep asking "but when do I start estimating beta?") It's not ideally calibrated for an reader coming from, say, machine learning.

One theme running through the paper is the futility of trying to define
causality in purely probabilistic terms, and the fact that cases where it looks
like one can do so are really cases where causal assumptions have been smuggled
in. Another is that once you realize counterfactual or mechanistic assumptions
are needed, the graphical-models/structural equation framework makes it
immensely easier to reason about them than does the rival "potential outcomes"
framework. In fact, the objects which the potential outcomes framework takes
as its primitives can be *constructed* within the structural framework,
so the correct part of the former is a subset of the latter. And by reasoning
on graphical models it is easy to see that confounding can be introducing by
"controlling for" the wrong variables, something explicitly denied by leading
members of the potential-outcomes school. (Pearl quotes them making this
mistake, and manages to pull off a more-in-sorrow-than-in-glee tone while doing
so.) Mostly, however, the paper is about showing off what can be done within
the new framework, which is really pretty impressive, and ought to be part of
the standard tool-kit of data analysis. If you are not already familiar with
it, this is an excellent place to begin, and if you are you will enjoy the
elegant and comprehensive presentation.

Looking back over what I write in this blog, I feel like, on the one hand,
there's too little of it lately, and on the other hand, it's too tilted towards
negative, critical stuff. While not regretting at all being negative and
critical about stupid ideas that need to be criticized (or, really,
pulverized), I will try to expand and balance my output by posting at least
once a week on some *good* science. We'll see how this goes.

Posted by crshalizi at September 25, 2009 10:12 | permanent link