Probabilistic prediction is about passively selecting a sub-ensemble,
leaving all the mechanisms in place, and seeing what turns up after applying
that filter. Causal prediction is about actively *producing* a new
ensemble, and seeing what would happen if something were to change
("counterfactuals"). Graphical causal models are a way of reasoning about
causal prediction; their algebraic counterparts are structural equation models
(generally nonlinear and non-Gaussian). The causal Markov property.
Faithfulness. Performing causal prediction by "surgery" on causal graphical
models. The d-separation criterion. Path diagram rules for linear models.

*Reading*:
Notes, chapter
22

Posted by crshalizi at April 15, 2012 20:03 | permanent link

In which the analysis of multivariate data is recursively applied.

*Reading*:
Notes, assignment

Posted by crshalizi at April 15, 2012 20:02 | permanent link

Conditional independence and dependence properties in factor models. The generalization to graphical models. Directed acyclic graphs. DAG models. Factor, mixture, and Markov models as DAGs. The graphical Markov property. Reading conditional independence properties from a DAG. Creating conditional dependence properties from a DAG. Statistical aspects of DAGs. Reasoning with DAGs; does asbestos whiten teeth?

*Reading*:
Notes, chapter
21

Posted by crshalizi at April 15, 2012 20:01 | permanent link

From factor analysis to mixture models by allowing the latent variable to be discrete. From kernel density estimation to mixture models by reducing the number of points with copies of the kernel. Probabilistic formulation of mixture models. Geometry: planes again. Probabilistic clustering. Estimation of mixture models by maximum likelihood, and why it leads to a vicious circle. The expectation-maximization (EM, Baum-Welch) algorithm replaces the vicious circle with iterative approximation. More on the EM algorithm: convexity, Jensen's inequality, optimizing a lower bound, proving that each step of EM increases the likelihood. Mixtures of regressions. Other extensions.

Extended example: Precipitation in Snoqualmie Falls revisited. Fitting a two-component Gaussian mixture; examining the fitted distribution; checking calibration. Using cross-validation to select the number of components to use. Examination of the selected mixture model. Suspicious patterns in the parameters of the selected model. Approximating complicated distributions vs. revealing hidden structure. Using bootstrap hypothesis testing to select the number of mixture components.

*Reading*:
Notes, chapter
20; `mixture-examples.R`

Posted by crshalizi at April 15, 2012 20:00 | permanent link