From the all-too-small Department of Unambiguously Good Things Happening to People Who Thoroughly Deserve Them, Judea Pearl has won the Turing Prize for 2011. As a long-time admirer*, I could not be more pleased, and would like to take this opportunity to recommend his "Causal Inference in Statistics" again.
I realize it edges into "I liked Feynman before he joined the Manhattan Project; the Williamsburg Project was edgier" territory, but I have very vivid memories of reading Probabilistic Reasoning in Intelligent Systems in the winter months of early 1999, and being correspondingly excited to hear that the first edition of Causality was coming out...
Posted by crshalizi at March 30, 2012 15:30 | permanent link
Attention conservation notice: Only of interest if you (1) care about high-dimensional statistics and (2) will be in Pittsburgh over the next two weeks.
I am not sure how our distinguished speakers would feel at being called sorcerers, but since one of them is using sparsity to read minds, and the other to infer causation from correlation, it is hard to think of a more appropriate word.
As always, the talks are free and open to the public; hecklers will, however, be turned into newts.
Posted by crshalizi at March 29, 2012 13:10 | permanent link
Attention conservation notice: Only of interest if you (1) care about statistical models of networks or collective information-processing, and (2) will be in Pittsburgh this week.
I am behind in posting my talk announcements:
Posted by crshalizi at March 26, 2012 10:00 | permanent link
You are a theoretical physicist, trying to do data analysis, and "Such a Shande far de Goyim!" is all I can think after reading your manuscript. Even if it turns out we are playing out this touching scene (which never fails to bring tears to my eyes) — no.
(SMBC via Lost in Transcription)
Update: Thanks to reader R.K. for correcting my Yiddish.
Posted by crshalizi at March 21, 2012 11:49 | permanent link
Homework 7: A little theory, a little methodology, a little data analysis: these keep growing young statisticians healthily balanced.
Posted by crshalizi at March 20, 2012 10:31 | permanent link
Simulation: implementing the story encoded in the model, step by step, to produce something data-like. Stochastic models have random components and so require some random steps. Stochastic models specified through conditional distributions are simulated by chaining together random variables. How to generate random variables with specified distributions. Simulation shows us what a model predicts (expectations, higher moments, correlations, regression functions, sampling distributions); analytical probability calculations are short-cuts for exhaustive simulation. Simulation lets us check aspects of the model: does the data look like typical simulation output? if we repeat our exploratory analysis on the simulation output, do we get the same results? Simulation-based estimation: the method of simulated moments.
Posted by crshalizi at March 20, 2012 10:30 | permanent link
My paper with Aaron Clauset and Mark Newman on power laws has just passed 1000 citations on Google Scholar, slightly ahead of schedule. (Actually, the accuracy of Aaron's prediction is a little creepy.)
I am spending the day reading over my student Daniel McDonald's dissertation draft. The calendar tells me that I was in the middle of writing up my own dissertation in mid-March 2001. But this is impossible, since I could swear that was just a few months ago at most, not eleven years.
Most significant of all, one of my questions has been answered by Guillaume the adaptationist goat.
Posted by crshalizi at March 15, 2012 11:00 | permanent link
The desirability of estimating not just conditional means, variances, etc., but whole distribution functions. Parametric maximum likelihood is a solution, if the parametric model is right. Histograms and empirical cumulative distribution functions are non-parametric ways of estimating the distribution: do they work? The Glivenko-Cantelli law on the convergence of empirical distribution functions, a.k.a. "the fundamental theorem of statistics". More on histograms: they converge on the right density, if bins keep shrinking but the number of samples per bin keeps growing. Kernel density estimation and its properties: convergence on the true density if the bandwidth shrinks at the right rate; superior performance to histograms; the curse of dimensionality again. An example with cross-country economic data. Kernels for discrete variables. Estimating conditional densities; another example with the OECD data. Some issues with likelihood, maximum likelihood, and non-parametric estimation.
Reading: Notes, chapter 15
Posted by crshalizi at March 08, 2012 10:30 | permanent link
Reminders about multivariate distributions. The multivariate Gaussian distribution: definition, relation to the univariate or scalar Gaussian distribution; effect of linear transformations on the parameters; plotting probability density contours in two dimensions; using eigenvalues and eigenvectors to understand the geometry of multivariate Gaussians; conditional distributions in multivariate Gaussians and linear regression; computational aspects, specifically in R. General methods for estimating parametric distributional models in arbitrary dimensions: moment-matching and maximum likelihood; asymptotics of maximum likelihood; bootstrapping; model comparison by cross-validation and by likelihood ratio tests; goodness of fit by the random projection trick.
Reading: Notes, chapter 14
Posted by crshalizi at March 06, 2012 09:25 | permanent link
Building a weather forecaster for Snoqualmie Falls, Wash., with logistic regression. Exploratory examination of the data. Predicting wet or dry days form the amount of precipitation the previous day. First logistic regression model. Finding predicted probabilities and confidence intervals for them. Comparison to spline smoothing and a generalized additive model. Model comparison test detects significant mis-specification. Re-specifying the model: dry days are special. The second logistic regression model and its comparison to the data. Checking the calibration of the second model.
Reading: Notes, second half of chapter 13; Faraway, chapters 6 and 7
Posted by crshalizi at March 01, 2012 10:30 | permanent link