### Principal Components Analysis (Advanced Data Analysis from an Elementary Point of View)

Principal components is the simplest, oldest and most robust of
dimensionality-reduction techniques. It works by finding the line (plane,
hyperplane) which passes closest, on average, to all of the data points. This
is equivalent to maximizing the variance of the projection of the data on to
the line/plane/hyperplane. Actually finding those principal components reduces
to finding eigenvalues and eigenvectors of the sample covariance matrix. Why
PCA is a data-analytic technique, and not a form of statistical inference. An
example with cars. PCA with words: "latent semantic analysis"; an example with
real newspaper articles. Visualization with PCA and multidimensional scaling.
Cautions about PCA; the perils of reification; illustration with genetic
maps.

*Reading*: Notes, chapter 18;
`pca.R`, `pca-examples.Rdata`, and `cars-fixed04.dat`

Advanced Data Analysis from an Elementary Point of View

Posted by crshalizi at April 03, 2012 09:10 | permanent link