←September→
Sun 
Mon 
Tue 
Wed 
Thu 
Fri 
Sat 

1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 




Archives
Categories
SelfCentered
Books to Read While the Algae Grow in Your Fur
Books (etc.) I've read this month and
feel I can recommend (warning: I have no taste)
 Jiang Rong (i.e., Lu Jiamin), Wolf Totem
 Environmentalist fiction, about the destruction of nomadism and indeed of
the inner Mongolian steppe by Han expansion during the Cultural Revolution.
(It seems to be at least somewhat autobiographical.) On the one side, it's
pretty heavy handed, and stylistically even a bit awkward. (I could believe
that many subtleties did not survive translation, but unless the translator was
a complete butcher, much of the dialogue is just stilted, asyouknowChen
infodumping. There is also a lot of Noble Savage primitivism, and I at least
find a little of this goes a long way. Against that, there is a real story
here, told with real feeling for its characters and its subject matter, and
finelyhoned observations. (Or at least  since after all, what do I know of
inner Mongolia in the 1960s?  it gives every appearance to me of these
virtues.) After the first few chapters, I think absolutely nothing in the
plot surprised me, but I still wanted to see it all unfold.
 I can't remember where I saw this recommended, but I'm glad I followed whomever's advice.
 Scott Lynch, The Republic of Thieves
 Kameron Hurley, God's War
 Chelsea Cain, One Kick
 George Scialabba, The Modern Predicament
 M. Night Shyamalan, I Got Schooled: The Unlikely Story of How a Moonlighting Movie Maker Learned the Five Keys to Closing America's Education Gap
 Tony Cliff, Delilah Dirk and the Turkish Lieutenant
 Unusually delightful comicbook mind candy. The first few chapters are free online. A sequel is promised, and I await it eagerly. (See also Cliff's charming selfparody.)
 Oliver Morton, Eating the Sun: How Plants Power the Planet
 I cannot remember the last time I read a popular science book with such
enjoyment, or learned so much from it.
 The first part is about how photosynthesis works, at the physical and
molecular level. This is relayed by telling the story of how we came to that
understanding, and parts of the lives of its discoverers. This embraces a
surprisingly large range of the 20th century's golden age of science, and a
surprisingly large range of its sciences: biochemistry, the nuclear physics of
isotopes and radioactive decay, the quantum physics of molecular bonding and
the interaction of light and electricity, the biophysics of free energy flow
through cells and through molecules, crystallography, the molecular biology
which let us isolate and manipulate individual enzymes, and so on. (I was
pleased to learn how much of the early work was done at Berkeley.) This is a
story of discovery, rivalry, insights and false paths, human and biological
ingenuity, and ultimately a deep understanding of one of the fundamental
processes of life as we know it.
 The second part is about the evolution of photosynthesis, and the way
organisms carrying it out have interacted with the Earth's climate over the
last threebillionandchange years. This covers everything from the origin of
life to plate tectonics to the spread of grasses over the last few million
years. Again, much of it is told through stories of discovery and the history
of the science. It is necessarily more conjectural than the very settled
science of how photosynthesis works, but none the less fascinating for all
that.
 The third part is about what Morton calls the "climate/carbon crisis".
Agriculture already had nontrivial impacts on climate, but our real change
began with the Industrial Revolution and the vast growth in consuming fossil
fuels. (The second part had a very nice explanation of where those fossil
fuels came from.) Huge amounts of carbon compounds, charged with free energy
by photosynthesis and then taken out of the biosphere by geological processes
over millions of years, are getting burned to release the energy, and returned
to the biosphere much faster than they can be processed. The result is that
the atmospheric carbon dioxide concentration has already drastically increased
over what it was a few centuries ago, and is pretty much bound to keep rising
for a long time. Since atmospheric carbon dioxide is good at trapping heat
radiated back from the ground, the firstorder effect of this is to warm the
Earth. The exact effects depend on incredibly complicated and illunderstood
feedback processes. (For instance: leaves release water vapor, regulating this
through their stomata; what
will a warmer atmosphere with more carbon dioxide do to cloud formation above
tropical forests, or above plankton blooms?) To take these uncertainties as
ground for complacency, though, seems grotesque.
 Our global civilization runs at something like 40 terawatts. There is
enough fossil fuel to keep going for centuries. (It's doubtful there's enough
oil, but there's a lot more coal and natural gas,
and quite
practical ways of turning them into liquid fuels.) Dumping that much more
carbon into the atmosphere, though, is not going to lead to anything good.
Tidal and geothermal energy are too localized and smallscale to be global
solutions. Nuclear fission looks more attractive when one compares longlived
radioactive waste to longlived carbon dioxide as a pollutant, but there are
very real practical obstacles. All our other options are ultimately solar
powered  winds, rivers,
photovoltaic devices, biomass. Morton is very hopeful about the last two, and
especially about what real molecular engineering might be able to do in the
space intermediate between photovoltaic plates (high efficiency, but also high
cost) and naturallyoccurring leaves (low efficiency, but they grow).
 This is a marvelous book, meaning one filled with wonders: I strongly urge
you to encounter them for yourself.
 Richard R. Nelson, The Moon and the Ghetto: An Essay on Public Policy Analysis
 Nominally, Nelson's starting point here is the thenfrequent question of
why, if we can put people on the moon, we can't do anything about the ghetto.
He uses this as a launching point to examine what he sees as the three leading
traditions of public policy analysis then on offer: the costbenefit school
influenced by (if not in thrall to) economics; the organizational perspectives
coming from sociology and political science; and the researchanddevelopment
tradition that looks at solving problems by focused technological research.
All three are, on quite sensible grounds, found wanting. The costbenefit
school has a clear normative structure which often seems compelling  who
would want fewer benefits at higher costs? But, outside of very limited areas,
it totally founders on the issue of determining what the costs and benefits
really are, and of who pays the costs and who receives the benefits. (While
Nelson doesn't go far into this, the KaldorHicks idea that one can evade this
by looking at whether the winners could compensate the losers was worth
exploring but ultimately fails badly,
as Steve Randy Waldman has
recently explained at length.) The organizational analysts don't have good
causal models of what consequences will follow from changes in how some area of
policy concern is organized, and lack any sort of definite normative theory to
set up against costbenefit analysis. (In this, as in much else, conviction
can be more persuasive than sanity.) R&D is great, but there are very few
areas of public policy concern where it's really hard to argue that what we're
lacking is technological knowhow.
 These chapters are followed by two which look now very much like period
pieces: one is about the difficulties of subsidizing childcare, and the other
about public support for developing supersonic passenger jets, and liquid
metal fast breeder reactors. The more enduring lessons here are that there are
lots of ways of organizing economic activity, and shaping it to public ends,
which go beyond the simple "profitdriven markets will take care of it" / "the
government has to do it" alternatives. (Actually, a lot of the issues he
raises about how hard it would be for parents to know whether daycare centers
are doing a good job would seem to be ones which the Internet could help
alleviate...)
 The work ends with a preview of the evolutionary economics Nelson and
Winter put forward in their nowclassic book. This is capped by an
exhortation, in thinking about public policy, to think about the sources of
variation, the selective environment, and how to take advantage of novelty and
variation. This all seems sensible, but if I were someone who had to craft or analyze public policy, it's not very clear about what I should do.
 I do not think it is an accident that Nelson never gets around to
explaining why we could send people to the Moon, but not do anything about the
ghetto.
 Not totally unrelated: a plea to "put whitey back on the moon"
 Iain M. Banks, Consider Phlebas
 I picked up the Culture series with later books, and never got back to the
beginning. This is almost everything space opera ought to be.
Upcoming Talks
Upcoming Talks
None from now through October

September 15, 2014
Introduction to Statistical Computing
At an intersection of Enigmas of
Chance and
Corrupting the Young.
Class homepage
Fall 2014
Class announcement
Lectures:
 Introduction to the Course; Basic Data Types
 Bigger Data Structures
 Dataframes and Control
 Introduction to Strings
 Regular Expressions
 Writing Functions (gmp.dat file for the example)
Labs:
 Exponentially More Fun
 Things That Go Vroom
 Scrape the Rich! (rich.html file)
Homework:
 Rainfall, Data Structures, Sequences
 Housing, Dataframes, Control
 Super Scalper Scrape (NHLHockeySchedule2.html)
Fall 2013
Class announcement
Lectures:
 Combined lectures 1 and 2: intro to the class, basic data types, basic data structures, structures of structures
 Flow control, iteration, vectorization
 Writing and Calling Functions
 Writing Multiple Functions
 TopDown Design
 Testing
 Debugging
 Functions as Objects
 Optimization I: Simple Optimization
 Abstraction and Refactoring
 Split, Apply, Combine I: Using Basic R
 Split, Apply, Combine II: Using plyr
 Simulation I: Generating Random Variables
 Simulation II: Markov Chains
 Simulation III: Monte Carlo and Markov Chain Monte Carlo
 Simulation IV: Quantifying uncertainty with simulations
 Optimization II: Deterministic, unconstrained optimization
 Optimization III: Stochastic and constrained optimization
 Basic character/string manipulation
 Regular expressions
 Importing data from web pages
 Review on text processing
 Change of representation; text as vectors
 Databases
 Simulation V: Matching simulation models to data
 Speed, computational complexity, going beyond R
 Computing for statistics
Unnumbered because not actually delivered in class: The Scope of Names
Labs:
 Basic Probability, Basic Data Structures
 Only the Answers Have Changed
 Of Big and Small Hearted Cats
 Like a Jackknife to the Heart
 Testing Our Way to Outliers
 I Can Has Likelihood Surface?
 Bunches of Novels
 How Antiobiotics Came to Peoria
 Tremors
 Scrape the Rich
 Baseball Salaries
Homework:
 Rainfall, Data Structures, Obsessive Doodling
 Tweaking ResourceAllocationbyTweaking
 Hitting Bottom and Calling for a Shovel
 Standard Errors of the Cat Heart
 Dimensions of Anomaly
 I Made You a Likelihood Function, But I Ate It
 The Intensity of 19th Century Literature
 Antibiotic Diffusion and Outlier Resistance
Canceled
 A Maze of Twisty Little Passages
 Several Hundred Degrees of Separation
Exams:
 Midterm Exam
SelfEvaluation and Lessons Learned
Fall 2012
Class announcement
Lectures with no links haven't been delivered yet, and the order an topics may
change.
Lectures:
 Introduction to the class, basic data types, basic data structures
 More data structures: matrices, data frames, structures of structures
 Flow Control, Looping, Vectorization
 Writing and Calling Functions
 Writing Multiple Functions
 TopDown Design
 Testing
 Debugging
 The Scope of Names
 Functions as Objects
 Split/Apply/Combine I: Using Basic R
 Split/Apply/Combine II: Using plyr
 Abstraction and Refactoring

Graphics (canceled)
 Simulation I: Random variable generation
 Simulation II: Monte Carlo, Markov chains, Markov chain Monte Carlo
 Optimization I: Deterministic, Unconstrained
Optimization
 Optimization II: Stochastic, Constrained, and Penalized Optimization
 Basic Text Manipulation
 Regular Expressions I
 Regular Expressions II
 Importing Data from Web Pages
 Reshaping Data
 Relational Databases I
 Relational Databases II
Labs:
 Basic Probability, Basic Data Structures
 Flow Control and the Urban Economy
 Of Big and Small Hearted Cats
 Like a Jackknife to the Heart
 Testing Our Way to Outliers
 I Can Has Likelihood Surface?
 Bunches of Novels, or, Representation and the History of Genre
 How Antibiotics Came to Peoria
 A Maze of Twisty Little Passages
Homework:
 Rainfall, Data Structures, Obsessive Doodling
 Tweaking ResourceAllocationbyTweaking
 Hitting Bottom and Calling for a Shovel
 Standard Errors of the Cat Heart
 Dimensions of Anomaly
 I Made You a Likelihood Function, But I Ate It

canceled
 The Intensity of 19th Century Literature
 Antibiotic Diffusion and Outlier Resistance
 Several Hundred Degrees of Separation
Exams:
 Midterm Exam
 Final Project Options
Fall 2011
Class announcement
Lectures:
 Introduction to the class, basic data types, basic data structures
 More Data Structures: Matrices, Lists, Data Frames, Structures of Structures
 Flow Control, Looping, Vectorization
 Writing and Calling Functions
 Writing Multiple Functions
 TopDown Design
 The Scope of Names
 Debugging
 Testing
 Functions as Arguments
 Functions as Return Values
 Exam briefing
 Split, Apply, Combine: Using Base R
 Split, Apply, Combine: Using plyr
 Abstraction and Refactoring
 Simulation I: Random Variable Generation
 Exam debriefing
 Simulation II: Monte Carlo and Markov Chains
 Simulation III: Mixing and Markov Chain Monte Carlo
 Basic Character Manipulation
 Regular Expressions I
 Regular Expressions II
 Importing Data from Webpages I
 Importing Data from Webpages II
 Databases I
 Databases II
Homework:
 Rainfall and Data Structures
 Tweaking ResourceAllocationbyTweaking
 Improving Estimation by Nonlinear Least Squares
 Standard Errors of the Cat Heart
 Rancorous Testing
 OutlierRobust Linear Regression
 'Tis the Season to Be Unemployed
 Sampling Accidents
 Get (the 400) Rich(est list) Quick
 Baseball Salaries
Labs:
 Basic Probability and Basic Data Structures
 Flow Control and the Urban Economy
 Of Big and Small Hearted Cats
 Further Errors of the Cat Heart
 Testing Our Way to Outliers
 Likelihood
 SplitApplyCombine
 Changing My Shape, I Feel Like an Accident
 Regular Expressions I
Exams:
 Midterm
 Final Project Descriptions
My Work Here Is Done
SelfEvaluation and Lessons Learned
Posted by crshalizi at September 15, 2014 22:38  permanent link
August 29, 2014
Rainfall, Data Structures, Sequences (Introduction to Statistical Computing)
In which we practice working with data frames, grapple
with some of the subtleties of R's system of data types, and think about how to
make sequences.
(Hidden agendas: data cleaning; practice using R Markdown; practice reading
R help files)
Assignment, due at 11:59 pm on Thursday, 4 September 2014
Introduction to Statistical Computing
Posted by crshalizi at August 29, 2014 11:30  permanent link
Lab: Exponentially More Fun (Introduction to Statistical Computing)
In which we play around with basic data structures and convince ourself that
the laws of probability are, in fact, right. (Or perhaps that R's random
number generator is pretty good.) Also, we learn to use R Markdown.
— Getting everyone randomly matched for pair programming with a deck
of cards worked pretty well. It would have worked better if the university's
IT office hadn't broken R on the lab computers.
Lab (and its R Markdown source)
Introduction to Statistical Computing
Posted by crshalizi at August 29, 2014 10:30  permanent link
August 27, 2014
Bigger Data Structures (Introduction to Statistical Computing)
Matrices as a special type of array; functions for matrix arithmetic and
algebra: multiplication, transpose, determinant, inversion, solving linear
systems. Using names to make calculations clearer and safer:
resourceallocation miniexample. Lists for combining multiple types of
values; access sublists, individual elements; ways of adding and removing
parts of lists. Lists as keyvalue pairs. Data frames: the data structure for
classic tabular data, one column per variable, one row per unit; data frames as
hybrids of matrices and lists. Structures of structures: using lists
recursively to creating complicated objects; example with eigen.
Slides
Introduction to Statistical Computing
Posted by crshalizi at August 27, 2014 10:30  permanent link
August 25, 2014
Introduction to the Course; Basic Data Types (Introduction to Statistical Computing)
Introduction to the course: statistical programming for autonomy, honesty,
and clarity of thought. The functional programming idea: write code by
building functions to transform input data into desired outputs. Basic data
types: Booleans, integers, characters, floatingpoint numbers. Operators as
basic functions. Variables and names. Related pieces of data are bundled into
larger objects called data structures. Most basic data structures: vectors.
Some vector manipulations. Functions of vectors. Naming of vectors.
Our first regression. Subtleties of floating point numbers and of integers.
Slides
Introduction to Statistical Computing
Posted by crshalizi at August 25, 2014 11:30  permanent link
Class Announcement: 36350, Statistical Computing, Fall 2014
Fourth time is charm:
 36350, Statistical Computing
 Instructors: Yours truly and Andrew Thomas
 Description: Computational data analysis is an essential part of
modern statistics. Competent statisticians must not just be able to run
existing programs, but to understand the principles on which they work. They
must also be able to read, modify and write code, so that they can assemble the
computational tools needed to solve their dataanalysis problems, rather than
distorting problems to fit tools provided by others. This class is an
introduction to programming, targeted at statistics majors with minimal
programming knowledge, which will give them the skills to grasp how statistical
software works, tweak it to suit their needs, recombine existing pieces of
code, and when needed create their own programs.
 Students will learn the core of ideas of programming — functions,
objects, data structures, flow control, input and output, debugging, logical
design and abstraction — through writing code to assist in numerical and
graphical statistical analyses. Students will in particular learn how to write
maintainable code, and to test code for correctness. They will then learn how
to set up stochastic simulations, how to parallelize data analyses, how to
employ numerical optimization algorithms and diagnose their limitations, and
how to work with and filter large data sets. Since code is also an important
form of communication among scientists, students will learn how to comment and
organize code.
 The class will be taught in the R
language, use RStudio for labs,
and R Markdown for assignments.
 Prerequisites: This is an introduction to programming for
statistics students. Prior exposure to statistical thinking, to data analysis,
and to basic probability concepts is essential, as is some prior acquaintance
with statistical software. Previous programming experience is not
assumed, but familiarity with the computing system is. Formally, the
prerequisites are "Computing at Carnegie Mellon" (or consent of instructor),
plus one of either 36202 or 36208, with 36225 as either a prerequisite
(preferable) or corequisite (if need be).
 The class may be unbearably redundant for those who already know a
lot about programming. The class will be utterly incomprehensible for
those who do not know statistics and probability.
Further details can be found at
the class website.
Teaching materials (lecture slides, homeworks, labs, etc.), will appear both
there and here.
— The class is much bigger than in any previous year  we currently
have 50 students enrolled in two backtoback lecture sections, and another
twentyodd on the waiting list, pending more space for labs. Most of the ideas
tossed out in my last selfevaluation are going to be
at least tried; I'm particularly excited about pair programming for the labs.
Also, I at least am enjoying rewriting the lectures
in R
Markdown's presentation mode.
Manual trackback: Equitablog
Corrupting the Young;
Enigmas of Chance;
Introduction to Statistical Computing
Posted by crshalizi at August 25, 2014 10:30  permanent link
July 31, 2014
Books to Read While the Algae Grow in Your Fur, July 2014
Attention conservation notice: I have no taste.
 Stephen King, Eyes of the Dragon
 Mind candy. I really liked it when I was a boy, and on rereading it's not
been visited by
the Suck Fairy,
but I did come away with two thoughts. (1) I'd have been very interested to
see what a writer with drier view of political power would have done
with the story elements (the two princes, the evil magician, the exiled nobles)
— Cherryh, say, or
Elizabeth Bear. (2) Speaking of which, it's striking how strongly King's
fantasy books (this one, The Dark Tower) buy into the idea
of rightfully inherited authority, when his horror stories are often
full of healthy distrust of government officials ("the Dallas police"). I
don't think he'd say that being electorally accountable, rather than chosen by
accident of birth, makes those in power less trustworthy...
 Charles Tilly, Why?
 Tilly's brief attempt to look at reasongiving as a social act, shaped by
relations between the giver and receiver of reasons, and often part of
establishing, maintaining, or repairing that relationship. He distinguished
between reasons why involved causeandeffect and those which use a logic of
"appropriateness" instead, and those which require specialized knowledge and
those which don't. "Conventions" are commonknowledge reasons which are invoke
appropriateness, not causal accounts. (Think "Sorry I'm late, traffic was
murder".) "Stories" give causal explanations which only invoke common
knowledge. Tilly is (explicitly) pretty Aristotlean about stories: they
involve the deeds of a small number of conscious agents, with unity of time,
place, and action. Codes are about matching circumstances to the right
specialized formulas and formalities  are your papers in order? is the
evidence admissible? Technical accounts, finally, purport to be full
causeeffect explanations drawing on specialized knowledge.
 The scheme has some plausibility, and Tilly has lots of interesting
examples. But of course he has no argument that these two dimensions
(generalist vs. specialist, causation vs. appropriateness) are the only two big
ones, that everything (e.g.) the "codes" box really does act the same way,
etc. So I'd say it's worth reading to chew over, rather than being
deeply illuminating.
 Elliott Kay, Rich Man's War
 Sequel to Poor Man's
Fight, continuing the same high standard of quality mindcandy. (No
Powell's link because currently only available on Amazon.)
 Alexis de Tocqueville, Democracy in America
 Yet another deserved classic read only belatedly. Volume I is actually
about de Tocqueville's observations on, and ideas about, democracy in America.
This is interesting, mostly empirical, and full of intriguing accounts of
social mechanisms. (I see
why Jon
Elster is so into him.) Volume II consists of his dictates about what
democracy and social equality will do to customs and character in every
society. This is speculative and often the only reference to America comes in
the chapter titles. (I see why this would also appeal to Elster.)
 I would dearly like to find a good "de Tocqueville in retrospect" volume.
Some of his repeated themes are the weakness of the Federal government, the
smallness of our military, the absence of serious wars, the relative equality
of economic condition of the (white) population, the lack of big cities among
us. So how have we managed to preserve as much democracy as we have? For that
matter, how does the civil war and its outcomes even begin to make sense from
his perspective?
 &madash; Rhetorical observation: de Tocqueville was very fond of contrasts
where democracy leads to less dispersion among people than does aristocracy,
but around a higher average level. He either didn't have the vocabulary to say
this concisely, or regarded using statistical terms as bad style. (I suspect
the former, due to the time period.) He was also very fond of paradoxes, where
he either inverted directions of causal arrows, or flipped their signs.
 Maria Semple, Where'd You Go, Bernadette?
 Literary fiction about Seattle, motherhood, marital collapse, aggressively
eccentric architects, and Antarctica. Very funny and more than a bit
touching.
 Thomas Piketty, Capital in the TwentyFirst Century [Online technical appendix, including extra notes, figures, and spreadsheets]
 Yes, it's as good and important as everyone says. If by some chance you
haven't read about this yet, I
recommend Robert
Solow, Branko Milanovic
and Kathleen
Geier for overviews; Suresh
Naidu's take
is the best I've seen on the strengths and weaknesses of the book, but doesn't
summarize so much.
 Some minor and scattered notes; I might write a proper review later. (Why
not? Everybody else has.)
 Perhaps it's the translation, but Piketty seems wordy and a bit
repetitive; I think the same things could have been said more briskly. Perhaps
relatedly, I got a little tired of the invocations of Austen, Balzac, and
American television.
 The book has given rise to the most
perfect "I
happen to have Marshall McLuhan right here" moment I ever hope to see.
 Attempts to undermine his data have,
unsurprisingly, blown up in his attackers' faces.
Similarly, claims that Piketty ignores historical contigency, political factors
and institutions are just bizarre.
 Granting that nobody has better point estimates, I wish he'd give margins
of error as well. (A counterargument: maybe he could calculate
purelystatistical standard errors, but a lot of the time they could
be swamped by nearlyimpossibletoestimate systematic errors, due to, e.g.,
tax evasion.)
 His two "laws" of capitalism are an accounting identity (the share of
capital in national income is the rate of return on capital times the ratio of
capital to income, \( \alpha = r \beta \) ), and a longrun equilibrium
condition (the steadystate capital/income ratio is the savings rate divided by
the economywide growth rate, \( \beta = s/g \) ), the latter presuming that
two quite variable quantities (\( s \) and \( g \)) stay fixed forever. So the
first can't help but be true, and the second is of limited relevance. (Why
should people keep saving the same fraction of national income as their wealth
and income change?) But I don't think this matters very much, except for the
style. (However, Milanovic has
an interesting
defense of Piketty on this point.)

He gets
the Cambridge Capital Controversy wrong, and while
that matters
for our understanding of capital as a factor of production, it's
irrelevant
for capital as a store of value, which is what Piketty is all about.
Similarly, Piketty doesn't need to worry about
declining
marginal returns to capital in the economy's aggregate production function,
which is good, because aggregate production
functions make
no sense even within orthodox neoclassical economics. (The fact
that orthodox
neoclassical economists continue to use them is a bit of an intellectual
embarrassment; they should have more selfrespect.)
 The distinction between "income from labor" and "income from capital" is
part of our legal system, and Piketty rests a lot of his work on it. But It
seems to me an analytical mistake to describe the high compensation of a
"supermanager" as income from labor. While it isn't coming
from owning their corporation, it is coming from
(partially) controlling it. In some ways, it's more like the income
of an ancien regime tax
farmer, or an
Ottoman timariot, than the
income of a roofer, nurse's aid, computer programmer, or even an architect.
(Actually, the analogy with the timariot grows on me the more I think about it.
The timariot didn't own his timar, he couldn't sell it or bequeath it, any more
than a supermanager owns his company. Officially, income in both cases is
compensation for services rendered to the actual owner, whether sultan or
stockholder.) It would be very nice to see someone try to separate income
from labor and income from control, but I have no clue how to do it,
statistically. (Though I do have a modest proposal for how to
reduce the control income of supermanagers.)
 p. 654, n. 56: For "Claude Debreu", read "Gerard Debreu". (Speaking of economists' "passion for mathematics and for purely theoretical ... speculation"!)
 ETA: Let me emphasize the point about production functions, marginal
returns on capital, etc. It cannot be emphasized enough that capital, for
Piketty, is the same as wealth, assets, stores of value which can be traded in
the market. He does
not mean nonhuman factors of production, "capital goods".
(Cf.) Capital
goods can work fine as assets, but a much more typical asset is a claim on part
of the product achieved through putting capital goods and labor to use.
Because he is looking at wealth rather than capital goods, the appropriate unit
of measurement, and the one he uses, is monetary rather than physical. One
consequence is that Piketty can legitimately add up monetary amounts to get the
total wealth of a person, a class, or a country. (Whereas adding up capital
goods is deeply problematic at best; I don't think even the dullest Gosplan
functionary would've tried to get the total capital of the USSR by adding up
the weight or volume of its capital goods.)
 This also has implications for the "marginal product of capital" question.
If a capital good is measured in physical units, it's not crazy to imagine
diminishing marginal returns. If some line of work needs tools, equipment, a
proper space, etc., to be carried out, then the first crude tools and the shack
which allow it to get started increase output immensely, then having a bit more
equipment and a decent space helps, and after a certain point one extra spanner
or crucible, with no extra worker, does very little. (Not crazy, but also not
obviously true: see the work of Richard A. Miller
[i, ii],
which I learned of
from Seth
Ackerman's piece on Piketty.) Some critics of Piketty's forecasts point to
this, to argue that his vision of widening inequality will fail on these
grounds. They equate the rate of rate on capital, Piketty's \( r \), with the
marginal product of capital, and, believing the latter must decline, think \( r
\) must shrink as well. We thus have the curious spectacle of apostles of
capitalism claiming it will be saved by a falling rate of profit. (I
believe Uncle Karl would have savored the irony.) This intuition, however, is
based on physical units of capital  spanners, crucibles, servers,
square meters of buildings. What about in monetary units?
 Well, what price would you, as a sensible capitalist, pay for a marginal
increase in your supply of some capital good? Its value to you is the present
value of the increased future production that makes possible. (One buys
a stock of capital and receives a flow of product.) A $1
marginal increase in the capital stock has to produce at least $1 PV in extra
production. If it augmented the NPV of production by more than $1, you'd be
happy to buy it, but the price of that same physical capital good would then
presumably be bid up by others. (Or, alternately, if not bid up, you would
then buy another $1 worth of capital, until the diminishing returns
of physical capital set in.) At equilibrium, a marginal extra dollar
of capital should always, for every enterprise, increase the PV of production
by $1. Under the simplest assumption that the extra product is constant over
time, this means a marginal $1 of capital should increase production in each
time period by $ \( \delta \), the discount rate. (Again, we're using monetary
and not physical units for output. Also, I neglect small complications from
depreciation and the like.) In symbols, \( r = \partial Y/\partial K = \delta
\). (\( K \) has units of money, and \( Y \) of money per unit time, so the
partial derivative has units of inverse time, as \( \delta \) should.) It is
surely not obvious that the discount rate should fall as capital accumulates.
 Expressed in other terms, the elasticity of substitution between capital
and labor thus ends up being the elasticity of the marginal product of labor
(\( \partial Y/\partial L \)) with respect to the ratio of capital to labor (\(
K/L \)). Again, this may or may not fall as \( K \) increases, but I don't see
how diminishing returns to physical capital guarantees this.
 However, the fact that the measured real rate of return on capital (which
Piketty puts at roughly 5% over all periods and countries) is so much higher
than any plausible discount rate suggests that the whole enterprise of trying
to relate returns on capital to marginal products is illconceived. Indeed,
Piketty rightly says as much, and his claim that \( r > g \) is just an
empirical regularity, true for most but not all of his data.
So it's clearly not immutable, and indeed his policy proposal of a progressive
tax on capital is designed to change it!
 Charles Stross, The Rhesus Chart
 Mind candy. Of course this is what would happen if some City
quants happened to find themselves turning into vampires...
 Susan A. Ambrose, Michael W. Bridges, Michele DiPietro, Marsha C. Lovett and Marie K. Norman, How Learning Works: Seven ResearchBased Principles for Smart Teaching
 An excellent guide to what psychological research has to say about making
collegelevel teaching more effective  that is, helping our students
understand what we want them to learn, retain it, and use it and make it their
own. I'd already been following some of the recommendations, but I am going to
consciously try to do more, especially when it comes to scaffolding and giving
rapid, targeted feedback. Following through on everything here would be a
pretty daunting amount of work...
 Disclaimer: Four of the authors worked at CMU when the book was
published, and one is the spouse of a friend.
Books to Read While the Algae Grow in Your Fur;
Scientifiction and Fantastica;
Commit a Social Science;
Minds, Brains, and Neurons;
The Beloved Republic;
The Dismal Science;
Corrupting the Young;
The Commonwealth of Letters
Posted by crshalizi at July 31, 2014 23:59  permanent link
July 11, 2014
Attention conservation notice:
Leaden academic sarcasm about methodology.
The following statement was adopted unanimously by the editorial board
of the journal, and reproduced here in full:
We wish to endorse, in its entirety and without reservation, the
recent
essay "On
the Emptiness of Failed Replications"
by Jason
Mitchell. In Prof. Mitchell's field, scientists attempt to detect subtle
patterns of association between faint environmental cues and measured
behaviors, or to relate remote proxies for neural activity to differences in
stimuli or psychological constructs. We are entirely persuaded by his
arguments that the experimental procedures needed in these fields are so
delicate and so tacit that failures to replicate published findings must
indicate incompetence on the part of the replicators, rather than the original
report being due to improper experimental technique or statistical
fluctuations. While the specific obstacles to transmitting experimental
procedures for social priming or functional magnetic resonance imaging are not
the same as those for reading the future from the conformation and coloration
of the liver of a sacrificed sheep, goat, or other bovid, we see no reason why
Prof. Mitchell's arguments are not at least as applicable to the latter as to
the former. Instructions to referees for JEBH will accordingly be modified to
enjoin them to treat reports of failures to replicate published findings as
"without scientific value", starting immediately. We hope by these means to
ensure that the field of haruspicy, and perhaps even all of the mantic
sciences, is spared the painful and unprofitable controversies over replication
which have so distracted our colleagues in psychology.
Questions about this policy should be directed to the editors; I'm
just the messenger here.
Manual trackback: Equitablog; Pete Warden
Learned Folly;
Modest Proposals
Posted by crshalizi at July 11, 2014 15:50  permanent link
July 06, 2014
Accumulated Bookchat
Attention conservation notice: I
have no taste, and I am about to recommend a lot of books.
Somehow, I've not posted anything about what I've been reading since
September. So: have
October,
November,
December,
January,
February,
March,
April,
May, and
June.
Posted by crshalizi at July 06, 2014 17:26  permanent link
June 30, 2014
Books to Read While the Algae Grow in Your Fur, June 2014
Attention conservation notice: I have no taste.
 Plato, The Republic
 I had a teacher in junior high who had the good idea, when I was bored, of
making me read philosophers and political writers he thought I'd violently
disagree with, and forcing me to explain why I thought they were wrong. The
ones which stuck with me were Ayn Rand and Plato. I did indeed disagree
furiously with both of them (I'd
already imprinted
on orcs), but they became part of the, as it were, invisible jury in my
head I run things by.
 Reading Drury on Strauss (below) drove me back to
the Republic. (You couldn't pay me enough to revisit Rand.) As a
grownup, I find it such a deeply strange book as to sympathize with Strauss's
position that it couldn't possibly be taken at face value.
 For instance: the idea that justice is doing good to friends but bad to
enemies is proposed in I 332d, and then rejected with downright sophistry.
But it's then revived as a desideratum for the guardians
(II
375), and argued to be psychologically realizable because purebred dogs
show "love of learning and love of wisdom"
(II
376).
 Or again: the whole point of the book is supposedly to figure out what
justice is. The ideal city was spun out because it's supposed to be easier to
figure out what makes a just city than a just person. (No reason is given for
why the justice of the just city has to resemble the justice of the just person
any more than the beauty of a beautiful sunrise has to resemble the beauty of a
beautiful poem.) Plato's answer is that the justice of the ideal city consists
of the members of each class sticking to their duties and not getting above
their station
(IV
433). Socrates supposedly reaches this by a process of
elimination, all the other features of city having been identified
with other virtues (IV 428432). I won't say that this is the worst train of
reasoning ever (I've graded undergraduates), but how did it ever persuade
anyone?
 The whole thing is like that: a tissue of weak analogies, arbitrary
assertions, eugenic
numerology,
and outright
myths. Whatever you think about Plato's conclusions, there's hardly any
rational argument for those conclusions to engage with. And yet
this is the foundationwork of the western (as in, westofChina)
intellectual tradition which prizes itself on, precisely, devotion to reason!
 Given how much better Plato could argue in works like
Euthyphro and Meno, how moving
the Apology is, how other dialogues show actual dialogue,
etc., I am led to wonder whether our civilization has not managed to canonize
one of the oldest surviving attacks of
the brain
eater.
 ObLinkage: Jo Walton reviewing it as though it were SF.
 Update: John Emerson on Plato.
 Christopher Moore and Ian Corson with Jennyson Rosero, The Griff
 Ted Naifeh, Courtney Crumrin and the Night Things
 Nick Spencer and Joe Eisma, Morning Glories: For a Better Future
 Brian K. Vaughan et al. Runaways, 2: Teenage Wasteland and 3: The Good Die Young
 Comic book mind candy, assorted.
 Shamini Flint, A Bali Conspiracy Most Foul
 Mind candy. The intersection of dissipated expat life with terrorism. (Previously.)
 John Layman and Rob Guillory, Chew (3, 4, 5, 6, 7, 8)
 Comicbook mind candy (forgive the pun). I'm not sure what further
foodrelated weirdness there is for them to pull, but I look forward to finding
out. (Previously: 1, 2.)
 Shadia B. Drury, The Political Ideas of Leo Strauss
 Convincing portrait of Strauss as someone who was basically Nietzschean,
and who projected his own views back on to admired figures from the past by the
device of claiming they engaged in "esoteric
writing". The esoteric doctrine is that the definition of justice given
and then (to exoteric eyes) rejected at the beginning
of The
Republic, namely helping one's friends and hurting one's enemies, is
in fact right, because there is really no basis for justice or morality beyond
force and fraud. When Plato's Socrates seems to say
that even
bandits must be just to each other in order to prey effectively on others,
what Plato really means is that this is all justice is. (In other
words, Thrasymachus is right.) Hedonism is also true, and the only real good
is pleasure in this world. Despite this, there are higher and lower types of
humanity; the highest types are the philosophers, the tiny elite able to take
pleasure from contemplating the Cosmic Null and/or fabricating new values.
Political society exists for their sake. If most people realized the truth,
political society would fall apart, so they need to be thoroughly soaked in the
illusions of morality, virtue, afterlives, personal divinities, etc.
Philosophers must on no account teach the truth in such a way that the masses
can pick up on it. For these purposes, "the masses" including most rulers, who
should be just as much ideological dupes as any servant. Basically every
philosopher in the Greek tradition and its descendants, from the British Isles
to Khurasan, had this same esoteric teaching, whatever the differences in their
exoteric teachings. The rot set in when people like Machiavelli and Hobbes
began to give the game away, and look where we are now.
 Drury makes no attempt to evaluate Strauss as a historian of philosophy
(but cf.).
She confines criticism of his ideas to her last chapter, where she suggests
that people who believe this sort of thing are not going to be fun to live
around, or have in your government. Strauss's on
modes of interpretation (heavy on numerology and inversions of meaning) are
left undeployed. Mostly, it's just an attempt to say plainly, based on
Strauss's actual texts, what he says obscurely and circuitously. At that
point, criticism becomes almost superfluous.
 Sidenotes and speculations:
 1. Drury presumes that Strauss gave his story of the Platonic tradition of
political philosophy, and its degeneration via Machiavelli and Hobbes into mere
modernity, as sincere (if betweenthelines) account of what happened. This
would make it a remarkably influential piece
of psychoceramica,
and Strauss a sort of superior (because genuinely
erudite) Mencius
Moldbug. After reading her, however, I wonder if it wasn't
a deliberate myth, told in indifference to the facts but
with an
eye on its effects on his students, or perhaps their students.
 2. It's interesting to imagine what Strauss or Straussians would've made of
evolutionary game theory. On the one hand, being so explicit that
the "prosocial
behavior" means cooperating to prey on others might count as decadent
modernity. On the other hand, math is arguably even better than esoteric
writing for keeping the doctrine from the multitude, so it might be acceptable
as "political philosophy".
 3. It is true that there's a puzzle in interpreting The
Republic: the arguments against Thrasymachus are horribly bad. After
Thrasymachus is given a chance to state his views, Socrates tries to refute
them with a
series
of incredibly weak analogies, and shouldn't have convinced anyone.
(The counteranalogy
of the shepherd is much stronger than any of Socrates's.) Then
Thrasymachus shuts up in a huff, and
Glaucon rephrases
a very similar position in more socialcontract or titfortat terms
(recently
illustrated by John Holbo). Socrates's response is to change the
subject to the ideal city. Since Plato could certainly argue much more
logically, why didn't he? (ETA: See above.)
 Europa Report
 I appreciate the effort at making a hardSF movie. But: how would a
private company make money sending an expedition to Europa? More importantly
(ROT13'd for spoilers), ubj
bsgra qbrf fbzrguvat ynaq ba Rhebcn, gb cebivqr na rpbybtvpny avpur sbe gur
perngher jr frr?
 Tim Seeley and Mike Norton, Revival: 1, You're Among Friends; 2, Live Like You Mean It; 3, A Faraway Place
 Comic book mind candy. It's just a little resurrection of the
dead, barely worth bothering over...
Books to Read While the Algae Grow in Your Fur;
Scientifiction and Fantastica;
Philosophy;
The RunningDogs of Reaction;
Writing for Antiquity;
Pleasures of Detection, Portraits of Crime;
Learned Folly
Posted by crshalizi at June 30, 2014 23:59  permanent link
June 22, 2014
Notes on "Collective Stability in Structured Prediction: Generalization from One Example" (or: Small Pieces, Loosely Joined)
\[
\newcommand{\Prob}[1]{\mathbb{P}\left( #1 \right)}
\newcommand{\Expect}[1]{\mathbb{E}\left[ #1 \right]}
\newcommand{\zprime}{z^{\prime}}
\newcommand{\Zprime}{Z^{\prime}}
\newcommand{\Eta}{H}
\newcommand{\equdist}{\stackrel{d}{=}}
\newcommand{\indep}{\mathrel{\perp\llap{\perp}}}
\]
Attention conservation notice: 2700+ words, expounding a
mathematical paper on statistical learning theory. Largely written months ago,
posted now in default of actual content.
For the CMU statistical learning theory reading group, I decided to present
this:
 Ben London and Bert Huang and Benjamin Taskar and Lise Getoor, "Collective Stability in Structured Prediction: Generalization from One Example", in Sanjoy Dasgupta and David McAllester (eds.), Proceedings of the 30th International Conference on Machine Learning [ICML13] (2013): 828836
 Abstract: Structured predictors enable joint inference over multiple interdependent output variables. These models are often trained on a small number of examples with large internal structure. Existing distributionfree generalization bounds do not guarantee generalization in this setting, though this contradicts a large body of empirical evidence from computer vision, natural language processing, social networks and other fields. In this paper, we identify a set of natural conditions — weak dependence, hypothesis complexity and a new measure, collective stability — that are sufficient for generalization from even a single example, without imposing an explicit generative model of the data. We then demonstrate that the complexity and stability conditions are satisfied by a broad class of models, including marginal inference in templated graphical models. We thus obtain uniform convergence rates that can decrease significantly faster than previous bounds, particularly when each structured example is sufficiently large and the number of training examples is constant, even one.
The question being grappled with here is how we can learn from one
example, really from one realization of a stochastic process. Our usual
approach in statistics and machine learning is to assume we have many,
independent examples from the same source. It seems very odd to say that if we
see a single big, internallydependent example, we're as much in the dark about
the data source and its patterns as if we'd observed a single onedimensional
measurement, but that's really all a lot of our theory can do for us. Since we
know that animals and machines often can successfully learn
generalizable patterns from single realizations, there needs to be some
explanation of how the trick is turned...
This paper is thus relevant to my interests
in dependent
learning, time
series and spatiotemporal data,
and networks.
I read it when it first came out, but I wasn't at all convinced that I'd really
understood it, which was why I volunteered to present this. Given this, I
skipped sections 6 and 7, which specialize from pretty general learning theory
to certain kinds
of graphical
models. It's valuable to show that the assumptions of the general
theory can be realized, and by a nontrivial class of models at that,
but they're not really my bag.
At a very high level, the strategy used to prove a generalizationerror
bound here is fairly familiar in learning theory. Start by establishing a
deviation inequality for a single wellbehaved function. Then prove that the
functions are "stable", in the sense that small changes to their inputs can't
alter their outputs too much. The combination of pointwise deviation bounds
and stability then yields concentration bounds which hold uniformly over all
functions. The innovations are in how this is all made to work when we
see one realization of a dependent process.
Weak Dependence and a Pointwise Deviation Bound
The data here is an $n$dimensional vector of random variables, $Z = (Z_1,
Z_2, \ldots Z_n)$ is an $n$dimensional object. N.B., $n$ here is NOT number
of samples, but the dimensionality of our one example. (I might have preferred
something like $p$ here personally.) We do not assume that the $Z_i$
are independent, Markov, exchangeable, stationary, etc., just that $Z$ obeys
some stochastic process or other.
We are interested in functions of the whole of $Z$, $g(Z)$. We're going to
assume that they have a "bounded difference" property: that if $z$ and
$\zprime$ are two realizations of $Z$, which differ in only a single
coordinate, then $g(z)  g(\zprime) \leq c/n$ for some $c$ which doesn't care
about which constant we perturb.
With this assumption, if the $Z_i$ were IID, the ordinary (McDiarmid)
bounded differences inequality would say
\[
\Prob{g(Z)  \Expect{g(Z)} \geq \epsilon} \leq \exp{\left\{ \frac{2n\epsilon^2}{c^2} \right\} }
\]
This sort of deviation inequality is
the breadandbutter of
IID learning theory, but now we need to make it work under dependence.
This needs a probabilistic assumption: changing one coordinate alone can't
change the function $f$ too much, but it mustn't also imply changes to many
other coordinates.
The way London et al. quantify this is to use
the $\eta$dependence coefficients introduced by
Aryeh "Absolutely
Regular" Kontorovich.
Specifically, pick some ordering of the $Z_i$ variables. Then the
$\eta$dependence between positions $i$ and $j$ is
\[
\eta_{ij} = \sup_{z_{1:i1}, z_i, \zprime_i}{{\left\P\left(Z_{j:n}\middle Z_{1:i1}= z_{1:i1}, Z_i = z_i\right)  P\left(Z_{j:n}\middle Z_{1:i1}= z_{1:i1}, Z_i = \zprime_i\right) \right\}_{TV}}
\]
I imagine that if you are Aryeh, this is transparent, but the rest of us need to take it apart to see how it works...
Fix $z_{1:i1}$ for the moment. Then the expression above would say how
much can changing $Z_i$ matter for what happens from $j$ onwards; we might call
it how much influence $Z_i$ has, in the context $z_{1:i1}$. Taking
the supremum over $z_{1:i1}$ shows how much influence $Z_i$ could have, if we
set things up just right.
Now, for bookkeeping, set $\theta_{ij} = \eta_{ij}$ if $i < j$, $=1$ if $i=j$, and $0$ if $i > j$. This lets us say that $\sum_{j=1}^{n}{\theta_{ij}}$
is (roughly) how much influence $Z_i$ could exert over the whole future.
Since we have no reason to pick out a particular $Z_i$, we ask how
influential the most influential $Z_i$ could get:
\[
\\Theta_n\_{\infty} = \max_{i\in 1:n}{\sum_{j=1}^{n}{\theta_{ij}}}
\]
Because this quantity is important and keeps coming up, while the matrix of
$\theta$'s doesn't, I will depart from the paper's notation and give it an
abbreviated name, $\Eta_n$.
Now we have the tools to assert Theorem 1 of London et al., which is (as
they say) essentially Theorem 1.1
of Kontorovich and Ramanan:
Theorem 1: Suppose that $g$ is a realvalued
function which has the boundeddifferences property with constant $c/n$. Then
\[
\Prob{g(Z)  \Expect{g(Z)} \geq \epsilon} \leq \exp{\left\{ \frac{2n\epsilon^2}{c^2 \Eta_n^2} \right\} }
\]
That is, the effective sample size is $n/\Eta_n^2$, rather than $n$, because of
the dependence between observations. (We have seen similar deflations of the
number of effective observations before, when we looked
at mixing, and even in the world's
simplest ergodic theorem.) I emphasize that we are not assuming any Markov
property/conditional independence for the observations, still less that $Z$
breaks up into independent chucks (as in
an $m$dependent
sequence). We aren't even assuming a bound or a growth rate for $\Eta_n$.
If $\Eta_n = O(1)$, then for each $i$, $\eta_{ij} \rightarrow 0$ as $j
\rightarrow \infty$, and we have what Kontorovich and Ramanan call an
$\eta$mixing process. It is not clear whether this is stronger than, say,
$\beta$mixing. (Two nice questions, though tangential here, are whether
$\beta$ mixing would be enough, and, if not whether
our estimator of $\beta$mixing be adapted to get
$\eta_{ij}$ coefficients?)
To sum up, if we have just one function $f$ with the
boundeddifference property, then we have a deviation inequality: we can bound
how far below its mean it should be. Ultimately the functions we're going to
be concerned with are the combinations of models with a loss function, so we
want to control deviations for not just one model but for a whole model class...
Vectorized Functions and Collective Stability
In a lot of contexts with structured data, we might want to make a prediction
(assign a label, take an action) for each component of $Z$. If $Z$ is an
image, for instance, and we're doing image segmentation, we might want to say
which segment each pixel is in. If $Z$ is text, we might want to assign each
word to a part of speech. If $Z$ is a social network, we might want to
categorize each node (or edge) in some way. We might also want to output
probability distributions over categories, rather than making a hard choice of
category. So we will now consider functions $f$ which map $Z$ to
$\mathcal{Y}^n$, where $\mathcal{Y}$ is some suitable space of predictions or
actions. In other words, our functions output vectors.
(In fact, at some points in the paper London et al. distinguish
between the dimension of the data ($n$) and the dimension of the output vector
($N$). Their core theorems presume $n=N$, but I think one could maintain the
distinction, just at some cost in notational complexity.)
Ordinarily, when people make stability arguments in learning theory, they
have the stability
of algorithms in mind: perturbing (or omitting) one data point
should lead to only a small change in the algorithm's output. London
et al., in contrast, are interested in the stability of
hypotheses: small tweaks to $z$ should lead to only small changes in
the vector $f(z)$.
Definition. A vectorvalued function $f$ is collectively $\beta$stable iff,
when $z$ and $\zprime$ are offbyone, then $\ f(z)  f(\zprime) \_1 \leq
\beta$. The function class $\mathcal{F}$ is uniformly collectively
$\beta$stable iff every $f \in \mathcal{F}$ is
$\beta$stable.
Now we need to devectorize our functions. (Remember, ultimately we're
interested in the loss of models, so it would make sense to average their
losses over all the dimensions over which we're making predictions.) For any
$f$, set
\[
\overline{f}(z) \equiv \frac{1}{n}\sum_{i=1}^{n}{f_i(z)}
\]
(In what seems to me a truly unfortunate notational choice, London et
al. wrote what I'm calling $\overline{f}(z)$ as $F(z)$, and wrote
$\Expect{\overline{f}(Z)}$ as $\overline{F}$. I, and much of the readinggroup
audience, found this confusing, so I'm trying to streamline.)
Now notice that if $\mathcal{F}$ is uniformly $\beta$stable, if we pick any
$f$ in $\mathcal{F}$, its sample average $\overline{f}$ must obey the bounded
difference property with constant $\beta/n$. So sample averages of
collectively stable functions will obey the deviation bound in Theorem 1.
Stability of the WorstCase Deviation
Can we extend this somehow into a concentration inequality,
a deviation bound that holds uniformly over $\mathcal{F}$?
Let's look at the worst case deviation:
\[
\Phi(z) = \sup_{f \in \mathcal{F}}{\Expect{\overline{f}(Z)}  \overline{f}(z)}
\]
(Note: Strictly speaking, $\Phi$ is also a function of $\mathcal{F}$ and $n$,
but I am suppressing that in the notation. [The authors included the
dependence on $\mathcal{F}$.])
To see why controlling $\Phi$ gives us concentration, start with the fact
that, by the definition of $\Phi$,
\[
\Expect{\overline{f}(Z)}  \overline{f}(Z) \leq + \Phi(Z)
\]
so
\[
\Expect{\overline{f}(Z)} \leq \overline{f}(Z) + \Phi(Z)
\]
not just with almostsurely but always. If in turn $\Phi(Z) \leq
\Expect{\Phi(Z)} + \epsilon$, at least with high probability, then we've got
\[
\Expect{\overline{f}(Z)} \leq \overline{f}(Z) + \Expect{\Phi(Z)} + \epsilon
\]
with the same probability.
There are many ways one could try to show that $\Phi$ obeys a deviation
inequality, but the one which suggests itself in this context is that of
showing $\Phi$ has bounded differences. Pick any $z, \zprime$ which differ in
just one coordinate. Then
\begin{eqnarray*}
\left\Phi(z)  \Phi(\zprime)\right & = & \left \sup_{f\in\mathcal{F}}{\left\{ \Expect{\overline{f}(Z)}  \overline{f}(z)\right\}}  \sup_{f\in\mathcal{F}}{\left\{ \Expect{\overline{f}(Z)}  \overline{f}(\zprime)\right\}} \right\\
& \leq & \left \sup_{f \in \mathcal{F}}{ \Expect{\overline{f}(Z)}  \overline{f}(z)  \Expect{\overline{f}(Z)} + \overline{f}(\zprime)}\right ~ \text{(supremum over differences is at least difference in suprema)}\\
& = & \left\sup_{f\in\mathcal{F}}{\frac{1}{n}\sum_{i=1}^{n}{f_i(\zprime)  f_i(z)}}\right \\
&\leq& \sup_{f\in\mathcal{F}}{\frac{1}{n}\sum_{i=1}^{n}{f_i(\zprime)  f_i(z)}} ~ \text{(Jensen's inequality)}\\
& = & \frac{1}{n}\sup_{f\in \mathcal{F}}{\f(\zprime)  f(z)\_1} ~ \text{(definition of} \ \ \_1) \\
& \leq & \frac{\beta}{n} ~ \text{(uniform collective stability)}
\end{eqnarray*}
Thus Theorem 1 applies to $\Phi$:
\[
\Prob{\Expect{\Phi(Z)}  \Phi(Z) \geq \epsilon} \leq \exp{\left\{ \frac{2n\epsilon^2}{\beta^2 \Eta_n^2} \right\}}
\]
Set the righthand side to $\delta$ and solve for $\epsilon$:
\[
\epsilon = \beta \Eta_n \sqrt{\frac{\log{1/\delta}}{2n}}
\]
Then we have, with probability at least $1\delta$,
\[
\Phi(Z) \leq \Expect{\Phi(Z)} + \beta \Eta_n \sqrt{\frac{\log{1/\delta}}{2n}}
\]
Hence, with the same probability, uniformly over $f \in \mathcal{F}$,
\[
\Expect{\overline{f}(Z)} \leq \overline{f}(Z) + \Expect{\Phi(Z)} + \beta \Eta_n \sqrt{\frac{\log{1/\delta}}{2n}}
\]
Rademacher Complexity
Our next step is to replace the expected supremum of the empirical process,
$\Expect{\Phi(Z)}$, with something more tractable and familiarlooking.
Really any bound on this could be used, but the authors provide a
particularly nice one, in terms of the Rademacher complexity.
Recall how the Rademacher complexity works when we
have a class $\mathcal{G}$ of scalarvalued function $g$ of an IID sequence
$X_1, \ldots X_n$: it's
\[
\mathcal{R}_n(\mathcal{G}) \equiv \Expect{\sup_{g\in\mathcal{G}}{\frac{1}{n}\sum_{i=1}^{n}{\sigma_i g(X_i)}}}
\]
where we introduce the Rademacher random variables $\sigma_i$, which
are $\pm 1$ with equal probability, independent of each other and of
the $X_i$. Since the Rademacher variables are the binary equivalent of white
noise, this measures how well our functions can seem to
correlate with noise, and so how well they can seem to match any damn
thing.
What the authors do in Definition 2 is adapt the definition of Rademacher
complexity to their setting in the simplest possible way:
\[
\mathcal{R}_n(\mathcal{F}) \equiv \Expect{\sup_{f\in\mathcal{F}}{\frac{1}{n}\sum_{i=1}^{n}{\sigma_i f_i(Z)}}}
\]
In the IID version of Rademacher complexity, each summand involves applying the
same function ($g$) to a different random variable ($X_i$). Here, in contrast,
each summand applies a different function ($f_i$) to the same random vector
($Z$). This second form can of course include the first as a special case.
Now we would like to relate the Rademacher complexity somehow to the
expectation of $\Phi$. Let's take a closer look at the definition there:
\[
\Expect{\Phi(Z)} = \Expect{\sup_{f\in\mathcal{F}}{\Expect{\overline{f}(Z)}  \overline{f}(Z)}}
\]
Let's introduce an independent copy of $Z$, say $\Zprime$, i.e., $Z \equdist
\Zprime$, $Z\indep \Zprime$. (These are sometimes called "ghost samples".)
Then of course $\Expect{\overline{f}(Z)} = \Expect{\overline{f}(\Zprime)}$, so
\begin{eqnarray}
\nonumber \Expect{\Phi(Z)} & = & \Expect{\sup_{f\in\mathcal{F}}{\Expect{\overline{f}(\Zprime)}  \overline{f}(Z)}} \\
\nonumber & \leq & \Expect{\sup_{f\in\mathcal{F}}{\overline{f}(\Zprime)  \overline{f}(Z)}} ~ \text{(Jensen's inequality again)}\\
\nonumber & = & \Expect{\sup_{f\in\mathcal{F}}{\frac{1}{n}\sum_{i=1}^{n}{f_i(\Zprime)  f_i(Z)}}}\\
& = & \Expect{\Expect{ \sup_{f\in\mathcal{F}}{\frac{1}{n}\sum_{i=1}^{n}{f_i(\Zprime)  f_i(Z)}} \middle \sigma}} ~ \text{(law of total expectation)} \label{eqn:phiaftersymmetrizing}
\end{eqnarray}
Look at the summands. No matter what $f_i$ might be, $f_i(\Zprime)  f_i(Z)
\equdist f_i(Z)  f_i(\Zprime)$, because $Z$ and $\Zprime$ have the same
distribution but are independent. Since multiplying something by $\sigma_i$
randomly flips its sign, this suggests we should be able to introduce
$\sigma_i$ terms without changing anything. This is true, but it needs a bit
of trickery, because of the (possible) dependence between the different
summands. Following the authors, but simplifying the notation a bit, define
\[
T_i = \left\{ \begin{array}{cc} Z & \sigma_i = +1\\ \Zprime & \sigma_i = 1
\end{array} \right. ~ , ~ T^{\prime}_i = \left\{ \begin{array}{cc} \Zprime & \sigma_i = +1 \\ Z & \sigma_i = 1 \end{array}\right.
\]
Now notice that if $\sigma_i = +1$, then
\[
f_i(\Zprime)  f_i(Z) = \sigma_i(f_i(\Zprime)  f_i(Z)) = \sigma_i(f_i(T^{\prime}_i)  f_i(T_i))
\]
On the other hand, if $\sigma_i = 1$, then
\[
f_i(\Zprime)  f_i(Z) = \sigma_i(f_i(Z)  f_i(\Zprime)) = \sigma_i(f_i(T^{\prime}_i)  f_i(T_i))
\]
Since $\sigma_i$ is either $+1$ or $1$, we have
\begin{equation}
f_i(\Zprime)  f_i(Z) = \sigma_i(f_i(T^{\prime}_i)  f_i(T_i)) \label{eqn:symmetricdifferenceintermsofradvars}
\end{equation}
Substituting \eqref{eqn:symmetricdifferenceintermsofradvars} into \eqref{eqn:phiaftersymmetrizing}
\begin{eqnarray*}
\Expect{\Phi(Z)} & \leq & \Expect{\Expect{\sup_{f\in\mathcal{F}}{\frac{1}{n}\sum_{i=1}^{n}{f_i(\Zprime)  f_i(Z)}} \middle \sigma}} \\
& = & \Expect{\Expect{\sup_{f\in\mathcal{F}}{\frac{1}{n}\sum_{i=1}^{n}{\sigma_i (f_i(T^{\prime}_i)  f_i(T_i))}} \middle  \sigma}} \\
& = & \Expect{\sup_{f\in\mathcal{F}}{\frac{1}{n}\sum_{i=1}^{n}{\sigma_i(f_i(\Zprime)  f_i(Z))}}}\\
& \leq & \Expect{\sup_{f\in\mathcal{F}}{\frac{1}{n}\sum_{i=1}^{n}{\sigma_i f_i(\Zprime)}} + \sup_{f\in\mathcal{F}}{\frac{1}{n}\sum_{i=1}^{n}{\sigma_i f_i(Z)}}}\\
& = & 2\Expect{\sup_{f\in\mathcal{F}}{\frac{1}{n}\sum_{i=1}^{n}{\sigma_i f_i(Z)}}}\\
& = & 2\mathcal{R}_n(\mathcal{F})
\end{eqnarray*}
This is, I think, a very nice way to show that Rademacher complexity still
controls overfitting with dependent data. (This result in fact subsumes our
result in
arxiv:1106.0730, and London et al. have, I think, a more elegant proof.)
Collective Stability and Generalizing from One Big Example
Now we put everything together.
Suppose that $\mathcal{F}$ is uniformly collectively $\beta$stable.
Then with probability at least $1\delta$, uniformly over $f \in \mathcal{F}$,
\[
\Expect{\overline{f}(Z)} \leq \overline{f}(Z) + 2\mathcal{R}_n(\mathcal{F}) + \beta \Eta_n \sqrt{\frac{\log{1/\delta}}{2n}}
\]
This is not quite the Theorem 2 of London et al., because they go
through some additional steps to relate the collective stability
of predictions to the collective stability of loss functions,
but at this point I think the message is clear.
That message, as promised in the abstract, has three parts. The three
conditions which are jointly sufficient to allow generalization from a single
big, interdependent instance are:
 An isolated change to
one part of the instance doesn't change the predictions very much (collective
stability, $\beta$ exists and is small);
 Very distant parts of the instance are nearly independent ($\eta$mixing,
$\Eta_n = O(1)$); and
 Our hypothesis class isn't so flexible it could seem to fit any damn thing
(shrinking Rademacher complexity, $\mathcal{R}_n \rightarrow 0$).
I suspect this trio of conditions is not jointly necessary as well,
but that's very much a topic for the future. I
also have some thoughts about
whether, with dependent data, we really want to control
$\Expect{\overline{f}(Z)}$, or rather whether the goal shouldn't be something
else, but that'll take another post.
Enigmas of Chance
Posted by crshalizi at June 22, 2014 10:54  permanent link
May 31, 2014
Books to Read While the Algae Grow in Your Fur, May 2014
Attention conservation notice: I have no taste.
 Robert Hughes, Rome: A Cultural, Visual, and Personal History
 As the subtitle suggests, a bit of a grabbag of Hughes talking about Rome,
or Romerelated, subjects, seemingly as they caught his attention. Thus the
chapter on ancient Rome contains a mix of recitals of the legends,
archaeological findings, the military history of the Punic Wars (including a
description of
the corvus
filled with what I recognize as schoolboy enthusiasm), the rise of the Caesars
— and then he gets to the art, especially the architecture, of the
Augustan age, and takes off, before wandering back into political history
(DiocletianConstantineJulian). The reader should, in other words, be
prepared for a ramble.
 Hughes is, unsurprisingly, at his best when talking about art. There is he
knowledgeable, clear, sympathetic to a wide range of art but definitely
unafraid of rendering judgment. If he doesn't always persuade (I remain
completely immune to the charms of Baroque painting and sculpture), he
definitely does his best to catalyze an appreciative reaction to the art in his
reader,
and one
can hardly ask more of a critic
 He's at his second best in the "personal" parts, conveying his impressions
of Rome as he first found it in the early 1960s, and as he left it in the
2000s, to the detriment of the latter. (He's selfaware enough to reflect
that some of that is the difference between being a young and an old
man.) His ventures into the political and religious history of Rome are not as
good — he has nothing new to say — but not bad.
 Overall: no masterpiece, but always at least pleasant, and often
informative and energizing.
 R. A. Scotti, Basilica: The Splendor and
the Scandal: Building St. Peter's
 Mind candy: engagingenough popular history, by a veryobviously Catholic
writer. (My own first reaction to St. Peter's, on seeing it again for the
first time as a grownup, was that Luther had a point; my second and more
charitable reaction was that there was an awesome space beneath the idolatrous
and servile
rubbish.)
 Pacific Rim
 Mind candy. While I like giant robots battling giant monsters, and I
appreciate playing with elements of the epic (the warrior sulking in his tent;
the catalog of ships), I'd have liked it better if the plot made more sense.
 Sara Poole, Poisoner and The Borgia Betrayal
 Mind candy: decent historical thrillers, though implausibly protofeminist,
philoSemitic and protoEnlightenment for the period.
 Patrizia Castiglione, Massimo Falcioni, Annick Lesne and Angelo Vulpiani,
Chaos and Coarse Graining in Statistical Mechanics
 A good modern tour of key issues in what might be called the "Boltzmannian"
tradition of work on the foundations of statistical mechanics, emphasizing the
importance of understanding what happens in single, very large mechanical
assemblages. Both "single" and "very large" here are important, and important
by way of contrasts.
 The emphasis on the dynamics of single assemblages contrasts with
approaches (descending from Gibbs and
from Jaynes) emphasizing
"ensembles", or probability distributions over assemblages. (Ensembles are
still used here, but as ways of simplifying calculations, not fundamental
objects.) The entropy one wants to show (usually) grows over time is the
Boltzmann entropy of the macrostate, not the Gibbs or Shannon entropy of the
ensemble. (Thus studies of the dynamics of ensembles are, pace, e.g.,
Mackey, irrelevant
to this question, whatever their other merits.) One wants to know that a
typical microscopic trajectory will (usually) move the assemblage from a
lowentropy (lowvolume) macrostate to a highvolume macrostate, and moreover
that once in the latter region, most trajectories that originated from the
lowentropy macrostate will act like ones that began in the
equilibrium macrostate. One wants, though I don't recall that they put it this
way, a Markov property at the
macroscopic level.
 Randomizing behavior for macroscopic variables seems to require some amount
of instability at the microscopic level, but not necessarily the very strict
form of instability, of sensitive dependence on initial conditions, which we've
come to call "chaos". Castiglione et al. present a very nice review
of the definitions of chaos, the usual measures of chaos
(Lyapunov
exponents
and KolmogorovSinai
entropy rate), and "finitesize" or nonasymptotic analogs, in the course
of arguing that microscopic chaos is neither necessary nor sufficient for the
applicability of statistical mechanics.
 The singleassemblage viewpoint on statistical mechanics has often
emphasized
ergodicity, but
Castiglione et al. downplay it. The ergodic property, as that has
come to be understood in dynamical systems theory, is both too weak and too
strong to really be useful. It's too weak because it doesn't say anything
about how quickly time averages converge on expectations. (If it's too slow,
it's irrelevant to shortlived creatures like us, but if it's too fast, we
should never be able to observe nonequilibrium behavior!) It's too strong in
that it applies to all integrable functions of the microscopic state,
not just physically relevant ones.
 The focus on large assemblages contrasts with lowdimensional
dynamical systems. Here the authors closely follow the pioneering work of
Khinchin,
pointing out that if one has a noninteracting assemblage of particles and
considers macroscopic observables which add up over molecules (e.g., total
energy), one can prove that they are very close to their expectation values
with very high probability. (This is
a concentration
of measure result, though the authors do not draw connections to that
literature.) This still
holds even when one relaxes noninteraction to weak, shortrange
interaction, and from strictly additive observables to ones where each
microscopic degree of freedom has only a small influence on the total.
(Again, familiar ground for concentration of measure.)
This is a distinctly highdimensional phenomenon, not found in lowdimensional
systems even if very chaotic.
 Putting these two ingredients — some form of randomizing local
instability and highdimensional concentration of measure — it becomes
reasonable to think that something like statistical mechanics, in the
microcanonical ensemble, will work. Moreover, for special systems one can
actually rigorously prove results like Boltzmann's Htheorem in suitable
asymptotic limits. An
interesting largedeviations
argument by De Roeck et
al. suggests that there will generally be an Htheorem when (i) the
macroscopic variables evolve autonomously in the largeassemblage limit, and
(ii) microscopic phasespace volume is conserved. This conclusion is very
congenial to the perspective of this book, but unfortunately the work of De
Roeck et al. is not discussed.
 One feature which pushes this book beyond being just a careful and
judicious defense of the Boltzmannian viewpoint on statistical mechanics is its
treatment of how, generally, one might obtain macroscopic dynamics from
microscopic physics. This begins with an interesting discussion of multiscale
methods for differential equations, as an alternative to the usual series
expansions of perturbation theory. This is then used to give a newtome
perspective on renormalization, and why differences in microscopic dynamics
wash out when it comes to aggregated, macroscopic variables. I found this
material intriguing, but not as fully persuasive as the earlier parts.
 Conor Fitzgerald, The Dogs of Rome
 Mind candy. Police procedural with local color for Rome. (Having an
American protagonist seems like a cheat.)
 Colin Crouch, Making Capitalism Fit for Society
 A plea for an "assertive" rather than a "defensive" social democracy, on
the grounds that social democracy has nothing to be defensive about,
and that the taming of capitalism is urgently necessary. That neoliberalism
has proved to be a combination of a sham and a disaster, I agree; that lots of
his policy proposals, about combining a stronger safety net with more
microeconomic flexibility, are both desirable and possible, I agree. But what
he seems to skip over, completely, is where the power for this
assertion will come from.
 Disclaimer: I've never met Prof. Crouch, but he's a friend of a friend.
 Elliott Kay, Poor Man's Fight
 Mind candy science fiction. Does a good job of presenting the villains
sympathetically, from inside their own minds. Also, props for having the
hero's stepmother make a prediction of how joining the navy will distort the
hero's character, for it coming true, and for the hero realizing it, to his
great discomfort.
 Sequel.
Books to Read While the Algae Grow in Your Fur;
Pleasures of Detection, Portraits of Crime;
Scientifiction and Fantastica;
Writing for Antiquity;
Tales of Our Ancestors;
Physics;
The Progressive Forces
Posted by crshalizi at May 31, 2014 23:59  permanent link
April 30, 2014
Books to Read While the Algae Grow in Your Fur, April 2014
Attention conservation notice: I have no taste.
 Chris Willrich, Scroll of Years
 Mind candy fantasy. The blurbs are over the top, but it is fun and
decidedly betterwritten than average. Mildly orientalist, though in a
respectful mode.
 Matthew Bogart, The Chairs' Hiatus
 Kelly Sue DeConnick and Emma Rios, Pretty Deadly
 Joe Harris, Great Pacific: 1, Trashed; 2, Nation Building
 Brian K. Vaughan and Fiona Staples, Saga, vols. 2 and 3
 L. Frank Weber, Bikini Cowboy
 Terry LaBan, Muktuk Wolfsbreath, Hard Boiled Shaman: The Spirit of Boo
 Comic book mind candy. Muktuk Wolfsbreath is especially
notable for ethnographic accuracy, Pretty Deadly for the gorgeous
art and genuinelymythic weirdness, and Saga for general
awesomeness. (Previously
for Saga.)
 Jeff VanderMeer, Annihilation
 Mind candy, but incredible mind candy. The basic story is a familiar one for
SF: an expedition into an unknown and hostile environment quickly goes
spectacularly awry, as the explorers don't appreciate just how strange that
environment really is. But from there it builds to a gripping story that
combines science fiction about bizarre biology with genuinely creepy horror.
It's Lovecraftian in the best sense, not because it uses the props of
Cthulhiana, but because it gives the feeling of having encountered something
truly, frighteningly alien. (In
contrast.)
 There are two sequels coming out later this year; I've ordered both.
 Adam Christopher, The Burning Dark
 Mind candy: a haunted house story, with a spaceopera setting.
(Selfpresentation.)
 S. Frederick Starr, Lost Enlightenment: Central Asia's Golden Age from the Arab Conquest to Tamerlane
 Starr has been a historian of Central Asia throughout his long professional
career, and like many such, he feels that the region doesn't get enough respect
in world history. This is very much an effort in rectifying that, along the
way depicting medieval Central Asia as a center of the Hellenistic rationalism
which he sees as being the seed of modern science and enlightenment. (It's
pretty unashamedly whiggish history.)
 Starr's Central Asia is urban and mercantile. It should be understood as
the historic network of towns in, very roughly, the basins of
the Amu Darya
and Syr Darya rivers,
or Transoxiana
plus Khorasan and
Khwarezm. Starr argues that this region formed a fairly coherent cultural
area from a very early period, characterized by intensive irrigation, the
cultural and political dominance of urban elites, the importance of
longdistance overland trade (famously but not exclusively, the Silk Road),
and so crossexposure to ideas and religions developed in the betterknown
civilizations of the Mediterranean, the Fertile Crescent, Iran, India and
China. One consequence of this, he suggests, was an interest in systematizing
these traditions, e.g., compiling versions of the Buddhist canon.
 With the coming of Islam, which he depicts as a very drawnout process,
some of these same traditions led to directions like
compiling hadith. Beyond this, the coming of Islam exposed local
intellectuals both to Mulsim religious concepts, to the works of Greek science
and philosophy, and to Indian mathematics and science. (He gives a lot more
emphasis to the Arab and Greek contributions than the Indian.) In his telling,
it was the tension between these which led to the great contributions of the
great figures of medieval Islamic intellectual history. Starr is at pains to
claim as many of these figures for Central Asia as possible, whether by where
they lived and worked, where their families were from, where they trained, or
sometimes where their teachers were from. [0] He even, with some justice,
depicts the rise of the 'Abbasid dynasty as a conquest of Islam by Khurasan.
 Much of the book is accordingly devoted to the history of mathematics,
natural science, philosophy, theology, and belles lettres in Central
Asia, with glances at the fine arts (especially painting and architecture) and
politics (especially royal patronage). This largely takes the form of capsule
biographies of the most important scholars, and sketches of the cities in which
they worked. These seem generally reliable, though there are some
grounds for worry. One is that I can't tell whether Starr is just awkward at
explaining what mathematicians did, or whether he doesn't understand it and is
garbling his sources. The other is that there are places where he definitely
overreaches in claiming influence [1]. Even deducting for these exaggerations
and defects, Starr makes a sound case that there was a long period of time 
as he says, from the Arab conquests to the coming of the Timurids  when
Central Asia was the home to much of the best intellectual activity of the old
world. That this amounted to an "age of Enlightenment" comparable to 17th and
18th century Europe seems another overfond exaggeration.
 What Starr would have liked to produce is something as definitive,
and as revelatory, as Joseph Needham's Science and Civilisation in
China. (He's pretty up front about this.) He knows that he hasn't
gotten there. He can't be blamed for this: even for so extraordinary a figure
as Needham, it was the work of a lifetime, backed by a massive team. Still,
one can hope that his book will help make such an effort more likely. In the
meanwhile, it's a decentlywritten and mostlyaccurate popular history about a
time and place which were once quite important, and have since faded into
undeserved obscurity.
 What the book is doing with blurbs from various reactionary foreignaffairs
pundits, up to and including Henry Kissinger, I couldn't say, though I have
suspicions.
 0: He also feels it necessary to make the
elementary point that writing in Arabic didn't make these men "Arabs",
any more than writing in Latin made contemporary European scholars "Romans". I
will trust his judgment that there are still people who need to hear
this.
 1: E.g., on p. 421, it's baldly asserted that Hume
found Ghazali's arguments against causality "congenial". Now,
the similarity between the two men's arguments have often been pointed
out, and the relevant book of
Ghazali's, The
Incoherence of the Philosophers, was known to the Scholastics in
Latin translation. It's conceivable that Hume encountered a copy he
could have read. Nonetheless, Ghazali's name does not appear, in any
romanization, in
Hume's Treatise of Hume
Nature,
Enquiry Concerning Human Understanding, Enquiry Concerning the Principles of Morals, Dialogues Concerning Natural Religion, or
Essays,
Moral, Political, and Literary. (I have not searched Hume's
complete works.) No other writer on either philosopher, that I am aware of,
suggests either a direct influence or even the transmission of a tradition, as
opposed to a reinvention, and Starr provides no supporting citation or
original evidence.
 Arkady and Boris Strugatsky (trans. Olena Bormashenko), Roadside Picnic
 Mind candy, at the edge of being something greater. Humanity is visited by
ridiculously advanced aliens, who leave behind artifacts which we understand no
more than ants could comprehend the relics of the titular picnic. Owing to
human greed, stupidity, and (it must be said) capitalism, this goes even worse
for us than it would for the ants.
 M. John Harrison, Nova Swing
 Mind candy: noir science fiction, owing a massive debt
to Roadside Picnic.
 Elizabeth Bear, Steles of the Sky
 Conclusion to the trilogy begun
in Range of
Ghosts
and Shattered
Pillars. It is, to my mind, magnificent; all the promise of the
earlier books is fulfilled.
 ObLinkage: Astute comments by Henry Farrell.
 Felix Gilman, The Revolutions
 Historical fantasy set in Edwardian London, and the outer spheres of the
solar system, featuring underemployed young people with literary ambitions,
dueling occult societies, interplanetary romances, and distributed Chinese
rooms.
 Gene Wolfe, The Claw of the Conciliator
 My comments on The Shadow of
the Torturer apply with even greater force.
 Darwyn Cooke, Parker
(1,
2,
3,
4)
 Mind candy: comic book versions of the classic crime novels
by Richard Stark. The pictures
are a nice complement to the highenergy stories about characters with no
morally redeeming qualities whatsoever.
Books to Read While the Algae Grow in Your Fur;
Scientifiction and Fantastica;
Pleasures of Detection, Portraits of Crime;
Afghanistan and Central Asia;
Islam;
Philosophy;
Writing for Antiquity
Posted by crshalizi at April 30, 2014 23:59  permanent link
April 01, 2014
"Proper scoring rules and linear estimating equations in exponential families" (Next Week at the Statistics Seminar)
Attention conservation notice:
Only of interest if you (1) care about estimating complicated statistical
models, and (2) will be in Pittsburgh on Monday.
 Steffen Lauritzen, "Proper scoring rules and linear estimating equations in exponential families"
 Abstract: In models of high complexity, the computational burden involved in calculating the maximum likelihood estimator can be forbidding. Proper scoring rules (Brier 1950, Good 1952, Bregman 1967, de Finetti 1975) such as the logarithmic score, the Brier score, and others, induce natural unbiased estimating equations that generally lead to consistent estimation of unknown parameters. The logarithmic score corresponds to maximum likelihood estimation whereas a score function introduced by Hyvärinen (2005) leads to linear estimation equations for exponential families.
 We shall briefly review the facts about proper scoring rules and their associated divergences, entropy measures, and estimating equations. We show how Hyvärinen's rule leads to particularly simple estimating equations for Gaussian graphical models, including Gaussian graphical models with symmetry.
 The lecture is based on joint work with Philip Dawid, Matthew Parry, and Peter Forbes. For a recent reference see: P. G. M. Forbes and S. Lauritzen (2013), "Linear Estimating Equations for Exponential Families with Application to Gaussian Linear Concentration Models", arXiv:1311.0662.
 Time and place: 45 pm on Monday, 7 April 2014, in 125 Scaife Hall.
Much of what I know
about graphical
models I learned from Prof. Lauritzen's book. His work
on sufficienct statistics and extremal
models, and their connections
to symmetry
and prediction, has shaped how I
think about big chunks of statistics,
including stochastic
processes and networks. I am really looking forward to this.
(To add some commentary purely of my own: I sometimes encounter the idea
that frequentist statistics is somehow completely committed to maximum
likelihood, and has nothing to offer when that fails, as
it sometimes
does [1]. While I can't of course speak for
every frequentist statistician, this seems silly. Frequentism is a family of
ideas about when probability makes sense, and it leads to some ideas about how
to evaluate statistical models and methods, namely, by
their error properties. What
justifies maximum likelihood estimation, from this perspective, is not the
intrinsic inalienable rightness of taking that function and making it big.
Rather, it's that in many situations maximum likelihood converges to the right
answer (consistency), and in a somewhat narrower range will converge as fast as
anything else (efficiency). When those fail, so much the worse for maximum
likelihood; use something else that is consistent. In situations where
maximizing the likelihood has nice mathematical properties but is
computationally intractable, so much the worse for maximum likelihood; use
something else that's consistent and
tractable. Estimation by minimizing a wellbehaved
objective function has many nice features, so when we give up on likelihood
it's reasonable to try minimizing some other
proper
scoring function, but again, there's nothing which says we must.)
[1]: It's not worth my time today
to link to particular examples; I'll just say that from my own reading and
conversation, this opinion is not totally confined to the kind of website which
proves that rule 34
applies even to Bayes's
theorem. ^
Posted by crshalizi at April 01, 2014 10:45  permanent link
March 31, 2014
Books to Read While the Algae Grow in Your Fur, March 2014
Attention conservation notice: I have no taste.
 Brian Michael Bendis, Michael Avon Oeming, et al. Powers
(1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14)
 Comic book mind candy, at the superhero/policeprocedural interface.
 Martha Wells, The Fall of IleRien (The Wizard Hunters, The Ships of Air, The Gate of Gods)
 Mind candy. I read these when they were first published, a decade ago, and
revisited them when they came out as audiobooks. They remain unusually smart
fantasy with refreshingly human protagonists. (Though I think the narrator for
the recordings paused too often, in odd places, to be really effective.)
 Graydon Saunders, The March North
 Mind candy. Readers know I am a huge fan of science fiction and fantasy
(look around you), but there's no denying both genres have a big soft spot for
authoritarianism and feudalism. I picked this up because it promised to be
an egalitarian fantasy novel, and because (back in the Late Bronze Age
of Internet time) I used to like Saunders's posts
on rec.arts.sf.written . I am glad I did: it's not the bestwritten
novel ever, but it's more than competent, it scratches the genre itch, and
Saunders has thought through how peoplepower could work in
a world
where part of the normal life cycle of a wizard would ordinarily be ruling as a
Dark Lord for centuries. (I suspect his solution owes something to
the pike square.) The
setup calls out for sequels, which I would eat up with a spoon.
 Marie Brennan, The Tropic of Serpents: A Memoir by Lady Trent
 Mind candy: further adventures in
the natural history of dragons.
 Franklin
M. Fisher, Disequilibrium Foundations of Equilibrium Economics
 This is a detailed, clear and innovative treatment of what was known in
1983 about the stability of economic equilibrium. (It begins with an argument,
which I find entirely convincing, that this is an important
question, especially for economists who only want to reason about
equilibria.) The last half of the book is a very detailed treatment of a
disequilibrium model of interacting rational agents, and the conditions under
which it will approach a Walrasian (pricesupported) equilibrium. (These
conditions involve nonzero transaction costs, each agent setting its own
prices, and something Fisher calls "No Favorable Surprise", the idea
that unexpected changes never make things better.)
Remarkably, Fisher's model recovers such obvious features of the real world as
(i) money existing and being useful, and (ii) agents continue to trade for as
long as they live, rather than going through a spurt of exchange at the dawn of
time and then never trading again. It's a tour de force, especially
because of the clarity of the writing. I wish I'd read it long ago.
 Fisher has a 2010 paper, reflecting on the state of the art a few years
ago: "The Stability of General
Equilibrium  What Do We Know and Why Is It Important?".
 — One disappointment with this approach: Fisher doesn't consider the
possibility that aggregated variables might be in equilibrium, even
though
individuals are not in "personal equilibrium". E.g., prevailing
prices and quantities are stable around some equilibrium values (up to small
fluctuations), even though each individual is constantly perpetually
alternating between seizing on arbitrage opportunities and being frustrated in
their plans. This is more or less the gap that is being filled
by Herb Gintis and collaborators
in their recent work
(1,
2,
3,
4).
Gintis et al. also emphasize the importance of agents setting their
own prices, rather than having a centralized auctioneer decree a single binding
price vector.
 The Borgias (1, 2, 3)
 Mind candy. Fun in its own way, but I'm disappointed that they made the
politics less treacherous and backstabbing than it actually was.
(Also, there's a chunk of anachronistic yearning for equal rights and repulsion
over corruption.)
 ObLinkage: Comparative appreciation
by a historian of the Renaissance. (I haven't seen the rival series.)
 House of Cards
 Mind candy. Rather to my surprise, I enjoyed this at least as much as the
original. (I will not comment on the economics of the second season.)
 Homeland
(1, 2,
3)
 Mind candy. Wellacted, but I find the politics very dubious. In
particular, "let the CIA do whatever it wants in Iran" is a big part of how we got into this mess.
 Dexter
(5, 6, 7
and 8)
 Mind candy. Not as good as the first four seasons, but then, very little
is. I refuse to believe the ending. Previously: 2, 3, 4.
 Allie Brosh, Hyperbole and a Half: Unfortunate Situations, Flawed Coping Mechanisms, Mayhem, and Other Things That Happened
 As a sentient being with a working Internet connection, you are aware that
Brosh is one of this age's greatest writers
on moral
psychology
(and dogs).
This is a collection of some of her best pieces.
 John Dollard, Caste and Class in a Southern Town
 The easy way to read this ethnography of "Southerntown"
(= Indianola,
Miss.) in 1936 would be as a horror story, with a complacent "thankfully
we're not like that" air. (At least, it would be easy for me to read it so.)
Better, I think, to realize both how horrifying this was, and to reflect on
what injustices are similarly woven into my own way of life...
 Seanan
McGuire, HalfOff
Ragnarok
 Mind candy: the travails of a mildmanner Ohio cryptozoologist. (I think
it could be enjoyed without the earlier books in the series.)
Previously: 1, 2.
Books to Read While the Algae Grow in Your Fur;
Scientifiction and Fantastica;
The Beloved Republic;
Commit a Social Science;
The Dismal Science;
Linkage
Posted by crshalizi at March 31, 2014 23:59  permanent link
