May 30, 2012

In Soviet Union, Optimization Problem Solves You

Attention conservation notice: Over 7800 words about optimal planning for a socialist economy and its intersection with computational complexity theory. This is about as relevant to the world around us as debating whether a devotee of the Olypmian gods should approve of transgenic organisms. (Or: centaurs, yes or no?) Contains mathematical symbols but no actual math, and uses Red Plenty mostly as a launching point for a tangent.

Cross-posted at Crooked Timber, as part of the seminar on Red Plenty. That version has uglier math, but allows comments.

For ZMS.

There's lots to say about Red Plenty as a work of literature; I won't do so. It's basically a work of speculative fiction, where one of the primary pleasures is having a strange world unfold in the reader's mind. More than that, it's a work of science fiction, where the strangeness of the world comes from its being reshaped by technology and scientific ideas --- here, mathematical and economic ideas.

Red Plenty is also (what is a rather different thing) a work of scientist fiction, about the creative travails of scientists. The early chapter, where linear programming breaks in upon the Kantorovich character, is one of the most true-to-life depictions I've encountered of the experiences of mathematical inspiration and mathematical work. (Nothing I will ever do will be remotely as important or beautiful as what the real Kantorovich did, of course.) An essential part of that chapter, though, is the way the thoughts of the Kantorovich character split between his profound idea, his idealistic political musings, and his scheming about how to cadge some shoes, all blind to the incongruities and ironies.

It should be clear by this point that I loved Red Plenty as a book, but I am so much in its target demographic1 that it's not even funny. My enthusing about it further would not therefore help others, so I will, to make better use of our limited time, talk instead about the central idea, the dream of the optimal planned economy.

That dream did not come true, but it never even came close to being implemented; strong forces blocked that, forces which Red Plenty describes vividly. But could it even have been tried? Should it have been?

"The Basic Problem of Industrial Planning"

Let's think about what would have to have gone in to planning in the manner of Kantorovich.

I. We need a quantity to maximize. This objective function has to be a function of the quantities of all the different goods (and services) produced by our economic system.
Here "objective" is used in the sense of "goal", not in the sense of "factual". In Kantorovich's world, the objective function is linear, just a weighted sum of the output levels. Those weights tell us about trade-offs: we will accept getting one less bed-sheet (queen-size, cotton, light blue, thin, fine-weave) if it lets us make so many more diapers (cloth, unbleached, re-usable), or this many more lab coats (men's, size XL, non-flame-retardant), or for that matter such-and-such an extra quantity of toothpaste. In other words, we need to begin our planning exercise with relative weights. If you don't want to call these "values" or "prices", I won't insist, but the planning exercise has to begin with them, because they're what the function being optimized is built from.
It's worth remarking that in Best Use of Economic Resources, Kantorovich side-stepped this problem by a device which has "all the advantages of theft over honest toil". Namely, he posed only the problem of maximizing the production of a "given assortment" of goods --- the planners have fixed on a ratio of sheets to diapers (and everything else) to be produced, and want the most that can be coaxed out of the inputs while keeping those ratios. This doesn't really remove the difficulty: either the planners have to decide on relative values, or they have to decide on the ratios in the "given assortment".
Equivalently, the planners could fix the desired output, and try to minimize the resources required. Then, again, they must fix relative weights for resources (cotton fiber, blue dye #1, blue dye #2, bleach, water [potable], water [distilled], time on machine #1, time on machine #2, labor time [unskilled], labor time [skilled, sewing], electric power...). In some contexts these might be physically comparable units. (The first linear programming problem I was ever posed was to work out a diet which will give astronauts all the nutrients they need from a minimum mass of food.) In a market system these would be relative prices of factors of production. Maintaining a "given assortment" (fixed proportions) of resources used seems even less reasonable than maintaining a "given assortment" of outputs, but I suppose we could do it.
For now (I'll come back to this), assume the objective function is given somehow, and is not to be argued with.
IIA. We need complete and accurate knowledge of all the physical constraints on the economy, the resources available to it.
IIB. We need complete and accurate knowledge of the productive capacities of the economy, the ways in which it can convert inputs to outputs.
(IIA) and (IIB) require us to disaggregate all the goods (and services) of the economy to the point where everything inside each category is substitutable. Moreover, if different parts of our physical or organizational "plant" have different technical capacities, that needs to be taken into account, or the results can be decidedly sub-optimal. (Kantorovich actually emphasizes this need to disaggregate in Best Use, by way of scoring points against Leontief. The numbers in the latter's input-output matrices, Kantorovich says, are aggregated over huge swathes of the economy, and so far too crude to be actually useful for planning.) This is, to belabor the obvious, a huge amount of information to gather.
(It's worth remarking at this point that "inputs" and "constraints" can be understood very broadly. For instance, there is nothing in the formalism which keeps it from including constraints on how much the production process is allowed to pollute the environment. The shadow prices enforcing those constraints would indicate how much production could be increased if marginally more pollution were allowed. This wasn't, so far as I know, a concern of the Soviet economists, but it's the logic behind cap-and-trade institutions for controlling pollution.)
Subsequent work in optimization theory lets us get away, a bit, from requiring complete and perfectly accurate knowledge in stage (II). If our knowledge is distorted by merely unbiased statistical error, we could settle for stochastic optimization, which runs some risk of being badly wrong (if the noise is large), but at least does well on average. We still need this unbiased knowledge about everything, however, and aggregation is still a recipe for distortions.
More serious is the problem that people will straight-up lie to the planners about resources and technical capacities, for reasons which Spufford dramatizes nicely. There is no good mathematical way of dealing with.
III. For Kantorovich, the objective function from (I) and the constraints and production technology from (II) must be linear.
Nonlinear optimization is possible, and I will come back to it, but it rarely makes things easier.
IV. Computing time must be not just too cheap to meter, but genuinely immense.
It is this point which I want to elaborate on, because it is a mathematical rather than a practical difficulty.

"Numerical Methods for the Solution of Problems of Optimal Planning"

It was no accident that mathematical optimization went hand-in-hand with automated computing. There's little point to reasoning abstractly about optima if you can't actually find them, and finding an optimum is a computational task. We pose a problem (find the plan which maximizes this objective function subject to these constraints), and want not just a solution, but a method which will continue to deliver solutions even as the problem posed is varied. We need an algorithm.

Computer science, which is not really so much a science as a branch of mathematical engineering, studies questions like this. A huge and profoundly important division of computer science, the theory of computational complexity, concerns itself with understanding what resources algorithms require to work. Those resources may take many forms: memory to store intermediate results, samples for statistical problems, communication between cooperative problem-solvers. The most basic resource is time, measured not in seconds but in operations of the computer. This is something Spufford dances around, in II.2: "Here's the power of the machine: that having broken arithmetic down into tiny idiot steps, it can then execute those steps at inhuman speed, forever." But how many steps? If it needs enough steps, then even inhuman speed is useless for human purposes...

The way computational complexity theory works is that it establishes some reasonable measure of the size of an instance of a problem, and then asks how much time is absolutely required to produce a solution. There can be several aspects of "size"; there are three natural ones for linear programming problems. One is the number of variables being optimized over, say \( n \). The second is the number of constraints on the optimization, say \( m \). The third is the amount of approximation we are willing to tolerate in a solution --- we demand that it come within \( \epsilon \) of the optimum, and that if any constraints are violated it is also by no more than \( \epsilon \). Presumably optimizing many variables ( \( n \gg 1 \) ), subject to many constraints ( \( m \gg 1 \) ), to a high degree of approximation ( \( \epsilon \ll 1 \) ), is going to take more time than optimizing a few variables ( \( n \approx 1 \) ), with a handful of constraints ( \( m \approx 1 \) ), and accepting a lot of slop ( \( \epsilon \approx 1 \) ). How much, exactly?

The fastest known algorithms for solving linear programming problems are what are called "interior point" methods. These are extremely ingenious pieces of engineer, useful not just for linear programming but a wider class of problems called "convex programming". Since the 1980s they have revolutionized numerical optimization, and are, not so coincidentally, among the intellectual children of Kantorovich (and Dantzig). The best guarantees about the number of "idiot steps" (arithmetic operations) they need to solve a linear programming problem with such algorithms is that it's proportional to \[ (m+n)^{3/2} n^2 \log{1/\epsilon} \] (I am simplifying just a bit; see sec. 4.6.1 of Ben-Tal and Nemirovski's Lectures on Modern Convex Optimization [PDF].)

Truly intractable optimization problems — of which there are many — are ones where the number of steps needed grow exponentially 2. If linear programming was in this "complexity class", it would be truly dire news, but it's not. The complexity of the calculation grows only polynomially with \( n \), so it falls in the class theorists are accustomed to regarding as "tractable". But the complexity still grows super-linearly, like \( n^{3.5} \). Expanding the problem size by a factor of a thousand takes us not a thousand times as long, but about 30 billion times as long. Where does this leave us?

A good modern commercial linear programming package can handle a problem with 12 or 13 million variables in a few minutes on a desktop machine. Let's be generous and push this down to 1 second. (Or let's hope that Moore's Law rule-of-thumb has six or eight iterations left, and wait a decade.) To handle a problem with 12 or 13 billion variables then would take about 30 billion seconds, or roughly a thousand years.

Naturally, I have a reason for mentioning 12 million variables:

In the USSR at this time [1983] there are 12 million identifiably different products (disaggregated down to specific types of ball-bearings, designs of cloth, size of brown shoes, and so on). There are close to 50,000 industrial establishments, plus, of course, thousands of construction enterprises, transport undertakings, collective and state forms, wholesaling organs and retail outlets.
-- Alec Nove, The Economics of Feasible Socialism (p. 36 of the revised [1991] edition; Nove's italics)
This 12 million figure will conceal variations in quality; and it is not clear to me, even after tracking down Nove's sources, whether it included the provision of services, which are a necessary part of any economy.

Let's say it's just twelve million. Even if the USSR could never have invented a modern computer running a good LP solver, if someone had given it one, couldn't Gosplan have done its work in a matter of minutes? Maybe an hour, to look at some alternative plans?

No. The difficulty is that there aren't merely 12 million variables to optimize over, but rather many more. We need to distinguish between a "coat, winter, men's, part-silk lining, wool worsted tricot, cloth group 29--32" in Smolensk from one in Moscow. If we don't "index" physical goods by location this way, our plan won't account for the need for transport properly, and things simply won't be where they're needed; Kantorovich said as much under the heading of "the problem of a production complex". (Goods which can spoil, or are needed at particular occasions and neither earlier nor later, should also be indexed by time; Kantorovich's "dynamic problem") A thousand locations would be very conservative, but even that factor would get us into the regime where it would take us a thousand years to work through a single plan. With 12 million kinds of goods and only a thousand locations, to have the plan ready in less than a year would need computers a thousand times faster.

This is not altogether unanticipated by Red Plenty:

A beautiful paper at the end of last year had skewered Academician Glushkov's hypercentralized rival scheme for an all-seeing, all-knowing computer which would rule the physical economy directly, with no need for money. The author had simply calculated how long it would take the best machine presently available to execute the needful program, if the Soviet economy were taken to be a system of equations with fifty million variables and five million constraints. Round about a hundred million years, was the answer. Beautiful. So the only game in town, now, was their own civilised, decentralized idea for optimal pricing, in which shadow prices calculated from opportunity costs would harmonise the plan without anyone needing to possess impossibly complete information. [V.2]

This alternative vision, the one which Spufford depicts those around Kantorovich as pushing, was to find the shadow prices needed to optimize, fix the monetary prices to track the shadow prices, and then let individuals or firms buy and sell as they wish, so long as they are within their budgets and adhere to those prices. The planners needn't govern men, nor even administer things, but only set prices. Does this, however, actually set the planners a more tractable, a less computationally-complex, problem?

So far as our current knowledge goes, no. Computing optimal prices turns out to have the same complexity as computing the optimal plan itself 3. It is (so far as I know) conceivable that there is some short-cut to computing prices alone, but we have no tractable way of doing that yet. Anyone who wants to advocate this needs to show that it is possible, not just hope piously.

How then might we escape?

It will not do to say that it's enough for the planners to approximate the optimal plan, with some dark asides about the imperfections of actually-existing capitalism thrown into the mix. The computational complexity formula I quoted above already allows for only needing to come close to the optimum. Worse, the complexity depends only very slowly, logarithmically, on the approximation to the optimum, so accepting a bit more slop buys us only a very slight savings in computation time. (The optimistic spin is that if we can do the calculations at all, we can come quite close to the optimum.) This route is blocked.

Another route would use the idea that the formula I've quoted is only an upper bound, the time required to solve an arbitrary linear programming problem. The problems set by economic planning might, however, have some special structure which could be exploited to find solutions faster. What might that structure be?

The most plausible candidate is to look for problems which are "separable", where the constraints create very few connections among the variables. If we could divide the variables into two sets which had nothing at all to do with each other, then we could solve each sub-problem separately, at tremendous savings in time. The supra-linear, \( n^{3.5} \) scaling would apply only within each sub-problem. We could get the optimal prices (or optimal plans) just by concatenating the solutions to sub-problems, with no extra work on our part.

Unfortunately, as Lenin is supposed to have said, "everything is connected to everything else". If nothing else, labor is both required for all production, and is in finite supply, creating coupling between all spheres of the economy. (Labor is not actually extra special here, but it is traditional4.) A national economy simply does not break up into so many separate, non-communicating spheres which could be optimized independently.

So long as we are thinking like computer programmers, however, we might try a desperately crude hack, and just ignore all kinds of interdependencies between variables. If we did that, if we pretended that the over-all high-dimensional economic planning problem could be split into many separate low-dimensional problems, then we could speed things up immensely, by exploiting parallelism or distributed processing. An actually-existing algorithm, on actually-existing hardware, could solve each problem on its own, ignoring the effect on the others, in a reasonable amount of time. As computing power grows, the supra-linear complexity of each planning sub-problem becomes less of an issue, and so we could be less aggressive in ignoring couplings.

At this point, each processor is something very much like a firm, with a scope dictated by information-processing power, and the mis-matches introduced by their ignoring each other in their own optimization is something very much like "the anarchy of the market". I qualify with "very much like", because there are probably lots of institutional forms these could take, some of which will not look much like actually existing capitalism. (At the very least the firm-ish entities could be publicly owned, by the state, Roemeresque stock-market socialism, workers' cooperatives, or indeed other forms.)

Forcing each processor to take some account of what the others are doing, through prices and quantities in markets, removes some of the grosser pathologies. (If you're a physicist, you think of this as weak coupling; if you're a computer programmer, it's a restricted interface.) But it won't, in general, provide enough of a communication channel to actually compute the prices swiftly — at least not if we want one set of prices, available to all. Rob Axtell, in a really remarkable paper, shows that bilateral exchange can come within \( \epsilon \) of an equilibrium set of prices in a time proportional to \( n^2 \log{1/\epsilon} \), which is much faster than any known centralized scheme. But the equilibrium reached in this way has a lot of strange, ill-controlled properties that the centralized equilibrium doesn't (it's path-dependent, for starters). In any event, this is a market economy, not a planned one.

Now, we might hope that yet faster algorithms will be found, ones which would, say, push the complexity down from cubic in \( n \) to merely linear. There are lower bounds on the complexity of optimization problems which suggest we could never hope to push it below that. No such algorithms exist, and we don't have any good reason to think that they do. We also have no reason to think that alternative computing methods would lead to such a speed-up5.

I said before that increasing the number of variables by a factor of 1000 increases the time needed by a factor of about 30 billion. To cancel this out would need a computer about 30 billion times faster, which would need about 35 doublings of computing speed, taking, if Moore's rule-of-thumb continues to hold, another half century. But my factor of 1000 for prices was quite arbitrary; if it's really more like a million, then we're talking about increasing the computation by a factor of 1021 (a more-than-astronomical, rather a chemical, increase), which is just under 70 doublings, or just over a century of Moore's Law.

If someone like Iain Banks or Ken MacLeod wants to write a novel where they say that the optimal planned economy will become technically tractable sometime around the early 22nd century, then I will read it eagerly. As a serious piece of prognostication, however, this is the kind of thinking which leads to "where's my jet-pack?" ranting on the part of geeks of a certain age.

Nonlinearity and Nonconvexity

In linear programming, all the constraints facing the planner, including those representing the available technologies of production, are linear. Economically, this means constant returns to scale: the factory need put no more, and no less, resources into its 10,000th pair of diapers as into its 20,000th, or its first.

Mathematically, the linear constraints on production are a special case of convex constraints. If a constraint is convex, then if we have two plans which satisfy it, so would any intermediate plan in between those extremes. (If plan A calls for 10,000 diapers and 2,000 towels, and plan B calls for 2,000 diapers and 10,000 towels, we could do half of plan A and half of plan B, make 6,000 diapers and 6,000 towels, and not run up against the constraints.) Not all convex constraints are linear; in convex programming, we relax linear programming to just require convex constraints. Economically, this corresponds to allowing decreasing returns to scale, where the 10,000 pair of diapers is indeed more expensive than the 9,999th, or the first.

Computationally, it turns out that the same "interior-point" algorithms which bring large linear-programming problems within reach also work on general convex programming problems. Convex programming is more computationally complex than linear programming, but not radically so.

Unfortunately for the planners, increasing returns to scale in production mean non-convex constraints; and increasing returns are very common, if only from fixed costs. If the plan calls for regular flights from Moscow to Novosibirsk, each flight has a fixed minimum cost, no matter how much or how little the plane carries. (Fuel; the labor of pilots, mechanics, and air-traffic controllers; wear and tear on the plane; wear and tear on runways; the lost opportunity of using the plane for something else.) Similarly for optimization software (you can't make any copies of the program without first expending the programmers' labor, and the computer time they need to write and debug the code). Or academic papers, or for that matter running an assembly line or a steel mill. In all of these cases, you just can't smoothly interpolate between plans which have these outputs and ones which don't. You must pay at least the fixed cost to get any output at all, which is non-convexity. And there are other sources of increasing returns, beyond fixed costs.

This is bad news for the planners, because there are no general-purpose algorithms for optimizing under non-convex constraints. Non-convex programming isn't roughly as tractable as linear programming, it's generally quite intractable. Again, the kinds of non-convexity which economic planners would confront might, conceivably, universally turn out to be especially benign, so everything becomes tractable again, but why should we think that?

If it's any consolation, allowing non-convexity messes up the markets-are-always-optimal theorems of neo-classical/bourgeois economics, too. (This illustrates Stiglitz's contention that if the neo-classicals were right about how capitalism works, Kantorovich-style socialism would have been perfectly viable.) Markets with non-convex production are apt to see things like monopolies, or at least monopolistic competition, path dependence, and, actual profits and power. (My university owes its existence to Mr. Carnegie's luck, skill, and ruthlessness in exploiting the non-convexities of making steel.) Somehow, I do not think that this will be much consolation.

The Given Assortment, and Planner's Preferences

So far I have been assuming, for the sake of argument, that the planners can take their objective function as given. There does need to be some such function, because otherwise it becomes hard to impossible to chose between competing plans which are all technically feasible. It's easy to say "more stuff is better than less stuff", but at some point more towels means fewer diapers, and then the planners have to decide how to trade off among different goods. If we take desired output as fixed and try to minimize inputs, the same difficulty arises (is it better to use so less cotton fiber if it requires this much more plastic?), so I will just stick with the maximization version.

For the capitalist or even market-socialist firm, there is in principle a simple objective function: profit, measured in dollars, or whatever else the local unit of account is. (I say "in principle" because a firm isn't a unified actor with coherent goals like "maximize profits"; to the extent it acts like one, that's an achievement of organizational social engineering.) The firm can say how many extra diapers it would have to sell to be worth selling one less towel, because it can look at how much money it would make. To the extent that it can take its sales prices as fixed, and can sell as much as it can make, it's even reasonable for it to treat its objective function as linear.

But what about the planners? Even if they wanted to just look at the profit (value added) of the whole economy, they get to set the prices of consumption goods, which in turn set the (shadow) prices of inputs to production. (The rule "maximize the objective function" does not help pick an objective function.) In any case, profits are money, i.e., claims, through exchange, on goods and services produced by others. It makes no sense for the goal of the economy, as a whole, to be to maximize its claims on itself.

As I mentioned, Kantorovich had a way of evading this, which was clever if not ultimately satisfactory. He imagined the goal of the planners to be to maximize the production of a "given assortment" of goods. This means that the desired ratio of goods to be produced is fixed (three diapers for every towel), and the planners just need to maximize production at this ratio. This only pushes back the problem by one step, to deciding on the "given assortment".

We are pushed back, inevitably, to the planners having to make choices which express preferences or (in a different sense of the word) values. Or, said another way, there are values or preferences — what Nove called "planners' preferences" — implicit in any choice of objective function. This raises both a cognitive or computational problem, and at least two different political problems.

The cognitive or computational problem is that of simply coming up with relative preferences or weights over all the goods in the economy, indexed by space and time. (Remember we need such indexing to handle transport and sequencing.) Any one human planner would simply have to make up most of these, or generate them according to some arbitrarily rule. To do otherwise is simply beyond the bounds of humanity. A group of planners might do better, but it would still be an immense amount of work, with knotty problems of how to divide the labor of assigning values, and a large measure of arbitrariness.

Which brings us to the first of the two political problems. The objective function in the plan is an expression of values or preferences, and people have different preferences. How are these to be reconciled?

There are many institutions which try to reconcile or adjust divergent values. This is a problem of social choice, and subject to all the usual pathologies and paradoxes of social choice. There is no universally satisfactory mechanism for making such choices. One could imagine democratic debate and voting over plans, but the sheer complexity of plans, once again, makes it very hard for members of the demos to make up their minds about competing plans, or how plans might be changed. Every citizen is put in the position of the solitary planner, except that they must listen to each other.

Citizens (or their representatives) might debate about, and vote over, highly aggregated summaries of various plans. But then the planning apparatus has to dis-aggregate, has to fill in the details left unfixed by the democratic process. (What gets voted on is a compressed encoding of the actual plan, for which the apparatus is the decoder.) I am not worried so much that citizens are not therefore debating about exactly what the plan is. Under uncertainty, especially uncertainty from complexity, no decision-maker understands the full consequences of their actions. What disturbs me about this is that filling in those details in the plan is just as much driven by values and preferences as making choices about the aggregated aspects. We have not actually given the planning apparatus a tractable technical problem (cf.).

Dictatorship might seem to resolve the difficulty, but doesn't. The dictator is, after all, just a single human being. He (and I use the pronoun deliberately) has no more ability to come up with real preferences over everything in the economy than any other person. (Thus Ashby's "law of requisite variety" strikes again.) He can, and must, delegate details to the planning apparatus, but that doesn't help the planners figure out what to do. I would even contend that he is in a worse situation than the demos when it comes to designing the planning apparatus, or figuring out what he wants to decide directly, and what he wants to delegate, but that's a separate argument. The collective dictatorship of the party, assuming anyone wanted to revive that nonsense, would only seem to give the worst of both worlds.

I do not have a knock-down proof that there is no good way of evading the problem of planners' preferences. Maybe there is some way to improve democratic procedures or bureaucratic organization to turn the trick. But any such escape is, now, entirely conjectural. In its absence, if decisions must be made, they will get made, but through the sort of internal negotiation, arbitrariness and favoritism which Spufford depicts in the Soviet planning apparatus.

This brings us to the second political problem. Even if everyone agrees on the plan, and the plan is actually perfectly implemented, there is every reason to think that people will not be happy with the outcome. They're making guesses about what they actually want and need, and they are making guesses about the implications of fulfilling those desires. We don't have to go into "Monkey's Paw" territory to realize that getting what you think you want can prove thoroughly unacceptable; it's a fact of life, which doesn't disappear in economics. And not everyone is going to agree on the plan, which will not be perfectly implemented. (Nothing is ever perfectly implemented.) These are all signs of how even the "optimal" plan can be improved, and ignoring them is idiotic.

We need then some systematic way for the citizens to provide feedback on the plan, as it is realized. There are many, many things to be said against the market system, but it is a mechanism for providing feedback from users to producers, and for propagating that feedback through the whole economy, without anyone having to explicitly track that information. This is a point which both Hayek, and Lange (before the war) got very much right. The feedback needn't be just or even mainly through prices; quantities (especially inventories) can sometimes work just as well. But what sells and what doesn't is the essential feedback.

It's worth mentioning that this is a point which Trotsky got right. (I should perhaps write that "even Trotsky sometimes got right".) To repeat a quotation:

The innumerable living participants in the economy, state and private, collective and individual, must serve notice of their needs and of their relative strength not only through the statistical determinations of plan commissions but by the direct pressure of supply and demand. The plan is checked and, to a considerable degree, realized through the market.

It is conceivable that there is some alternative feedback mechanism which is as rich, adaptive, and easy to use as the market but is not the market, not even in a disguised form. Nobody has proposed such a thing.

Errors of the Bourgeois Economists

Both neo-classical and Austrian economists make a fetish (in several senses) of markets and market prices. That this is crazy is reflected in the fact that even under capitalism, immense areas of the economy are not coordinated through the market. There is a great passage from Herbert Simon in 1991 which is relevant here:

Suppose that ["a mythical visitor from Mars"] approaches the Earth from space, equipped with a telescope that revels social structures. The firms reveal themselves, say, as solid green areas with faint interior contours marking out divisions and departments. Market transactions show as red lines connecting firms, forming a network in the spaces between them. Within firms (and perhaps even between them) the approaching visitor also sees pale blue lines, the lines of authority connecting bosses with various levels of workers. As our visitors looked more carefully at the scene beneath, it might see one of the green masses divide, as a firm divested itself of one of its divisions. Or it might see one green object gobble up another. At this distance, the departing golden parachutes would probably not be visible.

No matter whether our visitor approached the United States or the Soviet Union, urban China or the European Community, the greater part of the space below it would be within green areas, for almost all of the inhabitants would be employees, hence inside the firm boundaries. Organizations would be the dominant feature of the landscape. A message sent back home, describing the scene, would speak of "large green areas interconnected by red lines." It would not likely speak of "a network of red lines connecting green spots."6

This is not just because the market revolution has not been pushed far enough. ("One effort more, shareholders, if you would be libertarians!") The conditions under which equilibrium prices really are all a decision-maker needs to know, and really are sufficient for coordination, are so extreme as to be absurd. (Stiglitz is good on some of the failure modes.) Even if they hold, the market only lets people "serve notice of their needs and of their relative strength" up to a limit set by how much money they have. This is why careful economists talk about balancing supply and "effective" demand, demand backed by money.

This is just as much an implicit choice of values as handing the planners an objective function and letting them fire up their optimization algorithm. Those values are not pretty. They are that the whims of the rich matter more than the needs of the poor; that it is more important to keep bond traders in strippers and cocaine than feed hungry children. At the extreme, the market literally starves people to death, because feeding them is a less "efficient" use of food than helping rich people eat more.

I don't think this sort of pathology is intrinsic to market exchange; it comes from market exchange plus gross inequality. If we want markets to signal supply and demand (not just tautological "effective demand"), then we want to ensure not just that everyone has access to the market, but also that they have (roughly) comparable amounts of money to spend. There is, in other words, a strong case to be made for egalitarian distributions of resources being a complement to market allocation. Politically, however, good luck getting those to go together.

We are left in an uncomfortable position. Turning everything over to the market is not really an option. Beyond the repulsiveness of the values it embodies, markets in areas like health care or information goods are always inefficient (over and above the usual impossibility of informationally-efficient prices). Moreover, working through the market imposes its own costs (time and effort in searching out information about prices and qualities, negotiating deals, etc.), and these costs can be very large. This is one reason (among others) why Simon's Martian sees such large green regions in the capitalist countries — why actually-existing capitalism is at least as much an organizational as a market economy.

Planning is certainly possible within limited domains — at least if we can get good data to the planners — and those limits will expand as computing power grows. But planning is only possible within those domains because making money gives firms (or firm-like entities) an objective function which is both unambiguous and blinkered. Planning for the whole economy would, under the most favorable possible assumptions, be intractable for the foreseeable future, and deciding on a plan runs into difficulties we have no idea how to solve. The sort of efficient planned economy dreamed of by the characters in Red Plenty is something we have no clue of how to bring about, even if we were willing to accept dictatorship to do so.

That planning is not a viable alternative to capitalism (as opposed to a tool within it) should disturb even capitalism's most ardent partisans. It means that their system faces no competition, nor even any plausible threat of competition. Those partisans themselves should be able to say what will happen then: the masters of the system, will be tempted, and more than tempted, to claim more and more of what it produces as monopoly rents. This does not end happily.

Calling the Tune for the Dance of Commodities

There is a passage in Red Plenty which is central to describing both the nightmare from which we are trying to awake, and vision we are trying to awake into. Henry has quoted it already, but it bears repeating.

Marx had drawn a nightmare picture of what happened to human life under capitalism, when everything was produced only in order to be exchanged; when true qualities and uses dropped away, and the human power of making and doing itself became only an object to be traded. Then the makers and the things made turned alike into commodities, and the motion of society turned into a kind of zombie dance, a grim cavorting whirl in which objects and people blurred together till the objects were half alive and the people were half dead. Stock-market prices acted back upon the world as if they were independent powers, requiring factories to be opened or closed, real human beings to work or rest, hurry or dawdle; and they, having given the transfusion that made the stock prices come alive, felt their flesh go cold and impersonal on them, mere mechanisms for chunking out the man-hours. Living money and dying humans, metal as tender as skin and skin as hard as metal, taking hands, and dancing round, and round, and round, with no way ever of stopping; the quickened and the deadened, whirling on. ... And what would be the alternative? The consciously arranged alternative? A dance of another nature, Emil presumed. A dance to the music of use, where every step fulfilled some real need, did some tangible good, and no matter how fast the dancers spun, they moved easily, because they moved to a human measure, intelligible to all, chosen by all.

There is a fundamental level at which Marx's nightmare vision is right: capitalism, the market system, whatever you want to call it, is a product of humanity, but each and every one of us confronts it as an autonomous and deeply alien force. Its ends, to the limited and debatable extent that it can even be understood as having them, are simply inhuman. The ideology of the market tell us that we face not something inhuman but superhuman, tells us to embrace our inner zombie cyborg and lose ourselves in the dance. One doesn't know whether to laugh or cry or running screaming.

But, and this is I think something Marx did not sufficiently appreciate, human beings confront all the structures which emerge from our massed interactions in this way. A bureaucracy, or even a thoroughly democratic polity of which one is a citizen, can feel, can be, just as much of a cold monster as the market. We have no choice but to live among these alien powers which we create, and to try to direct them to human ends. It is beyond us, it is even beyond all of us, to find "a human measure, intelligible to all, chosen by all", which says how everyone should go. What we can do is try to find the specific ways in which these powers we have conjured up are hurting us, and use them to check each other, or deflect them into better paths. Sometimes this will mean more use of market mechanisms, sometimes it will mean removing some goods and services from market allocation, either through public provision7 or through other institutional arrangements8. Sometimes it will mean expanding the scope of democratic decision-making (for instance, into the insides of firms), and sometimes it will mean narrowing its scope (for instance, not allowing the demos to censor speech it finds objectionable). Sometimes it will mean leaving some tasks to experts, deferring to the internal norms of their professions, and sometimes it will mean recognizing claims of expertise to be mere assertions of authority, to be resisted or countered.

These are all going to be complex problems, full of messy compromises. Attaining even second best solutions is going to demand "bold, persistent experimentation", coupled with a frank recognition that many experiments will just fail, and that even long-settled compromises can, with the passage of time, become confining obstacles. We will not be able to turn everything over to the wise academicians, or even to their computers, but we may, if we are lucky and smart, be able, bit by bit, make a world fit for human beings to live in.

[1] Vaguely lefty? Check. Science fiction reader? Check. Interested in economics? Check. In fact: family tradition of socialism extending to having a relative whose middle name was "Karl Marx"? Check. Gushing Ken MacLeod fan? Check. Learned linear programming at my father's knee as a boy? Check. ^

[2] More exactly, many optimization problems have the property that we can check a proposed solution in polynomial time (these are the class "NP"), but no one has a polynomial-time way to work out a solution from the problem statement (which would put them in the class "P"). If a problem is in NP but not in P, we cannot do drastically better than just systematically go through candidate solutions and check them all. (We can often do a bit better, especially on particular cases, but not drastically better.) Whether there are any such problems, that is whether NP=P, is not known, but it sure seems like it. So while most common optimization problems are in NP, linear and even convex programming are in P. ^

[3]: Most of the relevant work has been done under a slightly different cover --- not determining shadow prices in an optimal plan, but equilibrium prices in Arrow-Debreu model economies. But this is fully applicable to determining shadow prices in the planning system. (Bowles and Gintis: "The basic problem with the Walrasian model in this respect is that it is essentially about allocations and only tangentially about markets — as one of us (Bowles) learned when he noticed that the graduate microeconomics course that he taught at Harvard was easily repackaged as 'The Theory of Economic Planning' at the University of Havana in 1969.") Useful references here are Deng, Papadimitriou and Safra's "On the Complexity of Price Equilibria" [STOC '02. preprint], Condenotti and Varadarajan's "Efficient Computation of Equilibrium Prices for Markets with Leontief Utilities", and Ye's "A path to the Arrow-Debreu competitive market equilibrium". ^

[4]: In the mathematical appendix to Best Use, Kantorovich goes to some length to argue that his objectively determined values are compatible with the labor theory of value, by showing that the o.d. values are proportional to the required labor in the optimal plan. (He begins by assuming away the famous problem of equating different kinds of labor.) A natural question is how seriously this was meant. I have no positive evidence that it wasn't sincere. But, carefully examined, all that he proves is proportionality between o.d. values and the required consumption of the first component of the vector of inputs — and the ordering of inputs is arbitrary. Thus the first component could be any input to the production process, and the same argument would go through, leading to many parallel "theories of value". (There is a certain pre-Socratic charm to imagining proponents of the labor theory of value arguing it out with the water-theorists or electricity-theorists.) It is hard for me to believe that a mathematician of Kantorovich's skill did not see this, suggesting that the discussion was mere ideological cover. It would be interesting to know at what stage in the book's "adventures" this part of the appendix was written. ^

[5]: In particular, there's no reason to think that building a quantum computer would help. This is because, as some people have to keep pointing out, quantum computers don't provide a general exponential speed-up over classical ones. ^

[6]: I strongly recommend reading the whole of this paper, if these matters are at all interesting. One of the most curious features of this little parable was that Simon was red-green color-blind. ^

[7]: Let me be clear about the limits of this. Already, in developed capitalism, such public or near-public goods as the protection of the police and access to elementary schooling are provided universally and at no charge to the user. (Or they are supposed to be, anyway.) Access to these is not regulated by the market. But the inputs needed to provide them are all bought on the market, the labor of teachers and cops very much included. I cannot improve on this point on the discussion in Lindblom's The Market System, so I will just direct you to that (i, ii). ^

[8]: To give a concrete example, neither scientific research nor free software are produced for sale on the market. (This disappoints some aficionados of both.) Again, the inputs are obtained from markets, including labor markets, but the outputs are not sold on them. How far this is a generally-viable strategy for producing informational goods is a very interesting question, which it is quite beyond me to answer. ^

The Dismal Science; The Progressive Forces; Scientifiction and Fantastica

Posted by crshalizi at May 30, 2012 09:19 | permanent link

May 23, 2012

Cognitive Democracy

Attention conservation notice: 8000+ words of political theory by geeks.

For quite a while now, Henry Farrell and I have been worrying at a skein of ideas running between institutions, networks, evolutionary games, democracy, collective cognition, the Internet and inequality. Turning these threads into a single seamless garment is beyond our hopes (and not our style anyway), but we think we've been getting somewhere in at least usefully disentangling them. An upcoming workshop gave us an excuse to set out part (but not all) of the pattern, and so here's a draft. To the extent you like it, thank Henry; the remaining snarls are my fault.

This is a draft, and anyway it's part of the spirit of the piece that feedback would be appropriate. I do not have comments, for my usual reasons, but it's cross-posted at Crooked Timber and The Monkey Cage, and you can always send an e-mail.

(Very Constant Readers will now see the context for a lot of my non-statistical writing over the last few years, including the previous post.)

Cognitive Democracy

Henry Farrell (George Washington University) and Cosma Rohilla Shalizi (Carnegie Mellon University/The Santa Fe Institute)

In this essay, we outline a cognitive approach to democracy. Specifically, we argue that democracy has unique benefits as a form of collective problem solving in that it potentially allows people with highly diverse perspectives to come together in order collectively to solve problems. Democracy can do this better than either markets and hierarchies, because it brings these diverse perceptions into direct contact with each other, allowing forms of learning that are unlikely either through the price mechanism of markets or the hierarchical arrangements of bureaucracy. Furthermore, democracy can, by experimenting, take advantage of novel forms of collective cognition that are facilitated by new media.

Much of what we say is synthetic --- our normative arguments build on both the academic literature (Joshua Cohen's and Josiah Ober's arguments about epistemic democracy; Jack Knight and James Johnson's pragmatist account of the benefits of a radically egalitarian democracy and Elster and Landemore's forthcoming collection on Collective Wisdom), and on arguments by public intellectuals such as Steven Berlin Johnson, Clay Shirky, Tom Slee and Chris Hayes. We also seek to contribute to new debates on the sources of collective wisdom. Throughout, we emphasize the cognitive benefits of democracy, building on important results from cognitive science, from sociology, from machine learning and from network theory.

We start by explaining social institutions should do. Next, we examine sophisticated arguments that have been made in defense of markets (Hayek's theories about catallaxy) and hierarchy (Richard Thaler and Cass Sunstein's "libertarian paternalism") and discuss their inadequacies. The subsequent section lays out our arguments in favor of democracy, illustrating how democratic procedures have cognitive benefits that other social forms do not. The penultimate section discusses how democracy can learn from new forms of collective consensus formation on the Internet, treating these forms not as ideals to be approximated, but as imperfect experiments, whose successes and failures can teach us about the conditions for better decision making; this is part of a broader agenda for cross-disciplinary research involving computer scientists and democratic theorists.

Justifying Social Institutions

What are broad macro-institutions such as politics, markets and hierarchies good for? Different theorists have given very different answers to this question. The dominant tradition in political theory tends to evaluate them in terms of justice --- whether institutions use procedures, or give results, that can be seen as just according to some reasonable normative criterion. Others, perhaps more cynically, have focused on their potential contribution to stability --- whether they produce an acceptable level of social order, which minimizes violence and provides some modicum of predictability. In this essay, we analyze these institutions according to a different criterion. We start with a pragmatist question - whether these institutions are useful in helping us to solve difficult social problems.1

Some of the problems that we face in politics are simple ones (not in the sense that solutions are easy, but in the sense that they are simple to analyze). However, the most vexing problems are usually ones without any very obvious solutions. How do we change legal rules and social norms in order to mitigate the problems of global warming? How do we regulate financial markets so as to minimize the risk of new crises emerging, and limit the harm of those that happen? How do we best encourage the spread of human rights internationally?

These problems are pressing --- yet they are difficult to think about systematically, let alone solve. They all share two important features. First, they are all social problems. That is, they are problems which involve the interaction of large numbers of human beings, with different interests, desires, needs and perspectives. Second, as a result, they are complex problems, in the sense that scholars of complexity understand the term. To borrow Scott Page's (2011, p. 25) definition, they involve "diverse entities that interact in a network or contact structure."2 They are a result of behavior that is difficult to predict, so that consequences to changing behavior are extremely hard to map out in advance. Finding solutions is difficult, and even when we find one, it is hard to know whether it is good in comparison to other possible solutions, let alone the best.

We argue that macro-institutions will best be able to tackle these problems if they have two features. First, they should foster a high degree of direct communication between individuals with diverse viewpoints. This kind of intellectual diversity is crucial to identifying good solutions to complex problems. Second, we argue that they should provide relative equality among affected actors in decision-making processes, so as to prevent socially or politically powerful groups from blocking socially beneficial changes to the detriment of their own particular interests.

We base these contentions on two sets of arguments, one from work on collective problem solving, the other from theories of political power. Both are clarified if we think of the possible solutions to a difficult problem as points on a landscape, where we seek the highest point. Difficult problems present many peaks, solutions that are better than the points close to them. Such landscapes are rugged --- they have some degree of organization, but are not so structured that simple algorithms can quickly find the best solution. There is no guarantee that any particular peak is globally optimal (i.e. the best solution across the entire landscape) rather than locally optimal (the best solution within a smaller subset of the landscape).

Solving a complex problem involves a search across this landscape for the best visible solutions. Individual agents have limited cognitive abilities, and (usually) limited knowledge of the landscape. Both of these make them likely to get stuck at local optima, which may be much worse than even other local peaks, let alone the global optimum. Less abstractly, people may settle for bad solutions, because they do not know better (they cannot perceive other, better solutions), or because they have difficulty in reaching these solutions (e.g. because of coordination problems, or because of the ability of powerful actors to veto possible changes).

Lu Hong and Scott Page (2004) use mathematical models to argue that diversity of viewpoints helps groups find better solutions (higher peaks on the landscape). The intuition is that different individuals, when confronting a problem, "see" different landscapes --- they organize the set of possible solutions in different ways, some of which are useful in identifying good peaks, some of which less so. Very smart individuals (those with many mental tools) have better organized landscapes than less smart individuals, and so are less likely to get trapped at inferior local optima. However, at the group level, diversity of viewpoints matters a lot. Page and Hong find that "diversity trumps ability". Groups with high diversity of internal viewpoints are better able to identify optima than groups composed of much smarter individuals with more homogenous viewpoints. By putting their diverse views together, the former are able to map out more of the landscape and identify possible solutions that would be invisible to groups of individuals with more similar perspectives.

Page and Hong do not model the social processes through which individuals can bring their diverse points of view together into a common framework. However, their arguments surely suggest that actors' different points of view need to be exposed directly to each other, in order to identify the benefits and drawbacks of different points of view, the ways in which viewpoints can be combined to better advantage, and so on. These arguments are supported by a plethora of work in sociology and elsewhere (Burt, Rossman etc). As we explain at length below, some degree of clumping is also beneficial, so so that individuals with divergent viewpoints do not converge too quickly.

The second issue for collective problem solving is more obvious. Even when groups are able to identify good solutions (relatively high peaks in the solution landscape), they may not be able to reach them. In particular, actors who benefit from the status quo (or who would prefer less generally-beneficial solutions) may be able to use political and social power to block movement towards such peaks, and instead compel movement towards solutions that have lower social and greater individual benefits. Research on problem solving typically does not talk about differences in actors' interests, or in actors' ability successfully to pursue their interests. While different individuals initially perceive different aspects of the landscape, researchers assume that once they are able to communicate with each other, they will all agree on how to rank visible solutions from best to worst. But actors may have diverse interests as well as diverse understandings of the world (and the two may indeed be systematically linked). They may even be working in such different landscapes, in terms of personal advantage, that one actor's peak is another's valley, and vice versa. Moreover, actors may differ in their ability to ensure that their interests are prosecuted. Recent work in political theory (Knight 1992, Johnson and Knight 2011), economics (Bowles and Naidu, 2008), political science (Hacker and Pierson 2010) and sociology details how powerful actors may be able to compel weaker ones to accept solutions that are to the advantage of the former, but that have lower overall social benefits.

Here, relative equality of power can have important consequences. Individuals in settings with relatively equal power relations, are, ceteris paribus more likely to converge on solutions with broad social benefits, and less likely to converge on solutions that benefit smaller groups of individuals at the expense of the majority. Furthermore, equal power relations may not only make it easier to converge on "good" solutions when they have been identified, but may stimulate the process of search for such solutions. Participating in the search for solutions and in decision-making demands resources (at a minimum, time), and if those resources are concentrated in a small set of actors, with similar interests and perspectives, the solutions they will find will be fewer and worse than if a wide variety of actors can also search.

With this in mind, we ask whether different macro-institutions are better, or worse at solving the complex problems that confront modern economies and societies. Institutions will tend to do better to the extent that they both (i) bring together people with different perspectives, and (ii) share decision-making power relatively equally. Our arguments are, obviously, quite broad. We do not speak much to the specifics of how macro-institutions work, instead focusing on the broad logics of these different macro-institutions. Furthermore, we do not look at the ways in which our desiderata interact with other reasonable desiderata (such as social stability, justice and so on). Even so, we think that it is worth clarifying the ways in which different institutions can, or cannot, solve complex problems. In recent decades, for example, many scholars and policy makers have devoted time and energy to advocating markets as the way to address social problems that are too complex to be solved by top-down authority. As we show below, markets, to the extent that they imply substantial power inequalities, and increasingly homogenize human relations, are unlikely to possess the virtues attributed to them, though they can have more particular benefits under specific circumstances. Similarly, hierarchy suffers from dramatic informational flaws. This prompts us to reconsider democracy, not for the sake of justice or stability, but as a tool for solving the complex problems faced by modern societies.

Markets and Hierarchies as Ways to Solve Complex Problems

Many scholars and public intellectuals believe that markets or hierarchies provide better ways to solve complex problems than democracy. Advocates of markets usually build on the groundbreaking work of F. A. von Hayek, to argue that market based forms of organization do a better job of eliciting information and putting it to good work than does collective organization. Advocates of hierarchy do not write from any such unified tradition. However, Richard Thaler and Cass Sunstein have recently made a sophisticated case for the benefits of hierarchy. They advocate a combination of top-down mechanism design and institutions designed to guide choices rather than to constrain them - what they call libertarian paternalism - as a way to solve difficult social problems. Hayek's arguments are not the only case for markets, and Thaler and Sunstein's are not the only justification for hierarchy. They are, however, among the best such arguments, and hence provide a good initial way to test the respective benefits of markets, hierarchies and democracies in solving complex problems. If there are better arguments, which do not fall victim to the kinds of problems we point to, we are not aware of them (but would be very happy to be told of them).

Hayek's account of the informational benefits of markets is groundbreaking. Although it builds on the insights of others (particularly Michael Polanyi), it is arguably the first real effort to analyze how social institutions work as information-processors. Hayek reasons as follows. Much of human knowledge (as Polanyi argues) is practical, and cannot be fully articulated ("tacit"). This knowledge is nonetheless crucial to economic life. Hence, if we are to allocate resources well, we must somehow gather this dispersed, fragmentary, informal knowledge, and make it useful.

Hayek is explicit that no one person can know all that is required to allocate resources properly, so there must be a social mechanism for such information processing. Hayek identifies three possible mechanisms: central planning, planning by monopolistic industries, and decentralized planning by individuals. He argues that the first and second of these break down when we take account of the vast amount of tacit knowledge, which cannot be conveyed to any centralized authority. Centralized or semi-centralized planning are especially poor at dealing with the constant flows of major and minor changes through which an economy (or, as Hayek would prefer, a catallaxy) approaches balance. To deal with such changes, we need people to make the necessary decisions on the spot --- but we also need some way to convey the appropriate information about changes in the larger economic system to him or her. The virtue of the price system, for Hayek, is to compress diffuse, even tacit, knowledge about specific changes in specific circumstances into a single index, which can guide individuals as to how they ought respond to changes elsewhere. I do not need to grasp the intimate local knowledge of the farmer who sells me tomatoes in order to decide whether to buy their products. The farmer needs to know the price of fertilizer, not how it is made, or what it could be used for other than tomatoes, or the other uses of the fertilizers' ingredients. (I do not even need to know the price of fertilizer.) The information that we need, to decide whether to buy tomatoes or to buy fertilizer, is conveyed through prices, which may go up or down, depending on the aggregate action of many buyers or suppliers, each working with her own tacit understandings.

This insight is both crucial and beautiful3, yet it has stark limits. It suggests that markets will be best at conveying a particular kind of information about a particular kind of underlying facts, i.e., the relative scarcity of different goods. As Stiglitz (2000) argues, market signals about relative scarcity are always distorted, because prices embed information about many other economically important factors. More importantly, although information about relative scarcity surely helps markets approach some kind of balance, it is little help in solving more complicated social problems, which may depend not on allocating existing stocks of goods in a useful way, given people's dispersed local knowledge, so much as discovering new goods or new forms of allocation. More generally, Hayek's well-known detestation for projects with collective goals lead him systematically to discount the ways in which aggregate knowledge might work to solve collective rather than individual problems.

This is unfortunate. To the extent that markets fulfil Hayek's criteria, and mediate all relevant interactions through the price mechanism, they foreclose other forms of exchange that are more intellectually fruitful. In particular, Hayek's reliance on arguments about inarticulable tacit knowledge mean that he leaves no place for reasoned discourse or the useful exchange of views. In Hayek's markets, people communicate only through prices. The advantage of prices, for Hayek, is that they inform individuals about what others want (or don't want), without requiring anyone to know anything about anyone else's plans or understandings. But there are many useful forms of knowledge that cannot readily be conveyed in this way.

Individuals may learn something about those understandings as a by-product of market interactions. In John Stuart Mill's description:

But the economical advantages of commerce are surpassed in importance by those of its effects which are intellectual and moral. It is hardly possible to overrate the value, in the present low state of human improvement, of placing human beings in contact with persons dissimilar to themselves, and with modes of thought and action unlike those with which they are familiar. Commerce is now what war once was, the principal source of this contact.

However, such contact is largely incidental --- people engage in market activities to buy or to sell to best advantage, not to learn. As markets become purer, in both the Hayekian and neo-classical senses, they produce ever less of the contact between different modes of life that Mill regards as salutary. The resurgence of globalization; the creation of an Internet where people who will only ever know each other by their account names buy and sell from each other; the replacement of local understandings with global standards; all these provide enormous efficiency gains and allow information about supply and demand to flow more smoothly. Yet each of them undermines the Millian benefits of commerce, by making it less likely that individuals with different points of view will have those perspectives directly exposed to each other. More tentatively, markets may themselves have a homogenizing impact on differences between individuals and across societies, again reducing diversity. As Albert Hirschman shows, there is a rich, if not unambiguous, literature on the global consequences of market society. Sociologists such as John Meyer and his colleagues find evidence of increased cultural and social convergence across different national contexts, as a result of exposure to common market and political forces.

In addition, it is unclear whether markets in general reduce power inequalities or reinforce them in modern democracies. It is almost certainly true that the spread of markets helped undermine some historical forms of hierarchy, such as feudalism (Marx). It is not clear that they continue to do so in modern democracies. On the one hand, free market participation provides individuals with some ability (presuming equal market access, etc.) to break away from abusive relationships. On the other, markets provide greater voice and choice to those with more money; if money talks in politics, it shouts across the agora. Nor are these effects limited to the marketplace. The market facilitates and fosters asymmetries of wealth which in turn may be directly or indirectly translated into asymmetries of political influence (Lindblom). Untrammeled markets are associated with gross income inequalities, which in turn infects politics with a variety of pathologies. This suggests that markets fail in the broader task of exposing individuals' differing perspectives to each to each other. Furthermore, markets are at best indifferent levelers of unequal power relations.

Does hierarchy do better? In an influential recent book, Richard Thaler and Cass Sunstein suggest that it does. They argue that "choice architects", people who have "responsibility for organizing the context in which people make decisions," can design institutions so as to spur people to take better choices rather than worse ones. Thaler and Sunstein are self-consciously paternalist, claiming that flawed behavior and thinking consistently stop people from making the choices that are in their best interests. However, they also find direct control of people's choices morally opprobrious. Libertarian paternalism seeks to guide but not eliminate choice, so that the easiest option is the "best" choice that individuals would make, if they only had sufficient attention and discipline. It provides paternalistic guidance through libertarian means, shaping choice contexts to make it more likely that individuals will make the right choices rather than the wrong ones.

This is, in Thaler and Sunstein's words, a politics of "nudging" choices rather than dictating them. Although Thaler and Sunstein do not put it this way, it is also a plea for the benefits of hierarchy in organizations and, in particular, in government. Thaler and Sunstein's "choice architects" are hierarchical superiors, specifically empowered to create broad schemes that will shape the choices of many other individuals. Their power to do this does not flow from, e.g., accountability to those whose choices get shaped. Instead, it flows from positions of authority within firm or government, which allow them to craft pension contribution schemes within firms, environmental policy within the government, and so on.

Thaler and Sunstein's recommendations have outraged libertarians, who believe that a nudge is merely a well-aimed shove --- that individuals' freedom will be reduced nearly as much by Thaler and Sunstein's choice architecture, as it would be by direct coercion. We are also unenthusiastic about libertarian paternalism, but for different reasons. While we do not talk, here, about coercion, we have no particular normative objection to it, provided that it is proportionate, directed towards legitimate ends, and constrained by well-functioning democratic controls. Instead, we worry that the kinds of hierarchy that Thaler and Sunstein presume actively inhibit the unconstrained exchange of views that we see as essential to solving complex problems.

Bureaucratic hierarchy is an extraordinary political achievement. States with clear, accountable hierarchies can achieve vast and intricate projects, and businesses use hierarchies to coordinate highly complex chains of production and distribution.4 Even so, there are reasons why bureaucracies have few modern defenders. Hierarchies rely on power asymmetries to work. Inferiors take orders from superiors, in a chain of command leading up to the chief executive officer (in firms) or some appointed or non-appointed political actor (in government). This is good for pushing orders down the chain, but notoriously poor at transmitting useful information up, especially kinds of information superiors did not anticipate wanting. As scholars from Max Weber on have emphasized, bureaucracies systematically encourage a culture of conformity in order to increase predictability and static efficiency.

Thaler and Sunstein presume a hierarchy in which orders are followed and policies are implemented, but ignore what this implies about feedback. They imagine hierarchically-empowered architects shaping the choices of a less well-informed and less rational general population. They discuss ordinary people's bad choices at length. However, they have remarkably little to say about how it is that the architects housed atop the hierarchy can figure out better choices on these individuals' behalf, or how the architectures can actually design choice systems that will encourage these choices. Sometimes, Thaler and Sunstein suggest that choice architects can rely on introspection: "Libertarian paternalists would like to set the default by asking what reflective employees in Janet's position would actually want." At other times, they imply that choice architects can use experimental techniques. The book's opening analogy proposes a set of experiments, in which the director of food services for a system "with hundreds of schools" (p. 1), "who likes to think about things in non-traditional ways," experiments with different arrangements of food in order to discover which displays encourage kids to pick the healthier options. Finally, Thaler and Sunstein sometimes argue that choice architects can use results from the social sciences to find optima.

One mechanism of information gathering that they systematically ignore is active feedback from citizens. Although they argue in passing that feedback from choice architects can help guide consumers, e.g., giving information about the content of food, or by shaping online interactions to ensure that people are exposed to others' points of view, they have no place for feedback from the individuals whose choices are being manipulated to help guide the choice architects, let alone to constrain them. As Suzanne Mettler (2011) has pointed out, Thaler and Sunstein depict citizens as passive consumers, who need to be guided to the desired outcomes, rather than active participants in democratic decision making.

This also means that Thaler and Sunstein's proposals don't take advantage of diversity. Choice architects, located within hierarchies which tend generically to promote conformity, are likely to have a much more limited range of ways of understanding problems than the population whose choices they are seeking to structure. In Scott Page's terms, these actors are may very "able" --- they will have sophisticated and complex heuristics, so that each individual choice architect is better able than each individual member of the population to see a large portion of the landscape of possible choices and outcomes. However, the architects will be very similar to each other in background and training, so that as a group they will see a far more limited set of possibilities than a group of randomly selected members of the population (who are likely to have less sophisticated but far more diverse heuristics). Cultural homogeneity among hierarchical elites helps create policy disasters (the "best and brightest" problem). Direct involvement of a wider selection of actors with more diverse heuristics would alleviate this problem.

However, precisely because choice architects rely on hierarchical power to create their architectures, they will have difficulty in eliciting feedback, even if they want to. Inequalities of power notoriously dampen real exchanges of viewpoints. Hierarchical inferiors within organizations worry about contradicting their bosses. Ordinary members of the public are uncomfortable when asked to contradict experts or officials. Work on group decision making (including, e.g., Sunstein 2003) is full of examples of how perceived power inequalities lead less powerful actors either to remain silent, or merely to affirm the views of more powerful actors, even when they have independently valuable perspectives or knowledge.

In short, libertarian paternalism is flawed, not because it restricts peoples' choices, but because it makes heroic assumptions about choice architects' ability to figure out what the actual default choices should be, and blocks their channels for learning better. Choice architects will be likely to share a narrow range of sophisticated heuristics, and to have difficulty in soliciting feedback from others with more diverse heuristics, because of their hierarchical superiority and the unequal power relations that this entails. Libertarian paternalism may still have value in situations of individual choice, where people likely do "want" e.g. to save more or take more exercise, but face commitment problems, or when other actors have an incentive to misinform these people or to structure their choices in perverse ways in the absence of a 'good' default choice. However, it will be far less useful, or even actively pernicious, in complex situations, where many actors with different interests make interdependent choices. Indeed, Thaler and Sunstein are far more convincing when they discuss how to encourage people to choose appropriate pension schemes than when they suggest that environmental problems are the "outcome of a global choice architecture system" that could be usefully rejiggered via a variety of voluntaristic mechanisms.

Democracy as a way to solve complex problems

Is democracy better at identifying solutions to complex problems? Many --- even on the left --- doubt that it is. They point to problems of finding common ground and of partisanship, and despair of finding answers to hard questions. The dominant tradition of American liberalism actually has considerable distaste for the less genteel aspects of democracy. The early 20th century Progressives and their modern heirs deplore partisanship and political rivalry, instead preferring technocracy, moderation and deliberation (Rosenblum 2008). Some liberals (e.g., Thaler and Sunstein) are attracted to Hayekian arguments for markets and libertarian paternalist arguments for hierarchy exactly because they seem better than the partisan rancor of democratic competition.

We believe that they are wrong, and democracy offers a better way of solving complex problems. Since, as we've argued, power asymmetries inhibit problem-solving, democracy has a large advantage over both markets and technocratic hierarchy. The fundamental democratic commitment is to equality of power over political decision making. Real democracies do not deliver on this commitment any more than real markets deliver perfect competition, or real hierarchies deliver an abstractly benevolent interpretation of rules. But a commitment to democratic improvements is a commitment to making power relations more equal, just as a commitment to markets is to improving competition, and a commitment to hierarchy (in its positive aspects) is a commitment to greater disinterestedness. This implies that a genuine commitment to democracy is a commitment to political radicalism. We embrace this.

Democracy, then, is committed to equality of power; it is also well-suited to exposing points of view to each other in a way that leads to identifying better solutions. This is because democracy also involves debate. In competitive elections and in more intimate discussions, democratic actors argue over which proposals are better or worse, exposing their different perspectives to each other.

Yet at first glance, this interchange of perspectives looks ugly: it is partisan, rancorous and vexatious, and people seem to never change their minds. This leads some on the left to argue that we need to replace traditional democratic forms with ones that involve genuine deliberation, where people will strive to be open-minded, and to transcend their interests. These aspirations are hopelessly utopian. Such impartiality can only be achieved fleetingly at best, and clashes of interest and perception are intrinsic to democratic politics.

Here, we concur with Jack Knight and Jim Johnson's important recent book (2011), which argues that politics is a response to the problem of diversity. Actors with differing --- indeed conflicting --- interests and perceptions find that their fates are bound together, and that they must make the best of this. Yet, Knight and Johnson argue, politics is also a matter of seeking to harness diversity so as to generate useful knowledge. They specifically do not argue that democracy requires impartial deliberation. Instead, they claim that partial and self-interested debate can have epistemological benefits. As they describe it, "democratic decision processes make better use of the distributed knowledge that exists in a society than do their rivals" such as market coordination or judicial decision making (p. 151). Knight and Johnson suggest that approaches based on diversity, such as those of Scott Page and Elizabeth Anderson, provide a better foundation for thinking about the epistemic benefits of democracy than the arguments of Condorcet and his intellectual heirs.

We agree. Unlike Hayek's account of markets, and Thaler and Sunstein's account of hierarchy, this argument suggests that democracy can both foster communication among individuals with highly diverse viewpoints. This is an argument for cognitive democracy, for democratic arrangements that take best advantage of the cognitive diversity of their population. Like us, Knight and Johnson stress the pragmatic benefits of equality. Harnessing the benefits of diversity means ensuring that actors with a very wide range of viewpoints have the opportunity to express their views and to influence collective choice. Unequal societies will select only over a much smaller range of viewpoints --- those of powerful people. Yet Knight and Johnson do not really talk about the mechanisms through which clashes between different actors with different viewpoints result in better decision making. Without such a theory, it could be that conflict between perspectives results in worse rather than better problem solving. To make a good case for democracy, we not only need to bring diverse points of view to the table, but show that the specific ways in which they are exposed to each other have beneficial consequences for problem solving.

There is micro-level work which speaks to this issue. Hugo Mercier and Dan Sperber (2011) advance a purely 'argumentative' account of reasoning, on which reasoning is not intended to reach right answers, but rather to evaluate the weaknesses of others' arguments and come up with good arguments to support one's own position. This explains both why confirmation bias and motivated reasoning are rife, and why the quality of argument is significantly better when actors engage in real debates. Experimentally, individual performance when reasoning in non-argumentative settings is 'abysmal,' but is 'good' in argumentative settings. This, in turn, means that groups are typically better in solving problems than is the best individual within the group . Indeed, where there is diversity of opinion, confirmation bias can have positive consequences in pushing people to evaluate and improve their arguments in a competitive setting.

When one is alone or with people who hold similar views, one's arguments will not be critically evaluated. This is when the confirmation bias is most likely to lead to poor outcomes. However, when reasoning is used in a more felicitous context — that is, in arguments among people who disagree but have a common interest in the truth — the confirmation bias contributes to an efficient form of division of cognitive labor. When a group has to solve a problem, it is much more efficient if each individual looks mostly for arguments supporting a given solution. They can then present these arguments to the group, to be tested by the other members. This method will work as long as people can be swayed by good arguments, and the results reviewed ... show that this is generally the case. This joint dialogic approach is much more efficient than one where each individual on his or her own has to examine all possible solutions carefully (p. 65).

A separate line of research in experimental social psychology (Nemeth et al. (2004), Nemeth and Ormiston (2007), and Nemeth (2012)) indicates that problem-solving groups produce more solutions, which outsiders assess as better and more innovative, when they contain persistent dissenting minorities, and are encouraged to engage in, rather than refrain from, mutual criticism. (Such effects can even be seen in school-children: see Mercer, 2000.) This, of course, makes a great deal of sense from Mercier and Sperber's perspective.

This provides micro-level evidence that political argument will improve problem solving, even if we are skeptical about human beings' ability to abstract away from their specific circumstances and interests. Neither a commitment to deliberation, nor even standard rationality is required for argument to help solve problems. This has clear implications for democracy, which forces actors with very different perspectives to engage with each others' viewpoints. Even the most homogenous-seeming societies contain great diversity of opinion and of interest (the two are typically related) within them. In a democracy, no single set of interests or perspectives is likely to prevail on its own. Sometimes, political actors have to build coalitions with others holding dissimilar views, a process which requires engagement between these views. Sometimes, they have to publicly contend with others holding opposed perspectives in order to persuade uncommitted others to favor their interpretation, rather than another. Sometimes, as new issues arise, they have to persuade even their old allies of how their shared perspectives should be reinterpreted anew.

More generally, many of the features of democracy that skeptical liberals deplore are actually of considerable benefit. Mercier and Sperber's work provides microfoundations for arguments about the benefits of political contention, such as John Stuart Mill's, and of arguments for the benefits of partisanship, such as Nancy Rosenblum's (2008) sympathetic critique and reconstruction of Mill. Their findings suggest that the confirmation bias that political advocates have are subject to can have crucial benefits, so long as it is tempered by the ability to evaluate good arguments in context.

Other work suggests that the macro-structures of democracies too can have benefits. Lazer and Friedman (2007) find on the basis of simulations that problem solvers connected via linear networks (in which there are few links) will find better solutions over the long run than problem solvers connected via totally connected networks (in which there all nodes are linked to each other). In a totally connected network, actors copy the best immediately visible solution quickly, driving out diversity from the system, while in a linear network, different groups explore the space around different solutions for a much longer period, making it more likely that they will identify better solutions that were not immediately apparent. Here, the macro-level structure of the network does the same kind of work that confirmation bias does in Mercier and Sperber's work --- it preserves diversity and encourages actors to keep exploring solutions that may not have immediate payoffs.5

This work offers a cognitive justification for the macro-level organization of democratic life around political parties. Party politics tends to organize debate into intense clusters of argument among people (partisans for the one or the other party) who agree in broad outline about how to solve problems, but who disagree vigorously about the specifics. Links between these clusters are much rarer than links within them, and are usually mediated by competition. Under a cognitive account, one might see each of these different clusters as engaged in exploring the space of possibilities around a particular solution, maintaining some limited awareness of other searches being performed within other clusters, and sometimes discreetly borrowing from them in order to improve competitiveness, but nonetheless preserving an essential level of diversity (cf. Huckfeldt et al., 2004). Such very general considerations do not justify any specific partisan arrangement, as there may be better (or worse) arrangements available. What it does is highlight how party organization and party competition can have benefits that are hard or impossible to match in a less clustered and more homogenous social setting. Specifically, it shows how partisan arrangements can be better at solving complex problems than non-partisan institutions, because they better preserve and better harness diversity.

This leads us to argue that democracy will be better able to solve complex problems than either markets or hierarchy, for two reasons. First, democracy embodies a commitment to political equality that the other two macro-institutions do not. Clearly, actual democracies achieve political equality more or less imperfectly. Yet if we are right, the better a democracy is at achieving political equality, the better it will be, ceteris paribus, at solving complex problems. Second, democratic argument, which people use either to ally with or to attack those with other points of view, is better suited to exposing different perspectives to each other, and hence capturing the benefits of diversity, than either markets or hierarchies. Notably, we do not make heroic claims about people's ability to deliberate in some context that is free from faction and self-interest. Instead, even under realistic accounts of how people argue, democratic argument will have cognitive benefits, and indeed can transform private vices (confirmation bias) into public virtues (the preservation of cognitive diversity)6. Democratic structures --- such as political parties --- that are often deplored turn out to have important cognitive advantages.

Democratic experimentalism and the Internet

As we have emphasized several times, we have no reason to think that actually-existing democratic structures are as good as they could be, or even close. If nothing else, designing institutions is, itself, a highly complex problem, where even the most able decision-makers have little ability to foresee the consequences of their actions. Even when an institution works well at one time, the array of other institutions, social and physical conditions in which it must function is constantly changing. Institutional design and reform, then, is unavoidably a matter of more or less ambitious "piecemeal social experiments", to use the phrase of Popper (1957). As emphasized by Popper, and by independently by Knight and Johnson, one of the strengths of democracy is its ability to make, monitor, and learn from such experiments.7 (Knight and Johnson particularly emphasize the difficulty markets have in this task.) Democracies can, in fact, experiment with their own arrangements.

For several reasons, the rise of the Internet makes this an especially propitious time for experimenting with democratic structures themselves. The means available for communication and information-processing are obviously going to change the possibilities for collective decision-making. (Bureaucracy was not an option in the Old Stone Age, nor representative democracy without something like cheap printing.) We do not yet know the possibilities of Internet-mediated communication for gathering dispersed knowledge, for generating new knowledge, for complex problem-solving, or for collective decision-making, but we really ought to find out.

In fact, we are already starting to find out. People are building systems to accomplish all of these tasks, in narrower or broader domains, for their own reasons. Wikipedia is, of course, a famous example of allowing lots of more-or-less anonymous people to concentrate dispersed information about an immense range of subjects, and to do so both cheaply and reliably8. Crucially, however, it is not unique. News-sharing sites like Digg, Reddit, etc. are ways of focusing collective attention and filtering vast quantities of information. Sites like StackExchange have become a vital part of programming practice, because they encourage the sharing of know-how about programming, with the same system spreading to many other technical domains. The knowledge being aggregated through such systems is not tacit, rather it is articulated and discursive, but it was dispersed and is now shared. Similar systems are even being used to develop new knowledge. One mode of this is open-source software development, but it is also being used in experiments like the Polymath Project for doing original mathematics collaboratively9.

At a more humble level, there are the ubiquitous phenomena of mailing lists, discussion forums, etc., etc., where people with similar interests discuss them, on basically all topics of interest to people with enough resources to get on-line. These are, largely inadvertently, experiments in developing collective understandings, or at least shared and structured disagreements, about these topics.

All such systems have to face tricky problems of coordinating their computational architecture, their social organization, and their cognitive functions (Shalizi, 2007; Farrell and Schwartzberg, 2008). They need ways of of making findings (or claims) accessible, of keeping discussion productive, and so forth and so on. (Often, participants are otherwise strangers to each other, which is at the least suggestive of the problems of trust and motivation which will face efforts to make mass democracy more participative.) This opens up an immense design space, which is still very poorly understood --- but almost certainly presents a rugged search landscape, with an immense number of local maxima and no very obvious path to the true peaks. (It is even possible that the landscape, and so the peaks, could vary with the subject under debate.) One of the great aspects of the current moment, for cognitive democracy, is that it has become (comparatively) very cheap and easy for such experiments to be made online, so that this design space can be explored.

There are also online ventures which are failures, and these, too, are informative. They range from poorly-designed sites which never attract (or actively repel) a user base, or produce much of value, to online groupings which are very successful in their own terms, but are, cognitively, full of fail, such as thriving communities dedicated to conspiracy theories. These are not just random, isolated eccentrics, but highly structured communities engaged in sharing and developing ideas, which just so happen to be very bad ideas. (See, for instance, Bell et al. (2006) on the networks of those who share delusions that their minds are being controlled by outside forces.) If we want to understand what makes successful online institutions work, and perhaps even draw lessons for institutional design more generally, it will help tremendously to contrast the successes with such failures.

The other great aspect for learning right now is that all these experiments are leaving incredibly detailed records. People who use these sites or systems leave detailed, machine-accessible traces of their interactions with each other, even ones which tell us about what they were thinking. This is an unprecedented flood of detail about experiments with collective cognition, and indeed with all kinds of institutions, and about how well they served various functions. Not only could we begin to just observe successes and failures, but we can probe the mechanisms behind those outcomes.

This points, we think, to a very clear constructive agenda. To exaggerate a little, it is to see how far the Internet enables modern democracies to make as much use of their citizens' minds as did Ober's Athens. We want to learn from existing online ventures in collective cognition and decision-making. We want to treat these ventures are, more or less, spontaneous experiments10, and compare the success and failures (including partial successes and failures) to learn about institutional mechanisms which work well at harnessing the cognitive diversity of large numbers of people who do not know each other well (or at all), and meet under conditions of relative equality, not hierarchy. If this succeeds, what we learn from this will provide the basis for experimenting with the re-design of democratic institutions themselves.

We have, implicitly, been viewing institutions through the lens of information-processing. To be explicit, the human actions and interactions which instantiate an institution also implement abstract computations (Hutchins, 1995). Especially when designing institutions for collective cognition and decision-making, it is important to understand them as computational processes. This brings us to our concluding suggestions about some of the ways social science and computer science can help each other.

Hong and Page's work provides a particularly clear, if abstract, formalization of the way in which diverse individual perspectives or heuristics can combine for better problem-solving. This observation is highly familiar in machine learning, where the large and rapidly-growing class of "ensemble methods" work, explicitly, by combining multiple imperfect models, which helps only because the models are different (Domingos, 1999) --- in some cases it helps exactly to the extent that the models are different (Krogh and Vedelsby, 1995). Different ensemble techniques correspond to different assumptions about the capacities of individual learners, and how to combine or communicate their predictions. The latter are typically extremely simplistic, and understanding the possibilities of non-trivial organizations for learning seems like a crucial question for both machine learning and for social science.

Conclusions: Cognitive Democracy

Democracy, we have argued, has a capacity unmatched among other macro-structures to actually experiment, and to make use of cognitive diversity in solving complex problems. To make the best use of these potentials, democratic structures must themselves be shaped so that social interaction and cognitive function reinforce each other. But the cleverest institutional design in the world will not help unless the resources --- material, social, cultural --- needed for participation are actually broadly shared. This is not, or not just, about being nice or equitable; cognitive diversity is itself a resource, a source of power, and not something we can afford to waste.

[Partial] Bibliography

Badii, Remo and Antonio Politi (1997). Complexity: Hierarchical Structures and Scaling in Physics, Cambridge, England: Cambridge University Press.

Bell, Vaughan, Cara Maiden, Antonio Mu{~n}oz-Solomando and Venu Reddy (2006). "``Mind Control'' Experience on the {Internet}: Implications for the Psychiatric Diagnosis of Delusions", Psychopathology, vol. 39, pp. 87--91,

Blume, Lawrence Blume and David Easley (2006). "If You're so Smart, Why Aren't You Rich? Belief Selection in Complete and Incomplete Markets", Econometrica, vol. 74, pp. 929--966,

Bowles, Samuel and Suresh Naidu (2008). "Persistent Institutions", working paper 08-04-015, Santa Fe Institute,

Domingos, Pedro (1999). "The Role of Occam's Razor in Knowledge Discovery", Data Mining and Knowledge Discovery, vol. 3, pp. 409--425,

Farrell, Henry and Melissa Schwartzberg (2008). "Norms, Minorities, and Collective Choice Online", Ethics and International Affairs 22:4,

Hacker, Jacob S. and Paul Pierson (2010). Winner-Take-All Politics: How Washington Made the Rich Richer --- And Turned Its Back on the Middle Class. New York: Simon and Schuster.

Hayek, Friedrich A. (1948). Individualism and Economic Order. Chicago: University of Chicago Press.

Hong, Lu and Scott E. Page (2004). "Groups of diverse problem solvers can outperform groups of high-ability problem solvers", Proceedings of the National Academy of Sciences, vol. 101, pp. 16385--16389,

Huckfeldt, Robert, Paul E. Johnson and John Sprague (2004). Political Disagreement: The Survival of Diverse Opinions within Communication Networks, Cambridge, England: Cambridge University Press.

Hutchins, Edwin (1995). Cognition in the Wild. Cambridge, Massachusetts: MIT Press.

Judd, Stephen, Michael Kearns and Yevgeniy Vorobeychik (2010). "Behavioral dynamics and influence in networked coloring and consensus", Proceedings of the National Academy of Sciences, vol. 107, pp. 14978--14982, doi:10.1073/pnas.1001280107

Knight, Jack (1992). Institutions and Social Conflict, Cambridge, England: Cambridge University Press.

Knight, Jack and James Johnson (2011). The Priority of Democracy: Political Consequences of Pragmatism, Princeton: Princeton University Press.

Krogh, Anders and Jesper Vedelsby (1995). "Neural Network Ensembles, Cross Validation, and Active Learning", pp. 231--238 in G. Tesauro et al. (eds)., Advances in Neural Information Processing Systems 7 [NIPS 1994],

Laughlin, Patrick R. (2011). Group Problem Solving. Princeton: Princeton University Press.

Lazer, David and Allan Friedman (2007). "The Network Structure of Exploration and Exploitation", Administrative Science Quarterly, vol. 52, pp. 667--694,

Lindblom, Charles (1982). "The Market as Prison", The Journal of Politics, vol. 44, pp. 324--336,

Mason, Winter A., Andy Jones and Robert L. Goldstone (2008). "Propagation of Innovations in Networked Groups", Journal of Experimental Psychology: General, vol. 137, pp. 427--433, doi:10.1037/a0012798

Mason, Winter A. and Duncan J. Watts (2012). "Collaborative Learning in Networks", Proceedings of the National Academy of Sciences, vol. 109, pp. 764--769, doi:10.1073/pnas.1110069108

Mercer, Neil (2000). Words and Minds: How We Use Language to Think Together. London: Routledge.

Mercier, Hugo and Dan Sperber (2011). "Why do humans reason? Arguments for an argumentative theory", Behavioral and Brain Sciences, vol. 34, pp. 57--111, doi:10.1017/S0140525X10000968

Mitchell, Melanie (1996). An Introduction to Genetic Algorithms, Cambridge, Massachusetts: MIT Press.

Moore, Cristopher and Stephan Mertens (2011). The Nature of Computation. Oxford: Oxford University Press.

Nelson, Richard R. and Sidney G. Winter (1982). An Evolutionary Theory of Economic Change. Cambridge, Massachusetts: Harvard University Press.

Nemet, Charlan J., Bernard Personnaz, Marie Personnaz and Jack A. Goncalo (2004). "The Liberating Role of Conflict in Group Creativity: A Study in Two Countries", European Journal of Social Psychology, vol. 34, pp. 365--374, doi:10.1002/ejsp.210

Nemeth, Charlan Jeanne and Margaret Ormiston (2007). "Creative Idea Generation: Harmony versus Stimulation", European Journal of Social Psychology, vol. 37, pp. 524--535, doi:10.1002/ejsp.373

Nemeth, Charlan Jeanne (2012). "Minority Influence Theory", pp. 362--378 in P. A. M. Van Lange et al. (eds.), Handbook of Theories in Social Psychology, vol. II. New York: Sage.

Page, Scott E. Page (2011). Diversity and Complexity, Princeton: Princeton University Press.

Pfeffer, Jeffrey and Robert I. Sutton (2006). Hard Facts, Dangerous Half-Truths and Total Nonsense: Profiting from Evidence-Based Management. Boston: Harvard Business School Press.

Popper, Karl R. (1957). The Poverty of Historicism, London: Routledge.

Popper, Karl R. (1963). Conjectures and Refutations: The Growth of Scientific Knowledge, London: Routledge.

Rosenblum, Nancy (2008). On the Side of the Angels: An Appreciation of Parties and Partisanship. Princeton: Princeton University Press.

Salganik, Matthew J., Peter S. Dodds and Duncan J. Watts (2006). "Experimental study of inequality and unpredictability in an artificial cultural market", Science, vol. 311, pp. 854--856,

Shalizi, Cosma Rohilla (2007). "Social Media as Windows on the Social Life of the Mind", in AAAI 2008 Spring Symposia: Social Information Processing,

Shalizi, Cosma Rohilla, Kristina Lisa Klinkner and Robert Haslinger (2004). "Quantifying Self-Organization with Optimal Predictors", Physical Review Letters, vol. 93, art. 118701,

Shalizi, Cosma Rohilla Shalizi and Andrew C. Thomas (2011). "Homophily and Contagion Are Generically Confounded in Observational Social Network Studies", Sociological Methods and Research, vol. 40, pp. 211--239,

Shapiro, Carl and Hal R. Varian (1998). Information Rules: A Strategic Guide to the Network Economy, Boston: Harvard Business School Press

Stiglitz, Joseph E. (2000). "The Contributions of the Economics of Information to Twentieth Century Economics", The Quarterly Journal of Economics, vol. 115, pp. 1441--1478,

Sunstein, Cass R. (2003). Why Societies Need Dissent. Cambridge, Massachusetts: Harvard University Press.

Swartz, Aaron (2006). "Who Writes Wikipedia?",

  1. Two qualifications are in order. First, we don't think that justice and social order are unimportant. If our arguments imply social institutions that are either profoundly unjust or likely to cause socially devastating instability, they are open to challenge on these alternative normative criteria. Second, our normative arguments about what these institutions are good for should not be taken as an empirical statement about how these institutions have come into being. Making institutions, like making sausages and making laws, is usually an unpleasant process.[return to main text]

  2. Much more could of course be said about the meaning of the term "complexity". In particular, it may later be useful to look at formal measures of the intrinsic complexity of problems in terms of the resources required to solve them ("computational complexity" theory, see Moore and Mertens), or the degree of behavioral flexibility of systems, such as interacting decision-makers (Badii and Politi; Shalizi, Klinkner and Haslinger). We should also note here that several decades of work in experimental psychology indicates that groups are better at problem-solving than the best individuals within the group (Laughlin, 2011). We do not emphasize this interesting experimental tradition, however, because it is largely concerned with problems which are, in our terms, rather simple, and so suitable to the psychology laboratory.[return to main text]

  3. Imagine trying to discover whether a locally-grown tomato in Pittsburgh is better, from the point of view of greenhouse-gas emission, than one imported from Florida. After working out the differences in emissions from transport, one has to consider the emissions involved in growing the tomatoes in the first place, the emissions-cost of producing different fertilizers, farm machinery, etc., etc. The problem quickly becomes intractable --- and this is before a consumer with limited funds must decide how much a ton of emitted carbon dioxide is worth to them. Let there be a price on greenhouse-gas emission, however, and the whole informational problem disappears, or rather gets solved implicitly by ordinary market interactions.[return to main text]

  4. "Thus bridges are built; harbours open'd; ramparts rais'd; canals form'd; fleets equip'd; and armies disciplin'd every where, by the care of government, which, tho' compos'd of men subject to all human infirmities, becomes, by one of the finest and most subtle inventions imaginable, a composition, which is, in some measure, exempted from all these infirmities." --- Hume, Treatise of Human Nature, book III, part II, sect. vii.[return to main text]

  5. Broadly similar results have come from experiments on learning and problem-solving in controlled networks of human subjects in the laboratory (Mason et al., 2008; Judd et al., 2010; Mason and Watts, 2012). However, we are not aware of experiments on human subjects which have deliberately varied network structure in a way directly comparable to Lazer and Friedman's simulations. We also note that using multiple semi-isolated sub-populations ("islands") is a common trick in evolutionary optimization, precisely to prevent premature convergence on sub-optimal solution (Mitchell, 1996).[return to main text]

  6. This resonates with Karl Popper's insistence (1957, 1963) that, to the extent science is rational and objective, it is not because individual scientists are disinterested, rational, etc. --- he knew perfectly well that individual scientists are often pig-headed and blinkered --- but because of the way the social organization of scientific communities channels scientists' ambition and contentiousness. The reliability of science is an emergent property of scientific institutions, not of scientists.[return to main text]

  7. Bureaucracies can do experiments, such as field trials of new policies, or "A/B" tests of new procedures, now quite common with Internet companies. (See, e.g., the discussion of such experiments in Pfeffer and Sutton.) Power hierarchies, however, are big obstacles to experimenting with options which would upset those power relations, or threaten the interests of those high in the hierarchy. Market-based selection of variants (explored by Nelson and Winter, 1982) also has serious limits (see e.g., Blume and Easley). There are, after all, many reasons why there are no markets in alternative institutions. E.g., even if such a market could get started, it would be a prime candidate efficiency-destroying network externalities, leading at best to monopolistic competition. (Cf. Shapiro and Varian's advice to businesses about manipulating standards-setting processes.)[return to main text]

  8. Empirically, most of the content of Wikipedia seems to come from a large number of users each of whom makes a substantial contribution or contributions to a very small number of articles. The needed formatting, clean-up, coordination, etc., on the other hand, comes disproportionately from a rather small number of users very dedicated to Wikipedia (see Swartz, 2006). On the role of internal norms and power in the way Wikipedia works, see Farrell and Schwartzberg (2008).[return to main text]

  9. For an enthusiastic and intelligent account of ways in which the Internet might be used to enhance the practice of science, see Nielsen. (We cannot adequately explore, here, how scientific disciplines fit into our account of institutions and democratic processes.)[return to main text]

  10. Obviously, the institutions people volunteer to participate in on-line will depend on their pre-existing characteristics, and it would be naive to ignore this. We cannot here go into strategies for causal inference in the face of such endogenous selection bias, which is pretty much inescapable in social networks (Shalizi and Thomas, 2011). Deliberate experimentation with online institutional arrangements is attractive, if it could be done effectively and ethically (cf. Salganik et al., 2006).[return to main text]

Manual trackback: 3 Quarks Daily; The Debate Link; Boing-Boing; MetaFilter (appropriately enough); Quomodocumque; Sunlight Foundation; Matthew Yglesias; Cows and Graveyards; Dead Voles; Bookforum; ABC Democracy

Update, 29 May 2012: Added some references to the bibliography.

The Collective Use and Evolution of Concepts; The Progressive Forces

Posted by crshalizi at May 23, 2012 15:00 | permanent link

May 21, 2012

If Peer Review Did Not Exist, We Would Have to Invent Something Very Like It to Serve Highly Similar Ends

Attention conservation notice: 1400 words on a friend's proposal to do away with peer review, written many weeks ago when there was actually some debate about this.

Larry is writing about peer review (again), this time to advocate "A World Without Referees". Every scientist, of course, has day-dreamed about this, in a first-lets-kill-all-the-lawyers way, but Larry is serious, so let's treat this seriously. I'm not going to summarize his argument; it's short and you can and should go read it yourself.

I think it helps, when thinking about this, to separate two functions peer-reviewed journals and conferences have traditionally served. One is spreading claims (dissemination), and the other is letting readers know about claims worthy of their attention (certification).

Arxiv, or something like it, can take over dissemination handily. Making copies of papers is now very cheap and very fast, so we no longer have to be choosy about which ones we disseminate. In physics, this use of Arxiv is just as well-established as Larry says. In fact, one reason Arxiv was able to establish itself so rapidly and thoroughly among physicists was that they already had a well-entrenched culture of circulating preprints long before journal publication. What Arxiv did was make this public and universally accessible.

But physicists still rely on journals for certification. People pay more attention to papers which come out in Physical Review Letters, or even just Physical Review E, than ones which are only on Arxiv. "Could it make it past peer review?" is used by many people as a filter to weed out the stuff which is too wrong or too unimportant to bother with. This doesn't work so well for those directly concerned with a particular research topic, but if something is only peripherally of interest, it makes a lot of sense.

Even within a specialized research community, consisting entirely of experts who can evaluate new contributions on their own, there is a rankling inefficiency to the world without referees. Larry talks about spending a minute or two looking at new stats. papers on Arxiv every day. But everyone filtering Arxiv for themselves is going to get harder and harder as more potentially-relevant stuff gets put on it. I'm interested in information theory, so I've long looked at cs.IT, and it's become notably more time-consuming as that community has embraced the Arxiv. Yet within any given epistemic community, lots of people are going to be applying very similar filters. So the world-without-referees has an increasing amount o work being done by individuals, but a lot of that work is redundant. Efficiency, the division of labor, points to having a few people put their time into filtering, and the rest of us relying on it, even when in principle we could do the filtering ourselves. To be fair, of course, we should probably take this job in turns...

So: if all papers get put on Arxiv, filtering becomes a big job, so efficiency pushes us towards having only some members of the research community do the filtering for the rest. We have re-invented something very much like peer review, purely so that our lives are not completely consumed by evaluating new papers, and we can actually get some work done.

Larry's proposal for a world without referees also doesn't seem to take into account the needs of researchers to rely on findings in fields in which they are not experts, and so can't act as their own filters. (Or they could if they put in a few years in something else first.) If I need some result from neuroscience, or for that matter from topology, I do not have the time to spend becoming a neuroscientist or topologist, and it is an immense benefit to have institutions I can trust to tell me "these claims about cortical columns, or locally compact Hausdorff spaces, are at least not crazy". This is also a kind of filtering, and there is the same push, based on the division of labor, to rely on only some neuroscientists or topologists to do the filtering for outsiders (or all of them only some of the time), and again we have re-created something very much like refereeing.

So: some form or forms of filtering is inevitable, and the forces pushing for a division of labor in filtering are very strong. I don't know of any reason to think that the current, historically-evolved peer review system is the best way of organizing this cognitive triage, but we're not going to avoid having some such system, nor should we want to. Different ways of organizing the work of filtering will have different costs and benefits, but we should be talking about those and those trade-offs, not hoping that we can just wish the problem away now that making copies is cheap1. It's not at all obvious, for instance, that attention-filtering for the internal benefit of members of a research community should be done in the same way as reliability-filtering for outsiders. But, to repeat, we are going to have filters and they are almost certainly going to involve a division of labor.

Lenin, supposedly, said that "small production engenders capitalism and the bourgeoisie daily, hourly, spontaneously and on a mass scale" (Nove, The Economics of Feasible Socialism Revisited, p. 46). Whether he was right about the bourgeoisie or not, the rate of production of the scientific literature, the similarity of interests and standards with a community, and the need to rely on other field's findings are all doing to engender refereeing-like institutions, "daily, hourly, spontaneously and on a mass scale". I don't think Larry would go to the same lengths to get rid of referees that Lenin went to get rid of the bourgeoisie, but in any case the truly progressive course is not to suppress the old system by force, but to provide a superior alternative.

Speaking personally, I am attracted to a scenario we might call "peer review among consenting adults". Let anyone put anything on Arxiv (modulo the usual crank-screen). But then let others create filtered versions, applying such standards of topic, rigor, applicability, writing quality, etc., as they please --- and be explicit about what those standards are. These can be layered as deep as their audience can support. Presumably the later filters would be intended for those further from active research in the area, and so would be less tolerant of false alarms, and more tolerant of missing possible discoveries, than the filters for those close to the work. But this could be an area for experiment, and for seeing what people actually find useful. This is, I take it, more or less what Paul Ginsparg proposes, and it has a lot to recommend it. Every contribution is available if anyone wants to read it, but no one is compelled to try to filter the whole flow of the scholarly literature unaided, and human intelligence can still be used to amplify interesting signals, or even to improve papers.

Attractive as I find this idea, I am not saying it is historically inevitable, or even the best possible way of ordering these matters. The main point is that peer review does some very important jobs for the community of inquirers (whether or not it evolved to do them), and that if we want to get rid of it, it would be a good idea to have something else ready to do those jobs.

[1]: For instance, many people have suggested that referees should have to take responsibility, in some way, for their reports, so that those who do sloppy or ignorant or merely-partisan work will be at least shamed. There is genuinely a lot to be said for this. But it does run into the conflicting demand that science should not be a respecter of persons --- if Grand Poo-Bah X writes a crappy paper, people should be able to call X on it, without fear of retribution or considering the (inevitable) internal politics of the discipline and the job-market. I do not know if there is a way to reconcile these, but that's one of the kind of trade-offs we have to consider as we try to re-design this institution. ^

Learned Folly; Kith and Kin; The Collective Use and Evolution of Concepts

Posted by crshalizi at May 21, 2012 02:00 | permanent link

May 03, 2012

Ten Years of Monster Raving Egomania and Utter Batshit Insanity

Sometimes, all you can do is quote verbatim* from your inbox:

Date: Tue, 17 Apr 2012 09:31:57 -0400
From: Stephen Wolfram
To: Cosma Shalizi
Subject: 10-year followup on "A New Kind of Science"

Next month it'll be 10 years since I published "A New Kind of Science"
... and I'm planning to take stock of the decade of commentary, feedback and
follow-on work about the book that's appeared.

My archives show that you wrote an early review of the book:

At the time reviews like yours appeared, most of the modern web apparatus
for response and public discussion had not yet developed.  But now it has,
and there seems to be considerable interest in the community in me using
that venue to give my responses and comments to early reviews.

I'm writing to ask if there's more you'd like to add before I embark on my
analysis in the next week or so.

I'd like to take this opportunity to thank you for the work you put into
writing a review of my book.  I know it was a challenge to review a book of
its size, especially quickly.  I plan to read all reviews with forbearance,
and hope that---especially leavened by the passage of a decade---useful
intellectual points can be derived from discussing them.

If you don't have anything to add to your early review, it'd be very helpful
to know that as soon as possible.

Thanks in advance for your help.

-- Stephen Wolfram

P.S. Nowadays you can find the whole book online at  If you'd like a new
physical copy, just let me know and I can have it shipped...

I wrote my my review in 2002 (though I didn't put it out until 2005). The idea that complex patterns can arise from simple rules was already old then, and has only become more commonplace since. A lot of interesting, substantive, specific science has been done on that theme in the ensuing decade. To this effort, neither Wolfram nor his book have contributed anything of any note. The one respect in which I was overly pessimistic is that I have not, in fact, had to spend much time "de-programming students [who] read A New Kind of Science before knowing any better" — but I get a rather different class of students these days than I did in 2002.

Otherwise, and for the record, I do indeed still stand behind the review.

Manual trackback: Hacker News; Wolfgang; Andrew Gelman

*: I removed our e-mail addresses, because no one deserves spam.

Self-Centered; Complexity; Psychoceramica

Posted by crshalizi at May 03, 2012 23:10 | permanent link

May 02, 2012

Installing pcalg

Attention conservation notice: Boring details about getting finicky statistical software to work; or, please read the friendly manual.

Some of my students are finding it difficult to install the R package pcalg; I share these instructions in case others are also in difficulty.

  1. For representing graphs, pcalg relies on two packages called RBGL and graph. These are not available on CRAN, but rather are on the other R software repository, BioConductor. To install them, follow the instructions at those links; to summarize, run this:
    (Since RBGL depends on graph, this should automatically also install graph; if not, run biocLite("graph"), then biocLite("RBGL").)
  2. Now install pcalg from CRAN, along with the packages it depends on. You will get a warning about not having the Rgraphviz package. However, you will be able to load pcalg and run it. You should be able to step through the example labeled "Using Gaussian Data" at the end of help(pc), though it will not produce any plots.

    You can still extract the graph by hand from the fitted models returned by functions like pc --- if one of those objects is fit, then fit@graph@edgeL is a list of lists, where each node has its own list, naming the other nodes it has arrows to (not from). If you are doing this for the final in ADA, you don't actually need anything beyond this to do the assignment, as explained in question A1a.

  3. Rgraphviz is what pcalg relies on for drawing pictures of causal graphs. Its installation is somewhat tricky, so there is a README file, which you should read.
    The key point is that Rgraphviz itself relies on a non-R suite of programs called graphviz. You will want to install these. Go to, and download and install the software. (If you use a Mac, the standard download also includes, which is a nice visual interface to the actual graph-drawing functions, and what I use for drawing the DAGs in the lecture notes.)
  4. You have to make sure that your operating system will let other software (like R) call on graphviz. The way to do this is to add the directory (or folder) where you installed graphviz to the list of places your computer recognizes as containing executable programs --- the system's "command path". The README for installing Rgraphviz explains what you have to add to the path. (If you are a Windows user and do not know how to alter the command path, read this.)
  5. If you have R open, close it. (If you do not, it will probably not know about the new software you've just gotten the system to recognize.) Re-open R, and install Rgraphviz. The basic installation command is just
    The README for Rgraphviz gives some checks which you should be able to run if everything is working; try them.
  6. You should now be able to generate pictures of DAGs with pc and the other functions in pcalg; try stepping through all the examples at the end of help(pc).

When I installed pcalg on my laptop two weeks ago, it was painless, because (1) I already had graphviz, and (2) I knew about BioConductor. (In fact, the R graphical interface on the Mac will switch between installing packages from CRAN and from BioConductor.) To check these instructions, I just now deleted all the packages from my computer and re-installed them, and everything worked; elapsed time, ten minutes, mostly downloading.

Advanced Data Analysis from an Elementary Point of View

Posted by crshalizi at May 02, 2012 21:30 | permanent link

May 01, 2012

Final Exam (Advanced Data Analysis from an Elementary Point of View)

In which we are devoted to two problems of political economy, viz., strikes, and macroeconomic forecasting.

Assignment; macro.csv

Advanced Data Analysis from an Elementary Point of View

Posted by crshalizi at May 01, 2012 10:31 | permanent link

Time Series I (Advanced Data Analysis from an Elementary Point of View)

What time series are. Properties: autocorrelation or serial correlation; other notions of serial dependence; strong and weak stationarity. The correlation time and the world's simplest ergodic theorem; effective sample size. The meaning of ergodicity: a single increasing long time series becomes representative of the whole process. Conditional probability estimates; Markov models; the meaning of the Markov property. Autoregressive models, especially additive autoregressions; conditional variance estimates. Bootstrapping time series. Trends and de-trending.

Reading: Notes, chapter 26; R for examples; gdp-pc.csv

Advanced Data Analysis from an Elementary Point of View

Posted by crshalizi at May 01, 2012 10:30 | permanent link

Three-Toed Sloth:   Hosted, but not endorsed, by the Center for the Study of Complex Systems