March 28, 2004

Creationism and Stupid Complexity Measures Make a bête noire with Two Backs

P. Z. Myers at Pharyngula alerts me to a new variety of rubbish from our friends at the Discovery Institute: "ontogenetic depth". (See also here at The Panda's Thumb.) This notion, the brainchild of one Paul Nelson, is supposed to be a way of measuring the complexity of organisms. I can't find any serious preprints about it online (there are, of course, no published papers), so there is some slight chance it is not as completely wrong-headed as this note by Nelson (PDF, 540k) makes it seem. (The other primary source seems to be this two-paragraph note, also by Nelson.) With that caveat, here goes.

Nelson defines ontogenetic depth as "the distance, in terms of cell division and differentiation, between a unicellular condition and a macroscopic adult metazoan able to reproduce itself". The kicker, says Nelson, is that "natural selection only 'sees' reproductive output", so selection can't work on the "ontogenetic network" which produces the final reproductive organism, and so can't increase ontogenetic depth. Now, I can't improve on Myers's explanation of all the ways ontogenetic depth is biologically ill-founded, so I'll lay into it as a complex systems geek who (cue MC Hawking) is tired of seeing our ideas misused by these charlatans. (We misuse them enough ourselves.)

The most basic point, of course, is that when Nelson writes about "the causal inefficacy of natural selection for constructing ontogenetic networks", I don't know whether to laugh, or to have several tons of papers dropped on his head. Never mind actual evolutionary developmental biology. There's a neat and active body of work on using evolvutionary computation to find neural networks which do useful computations, where the network structure is encoded as a set of developmental rules. It took me about five minutes to Google up Dara Curran and Colm O'Riordan's review paper on "Applying Evolutionary Computation to Designing Neural Networks" (Postscript, 420k), about a third of which is taken up by various developmental encoding methods. In all these applications, what selection sees is the performance of the final network, but what gets selected on is a developmental program or strategy. Una-May O'Reilly applies the same trick to structural design: what evolves is a developmental program (technically, a modified L-system) which responds to its environment as it grows spatial forms; what selection sees is the ultimate form. (Actually, she even evolves the language the developmental program is in.) You can, and people have, build the results, and you can download Una-May's software to play with. For that matter, as I've mentioned before, I used to work in a research group where we evolved cellular automaton rules to perform distributed computation. The dynamics set up by the rules naturally generated a complicated set of emergent structures --- grew some organs, if you will --- which interacted to actually do the calculations. Damaging one of them, or changing its properties, would in general mess up the calculation. All selection looked at was how often the computation came out right; all selection had to work with was the CA rules, which didn't even explicitly represent the organs. This was not a problem, and we can tell you how and why it worked, with considerable mechanistic and mathematical detail.

(I can't resist making a biological point here, which is that someone who doesn't get how things like this could be possible is simply failing to grasp elementary, 1858-vintage natural selection. Whether in actual metazoan development or in the artificial cases, what matters is that the genome has a reliable and causal, though indirect, effect on aspects of the phenotype to which themselves affect fitness. One does not require a very sophisticated understanding of evolution to see that this is enough for selection to work with, though such an understanding helps explain, for instance, just why metazoans with complex internal structure always start from small, simple eggs, something which for Nelson must be put down to a whim of YHWH, and/or the Elohim Space Brothers.)

So much for the bizarre idea that natural selection can't act on developmental mechanisms, and so couldn't generate ontogenetic depth. Logically, this is a separate question from whether ontogenetic depth is a good way of measuring the complexity of organisms. (It could be that it was a useful concept, quite aside from its origins in creationist apologetics. One doubts it, but if they come up with enough ideas...) As it happens, I'm something of a specialist in complexity measures, and referee a new manuscript on the measurement of complexity about once every six months. In fact my first published paper (PDF, 215k) was an analysis of a complexity measure called "thermodynamic depth". That in its turn was inspired by something called logical depth, both of which make good points of comparison with ontogenetic depth.

Logical depth was proposed by the physicist Charles Bennett in the early 1980s as a complexity measure for strings and other discrete mathematical objects. (The best reference is not Bennett's original papers but Li and Vitanyi's An Introduction to Kolmogorov Complexity and Its Applications.) Given a string --- the digits of pi, say --- there is a shortest program which will generate the characters of the string, in order. The classical, Kolmogorov notion of algorithmic complexity is just the length of this program. This turns out to make random objects maximally complex --- in fact, Kolmogorov showed you could define randomness in terms of maximal algorithmic complexity. It turns out that pi is not algorithmically complex, because a fixed-length program (e.g., his) will generate arbitrarily many digits of pi. Logical depth, roughly speaking, is how long it takes the minimal program to run. It's entirely possible that something which is algorithmically simple (like pi) is logically deep, because while there's a short program which generates that object, it has to do a lot of work, in the form of recursive calculation and storing partial results. The problem with logical depth is that it depends on knowing the minimal program, and if you could do that, you'd have a way around Gödel's Theorem, and be able to do uncountably many impossible things before breakfast. So, no go; but, through the miracle of mathematics, it is possible to establish many properties of logical depth, even if you can't calculate it, and you can put bounds on the depth by showing quick or slow ways to do the relevant computations.

Thermodynamic depth, as proposed by Seth Lloyd and Heinz Pagels, took the idea that "deep", complex objects are hard to construct and tried to make it more physical, less computational, and ultimately tractable. ("Complexity as Thermodynamic Depth", Annals of Physics 188 (1988): 186--213.) I won't go in to the technical details here, but the intuition is that the thermodynamic depth of a system is how much information its final state contains about the trajectory leading to that state. This turns out to be a kind of measure of how far you have to drive the system from equilibrium to produce the object of interest, integrated over time. Thus monkeys and hurricanes are both far-from-equilibrium phenomena, and a hurricane may even be further from thermodynamic equilibrium than a monkey, but the monkey implies a fairly specific, and quite long, far-from-equilibrium evolutionary history, and the hurricane does not. (As D'Arcy Thompson said, in a related context, "A snow-crystal is the same to-day as when the first snows fell".) Thermodynamic depth was a very clever idea, but sadly there are technical problems with it, having to do with the definition of "state", that keep it from really working out. (See, if you really care, this paper.)

Now, I can't find any place where Nelson mentions either logical or thermodynamic depth, but I haven't searched very exhaustively, and anyway I refuse to believe that he decided to measure how complex something was by how hard it would be to assemble, and to call the result "depth", without common conceptual descent. (Admittedly, I haven't done a formal phylogenetic analysis.) So let's contrast.

Recall that Nelson wants to make ontogenetic depth proportional to the number of cell divisions and differentiations that separate a unicellular egg from a reproductive adult. Now, divisions and differentiations are two different kinds of events, and he doesn't specify how much weight division will have relative to differentiation. He also doesn't say whether all differentiations add equally to the depth, or whether some changes in cell type are deeper than others. There doesn't seem to be any obvious way to decide on these conversion factors. Ontogenetic depth is not, then, a mathematically well-defined measure, and to make it well-defined we'd have to pull some conversion ratios out of, to be polite, the air. (Neither logical depth nor thermodynamic depth have such problems.) Let us rise above our shock at a Senior Fellow of the Discovery Institute being mathematically sloppy and press on. The idea behind this attempt at a definition seems to be that ontogenetically deep organisms would be ones requiring elaborate developmental mechanisms (Nelson's "ontogenetic network"), and so be the ones which are hard to grow.

There are a couple of problems here. One is that, however one fills in Nelson's incomplete definition, the result is going to be biased towards saying large organisms are more complex than small ones, because big creatures will need more cell divisions. Dwarf mammoths may have been less complex than their full-sized cousins, but this hardly seems axiomatic.

The more fundamental objection, I think, is that if you want to measure the complexity of the developmental mechanism, counting changes in cell types is a clumsy way of going about it. It's not, for instance, going to tell you anything about the extent to which the organism is exploiting environmental asymmetry in pattern formation, or regulating away environmental perturbations, or engaging in self-organization, or adapting its phenotype in response to environmental cues, which it would have to sense and process. (As Myers says, Nelson appears to think of development as a process in which cells follow a pre-scripted list of instructions, without noticing what's happening around them.) If you want to measure that kind of organization (what Gerhart and Kirschner call "contigency"), counting cell cleavages isn't going to get you very far. I have an idea about the right way to measure the complexity of network dynamics; other people have other measures. These techniques could be used to estimate the complexity of developmental processes (e.g., the formation of segment polarity in Drosophila). We'd need to determine the genetic regulatory network involved, and get a considerable volume of data on those genes' expression levels over time. Gene chips experiments are still expensive enough that this isn't really a feasible project, otherwise I'd be working on it right now. But someday soon that kind of data will be too cheap to meter, and somebody will do that study, and the result will say how much information processing is intrinsic to that developmental process. Rather than just having a raw number at the end of this, we would, as a necessary part of the process, learn a lot about how the regulatory network performs its function by processing that information, and even something about the thermodynamics involved. This is important; a good complexity measure doesn't just give you a number, it gives you a number with consequences. Learning an organism's ontogenetic depth would seem to tell us absolutely nothing else about it.

I have to admit that ontogenetic depth is not among the absolute worst of complexity measures I come across, but only because it has some content. But it's full of confusion and ambiguity, it lacks a clear physical or biological meaning, it doesn't really measure what it hopes to, and has no implications for anything else. If this is the best the Discovery Institute can do in the way of ideas inspired by the program of Intelligent Design creationism --- I'm not at all surprised.

Update, 10 April 2004: Somehow the work of Jordan Pollack's Dynamical and Evolutionary Machine Organization group at Brandeis escaped my mind when I wrote about the combination of ontogenetic mechanisms and evolutionary computation above, even though that work is quite relevant and extremely impressive. See, e.g., the paper on "Three Generations of Automatically Designed Robots", or this paper on evolving noise-tolerant ontogenetic mechanisms. Hugues Juillé's papers on finding CA rules for distributed computational problems (e.g., this one) are also quite cool, and show that these kind of results do not depend on having a fixed, externally-given fitness function (despite many claims to the contrary from the usual suspects).

Creationism; Complexity

Posted by crshalizi at March 28, 2004 01:33 | permanent link

Three-Toed Sloth:   Hosted, but not endorsed, by the Center for the Study of Complex Systems