Attention conservation notice: It's long, and it's about something which makes eyes glaze over even as tempers flare up, and it's not funny at all. Worse yet, there's a part II which is even more mathematical and boring. You could always read it later, but time spent now is gone forever.
Disclaimer: A decade ago, some of the senior faculty in my department, i.e., some of the people who will be voting on my contract renewal and tenure, helped put together a book called Intelligence, Genes and Success: Scientists Respond to The Bell Curve. Most, but not all, of the responses in that book were exceedingly negative. I cite some of that work below. Whether this should alter your evaluation of the case I make is for you to decide.
Thanks are due to (alphabetically) Carl Bergstrom, John Burke, Henry Farrell, Mark Liberman, Aryaman Shalizi and Aaron Swartz for many helpful suggestions. But, of course, I'm the only one responsible for this, all remaining errors are my own, and it's not in any sense authorized or endorsed by anyone (in particular not by them).
People seem to be experiencing more than the usual difficulty grasping what I was getting at in my posts on accent and intelligence. This is my fault, for trying to be cute rather than trying to be clear. (I realize I'm too murky even when I am trying to be clear.) I am already heartily sick of the subject, which is turning into the huge time-suck I was afraid it would be, and which presents a depressing prospect from every point of view, not least those which make it clear how rare it is for anyone to change their mind on any aspect of it for any cause at all. (I do wonder if I should've stuck with the original title of "Duet for Leo and Razib.") My aim here is to lay everything out cleanly and explicitly, and be done with this matter.
I was originally going to do just one post, explaining why I called the general factor of intelligence a "statistical myth", why I don't put any real faith in what I regard as even the best of the current estimates of IQ's heritability, and the evidence for IQ's malleability. But the thing grew unwieldy, and the only thing which I find more dreary, right now, than discussing heritability and malleability is explaining why factor analysis can't do what people want it to, so I'll save that for later, and stick to the heritability and plasticity of IQ here. [That post is now out.] Whether IQ means anything or not, it is, unlike general intelligence, unquestionably something we can measure, so we can consider how heritable and malleable it is. I am going to assume that you know what "variance" and "correlation" are, but not too much else.
To summarize: Heritability is a technical measure of how much of the variance in a quantitative trait (such as IQ) is associated with genetic differences, in a population with a certain distribution of genotypes and environments. Under some very strong simplifying assumptions, quantitative geneticists use it to calculate the changes to be expected from artificial or natural selection in a statistically steady environment. It says nothing about how much the over-all level of the trait is under genetic control, and it says nothing about how much the trait can change under environmental interventions. If, despite this, one does want to find out the heritability of IQ for some human population, the fact that the simplifying assumptions I mentioned are clearly false in this case means that existing estimates are unreliable, and probably too high, maybe much too high.
I should add that nothing I'm saying here is in any way original. Almost thirty years ago, Oscar Kempthorne — a man who knew a thing or two about statistical genetics — made pretty much all these points in a paper in Biometrics, working in swipes at Dick Lewontin while he was at it. (I would quibble with him some about the possibility of causal inference from observational data, but these rely on methods which didn't then exist, and are certainly not used by the parties to this dispute.) I do not, of course, pretend to be in Kempthorne's league, or anywhere close. For me, this is another episode of "Why oh why can't we have a better intelligentsia?". Kempthorne's exasperation, on the other hand, was that of someone seeing the tools of their life's work being wretchedly abused.
When we take our favorite population of organisms (e.g., last year's residents of the Morewood Gardens dorm at CMU), and measure the value of our favorite quantitative trait for each organism (e.g., their present zip code), we get a certain distribution of this trait:
If we are limited to the tools of early 20th century statistics (in particular, if we are the great R. A. Fisher, and so simultaneously forging those tools while helping to found evolutionary genetics), we summarize the distribution with a mean and a variance. We can inquire as to where the variance in the population comes from. In particular, assuming the organisms are not all clones, it is reasonable to suppose that some of the variation goes along with differences in genes. The fraction of variance which does so is, roughly speaking, the "heritability" of the trait.
The most basic sort of analysis of variance (see also: Fisher) would make this conceptually simple, though practically unsuccessful. Simply take all the organisms in the population, and group them by their genotypes. For each group of genetically identical organisms, compute the average value of the trait. Compare the variance of these within-genotype averages (that is, the across-genotype variance) to the total population variance; this is the fraction of variation associated with genotypes. In most mammalian populations, where clones (identical twins, triplets, ...) are rare and every organism otherwise has a unique genotype, this would tell you that almost all of the variance of any trait is associated with genetic differences. On such an analysis, almost all of the variance in zip codes in my example would be "due to" genetic differences, and the same would be true of telephone numbers, social security numbers, etc.
To see why, look at my table again. With one exception (the twins who live in 15213 and 48104), in this population changing zip code means changing your genotype. The vast majority (81%) of the variance in zip codes is between genotypes, not within them. With real human data, a quarter of the people wouldn't be twins living apart, and the proportion of variance in zip codes "due to" genotype would be even higher.
Naively, then, on this analysis we would say that the "heritability" of zip code, the fraction of its variance which goes along with genetic variations, is 81%. It is crucial to be clear on what this means, which is merely and exactly this: in this population, if we take a random group of genetically identical people, the variance within that group should be 19% (=100-81) of the total variance in the population.
Of course, there is nothing special about the genotype, and one could do a completely parallel analysis of variance based on environmental histories. If those are captured at a fine-enough grained level, every organism has a unique history, so all variance is "due to" the environment. Clearly, the sense in which the phrases "due to", "explained by", "accounted for", etc., are used in analysis of variance and regression have nothing to do with their ordinary, causal meanings. I am going to try to avoid using the causal-sounding phrases, because I think they encourage confusion, and instead stick with the vocabulary of "associated with", or at most "described by", because anything stronger is unjustified here.
Nobody credible has ever seriously proposed doing this sort of ridiculous analysis of variance. It seems clear that not every aspect of the environment matters. (The number of snowflakes which hit my face during February was either odd or even, but it's hard to see how that could change my present weight.) Similarly, not every genetic distinction makes a difference to the trait. If we could somehow identify relevant distinctions, and group together organisms with relevantly-identical genotypes, we'd be doing something much more reasonable. The true heritability of the trait is defined to be the ratio between the variance associated with genetic differences and the total variance in the trait.
If you've taken any kind of statistics course at all, what I've just said may be enough to give you an idea of how to figure out heritability: identify the relevant environmental variables, measure them, regress the trait on them, and figure that the residual variance has to be genetic. Many people, I find, have the impression that heritability studies control for the environment, in the sense of regression. (Leave aside, for now, whether "controlling for" really does what people seem to think.) Some studies in experimental genetics on plants and animals do this, near enough, but that's basically never how it's done with human beings. Instead, the procedure is vastly more indirect and model-dependent, which matters a lot when evaluating the results, as we'll see.
Supposing that we somehow learn the genetic variance, the usual next step is to split it into two uncorrelated components, one associated with the distinct, additive contribution of each individual gene to the trait ("additive genetic variance"), and one associated with specific combinations of genes. (Making this split is a straightforward problem in linear algebra; I won't get into it.) The ratio of additive genetic variance to total trait variance is the "strict" or "narrow", as opposed to the earlier "broad" heritability. Conventionally, the symbol for heritability is h2 (not h), with a subscript to indicate whether it's meant narrowly or broadly; I'll write h2 for the narrow sense and H2 for the broad. (The square is here for the same reason that the fraction of variance accounted for by a regression is R2 and not R.)
Implicit in the last steps is the assumption that the value of the trait is, in each organism, just the sum of a genetic contribution and an environmental one, i.e., that there is no interaction between the relevant genes and the relevant environments; also the assumption that the genetic contribution to the trait is completely uncorrelated with the environmental contribution. If these assumptions fail, one can still calculate heritability-like quantities, somewhat like in the simple analyses of variance, which can play similar roles in some evolutionary calculations, but they become strongly context-dependent, so it no longer makes any sense to speak of the heritability of a trait. We also come back to absurdities, since the tendency of families to cluster geographically makes zip codes look heritable — and more heritable in larger and more representative samples of the nation than in more geographically localized ones!
Saying a trait is highly heritable is saying that, in a given distribution of genotypes and environments, most of the variance in that trait is associated with genetic differences. Maybe the most important point I'll make here is that this is not the same most of the value of the trait being genetically controlled. The textbook example is that (essentially) all of the variance in the number of eyes, hearts, hands, kidneys, heads, etc. people have is environmental. (There are very, very few mutations which alter how many eyes people have, because those are strongly selected against, but people do lose eyes to environmental causes, such as accident, disease, torture, etc.) The heritability of these numbers is about as close to zero as possible, but the genetic control of them is about as absolute as possible. Similarly, heritability says nothing about malleability, about how much or how easily the trait changes in response to environmental manipulations: heritability is defined with respect to a given distribution of environments, and does not predict the response to environmental changes. (I will come back to this below.)
What heritability does predict is the response to selection, in a constant distribution of environments. This is why quantitative geneticists developed and retained the concept. If a population is subjected to directional selection on a trait, whether the selection is natural or artificial, and the trait follows the classical decomposition into additive, uncorrelated components, the degree to which the genetic component of the trait changes will depend on the intensity of selection, the variance in the trait, and its heritability. The response to selection, the phenotypic change in the next generation, will be large if the selection pressure, the trait's variance, and the trait's heritability are all high, assuming that the distribution of environments is held fixed and uncorrelated with genotype. In sexually-reproducing organisms, where the genes get reshuffled every generation, the relevant heritability is the narrow heritability, involving only the additive term for the genes, not the broad heritability, involving the total genetic variance. (Feel free to guess which number The Bell Curve obsesses over, despite supposedly being concerned with evolution.) To reinforce the context-dependence of heritability, note that selection will tend to reduce genetic variance, and so heritability, especially when selection is strong.
(Fisher, incidentally, was very far from being enslaved by the product of his own labors, at least here, describing it  as "one of those unfortunate short-cuts, which have often emerged in biometry for lack of a more thorough analysis of the data", dismissing the denominator as a "hotch-potch", and complaining that "the same herd, measured in the same character, would give widely different estimates of 'heritability' according to the practical precision obtained by the care and skilful control of the experimenter.")
So how does one estimate the heritability? The typical tactic, which again goes back to the early days of Fisher and friends, is to measure the covariance among individuals who share some but not all of their genes, and some but not all of their environments, and use the differences in covariances to gauge the size of the sundry components of variance. The standard biometric model decomposes the trait, call it Q, as follows:
If one takes identical (monozygotic) twins, raised together, and assumes that they are otherwise just like other members of the population, one obtains for the covariance
Let's break for a moment to be sure we understand this estimate, which is easy to get wrong. Heritability tells us how much IQ should differ among people who all happen to be genetically identical (twins, triplets, clones...), but are otherwise randomly distributed across the population, as compared to how much it differs across the whole population. By a convention of the test-makers, the standard deviation of IQ for the whole population is 15 points. That is, if we take a totally random person, we should expect their IQ to be about 15 points from the population average, which another convention fixes at 100. If we take two totally random individuals, then, we'd expect them to differ in IQ by about 22 points [= 15 * sqrt(2)]. Now imagine we have one of these groups of genetically-identical-but-otherwise-random people. If the heritability of IQ is 0.75, we expect the variance of IQ scores within such a group to be only 0.25 (=1-0.75) of the population variance in IQ. The standard deviation is the square root of the variance (that's why it's h2), so the expected standard deviation of a genetically-identical group should be 0.5 of the population standard deviation. Thus, in the end, what h2 amounts to is that two randomly chosen but genetically identical people should differ in IQ by 11 points rather than 22. It is not the case that a heritability of .75 means that there is a three-quarters chance of having identical IQ scores, or that three-quarters of the value of the IQ score is set genetically --- heritability is always about dividing up shares in the spread around an average level, not the level itself.
How could this possibly go wrong? Well, one sign that all is not well comes from comparing such "direct" estimates with "indirect" ones. The full algebra is tedious, but I'll go through a little bit of it to show the logic of what's going on. If we take fraternal ("dizygotic") twins, or other siblings, they will share somewhat more than half of their genes, because assortative mating means that their parents are already apt to be more similar genetically than a random pair of individuals from the population. Taking this into account, the covariances are
This is a good place to remind ourselves that the different contributions to the variance are basically never, in humans, measured in any direct way, or "controlled for" by regression. C and S are not measured variables, nor are A and D. Rather, they are all inferred indirectly, by comparing the correlations predicted by the model with observed correlations. (Doing otherwise, in experiments on other organisms, involves doing things like breeding lineages of known genetic composition.) If the model's specification is wrong, then even the best estimator of a parameter like heritability can give you rubbish. (Incidentally, almost nobody who works with these models in psychology seems to use the kind of mis-specification testing or more elaborate specification analyses econometricians have developed — but, to be fair, most economists don't.) In particular, to get the right values for the genetic contributions to the variance, it is essential to account for all environmental sources of correlations, otherwise we are back to the genetics of zip codes. It is also crucial that the form of the model — additive contributions from uncorrelated sources — be correct. If there are correlation between genes and environments, or there are interactions between genetic contributions and environmental ones, then all bets are off.
To see why gene-environment interactions matter, consider one of the best-established links between genetic variations and intelligence, phenylketonuria. This is a recessive genetic disease which interferes with the normal metabolism of the amino acid phenylalanine. If someone with one of the defective forms of the gene for phenylalanine hydroxylase consumes too much dietary phenylalanine, it leads, among other problems, to serious mental retardation. Under suitable diets low in phenylalanine, however, they grow up mentally normal. Assigning shares of this effect to the genes and to the environment is exactly as sensible as trying to say how much of the fact that a car can go is due to its having an engine and how much is due to their being fuel in the tank. The best the usual biometric model could do here would be to predict that having the gene always reduced intelligence, as did consuming phenylalanine (which would be bad news for makers of artificial sweeteners); the fact that it's the combination, and only the combination, which is a problem would be missed, and the predicted size of the effect would be badly wrong. (The situation is similar for, say, hypothyroidism and a lack, rather than an excess, of iodine, but the genetics there are messier.) So while everyone piously says that genes and environments interact in development, they typically use models which assume that they do so only in trivial ways, and hope that any actual interactions are small enough to be treated as noise.
The basic biometric model predicts that direct and indirect estimates of heritability should agree. Since they do not, the model cannot be right. It needs to be modified to include additional sources of variance, or correlated components, or interactions, or some combination of these. Which?
The best estimate of IQ's heritability that I've seen, at least within the usual biometric framework, is that of Devlin, Daniels and Roeder (Nature 388 (1997): 468--471), and they take the first tack, identifying a neglected source of variance. (Disclaimer: Roeder is one of the aforementioned senior members of my department. On the other hand, this is one of the few studies of the question by actual statistical geneticists.) They performed a meta-analysis of all the usable correlation studies they could find, including the justly famous Minnesota twins study, which came to 212 estimated correlations covering 50,470 different pairs of people. This gave them data on identical twins raised together and apart, fraternal twins raised together, non-twin siblings raised together and apart, parents and children living together and apart, and adoptive parents and their adopted children. It's worth noting that they were able to find only five studies on identical twins raised apart (none with more than 60 pairs), two studies on non-twin siblings raised apart (none with more than 150 pairs), and three studies of separated parents and children (none with more than 400 pairs). Even assuming the data quality is excellent, there isn't a lot of it that isn't "contaminated" with shared family environments.
What Devlin et al. took from the studies were the estimated correlations and the sample sizes, not the heritability estimates. Then they tried to fit them all into a single coherent picture. If you assume any particular magnitudes for all the different components of the variance, you can work out the expected correlation you'd see for any one of the nine kinds of pairings, and, using the sample sizes, the likelihoods of getting the observed configuration of sample correlations. If I'd been the one doing it, I'd have tried to find the likelihood-maximizing value of the parameters, and then done some bootstrapping to get confidence regions for this estimate. Since I suspect that would turn out to be horridly intractable, I am happy with what they did instead, which is to do Bayesian estimation with a thoroughly uninformative prior, one basically indifferent to all decompositions of the variance which weren't arithmetically impossible. (The paper I ought to be finishing, instead of writing this, is about when my fellow frequentists should find such procedures unobjectionable.) In other words, they did a perfectly standard Bayesian meta-analysis.
Beyond that, though, they changed the model, so it looked like this:
Devlin et al. fit several permutations of this model, including ones with and without the maternal effect term, and allowing more or less shared environmental influence on people with different sorts of relationships. What they found was that including the maternal effects substantially improved the fit, despite the fact that just about everyone previously had ignored it as negligible. To summarize, in their preferred model, the narrow-sense heritability (h2) was 0.34 (with a 95% credible interval of 0.27 to 0.40), the broad-sense heritability (H2) was 0.48 (with a CI of 0.43 to 0.54; coincidentally, very close to the estimate in Jencks's old book ), and share of the maternal term was 0.20 for twins (CI of 0.15 to 0.24) and 0.05 for non-twin siblings (CI of 0.01 to 0.08). They also compared their model with maternal effects to an alternative which embodied the often-repeated claim that the heritability of IQ rises with age: maternal effects fit much better. In retrospect, it's kind of astonishing that everyone had ignored maternal effects before, if only because including them is quite standard in animal quantitative genetics. Once, like Devlin et al., you do include them, voila, the discrepancy between direct and indirect estimates of heritability goes away. There's more to this study (read it, if you're not sick of these matters already), but that's the core of it.
So, if I like this study so much, and it puts the narrow-sense heritability of IQ at 0.34 and the broad-sense heritability at 0.48, why do I say that we presently know squat about the heritability of intelligence? Partly this is because I deeply object to the confusion of "IQ" with "intelligence", but that's a subject for the sequel. Even if we stick to IQ, whatever that might be, though, I don't see how this study, or any similar one, can really answer the question on technical grounds. It's about as far as you can go within the classical assumptions, but those assumptions are horridly shaky.
Let me point out three things that Devlin et al. didn't do — limitations they're fully aware of, I hasten to add, but couldn't be helped given the data available. First, they had to assume that the magnitude of the different additive components was the same across all the studies in their meta-analysis. In particular, the magnitude of the environmental components not included in maternal or within-family terms were assumed to be the same for everyone. Second, they neglected any correlations between the components. Third, they neglected any interactions between the components, and in particular any gene-environment interactions. Not only did they neglect the last two possibilities, they did not compare their model to models incorporating them, so they give no reason to think that they are negligible. I think there are pretty good reasons to believe these things would make a big difference. I am not going to say much more about the first problem, of heterogeneity or heteroskedasticity, but I want to expand on the issues of missing environmental correlations, of interactions and nonlinearities, and of cultural transmission and gene-environment correlations.
Many twin data sets show that the correlations in twins' IQs actually change with the environment, in a pretty crude way which nonetheless goes beyond what people generally include in their models. I will quote from an old paper by Bronfenbrenner [3, pp. 159--160], because it's handy and it makes the point:
The importance of degree of environmental variation in influencing the correlation between identical twins reared apart, and hence the estimate of heritability based on this statistic, is revealed by the following examples.(Let us pause a moment to contemplate the sense in which identical twins, growing up in the same town and attending the same school, are "raised apart".)
a. Among 35 pairs of separated twins for whom information was available about the community in which they lived, the correlation in Binet IQ for those raised in the same town was .83; for those brought up in different towns, the figure was .67.
b. In another sample of 38 separated twins, tested with a combination of verbal and non-verbal intelligence scales, the correlation for those attending the same school in the same town was .87; for those attending schools in different towns, the coefficient was .66. In the same sample, separated twins raised by relatives showed a correlation of .82; for those brought up by unrelated persons, the coefficient was .63.
c. When the communities in the preceding sample were classified as similar vs. dissimilar on the basis of size and economic base (e.g. mining vs. agricultural), the correlation for separated twins living in similar communities was .86; for those residing in dissimilar localities the coefficient was .26.
d. In the Newman, Holzinger, and Freeman study, ratings are reported of the degree of similarity between the environments into which the twins were separated. When these ratings were divided at the median, the twins reared in the more similar environments showed a correlation of .91 between their IQ's; for those brought up in less similar environments, the coefficient was .42.
By the time you're done partitioning the twin pairs into classes in this way, n is pretty small, and the sampling errors in the correlations are going to be large, so I wouldn't give the 0.86 vs 0.26 contrast a lot of credence, but the fact that the differences are all in the same direction, and get pretty big, ought to be hard to ignore. (And it's worth remembering that n is never very large for identical-twins-raised-apart studies.) The obvious explanation for such results is that the developmentally-relevant environment of twins raised apart, but in similar towns, is much more highly correlated than that of twins raised apart in dissimilar towns. This means that a substantial chunk of the correlation you thought was genetic is actually due to shared environment, and pushes your heritability estimate down. Alternately, you could abandon the lack of correlation between genetic and environmental contributions, or the strictly additive nature of the model by including a very substantial interaction between genes and environment, so that identical genotypes respond very differently to those differences in environment. However you slice it, your estimate of heritability was too high.
This is, of course, an old criticism; also a correct one. Kempthorne put it like this:
What, indeed, is the "grip" on environment in the human IQ area? It is no more than "reared together" versus "reared apart", and what does "reared apart" mean? Nothing more than at some age two related individuals, e.g., identical twins or full-sibs were separated by adoption, and then placed in homes that could be related familially and/or, of similar economic and social nature. I can only comment: Really, how naive can one be? The Burt study was characterized in the literature as the "only experiment". Some experiment!Devlin et al. were only able to detect maternal ("intra-uterine") effects because their meta-analysis had a very large sample, which made their tests more sensitive to its presence, and because it is fairly easy to tell when people had the same birth-mother.
As soon as one turns to any behavioral measurements, the need to incorporate intra-uterine, family and community environment is obvious. I have the view that the "hereditarians" are utterly naive. It is obvious that parental IQ influences offspring environment. It is obvious that there is cultural transmission. To ignore the existence of this is merely stupid. I see no point in mincing words. If non-scientists sometimes have scorn for some supposed "scientific work", they should not be faulted.
To this naivete in model formulation, must be added a statistical naivete. Any statistical test has, within its conceptual underpinning, a sensitivity or power function. It needs essentially no deep thought to realize that the sensitivity of statistical tests for maternal effects, for genotype-environment interaction, etc., etc., is so low that to say, as has been said often, that such and such a model modification has been examined and found unnecessary is utter naivete. [p. 18, his italics, my links]
When people have included interactions between genetic and environmental variables, and done so in an even half-way decent manner, the results are quite dramatic, and make it impossible to talk about a value of heritability at all. For instance, allowing a (sucky) measure of socioeconomic status to interact with genetic and environmental variables, Turkheimer and co. found a massive dependence of the broad-sense heritability on IQ on status, running from nearly zero at low status to about 0.8 at high status. (Since the data, very unusually for this sort of study, actually included a lot of poor people, the former number should be rather more precise than the latter.) Of course, this is another one of the uses of regression which is merely descriptive, and drawing any causal inferences from it is very risky indeed (a point I'll expand on in the sequel; for now, you can read Clark Glymour explaining why this is Bad and Wrong). But if you're willing to believe, say, Edward Prescott's statistics, let alone Arthur Jensen's, you have no right to complain about Turkheimer et al. Anyone who tells you that the heritability of IQ has any particular value needs to explain away findings like this.
Let us consider trying to estimate the heritability of something which is transmitted culturally. Its real heritability is zero, since there is no genetic component to its variance, but the question is rather what the estimated heritability would be, employing the usual methods. In particular, suppose I came up with some quantitative measures of, say, accent. (Talk to some phoneticians if you think that wouldn't be possible.) Now I attempt to estimate their heritability. I'd find that identical twins reared apart had more similarity in accent than random members of the general population. They were born at the same time (and population-wide accents drift over time: see, again, Labov), in the same place (and so they will tend to grow up near each other, even though raised apart). Moreover, the kind of families receiving children being raised apart are not random samples of the general population. (To pick an example, in one of the Minnesota studies on IQ, race, and adoption, the average IQ of the adoptive fathers, all of them white, was over 120, while the state average for white males was 105. For a more formal and detailed version of this critique, see, e.g., Mike Stoolmiller.) The more geographic and social (and, to a lesser extent, temporal) variability I include in my sample, the more such twins will stand out as more similar in accent than the general population. Of course children born and raised in the same family are going to be even more similar, through obvious non-genetic processes. Applying the usual direct estimate based on twins raised apart, or even the kind of analysis done by Devlin et al., I will estimate a non-zero heritability, which is entirely an artifact of neglecting cultural transmission. Another direct estimate of the narrow heritability is proportional to the correlation between separated parents and children. (It's twice that correlation if there's no assortative mating, shading down to just equal to the correlation when mating is perfectly assortative.) This, too, will be positive, if only owing to geographic clustering. The typical indirect estimate, based on comparing the correlation of identical and fraternal twins raised together, is the only one which should work, if they experience environments of equal variance. If identical twins experience more similar environmental influences on accent than fraternal twins, then even it will conclude that accent is, in fact, heritable.
(In fact, the usual methods should lead us to conclude that latitude and longitude are heritable. [To be concrete, imagine measuring this by averaging 1000 GPS readings taken at random times over the course of a week, commencing when the subjects are exactly 17 years and 17 days old.] Take identical twins who are adopted out right after birth — what should be ideal cases. Being born in the same place, at the same time, and placed in new households by the same mechanism, they will wind up physically closer than randomly-selected infants born at the same time. [Remember, on top of the tendency to wind up near where they were born, the kind of households which adopt are not socially representative and so will tend to cluster in space, a phenomenon demographers refer to as "neighborhoods".] Their latitude and longitude will thus be positively correlated, and since this correlation is the "direct" estimate of broad-sense heritability, that will be non-zero. I'd even be willing to bet, modestly, that identical twins adopted out will be more correlated in location than fraternal twins adopted out, if only because some of the fraternal twins will be of different sexes and some adoption processes will tend to treat them differently. Since identical twins raised together tend to be emotionally and physically closer than fraternal twins raised together, even the indirect method will tell us that location is heritable. If they applied the same principles here as they do in formally-similar cases, the hereditarian psychologists would recommend inflating the sample correlations before using them to estimate heritability, to compensate for the restricted geographic range of such studies, making the error worse.)
One of the sound tenets of a lot of conservative social and political thought is an insistence on the importance of tradition and tacit knowledge, its transmission through families and communities, and the difficulty of making up for the absence of early immersion in a tradition with later explicit instruction. The fact is, however, that if I studied anything which is transmitted via tradition in the way people estimate IQ's heritability, I'd conclude that it had a genetic component. If, in particular, there are traditions which affect IQ, the estimated genetic component of the variance is going to actually include at least some of the variance in traditions.
It's not hard to see how to do a proper study of heritability for accent. What you would need to do is take separated twins, adopted into other families, and see if they were more or less similar to each other than randomly-chosen children adopted into the same family, after matching on covariates (age at adoption, time with the family, time spent in one place, etc.). Similarly, the requirements for a really sound twin study of IQ have been known at least since Flynn laid them out in his 1980 book on on Race, IQ, and Jensen: twins from a representative sample of the genetic variation in the population would need to be adopted into families with a representative sample of the environmental variation in the population, with no gene-environment correlations introduced by the adoption process itself. Even so, we would need an indirect method, or lots of standardized uterine replicators, to remove maternal effects, and the problem of post-adoption processes creating such correlations would remain. I am unable to discover anyone, in the last twenty-seven years, actually doing such a study, though I'd be interested to learn of one which even comes close.
This leads us to the topic of gene-environment correlation, as opposed to interaction. It is very easy to think of ways in which genes and environments can become correlated during the life cycle. A standard example is the child who shows some early proclivity for music, perhaps genetically primed, which results in music lessons, more exposure to music, praise and interest in their efforts, support for them practicing, etc., the flip side being the child who seems dull early on, and so gets written off. The well-known paper by Dickens and Flynn shows how far you can push this idea, especially if you include some social amplification. But making this the main scenario for such correlations is indulging in that optimistic individualism which is one of our more admirable national traits (really!), but here gets in the way of thinking clearly.
All hitherto-existing civilized societies are divided, more or less sharply, into classes or groups, which tend to reproduce themselves through non-genetic means. (Bowles and Gintis give an excellent review of the contemporary situation, though even they use too high an estimate of the heritability of IQ.) The members of these groups differ systematically in their access to material and cultural resources. These are a large part of what is meant by "environment" in the biometric models. (Throwing "socioeconomic status" into a regression does not make these variables go away.) Social classes also have a tendency towards endogamy, of variable strength. The result will be a tendency to create genetic differences between groups, and so correlation between genes and environment. If we look at traits which respond favorably to material resources (like height) or cultural resources (like cognitive skills), and ignore this correlation, we will get a systematically upward-biased estimate of heritability, because genetic similarity will predict the value of the trait. Moreover, selection on the trait will alter the frequency of the associated genes. This is true even if the genetic differences have no causal influence on the trait at all, i.e., if there would be no systematic differences were the distribution of environments equalized (hopefully, up).
To the best of my knowledge, we currently have no idea of the magnitude of this effect, because:
I want to expand a bit on the importance of cultural resources, and how it plays in here. Consider a society where (for the sake of argument) the middle classes have dramatically expanded within living memory, and where our sample is disproportionately biased towards those classes. (Most of the family and adoption studies used for IQ are, in fact, biased in that way.) Many middle class families will then have parents one or two generations removed from poverty; others will have been in the middle classes for much longer. While the two groups may have comparable incomes and even formal educational credentials, any conservative thinker (Oakeshott, Barzun, Loury...) will be happy to explain to you that the latter will be more likely to have traditions which are fitted to their station in life, and adaptive in society more generally. (The difference will attenuate over time.) There will also, because of previous endogamy, tend to be genetic differences between the two sub-groups. Even if those genetic differences are causally irrelevant, genes and environment will be correlated, and genetic differences will predict trait values.
(An aside, because I'm a glutton for punishment: Ashkenazi Jews in the United States are represented in high-average-IQ professions far out of proportion to their share in the population, and are also more educated than the general population. It would be astonishing if they did not have above-average IQs. They are systematically different, genetically, from the rest of the population. Some genetic diseases, e.g., Tay-Sachs, are more common among them, while others, e.g., cystinosis, are rarer. More trivially, and so more suitably for my purposes, they are much more likely to be able to wear their hair in what one of my Israeli colleagues calls his "Isro". It follows that genetic variation at Isro-relevant loci has some ability to predict IQ, at least in the US. [It would not be so predictive in the Sitka District, another sign that we are dealing with correlates and not causes.] Ashkenazi Jews are also, by definition, systematically different culturally from the rest of the US population, so the environment in which children grow up differs systematically too. It's been argued, with some plausibility, that many of those cultural traditions are highly adapted to modern life, while not, of course, developed for that purpose — "pre-adaptations" or "exaptations", in the jargon. Now postulate — I hope this will not be a stretch — some scientists who are very sophisticated about molecular biology, but very naive about tradition, and for that matter about statistical methods for structured populations. It is only too easy to imagine them establishing quite precise degrees of genetic commonality by tracking the Isro loci or ones linked to them, but neglecting gene-culture covariation, and concluding that those loci are pleiotropic, affecting both hair curliness and IQ. Similar remarks apply to, say, the association, at least in the US, between the Scots-Irish and violence in defense of personal honor. Designing a study which could handle this kind of covariation is left to the reader.)
I could go on with other reasons why, even if the genetic variance-component of IQ was always zero (which, for the record, I doubt it is), we should expect a higher correlation in the IQ of twins' raised in the same family than of other siblings raised together. For starters, any fluctuations in the family's resources, quality of available schools, etc., are going to hit the twins at the same point in their development, which will not be the case with other siblings. But this grows (or has grown) tedious.
Does a trait's heritability tells us anything about its malleability, about how easy it is to change the trait with environmental manipulations? The answer is "no, of course not", even assuming (1) the basic biometric model holds, and (2) we are talking about true heritability and not biased-to-nonsensical estimated heritabilities.
It's banging on an often-sounded drum, but it's worth doing because it makes the point clearly: height is heritable, and estimates for the population of developed countries put the heritability around 0.8. Moreover, tall people tend to be at something of a reproductive advantage. Applying the standard formulas for response to selection, we straightforwardly predict that average height should increase. If we select a population without a lot of immigration or emigration to mess this up, say 20th century Norway, we find that that's true: the average height of Norwegian men increased by about 10 centimeters over the century. But that's much more than selection can account for. Doing things by discrete generations, rather than in continuous time, height grew by 2.5 centimeters per generation. (The conclusion is not substantially altered by going to continuous time.) If the heritability of height is 0.8, for this change to be due entirely to selection, the average Norwegian parent must have been 3 centimeters taller than the average Norwegian. This, needless to say, was not how it happened; the change was almost entirely environmental. The moral is that highly heritable traits with an indubitable genetic basis can be highly responsive to changes in environment (such as nutrition, disease, environmental influences on hormone levels, etc.).
Conversely, the very low heritability of eye number does not tell us that it is easy to increase how many eyes someone has by exercise, education and training, manipulating diet, manipulating ambient light, trepanation, etc.
So, does this apply to IQ? Well, if we couldn't find any environmental interventions which affected IQ, that would indeed be strange and suspicious. But in fact it's really not hard. Winship and Korenman (pp. 215--234 of Intelligence, Genes, and Success; large PDF reprint) re-analyzed the National Longitudinal Study of Youth data used by Herrnstein and Murray, but did it without their technical blunders. (I omit a blow-by-blow.) They found that the impact of each extra year of schooling beyond 8th grade is somewhere between 2 and 4 IQ points, depending on exactly how the model is specified, with most specifications giving estimates of 2.5 to 3 points per year. (This is in line with other studies, including some that Herrnstein and Murray cite and claim, contrary to fact, show negligible impact). The difference in IQ between someone who drops out of school in the 8th grade and someone who finishes college, all else being equal, would thus be somewhere between 20 and 24 points. Now, you might object that maybe this is because a higher IQ makes you stay in school longer, but Winship and Korenman use two measures of IQ, at different ages, and control for the direct effect of early IQ on later IQ, the direct effect of early IQ on years of education, and sundry other covariates. This study isn't perfect, methodologically, because regression generally sucks as a tool for causal inference, and here in particular there could be all kinds of unmeasured influences, hiding in their structural equations, confounding their results. But if you start discounting studies on these grounds — which, let's be honest, you should — you soon find that you can not only recycle The Bell Curve, but also safely ignore the bulk of what's published in Intelligence, The American Economic Review, etc. If you are not prepared to do that, it's hard to see how you can object to Winship and Korenman's methods.
There are, however, other studies of education's impact on IQ where it's hard to see how endogeneity could creep in. (I draw these from Wahlsteen, pp. 71--88 of Intelligence, Genes, and Success, because it's to hand, but it would be easy to multiply examples.) A cute one comes from Alberta, where there used to be an arbitrary birth-day cut-off for starting schooling; children born just before it thus start school a year before those born just a day later. The mean difference in IQ between such children is significant, positive, and about four points in favor of the early starters. (The same thing happens with many sports, where the discrepancies grow with age, perhaps due to a positive feedback loop from practice to success to motivation to practice.) A smaller effect, but broadly applicable, is that children's test scores are higher at the end of one school year than at the beginning of the next, after which they recover. There are also randomized studies of interventions with "at risk" families, e.g. ones with unusually low birth-weight children. Depending on the study, the treated groups had IQs 10 to 15 points above the controls. Because of the random assignment, not only is there no problem of endogeneity, it's also idle to worry about placebo effects — it would be fantastic if a placebo raised IQ by 10 points. Another nice example (not from Wahlsteen) comes from Heber's work on Rehabilitation of Families at Risk for Mental Retardation. Rather than summarizing it my own words, I'll quote someone else's summary (though no doubt I'll be told he understood neither genetics nor experimental methods):
It describes an experiment on ghetto children whose mothers had IQs of below 70. Some of these children received special care and training, while others were a control group. Four years after the training period the IQs of the former averaged 127 and those of the latter 90, a spectacular difference of 37 points. The fact that the control children had a 20-point advantage over their mothers is not unexpected [because of regression toward the mean]. [4, pp. 14--15]
At this point, the ritual is for people to begin saying things like "there's nothing you can do if the environment is already decent", "the changes didn't last long after the program" (which would equally show that exercise can't really change physical fitness; see below), or to raise irrelevancies. (My favorite, among the last, is to point to adoption studies showing that adoptees' IQs are more correlated with their biological than their adoptive parents' IQs, conveniently side-stepping which set of parents had scores closer to the adoptees'.) But now "we've established what you are, we're just haggling over your price".
On top of all this, there is, to repeat what I said in the dialogue, the Flynn Effect. (See the links there for references.) The population average IQ rose monotonically, and pretty steadily, over the 20th century in every country for which we can find suitable records, including ones where we can definitely rule out immigration or emigration as significant contributory causes. (If it really is global, and I think we don't know enough yet to say either way, then the idea that it could be due to migration is — peculiar.) The magnitude of the gains are, as these things go, huge: two to three IQ points per decade. As I said in the earlier post, this puts the average 1900 IQ at 70 to 80 in 2000 terms. Let's check how intense natural selection would have to be to explain this. Over a twenty-five-year generation, we're looking at an IQ change of 5 to 7.5 IQ points. Sticking with the usual biometric model, and taking the best estimate of heritability within that model, namely 0.34, we'd have to see a reproductive differential of between 14 and 22 points, i.e., the average parent would have to have an IQ that much higher than the average person. (I am neglecting correcting for assortative mating and for continuous time, which don't change things much.) Since 15 IQ points is one standard deviation, this would imply a huge bias in reproductive rates towards those with higher IQs. Needless to say, nothing of the kind is observed in any of the countries where the Flynn Effect has been documented.
(If you follow Herrnstein and Murray's account of the data they used, which is always hazardous, you should conclude that the mean IQ of parents is about 1 point below that of the general population; the result of this adaptive advantage of stupidity would be to lower the mean IQ by about a third of a point per generation. For comparison, the black-white IQ gap is about 15 points, and even, or rather especially, such raving hereditarians will tell you it is at least 20 to 50 percent "environmental". Following such people in placing a completely unwarranted causal interpretation on the models, this would mean that bringing the environment of black Americans up to the present white standard would raise the formers' IQ by 3 to 7 points, and so the national average by somewhere in the range of one-third to two-thirds of a point. It is hard to understand why anyone who values average IQ so much they care about such small differences would worry more about dysgenic pressure than racial injustice.)
Nobody doubts that athletic abilities have genetic causes. Bodily shape is strongly influenced genetically, as, undoubtedly, are all manner of things like lung capacity, the properties of muscle fibers, reflexes, visual acuity, etc. I can say this with complete confidence because these traits have clearly evolved, and so must be under substantial genetic control. (Even so, careful attempts to find genetic bases for even very striking group differences in high-level performance fail to identify any. [Thanks to Leo Kontorovich for that link.]) On the other hand, it is plainly insane to suppose that athletic performance is not very largely learned, and a result of interaction with the environment.
To be really concrete, think about distance running. With practice, just about everyone can increase the distance they can run, the speed they can sustain, etc. Presumably there are physiological limits on what any one body can attain, even with an ideal training regimen, but performance (which is all one sees on any test) is malleable. Practice can take someone from being winded after sprinting two blocks to being able to run a marathon. Very basic physiological parameters, like the rate of oxygen uptake, demonstrably respond to training, and over a matter of weeks at that.
Of course, the flip side of this is that not practicing reduces ability, and sufficiently drastic lack of practice takes someone from being able to run marathons to being winded after two blocks. It is not enough to have practiced at some point in the past. There needs to be continuing practice, which means continuing opportunity and motivation, as well as sheer physical capacity. If we took a bunch of kids from an environment where physical exercise is discouraged, and make them run laps every day for a year, at the end of that they will (on average) be better at running than their peers. (They may have acquired other issues, but they will be better at running.) If we now return them to their environment, with a pat on the back and perhaps a souvenir pair of sneakers, is there anyone who doubts that in, say, five years most of them will be pretty much as sluggish as their peers? Is there anyone who would look at the result of such an experiment and conclude that exercise cannot, in fact, alter the ability to run?
Of course, not every kind of physical performance is as malleable as is distance running. No amount of training is ever going to let anyone hold their breath under water for an hour. Similarly, I do not expect any sort of learning will be able to alter some fairly basic aspects of the mind, e.g., to force the capacity of short-term working memory up to twenty chunks (rather than "the magical number seven plus or minus two"). We have evolved certain kinds of physical adaptability — feel free to speculate on why running might've been more useful than breath-holding, but not so useful as to be automatic — and similarly we have evolved some kinds of mental adaptability, but not others.
Let me sum up.
I realize I'm inviting the suspicion that I'm protesting too much. If I really think heritability is irrelevant to malleability, why shouldn't I be happy to accept, say, Jensen's old favored value of 0.8 for IQ's broad-sense heritability, which puts in the same range as highly-malleable height? Why go on at such length about an irrelevancy? I can only offer two replies. One is that I am trying to meet people half way: even if I can't persuade you that heritability has nothing to do with malleability, I hope to persuade you that the current estimates are not reliable, that the notion of a value for IQ's heritability is silly, and that we do, indeed, know squat about that question. The other and more basic reply, however, is that these people are wrong in ways I find intensely irritating.
So: Do I really believe that the heritability of IQ is zero? Well, I hope by this point I've persuaded you that's not a well-posed question. What I hope you really want to ask is something like: Do I think there are currently any genetic variations which, holding environment fixed to within some reasonable norms for prosperous, democratic, industrial or post-industrial societies, would tend to lead to differences in IQ? There my answer is "yes, of course". I've mentioned phenylketonuria and hypothyroidism already, and many other in-born errors of metabolism also lead to cognitive deficits, including lower IQ, at least in certain environments. More interestingly, conditions like Williams's Syndrome, Downs's Syndrome, etc., are genetically caused, and lead to reasonably predictable patterns of cognitive deficits, affecting different abilities in different ways. In many of these cases, it seems very likely (but is not yet established) that these variants cause problems with the signaling pathways which set how gene expression responds to environmental cues. Manipulating those signaling pathways during the right time windows would change what kind of mind the organism has later. The fact that different genetic disorders lead to different patterns of cognitive deficits, rather than just generally making people duller all around, suggests ways of disentangling which genes are relevant to which abilities through which molecular mechanisms. (Cf.) At a popular level, I've still not run across a better description of way the regulation of gene expression couples genotypes and environments during mental development than Gary Marcus's writings, but if you want details there is a whole rapidly-growing field of molecular developmental neurobiology (as I'm not-infrequently reminded).
I suspect this answer will still not satisfy some people, who really want to know about differences between people who do not have significant developmental disorders. Here, my honest answer would be that I presently have no evidence one way or the other. If you put a gun to my head and asked me to guess, and I couldn't tell what answer you wanted to hear, I'd say that my suspicion is that there are, mostly on the strength of analogy to other areas of biology where we know much more. I would then — cautiously, because you have a gun to my head — suggest that you read, say, Dobzhansky on the distinction between "human equality" and "genetic identity", and ask why it is so important to you that IQ be heritable and unchangeable.
: "Limits to Intensive Production in Animals", British Agricultural Bulletin 4 (1951): 217--218, reprinted on pp. 219--223 of volume 5 of his Collected Papers (ed. J. H. Bennett, Adelaide: University of Adelaide Press, 1974). My quotations are all from p. 221 of the reprint.
: Christopher Jencks et al., Inequality: A reassessment of the effect of family and schooling in America (New York: Basic Books, 1972). I have not gone back over Jencks's calculations to see if they are sound, so this may indeed just be coincidence.
: Urie Bronfenbrenner, "Nature with Nurture: A Reinterpretation of the Evidence", pp. 153--183 of Ashley Montagu (ed.), Race and IQ (second edition, New York: Oxford University Press, 1999; first edition 1975), ISBN 0-19-510220-7.
: Theodosius Dobzhansky, Genetic Diversity and Human Equality (New York: Basic Books, 1973).
Update, 23 November: Typo correction (thanks to Matt Bonakdarpour).
Posted by crshalizi at September 27, 2007 14:55 | permanent link