• A-Z
  • Directory
  • myUVM
  • Loading search...

Evolution in Structured Populations

The Phase of Mass Selection and Long Term Selection Experiments

Posted: July 2nd, 2014 by Charles Goodnight

On to Phase 2 of Wright’s Shifting Balance Process. But before I do I should probably start with a shameless attempt to up my standings in the next Carnival of Evolution World Cup Competition by alerting the committee responsible to the following figure that I found:


Obvious evidence of Pre-Cambrian Bunnies (unapologetically lifted from http://clubschadenfreude.com/2013/02/19/not-so-polite-dinner-conversation-part-9-the-second-half-of-19-limestone-coelacanths-and-circular-reasoning/)

OK, on to phase 2: the phase of mass selection. In Wright’s words “. . . the set of gene frequencies drifts far enough to cross one of the . . . saddles in the surface of fitness values . . . There ensues a period of relatively rapid change in this deme, dominated by selection among individuals (or families) until the set approaches the equilibrium . . ..” (Page 455, Wright 1977). Evolution and Genetics of Populations. Vol. III. Univ. Chicago Press.).

There would appear not to be much controversy about this. In particular, Wright’s claim for this phase is that the population will climb the nearest peak and approach the local optimum. I doubt that Fisher would argue much with that. However, there are actually is the potential for some discussion. In particular, in an additive world the response to selection occurs strictly by changes in gene frequency of alleles with fixed effects. However, one of the points I have made before (https://blog.uvm.edu/cgoodnig/2013/07/31/drift-and-epistasis-the-odd-effects-of-small-population-sizes/) is that drift can convert epistatic variance into additive variance, and in the process change the average effects of alleles. As I mentioned last week, this may be the important role of Wright’s phase one: Drift causing shifts in local average effects. As I also discussed last week it is unlikely that these shifts will be major, since in general epistasis tends to be small as a variance component, and thus in most cases there won’t be much material for the “conversion” process to work on. This is where Wright’s phase two comes in.

To see where this is important it is useful to look at long-term selection experiments, and note two anomalies that are consistently found in such experiments. First, they work too well. That is, you typically can get 100 or more generations with a nearly linear response to selection (and MUCH more if we acknowledge the existence of laboratory selection experiments involving bacteria — Wiser, Ribeck, and Lenski 2013. Science 342:1364-1367). Second, there are typically intermediate selection plateaus.

Corn Selection copy

The results of 100 years of selection for oil and protein in corn. Note the overall long term linear response to selection (orange highlight), which is none the less punctuated by extended selection plateaus (green highlights).

Turning first to the long-term linear response to selection. This is actually expected under Fisher’s infinitesimal model, which has the odd feature that, because there are infinitely many loci of infinitely small effect, selection changes the gene frequencies of the individual loci by an infinitesimal amount, which is to say gene frequencies do not change. Of course in the real world there is a finite number of loci, nevertheless, this long-term linear response to selection implies that there are indeed a VERY large number of loci contributing to the response to selection. It turns out that there are two other possibilities. One is the ongoing input of new mutations, which due to space constraints I will not talk about, and the second is epistasis. It turns out that, as with genetic drift, selection will drive the conversion of epistasis to additive effects. The rather surprising empirical result from an old paper (Goodnight 2004 in: Plant Breeding Reviews. J. Janick, ed.) is that in epistatic systems selection seems to convert epistasis into additive effects at a more or less steady rate. Thus, this conversion of epistasis into additive variance driven by selection is a possible explanation for the extended response to selection.

VA with selection and epistasis

The additive genetic variance in four simulated populations with two loci and 100 alleles per locus under selection. Dashed lines are populations with only additive effects. Note the approximately exponential decline in the additive genetic variance. Solid lines are populations with additive-by-additive epistasis. Note that the additive genetic variance remains elevated for 10 generations before beginning an exponential decline.

The intermediate selection plateaus are also consistent with an epistatic model. The typical, and adequate, explanation for intermediate selection plateaus is that the population has run out of selectable variation, and is waiting for either a favorable mutation, or a favorable recombination event. However, consider the simple case of additive-by-additive epistasis.

VA fitness surface

The fitness surface for additive-by-additive epistasis.


From this fitness surface you can see that there are two possible outcomes of selection, fixation of the A1A1B1B1 or A2A2B2B2 fitness peaks. Interestingly, in a completely deterministic system there is a boundary dividing the domains of attraction for the two peaks. Thus, we can start two populations near fixation in one of the fitness valleys, such as nearly fixed for A1A1B2B2, and with arbitrarily small changes in gene frequency they will go to different peaks.

deterministic selection

The response to selection for two populations, one with a starting gene frequency of A1 = 0.99, and B1 = 0.0101 or 0.0099. Note in this deterministic model the outcome is a function of minor differences in gene frequency at the B locus. Each arrowhead represents one generation.

This simple system gives us both the results typical of long term selection experiments, that is surprisingly long responses to selection, and intermediate selection plateaus.

response to selection

The response to selection in the example described above. Note the long response to selection with an intermediate selection plateau.

Two more fun graphs, then I will conclude and leave you alone. First, the intermediate plateau is due to a lack of additive genetic variance, but not total genetic variance. The problem is that when the population nears a gene frequency of 0.5 the genetic variance is mostly expressed as epistasis, and the population cannot respond to selection. When the population is dominated by either allele at either (or both) locus the additive genetic variance increases.

VA during selection

During the selection process total genetic variance remains relatively constant (except near fixation) but additive genetic variance dominates during the early and late stages of the selection when frequencies are far from 0.5, and the epistatic variance dominates during the middle stages when the gene frequencies are near 0.5.

And, because I cannot resist, the average effects of the A1 and A2 alleles reverse over the course of the selection experiment.

figure 12 AXA LAE

The local average effects of the A1 allele. The A1 allele starts out being the low fitness allele. Half way through the response to selection the gene frequencies approach 0.5 and the local average effects reverse with the A1 allele becoming the high fitness allele.

So, returning to Wright’s phase two, the phase of mass selection, we see, again, that Fisher and Wright were seeing two sides of the same process. Within populations selection will appear to act exactly as Fisher said, as a process that refines adaptations and leading to the (locally) optimal phenotype. Between populations it must be recognized that in an epistatic world selection is a diversifying force that acts to magnify the small differences in local average effect and potentially driving populations to different adaptive peaks.

In other words, to paraphrase Dave McCauley when he was a postdoc and spiritual leader of us graduate students at the University of Chicago, it is the interaction of stochastic and deterministic evolutionary forces that give meaning to life.

Was Fisher (W)right?

Posted: June 28th, 2014 by Charles Goodnight

Once again without internet, thundering my way north on the Silver Star. It is hard to keep focus on what I intended on the subject I am writing on. The Evolution meetings were inspiring to say the least. Lots of great talks, many of which I could easily write a blog about, but I must stay on phase one for the moment!

Like Barton and Charlesworth and Carson and Templeton , Fisher and Wright would have disagreed on the importance of genetic drift (That sentence must win some sort of prize for name dropping!). Fisher would have emphasized that small population size would decrease the number of alleles and the (molecular) genetic variation within populations. He also would have, in all likelihood, argued that most meaningful genetic variation was additive, and epistasis would not be present to any great extent. In contrast, Wright would have argued that gene interaction is important in populations, and as a result genetic drift could result in shifts in genetic architecture, and with that the potential to form what Dobzhansky called adaptive gene complexes.   What I want to talk about is that, much like the blind men examining the elephant, they could both be right.

Blind men elephant 2

In the classic fable of the blind men and the elephant each man examines only a part of the elephant, and fails to understand what the whole is.  (from http://www.newsfromthehill.com/2011/06/keep-on-sunny-side.html)

First, Fisher. If you have a population with a large amount of additive by additive epistasis and send it through a bottleneck you will find that the additive genetic variance increases (yea! That’s a big part of the reason I have tenure!). However, although the total genetic variance increases a little bit, the additive variance increases a lot more. The net result is that the epistatic variance declines precipitously. The net result is that after a few generations of small population size the epistatic genetic variance basically disappears. Selection will typically do the same thing. In general both drift and selection tend to drive genetic variance to lower levels. Thus dominance by dominance epistasis tends to be converted to additive by dominance, dominance, and additive variance. Similarly additive by additive variance tends to get converted to additive variance. It is because of this tendency that there is virtually no reason to bother with modeling or measuring three way and higher epistatic variance. It also means that most populations most of the time will also tend to have relatively little two locus epistatic variance. The bottom line: Fisher was right: Within populations under most circumstances we can ignore gene interactions, and treat populations as if they were additive. This means that in many cases genetic drift will simply have the effect of reducing the variance within populations.

epistasis and drift

After a few generations of small population size the nonadditive variance (green line minus red line) becomes very small. Fisher was right, within populations we can ignore gene interaction.

Importantly, however, this view that has Fisher as correct is focused within populations. Within populations the epistasis disappears as a variance component, but it does not go away. The interactions are still there, it is just that most of the time populations will be in a gene frequency space where most of the epistatic variance has been “converted” into additive variance. The other way that you can think of this is that the epistatic variance disappears it reappears as variance among populations. The way this happens conceptually is that when one of the loci gets fixed the epistatic interaction between a pair of loci will be converted into additive effect. Of course in reality, one allele doesn’t get fixed while the other stays at intermediate frequency, however, thinking simultaneously about both loci going to partial fixation tends to hurt my head.

So it is this between population component where Wright comes in. As I said the fixation of alleles leads to the conversion of epistasis to additive variance, however, in different populations it may be different alleles that get fixed by drift. Thus, in an interaction between the A and B locus it may be the B1 allele that moves towards fixation in one population and B2 that moves towards fixation in a second population. In both populations the additive variance will increase due to conversion of epistasis to additive variance, but the increased additive variance will be different in that different alleles at the A locus will be favored in the two populations. In other words, you don’t get something for nothing. An increase in additive genetic variance is always accompanied by a shift in what alleles do. I have mentioned this before and identified this as a shift in the local average effects of alleles. However, even without the fancy name it is an important effect. In interacting systems genetic drift has the potential to send a population down a new evolutionary trajectory. Although the mathematical tools to describe it were not available to him, I believe that this is what Wright was talking about when he was thinking about phase one of his shifting balance process.

Mixing of alleles

When there is epistasis genetic drift will not only increase the additive genetic variance, it will also change the average effects of alleles.    Thus, an allele that was “good” prior to a bottleneck may be “bad” after the bottleneck.

Thus, it appears that both Fisher and Wright were correct. Within populations genetics will typically act in an additive manner, and we will see that in many circumstances population bottlenecks will do little more than decrease the ability of a population to adapt. Thus, within populations a Fisherian view is expected to entirely adequate. However, that decrease might not be as much as you might expect (for what it is worth, the VA should increase whenever VAA > 1/3 VA – that is ignoring the typo in the relevant equation), and there may be some shifting of local average effects. This shift in the effects of alleles on the phenotype is what Wright was talking about. If we are looking at a metapopulation we need to acknowledge this effect of gene interactions on the differentiation of populations.

Just to remind you this shift of local average effects due to gene interactions is an entirely different form of population differentiation than differentiation of population means. Two populations with identical mean phenotypes can nevertheless be differentiated for average effects.

In conclusion, then we can see that indeed Fisher and Wright were looking at different parts of the same elephant. Fisher was looking at the apparently additive world that is within a population, and Wright was looking at the emphatically non-additive world that is between populations. I would argue that a modern enlightened view of phase one of Wright’s shifting balance theory would combine these two views. Within populations an additive view will typically be adequate. Genetic drift will as often as not lead to a decrease in additive genetic variance, and epistatic variance will typically not be detectable. The true effects of epistasis will primarily be seen among populations, and they will be seen in the form of shifts in local average effects. These are measures we typically do not make, so until we hve more data we will not know how important epistasis is in population differentiation.

The 1984 founder event debate: Its relation to Phase 1 of Wright’s Shifting Balance Process

Posted: June 20th, 2014 by Charles Goodnight

Today I am speeding south on the Empire State in the morning and the Silver Star in the afternoon. I should be in Raleigh Durham for the Evolution meetings late this evening. For the uninitiated Amtrak trains have names that reflect where they are going. Thus, the famous Steve Goodman/Arlo Guthrie song, “City of New Orleans” is about the train that runs from Chicago to New Orleans. Finally, long distance trains have no internet, so this post is a bit sketchy. But, enough about trains, and on to phase one of Wright’s shifting balance. As I mentioned last week Wright identified three phases of his shifting balance process of how he thought populations might evolve on a complex adaptive landscape. The first of these phases is phase 1, the phase of random drift.


From an historical perspective, a pair of papers in Annual Reviews in 1984 (Carson and Templeton 1984 Ann. Rev 15:97-131, Barton and Charlesworth 1984 Ann. Rev. 15:133-164) are worth discussing. Although ostensibly about founder event speciation, they do a great job of laying out the state of the art for genetic drift in 1984. Taking the side that founder events (and by implication genetic drift) are relatively unimportant in evolution was Barton and Charlesworth, while Carson and Templeton attempted to defend founder event speciation.

Barton and Charlesworth’s points can be summarized in basically one sentence: Genetic drift and population bottlenecks reduce genetic variation, and honestly don’t change gene frequencies that much. In short genetic drift is unlikely to be important because it reduces a populations ability to respond to selection, and will cause the random loss of variation. From this perspective there is really no reason to give genetic drift an important role in evolution. In contrast Carson and Templeton argued that indeed there is something special about founder events, especially when there was epistasis, that they can drive a population to a new adaptive peak, and they should not be ignored. Here, however, is where they fell flat. In 1984 there was no theory on the effects of epistasis and genetic drift, or even a sense of how we should model it. The bottom line on these is that, in my view, Carson and Templeton were soundly defeated, not because they were wrong, but because they lacked any mathematical framework to use as scaffolding for their argument, whereas Barton and Charlesworth had the entire corpus of Fisher’s and Dobzhansky’s work to draw upon.

This pair of papers actually had quite an impact on me since they came out right after I finished my thesis, which had my original model of epistasis and founder events. I was sitting there with an unpublished manuscript on exactly the topic that would have completely changed the debate that these two pairs of authors were having (Goodnight 1987 Evolution 41:80-91).

What changed between 1984 and 1987 (actually 1995 – Goodnight 1995 Evolution 49: 501-511) was the development of formal models of the effect of genetic drift on epistatic variance.  The important detail that Barton and Charlesworth did not have available to them was that the genetic variance components are a statistical property of a population, and they change as gene frequencies change. In the rather odd world of Fisher’s infinitesimal model even selection does not change gene frequencies, and as a result in Fisher-World© (OK, its not really copyrighted) it is completely valid to assume that additive genetic variance is a constant. However, Fisher also assumed that populations were infinitely large. If they are not infinitely large then gene frequencies DO change, and my models showed that on average these changes in gene frequency lead to an increase in the additive genetic variance. More importantly, Barton and Charlesworth also assumed that the average effects of alleles were constants, which is a valid assumption in Fisher-World. In my later models I showed that indeed there is only one way that additive genetic variance can increase following a population bottleneck. That is the only way to increase variance to change the average effects of alleles. Thus, in retrospect, the problem that Barton and Charlesworth had is that they were trying to apply Fisher-World reasoning to a situation in which it did not apply.

So, the way this applies to phase 1 of the shifting balance process is that it we need to acknowledge that drift is not just change in gene frequencies. Of course at a molecular level that is exactly what it is, but what we are interested in is the phenotype, and the quantitative genetic variance components. I would even go so far as to say it is not the increase in additive genetic variance that is particularly important. Of course, increasing the additive genetic variance increases the rate at which a population can respond to selection; however, getting to your destination a few generations early does not strike me as the stuff of the shifting balance process.

Instead, I think it is the shift in local average effects of alleles that are the interesting feature of drift with respect to the shifting balance process. What this means is that if two populations are isolated from each other then genetic drift can lead to a slight increase in additive genetic variance, but shifts in average effects of alleles that are different in the two populations. These shifts mean that the same allele is doing different things in different populations. It also means that selection acting on the same phenotype in the same manner in the two populations will favor different alleles. What is good in one population may be bad in another population.

Importantly, Wright, and Fisher, and even Barton and Charlesworth and Carson and Templeton did not have access to this. They considered average effects to be unchanging, and as a consequence missed one of the major features of drift in small populations. What these models tell us is that the population genetics of gene frequencies is very different from the quantitative genetics of phenotypes, and since evolution is about phenotypes in most cases it is the quantitative genetics that will be important.

Final note: Yes, of course Barton and Charlesworth, and all good quantitative geneticists know that average effects are a function of population characteristics and gene frequency, but this is not something that will normally enter their thinking or intuition.

Wright’s Shifting Balance Process

Posted: June 13th, 2014 by Charles Goodnight

Now that I have talked about how Wright thought evolution didn’t occur on adaptive landscapes, now it is time to talk about how he thought it did occur. The 7 assumptions and the adaptive topography were all basically background for his “shifting balance” process of evolution (Wright 1977 Evolution and Genetics of Populations. Vol. III. Experimental Results and Evolutionary Deductions. Univ. Chicago Press). To quickly summarize the previous few posts, Wright thought that the evolutionary possibilities could be visualized as an adaptive topography with fitness peaks and valleys.

wsbp adaptive topography

Wright’s adaptive topography. The horizontal axes represent aspects of the genotype or phenotype, the vertical axes represents fitness. The red dot is the current position of a population on the adaptive topography. Wright’s central question was how does a population cross an adaptive valley to move from its current local adaptive peak to a second higher peak.

Wright felt that very large or very small populations, or even a single medium sized population would not be able to navigate this topography, either because very large populations would be dominated by selection and unable to cross fitness valleys, or single very small populations would be so small and isolated that they would not be able to adequately explore the landscape. Instead, he thought that a metapopulation, or population of populations, would be the ideal population structure for evolution. (Of course, he did not use the term “metapopulation”: That term was coined by Levins, 1969 Bull. Entom. Soc. Am. 15: 237-240.) He basically thought that a medium size population would have a balance of drift and selection that it would allow the population to drift away from an adaptive peak and randomly explore the adaptive landscape. However, he also thought that a single population would be inadequate since the chance of that population actually coming under the selective domain of a new higher peak would be very small. Thus, he thought that the metapopulation structure with numerous moderate size populations was necessary since collectively they would be able to drift away from an adaptive peak and adequately explore the adaptive landscape.

wsbp metapopulation

A metapopulation is a population of populations that is connected by a low level of migration

The process he envisioned is his “shifting balance” process, which he imagined as a three phase process (I prefer calling it a process since the “theory” is whether or not the “process” is important. I also prefer SBP, because as an infant my daughter Sylvia’s nickname was Sylvia Bilvia Pilvia, or SBP for short.).   The three phases he identified were:


(1) the phase of random drift. During this phase the subpopulations drift at random across the adaptive landscape. Drift is random with respect to fitness, thus, during this phase the populations are not constrained by selection, and can easily cross adaptive valleys.

wsbp phase 1

Phase 1, the phase of random drift. During this phase the populations move randomly without respect to fitness due to random sampling processes in small populations.

(2) the phase of mass selection. During this phase the subpopulations come under the selective influence of local adaptive peaks and are driven by natural selection to climb the closest peak. Selection is a deterministic process that always drives the population “up hill” towards higher fitness, regardless of the height of the peak relative to other peaks.

wsbp phase 2

Phase 2, the phase of mass selection. During this phase selection drives the populations towards the nearest local adaptive peak.

(3) the phase of interdeme selection. During this phase the subpopulations that are on higher peaks are more successful, and as a result are net exporters of migrants, whereas those on lower peaks are less successful and net importers of migrants. Wright thought that this differential migration would effectively export successful adaptive gene combinations to other subpopulations, and eventually shift the balance of adaptation over to the new adaptive peak.

WSBP phase 3

Phase 3, the phase of interdeme selection. During this phase the populations move randomly without respect to fitness due to random sampling processes in small populations.

In Wright’s discussion it is clear that he considered these three processes to be occurring simultaneously within a metapopulation. Presumably any given subpopulation may at some times be dominated by random drift, with selection being a relatively weak force in that population, whereas other subpopulations may be under the influence of an adaptive peak and be more strongly influenced by selection. Finally, all subpopulations would be sending and receiving migrants. A subpopulation may be a net recipient of migrants while it is in an adaptive valley, but perhaps at a later point become a net exporter of migrants as it climbed a particularly good adaptive peak. It is this constant shifting of the balance of migration and selection from one peak to another that is the reason that Wright named this the “shifting balance” process.

It is clear why this process is so attractive to me, and many others. It is a theory that combines the stochastic processes of genetic drift with the deterministic processes of selection at multiple levels to lead to not only adaptation, but also to the evolution of novel solutions to the process of adaptation. That said there are more than a few reasons to be skeptical about the process (Coyne, Barton and Turelli 1997 Evolution 51: 643-671). Perhaps the most obvious is that the different phases of this proposed process require very different population sizes. For example, the drift phase will presumably be most effective if populations are small and isolated. In contrast, in phase two, the phase of mass selection, larger population sizes will make selection more effective, and random drift less important. Finally, phase three, the phase of interdeme selection is most effective with high migration rates. Thus, we have phase one requiring isolation and small population size; phase two requiring large population size; and phase 3 requiring high migration. This suggests that the three phases would be working at odds with each other, and is one of the main conceptual reasons that the shifting balance process is often discounted. My main thought on this is that, as we have seen in past posts, this is not the first time that we have seen theory being used to discredit the intuition of very smart experimentalists, and to repeat my favorite adage, when theory and experiment are in conflict, the theory is wrong.

As I have written elsewhere, we (Wade and Goodnight 1998 Evolution 52: 1537-1553.) think that it is premature to dismiss Wright’s shifting balance process based on intuition and parsimony reasoning. That said, Wright’s model is old, and it needs to be brought into the 21st century. What I will do over the next several weeks is discuss each of the phases in turn, and suggest ways in which discoveries made over the last 70 years can be incorporated and used to modify our understanding of the process originally proposed by Wright.

Of Population Structure and the Adapative Landscapes

Posted: June 5th, 2014 by Charles Goodnight

Last week I talked about adaptive topographies, and while my discussion may have done little more than add to the confusion, at least it got across Wright’s view that there are multiple selective peaks, which in essence means that there are multiple solutions to the problem of achieving high fitness.


Figure taken from http://locustofauthority.wordpress.com/

Wright was interested in how a population could move from one local peak (such as the center intermediate height peak in the figure above) to another higher peak. He considered several options, which are summarized in the figure I showed last week:


He speculated that there were only a few ways that a population could move from one peak to another. In his figure above A, B, and C all depict very large population sizes: in the figure N is population size, U is mutation rate, and S is selection intensity, thus 4NU and 4NS large means that the population size is very large relative to the mutation and selection rates. The point of these top figures is that he thought that the only way a very large population can move from one peak to another is by a change in the environment (C).

His point is well taken in that in large populations drift will have only a small effect, and the population’s position on the adaptive landscape will be dominated by the effects of selection. Selection can only drive a population “up hill” to higher fitness, thus there is no way for the population to move down hill in fitness and cross a valley to a higher adaptive peak.

However, there are two reasons this may not necessarily be true. The first is a model put forth by Weinreich and Chao (2005 Evolution 59: 1175), that in retrospect is rather obvious (Isn’t that true of all great models?). Consider a population of bacteria (remembering they are the prokaryote equivalent of haploid) that has two loci A and B. Further imagine that A2B2 is intermediate fitness, A1B2 and A2B1 are of low fitness and A1B1 is the high fitness genotype. Based on Wright’s reasoning if we started a population off in a chemostat fixed for A2B2 it could not evolve to become A1B1 because of the mixed genotype low fitness valley. However, what Wright was not figuring on was just how large these populations are. A typical chemostat might have 106 to 109 cells per milliliter. If we imagine a mutation rate of 10-5 per locus then in each milliliter of chemostat there will be between 20 to 20000 bacteria that have mutated from A2B2 to A1B2 or A2B1. These mutants are effectively a low fitness population that is one mutation away from moving to the higher peak. Obviously it is in constant flux, new mutants are being added continuously due to mutation, but lost due to their lower division rate, and being eluted from the chemostat. Nevertheless, there is this substantial standing population of single mutant low fitness bacteria. At the higher density (109 cells) we would expect roughly 2 double mutant cells per milliliter at any given time. Note in the figure below the chemostat has 650 ml, thus such a system should have between 1.3 and 1300 high fitness individuals at any given time, even before taking into account the effects of selection.  Thus, in the very large population sizes of bacteria, two locus peak shifts, far from being rare, become nearly a certainty. Whether or not this works for the much smaller population sizes of multicellular organisms, or adaptive peaks that require the assembly of more than a few interacting loci remains an open question.


A typical chemostat setup.  Media is added to the chamber at a constant rate, and effluent is removed.  When properly set up such chambers will maintain a constant population size of the experimental organisms. (image taken from http://openi.nlm.nih.gov/detailedresult.php?img=2906461_1471-2180-10-149-1&req=4)

The second model is Gavrilets’ “holey landscape” model (Gavrilets 1997 TrEE 12: 307). In this model Gavrilets points out that real adaptive landscapes have very high dimensionality, and that high dimension graphs do not behave the same way as the three dimensional graphs we are familiar with. He argued that with a large number of horizontal axes there would nearly always be a ridge along some dimension that connected the two points. Thus, he argued that rather than an adaptive landscape of hills and valleys we should think of adaptive topographies as flat plains with holes in them. The holes represent fitness valleys that selection would prevent the populations from entering. In this model the flat plain means that there is no selection, and the changes are neutral. Without selection all populations will drift at random over the landscape regardless of the population size (although large populations will move more slowly). For more on this check out some of Østman’s work on the evolutionary dynamics of holey landscapes. For what it is worth, my own perspective is that Gavrilets’ model may or may not be correct. Either way it does not qualitatively change Wright’s model. In Wright’s model it is necessary to cross adaptive valleys, in Gavrilets’ model there is a high dimensionality ridge connecting them. Either way it will take a combination of selection and drift to explore the landscape and move to a higher peak. After all, are we really surprised that the details of an 80 year old model may not be exactly correct?


Gavrilets’ holey landscape model. Taken from http://evolvingthoughts.net/2012/12/evopsychopathy-4-adaptive-scenarios/

Returning to Wright’s figure at the top of the post, parts D, E, and F consider what happens when populations are smaller. In D he imagines that a population is very small. He suspected that these would be so small that selection would be relatively ineffective, and they would be so dominated by drift, (not to mention inbreeding depression) that they would have little chance of evolving to a new peak. In figure E he suggests that a medium size population would be the ideal balance between selection and drift, with drift allowing exploration of the landscape, but selection still being strong enough to cause it to tend to climb towards peaks. The problem with the E scenario is that a single population can only explore a small part of the landscape, and it is unlikely that it would stumble upon a higher adaptive peak.

This leaves us with F, which is a metapopulation structure, that is a set of moderate size to small subpopulations that are joined together by a low level of migration. He felt that this population structure provided a set of subpopulations that were small enough to be strongly influenced by drift, and because it was a large number of subpopulations they could explore a much larger portion of the adaptive landscape. Finally, because they were tied together by low levels of migration, when a population evolved towards a new adaptive peak it could export this fitness solution to other populations.

So, this is why Wright focused on what are now known as metapopulations. He reasoned that it was only in these structured populations that you had the conditions that would allow the kind of random exploration of the adaptive landscape that he thought was essential for a population to discover and move to a higher adaptive peak.

Some thoughts on adaptive topographies

Posted: May 29th, 2014 by Charles Goodnight

Last week I discussed Wright’s “seven generalizations” about populations. His seventh generalization, that there were multiple selective peaks, led him to develop his famous “adaptive topography” metaphor. As Provine (2001) discussed, there is considerable controversy over exactly what Wright meant by an adaptive topography. My understanding is that Wright never meant his topography to be a formal model, and as a result much of the ambiguity may be the result of Wright himself not being clear.

There are three big ambiguities that need to be resolved before his adaptive topography can be formalized. The first is whether a point on the adaptive topography refers to an individual or a population. According to Provine, and I am inclined to agree, Wright himself was ambiguous on this. If you look at the pictures in his 1977 book (and 1932 paper) it appears to me that he is thinking of a point on the topography as being an individual. The reason I say this is that in that paper the populations are drawn as regions on the topography, implying that indeed an individual represents a single point. I think this makes sense and I will stick with that.



Figure 4 from Wright (1932, Proc. VI Intern. Cojngr. Genet. 1:356-366), image taken from (http://pleiotropy.fieldofscience.com/2014/01/sewall-wrights-last-paper.html)

The second issue is what are the axes. Since I have announced that a point on the surface represents an individual, the vertical (z axis) is the fitness of an individual. More problematical are the “horizontal” axes. There are two problems with these. First what are they? In the literature you can find examples where the axes are different trait values, and thus aspects of the phenotype. Indeed, the experimental literature almost entirely treats the axes as if they are some aspect of the phenotype. The problem with this is that we would like the adaptive topography to give insights into evolutionary change. With gene interaction it is quite possible to have two different genotypes giving the same phenotype, and to have these genotypes be incompatible. This is not hypothetical issue: There are several examples of outbreeding depression in crosses between organisms that are phenotypically very similar (e.g., Edmands 1999 Evolution 53: 1757-68). Unfortunately these populations, to the extent that they have the similar phenotypes, would show up on the same area of the adaptive landscape. From Wright’s discussion it appears that he is, at least at times, thinking of the axes as aspects of the genotype, possibly allele frequency. Of course this leads to it’s own problem. If the axes are allele frequencies at a pair of loci, and the points are individuals, then for any given individual the value of an axis can only take on three values, 0, 1, or 2 copies of a particular allele. Nevertheless, from Wright’s figure it is clear that he is thinking of the axes as taking on continuous values. Gavrilets (1997 Trends in Ecology and Evolution 12, 307-12) handles this by suggesting that the axes take on discrete values, but that it is more convenient to represent genotype space as a continuous function. Also, there is the problem that fitness is a function of phenotype, not genotype.

The final problem with the horizontal axes is that in most illustrations of Wright’s metaphor there are three axes, but of course, there are vastly more axes. If we include loci in our topography, we need say one (or several) axes for each locus, and if we take the view I have been pushing, we would need axes describing the nature of interacting partners and cultural milieu the number of axes becomes very large indeed. Thus, the final ambiguity becomes what should and should not be included in the horizontal axes?

Here is where this becomes an opinion piece: Can we come up with a way to put all of this into a single framework that allows a more formal modeling? I think we can. The way I would resolve this problem is by embracing the complexity, and recognizing that there are a near infinite number of axes, and certainly more than we can deal with. We should also recognize that conceptually there is a mapping from the combined patterning and nonheritable factors to the phenotypic compartment. Finally we need to recognize that only the phenotype affects fitness.

I suggest that since it is only phenotype that affects fitness, the first step in developing a general is to consider an adaptive topography that has aspects of the phenotype as the horizontal axes. This actually saves us nothing since, for example, differences in alleles at a locus are actually chemical sequence differences in the DNA, and that aspect of a gene is part of the phenotype (Oh well.) Also, there will always be aspects of the phenotype that we don’t (either can’t or choose not to) measure. Thus, again we are back to our problem of a plethora of axes, certainly more than we can deal with, and some that we are unable to deal with.

cube projection

An orthographic projection of a cube onto a plane. The resulting object is a square. (from http://illuminations.nctm.org/Lesson.aspx?id=4228)

Generally, we will be interested in only a few traits (measured aspects of the phenotype). If we measure a large number of individuals for those traits and for fitness (have fun with that) then we are in effect doing an orthographic projection onto those traits. The result will be that different individuals with the same phenotype at the traits measured will have different fitnesses (what else is new). The fitness value at any point will then will be similar to the genotypic value of quantitative genetics. That is it will be the mean fitness of an individual who has the values of the measured traits at that point, averaged across all other aspects of the phenotype.

Of course, we can’t forget that we are frequently interested in genotypes. To deal with this it seems to me we can imagine that there is a lower level much larger set of axes representing the patterning compartment (genes, and other heritable factors) and the environmental compartment (non-heritable factors).



The genotype (e.g., patterning compartment)-phenotype map leads to a much larger set of axes, and different genotypes that map to the same phenotype. (image copied from http://en.paperblog.com/genotype-phenotype-maps-and-mathy-biology-317240/ who cites: A testable genotype-phenotype map: Modeling evolution of RNA molecules. In: Lässig, M. and Valleriani, A., editors, Biological Evolution and Statistical Physics, pp. 56–83. Springer-Verlag, Berlin, 2002).

The immediate effect of this is that because different heritable elements can create the same (or vanishingly similar) phenotypes, this means that in translating from a phenotype based topography to a patterning compartment based topography the same phenotype (and thus the same fitness) will be represented on different parts of the graph. Again, if we are interested in only a few loci (or other aspects of the patterning compartment) then we can do an orthographic projection onto those loci, and proceed with the subset. Also, it is probably useful to remember that a topography with genotype or other heritable elements as the horizontal axes is itself a projection from an underlying topography that includes also axes associated with non-heritable aspects of the environment.

Thus the point of this essay is that we can incorporate all of the different concepts, including the horizontal axes representing phenotype, or genotype, or more generally any heritable element, or even non-heritable aspects of the environment. The result is that the underlying conceptual topography is impossibly complex, nevertheless we can work with the concept by imagining that we are doing orthographic projections onto the elements of the genotype or phenotype that we are interested in. The nice thing about this is that, like the phenotypic view in general, this is an open ended view of adaptive topographies that can be easily expanded to include things like social interactions, or contextual traits.

As a final note: if this essay sounds a bit confused, it is because I am also confused by this topic.

Sewall Wright’s Seven Generalizations about Populations

Posted: May 22nd, 2014 by Charles Goodnight

Once again I seem to be reorganizing my plan of attack, and this will be a big one. I think it would be entertaining to move over to a discussion of Wright’s shifting balance theory. This is not a minor topic, and indeed, I am told that the two longest papers ever published in Evolution were on this topic (Coyne Barton and Turelli 1997, Evolution 51;643-671; Wade and Goodnight 1998, Evolution 52:1537-1548; see also Coyne Barton and Turelli 2000, Evolution 54:306-317; Goodnight and Wade 2000, Evolution 54:317-324 – FYI there is a hidden message in Goodnight and Wade. Write down the first letter of each paragraph).

It is surprising I haven’t talked much about Wright up to this point. He is one of my heroes, and one of the first luminaries of evolutionary biology I ever met—ok, lying there a bit. I was at the University of Chicago when Wade, Lande, Arnold, Teeri, and later Schemske were all starting out, but they really don’t count because none of them had tenure when I first met them. . .


One of the classic pictures of Sewall Wright while he was at the University of Chicago.

One of the fascinating things about Wright is that, unlike Fisher, he was an experimentalist, rather than a pure theoretician. The problem with doing experiments, of course, is that they are messy and rarely fit into the simple schemes that we develop for our models. Wright, as is often noted, and as is obvious from his books, bred a lot of guinea pigs and carefully examined the genetics of their coat colors and patterns.

guinea pigs

Guinea pigs come in a wide array of colors and patterns (http://emmasguineapigs.blogspot.com/p/cavy-colours.html).

Wright development fig

Wright’s view of developmental genetics.  (Wright 1968: Evolution and the Genetics of Populations, Vol. 1)

This is a situation we have seen before. True experimentalists are confronted with complexity that theoreticians are inclined to ignore. The difference is that Wright was genius, and both a good experimentalist and a brilliant theoretician, putting him the position of being both aware of this complexity, and having the mathematical skills to actually do something with it.

I actually think that there is another piece of history that may have been realized by Crow (Crow 1998 Genetics 148:923-928), but has generally been ignored. Wright received his Ph.D. in 1915 under the direction of William Castle, first at the University of Illinois, and ending at Harvard. Parallel to this, Shull began his work on corn genetics at the University of Illinois in 1905 (and coined the term heterosis in 1914), and East was similarly working on corn genetics at the Connecticut State Agricultural College. What is important here is that Connecticut State is where the Harvard scientists went during the summer both because it was cooler, and there was more land for field experiments. This was the time when the concepts of inbreeding and hybridization and the “magic” of hybrid corn was first being discovered. Importantly, this was also the time and place where Wright was a graduate student, and my guess is that he would have been interacting with the corn breeders on a more of less daily basis. Thus, it is interesting to speculate that the development of hybrid corn gave us not only hybrid corn, but also the shifting balance theory.

In any case, whether guinea pigs or corn or both Wright came up with a set of seven generalizations about populations. In his own words (Wright 1968: Evolution and the Genetics of Populations, Vol. 1)

“There are a number of broad generalizations that follow from this netlike relationship between genome and complex characters. These are all fairly obvious but it may be well to state them explicitly.

(1)   The variations of most characters are affected by a great many loci (the multifactor hypothesis).

(2)  In general, each gene replacement has effects on many characters (the principle of universal pleiotropy)

(3)  each of the innumerable possible alleles at any locus has a unique array of differential effects on taking account of pleiotropy (uniqueness of alleles)

(4)  The dominance relation of two alleles is not an attribute of them but of the whole genome and the environment. Dominance may differ for each pleiotropic effect and is in general easily modifiable (relativity of dominance).

(5)  The effects of multiple loci on a character in general involve much nonadditive interaction (universality of interaction effects)

(6)  Both ontogenetic and phylogenetic homology depend on calling into play similar chains of gene-controlled reactions under similar developmental conditions (homology)

(7)  The contributions of measurable characters to overall selective value usually involve interaction effects of the most extreme sort because of the usually intermediate position of the optimum grade, a situation that implies the existence of innumerable different selective peaks (multiple selective peaks).”

This is the genetics that Wright envisioned. It was a world in which all traits are determined by a large number of loci (multifactor hypothesis), and each of those loci affected a large number of traits (universal pleiotropy) and interacted intensively with each other (universal interaction effects).   The conclusion from this is that there are multiple ways to achieve high fitness (multiple selective peaks). What I find remarkable about this is that this theory was laid out in 1931 (Wright 1931, Genetics 16: 97-159), and yet it is clearly the outline of a complex system model, even down to using the words like “netlike relationship”, and “complex characters”. Basically, this model was developed before computers, before DNA, and really before we knew anything about genes other than that they behaved in a Mendelian fashion. In contrast, complexity theory, being generous, traces back to the 1940s at the earliest (http://www.ralph-abraham.org/articles/MS%23108.Complex/complex.pdf).

Wright was interested in how evolution could occur in this complex system (in both the formal and informal sense) he was envisioning. His world was very different from Fishers. The big difference is that the genetic complexity he was embracing, and his belief that species tended to be divided into small semi-isolated demes (more on that on another day), resulted in his seventh generalization, that there were multiple selective peaks. In contrast, Fisher thought that migration rates were generally large enough that the species could be considered approximately a single random mating population. In this situation, regardless of the amount of gene interaction, there will be only a single adaptive peak. Thus, the big difference between their world views was whether we could model evolution as a single fitness peak (Fisher), or whether we needed to model it as multiple fitness peaks (Wright).

Adaptive Landscapes

Adaptive landscapes.  Top:  Fisher’s world view  implies a simple fitness landscape with a single adaptive peak.  No matter where it starts on the landscape with only mutation and selection a population will eventually evolve to the top of the peak.  Bottom:  Wright’s world view explicitly incorporates a complex landscape multiple adaptive peaks.  In this landscape with only mutation and selection a population will always climb the nearest peak whether or not it is the “optimal” solution.  Once on a local adaptive peak the population will be stuck there. (image taken from http://www.terrorismanalysts.com/pt/index.php/pot/article/view/30/html)

In Wright’s view Fisher’s model, in which mutation and recombination generated variation and selection sorted out the good alleles from the bad ones, was simply not adequate to describe how evolution would occur on these complex landscapes. Thus, we can imagine his goal was to describe how evolution actually did occur on this complex landscape.

Wright’s model, as I have emphasized, was published in the 1930s. A lot has changed in the last 80 plus years. In subsequent posts I will describe Wright’s shifting balance process, but not from the historical perspective of what Wright envisioned, rather from my perspective in 2014 embracing my (obviously insufficient) knowledge of modern biology. I hope people will chime in when I have missed things or gotten things wrong.



Soft Selection: Why it is Multilevel Selection

Posted: May 14th, 2014 by Charles Goodnight

It has come to my attention that it makes sense to spend a blog entry talking strictly about contextual analysis and soft selection. The problem which, as Okasha (2006 Evolution and the levels of selection) puts it, some “theorists find deeply counter-intuitive” is that in soft selection every group puts out exactly the same number of offspring individuals each generation. As a result there can be no variance in fitness among groups, and yet, using contextual analysis one would conclude that despite (and actually because of) this lack of variation in fitness among groups that there must be group selection acting. As evidence I can stand up a few of my favorite straw men, West, Griffin and Gardner (2007, J. Evol Biol 21:374-385, p. 380):

However, now consider that, because of localized resource competition, all groups have a fixed productivity (soft selection; Wallace, 1968) and all competition for reproductive success occurs within the group. . . . Contextual analysis therefore identifies both an impact of individual-genotype and also an impact of group-genotype on the individual’s fitness, and hence diagnoses the operation of both individual and group selection. Again, this is undesirable, as group selection should not be in operation when all groups have the same fitness.

I have to mention they earn full credit for cluelessness, which is immediately obvious in this quote with the discussion of “individual” and “group” genotype. (Um, we are phenotype view evolutionary biologists. Get with the program guys.), but also because the quote is from a section titled “There is no formal theory of group selection”. (Dang, and all this time I thought quantitative genetics WAS a formal theory.)

Clueless 2

Clueless: A classic movie that, of course, is not relevant to this weeks post.  (Hey, diss group selection, prepare to be dissed back)

Soft selection is one of a selection scheme put forward by Wallace (1968 Polymorphism, population size, and genetic load. In Population biology and evolution. Lewontin RC (ed.), pp. 87-108) in which there is a set of populations. At the end of each reproductive cycle each population produces the same number of migrants, which are the winners of individual selection acting within the population.

soft selection schmatic

Soft selection: There is individual selection within each population for a fixed number of migrants that will be produced at the end of the reproductive cycle. This is contrasted with hard selection in which individual fitness is unconstrained by group membership. (http://pedrovale.wordpress.com/2013/07/08/killing-them-softly-managing-pathogen-polymorphism-and-virulence-in-spatially-variable-environments/)

The point is, because there is no variation in output of the different groups, there is by definition no variation in mean group fitness, and as a result it is reasoned that there is no group selection. Indeed this is the conclusion reached by Wade (1985 Am Natur. 125, 61-73), in which he states “For soft selection this covariance [between group mean relative fitness and group mean phenotype] is zero by assumption. Even if the within-group genotypic fitnesses, Wij, were frequency dependent, the assumption [of constant mean group fitnesses] would prohibit the operation of group selection.” I doctored the quote to remove math notation that is specific to that paper. However, this is an interesting problem. Intuitively, in soft selection the fitness of an individual is indeed a function of group membership. After all it is the phenotype of the individual relative to the population mean that determines its fitness, with an intermediate phenotype individual having a high fitness in a group of low phenotype individuals, and a low fitness in a group of high phenotype individuals.

So, what is going on? I think the easiest thing to do is to do some math. Further, to remove the politics, lets not think about contextual analysis. Instead, think of selection on two correlated traits, say body length, and body weight. These we can imagine to be correlated because generally longer animals are also bigger. In this population we can imagine that the population has a mean length of 60 inches and a mean weight of 100 pounds. Further, there is a correlation between length and weight of 0.5, giving a phenotypic covariance matrix of (Phenotypic NOT genotypic covariance matrix. Also, keeping it as simple as possible):

Soft Selection eq 1

So, now the problem. Because these traits are correlated, there is not only direct selection, that is if you select on length, length will change, but also indirect selection: Selecting on weight will also change body length. It turns out we can separate the effects of selection acting directly on a trait (direct selection) from selection acting on another correlated trait (indirect selection). The vector that does this is called the selection gradient, β. It is just like the selection vector, except that it mathematically removes the effects of indirect selection. Thus, if we have selection only acting on body length then the selection gradient might look like this (again keeping it simple)

Soft Selection eq 2

Indicating a selection strength of 1 on Length, and a selection strength of 0 on weight. Of course, we are asserting this direct selection, but what we actually observe is S, the selection vector, which includes both the effects of direct AND indirect selection. So what would S look like? To do determine this we need to recognize that the gradient is actually equal to β = P-1S, where P-1 is the inverse of the phenotypic covariance matrix:

Soft Selection eq 3

Then doing a bunch of algebra nobody needs to know we can solve for S, and discover that:

Soft Selection eq 4

In other words, if we apply selection only on length because the traits are correlated we will also see a change in weight.

OK, Now contextual analysis and soft selection. In the case of soft selection by assumption we have:

Soft Selection eq 5

In other words, we are assuming that there is no covariance between the group mean trait and group mean fitness. I am running out of space, but basically if we use only simple covariances:

Soft Selection eq 6

But, of course, what is important is not S, but β. Thus, if we want to calculate the selection gradient we need to calculate β:

Soft Selection eq 7

The point is that in order to have no variation in fitness among groups you need to have selection at the group level to remove that variation in fitness.

I could have made the same story with the original example of length and weight. If you want to select for longer animals without changing their weight you will need to select against weight. That is, you will need to select for long skinny animals. In other words, lack of covariance between fitness and a trait correlated with one under selection does not come for free. You need to actively apply selection to remove that covariation.

Thus, the point of this whole story is that the lack of covariance between group mean fitness and the group trait is NOT evidence for lack of group selection. Far from it, it is in fact evidence that group selection is acting. To believe otherwise is to simply not understand how selection on correlated characters works.

Why I Like the Multilevel Selection Approach

Posted: May 9th, 2014 by Charles Goodnight

For the past two weeks I have been rather destructionist (is that a word), with my diatribe against kin selection. It seems to me that if you are going to tear down a structure and declare it not useful then you had better be willing to provide an alternative and explain why your alternative is a better choice. With that in mind, this week I will be talking about why I think the multilevel selection approach is the best, and possibly only legitimate, approach for studying social evolution.

In MLS theory the distinction between selection and the response to selection explicit. MLS theory is an outgrowth of quantitative genetics. The classic breeders equation, R=GP-1S, divides the response to selection, R, into the ecological process of selection, P-1S, and the mechanism of inheritance G. This is important because it also provides a guide to research. The reason that we did experimental studies of group selection in the laboratory is that it provided a means of studying G. That is, if we experimentally apply group selection did it cause a response to selection. The answer, of course is yes. I could go on with how this response was explored in some detail, but the point is these lab studies were explicitly designed to study G, the inheritance part of the breeders equation. On the other hand, the contextual analysis studies I have been talking about are primarily useful as phenotypic analyses, that can be applied to natural populations. Thus, we have a growing number of studies demonstrating that multilevel selection is quite common in nature. These studies tell us nothing about the inheritance, for the simple reason that the research is specifically designed to inform us about P-1S. The point is that experimental studies of inheritance are, both from a conceptual and practical perspective, very different from studies of selection. It is thus this distinction is not a minor triviality of the mathematics, it is a central and useful feature of the theory.

In MLS theory fitness is seen as a function of phenotype. In kin selection theory the modeled relationship is between fitness and gene. Efforts have been made to relax this, but ultimately the method is about the effect of single genes or at least very simple genetics on fitness. In MLS theory the modeled relationship is between fitness and phenotype. This is much more realistic. Phenotypes are what we measure in real populations. The relationship between phenotype and genotype is potentially very complex, and certainly not knowable in real field studies. The MLS approach acknowledges this reality, and as a result it is a method that can realistically be used to study natural selection in the wild. This is an area where kin selection simply fails.

In MLS theory the relationship between genotype and phenotype is acknowledged to be complex. In kin selection theory a single gene (or aspect of the genotype) is considered to affect both the individual trait (the cost) and the group trait (benefit). In MLS theory the group and individual trait are considered to be separate but correlated traits, and the genetics are expressed in the form of the G matrix. This allows for the simple system of kin selection (in which the correlation between expression at the two levels is 1), but allows for the possibility that they are not perfectly correlated. As an example, in a typical kin selection model you are either selfish or an altruist. However, if the correlation is not perfect, then you could get different degrees of “efficiency” for altruists. Then selection might favor the phenotype (note not genotype!) that provides the maximum altruism for the minimum cost. In the putative Haldane case, rather than sacrificing his life for his two brothers maybe he just has to cut off his arm. As importantly, as we discover added complexity in nature of inheritance the “inheritance” part of the equation can be modified as needed, either by modifying the G matrix, or when a clever enough theoretician comes along, replacing it with something new (Anyone want to take on coskewness tensors?).

MLS theory focuses on similarity regardless of cause. Hamilton’s rule in kin selection format:

Why MLS HR KS version

can be reconstructed in MLS format in which case the equation comes down to:

Why MLS HR CA version

this makes the point that any cause of variance among groups, or equivalently (because of the bizarre nature of statistics) any cause of similarity within groups can be on the right side of this equation. Kin selection, with its focus on shared genes as the cause of similarity falls short here. What else can cause similarity within groups? How about shared cultural heritage, or traditions? How about policing enforcing uniformity?   This hugely broadens the range under which altruism can evolve.

The MLS approach is not obsessed with the evolution of altruism. The case where group and individual selection are acting in opposition is certainly interesting, but in kin selection it is the ONLY thing that is interesting. This is because it is an optimality approach and when they are acting in the same direction the equilibrium (fixation of the over all good gene) is trivial and uninteresting. Because the MLS approach measures selection as it is acting there is no need to focus solely on altruism. In general the word “altruism” is relatively rare in the MLS literature. It is an interesting sidelight, not the main focus of the research. This has lead to some interesting findings that are generally not appreciated outside of the MLS community. For example, individual selection interferes with group selection and is itself often ineffective due to indirect genetic effects. As a result, the overall response to simultaneous group selection and individual selection acting in concert is often less than simply the response to group selection acting along, and both are generally greater than the response to individual selection acting alone.

The MLS approach treats selection as a competing rates problem. Optimality approaches such as kin selection can at best tell us where a population “ought” to go, all things being equal. The problem is that all things are not equal, populations are not optimal, and there are a thousand contingencies keeping them away from simplistic optima. Because the MLS approach deals with the here and now, and how things are changing under the present conditions many of these problems go away. MLS approaches can be used to study stabilizing selection – either classic stabilizing selection at either the group or individual level, or stabilizing selection due to group and individual selection acting in opposition. However, note that I am not calling the stable equilibrium the “optimum”. In the example of group and individual selection acting in opposition the equilibrium point will be determined by a combination of the strength of selection at the two levels, the heritability of the group and individual level traits, and the genetic correlation between them.

I sum I think that the MLS is simply a better conceptual framework for thinking about social evolution. MLS theory fits firmly into the phenotypic approach, whereas kin selection theory, because of its focus on genes is basically incompatible with the phenotypic approach. MLS is by no means a mature theory, and there is much still to be done. But, heck that makes it exciting. The important point is that unlike kin selection theory, which is sadly stuck back in the 1960’s, MLS theory is an open ended field that is ripe to grow along with our increasing understanding of the subtleties of evolution.


Dynamical models of multilevel selection: Another problem with Kin selection

Posted: May 1st, 2014 by Charles Goodnight

First off, if you haven’t seen it check out the American Museum’s on line collection of photographs. I haven’t had a chance to really explore the hundreds of thousands of photos they have, but I am certain there are some real gems in there.

G. G. Simpson

One of the photos from the American Museum of Natural History. This is G. G. Simpson at his desk, at the museum. AMNH has over a million photographs on line.

It turns out I am not quite done “dissing” kin selection, although my discussion this time is nothing I would have thought of as a problem. What I want to talk about is a pair of papers that appeared in Evolution in a special section on multilevel selection that I edited.

The first of these is my own paper on direct fitness and contextual analysis (Goodnight. 2013 Evolution 67, 1539-48). In this paper I work through the relationship between direct fitness and contextual analysis. It turns out that both of these approaches are using multiple regression to analyze selection. In direct fitness the equation is:

dyamical models eq 1

Where W is absolute fitness, (ind) is the individual trait, (grp) is the trait in interacting partners, and x is a measure of genotype. Without loss of generality I converted absolute fitness to relative fitness (come on guys, working with absolute fitness is for chumps!), and I recognize that because these models are so naïve there must be a function relating genotype to phenotype. Thus, there is a value dynamical models eq 2that relates a change in phenotype to a change in genotype. So multiplying through by dynamical models eq 2we get the equation for contextual analysis:

dynamical models eq 3

which is really the same equation, but to me much more aesthetically pleasing for two reasons. First, as I said, working with absolute fitness is for chumps (AND it makes a difference for contextual analysis), and second, get real, we cannot measure “genotype”, hell, I don’t even know what that means, whereas I have a very clear idea of what I mean when I measure the phenotype.

Anyway, be that as it may, the end result is that the difference between the direct fitness approach (or neighborhood modulated fitness approach) and the multilevel selection approach of contextual analysis does not lie in the equations they use. Rather it lies in HOW the equations are used. In the direct fitness approach the equation is solved for the point where dW/dx = 0. Mathematically this has to be one of three types of points, a fitness maximum, a fitness minimum, or an inflection point. Simple inspection can distinguish between these three possibilities (or second derivatives if you prefer). In contrast, in contextual analysis the slope is analyzed at the point  where the population is currently residing, and dw/dz becomes a measure of the rate of change in relative fitness as a result of multilevel selection. In any case, it is quite reasonable to argue that kin selection and multilevel selection are very similar if not the same thing.

Next, we turn to Simon, Fletcher and Doebeli (2013 Evolution 67, 1561-72.). This is a dynamical model of two level selection using a continuous-time Markov chain, and a companion deterministic partial differential equation model. One of the first things I got out of this model is that Burt Simon is a better mathematician than I am, but as far as my little mind is capable of understanding such things, this model is quite complete, and an excellent general model of multilevel selection. Without going into details they develop a pair of partial differential equations, one in which it is assumed that there is not change in the number or types of groups, basically the frequency of individual types is allowed to change, but the overall change in group types is zero:

dynamical models eq 4

where αi is the growth rate (births-deaths) if the ith type of individual, xi is the trait value of the ith individual, and t is time. They then go on to argue that there are group level processes (group extinction, recolonization, fusion, fission, differential growth) that enter in to the equation.   On the other hand, if no changes in individual fitness are allowed then:

Dynamical models eq 6

Thus, and without going into detail, they then show that the overall change in the trait is:

dynamical model eq 5a

Please remember I am not doing this model justice, so, either believe that what I say is true, or read it yourself. (Word of advice: As Reagan, citing an old Russian proverb, said: “trust, but verify”) (Comment 2, I have no idea why these equations are so ugly.  Click on them for a clearer view).

The result of this is that they argue that if a selective event changes only the αi – the growth rate of the ith type without affecting the distribution of group types then only individual selection is acting, if the selective event changes the distribution of group types with out affecting the growth rate of individual types it is a pure group selection event, and finally if both change it is a multilevel selection event.

They then go through two examples that show the logic of what they are talking about, and eventually ask whether inclusive fitness, that is there in all cases a function bi that can be found that successfully combines individual fitness effects and group fitness effects. The answer to this is no. They point out that the two level approach can be solved directly, but the one level approach necessarily requires the prior solution to the two level approach. In their words, the reductionist approach is not “dynamically sufficient”, and there is a real difference between multilevel selection and kin selection models.

This is an interesting conundrum. On the one hand, the non-dynamical models of kin selection and contextual analysis arguably suggest that the two processes are the same. A dynamical model indicates that they are not the same. Who is right?

The answer seems a bit complex. First off, Goodnight and Simon et al. actually have different definitions of group selection. The Goodnight definition is that group selection is acting when the fitness of an individual is a function of group membership. The Simon et al. definition is that group selection is acting when the outcome of selection depends on group level fitness effects. However, I don’t think this is the problem. I think the bigger problem is that direct fitness and contextual analysis are statistical models that measure the conditions at the current value of the population. Contextual analysis works here because it is measuring the regression slopes at the current population values. It is certainly possible to imagine a system that overall had multilevel selection acting, but at a particular set of gene frequencies (or what have you) group selection was not acting at that moment. Thus, at least in theory, the strength of selection at the two levels may change from generation to generation, and selection at one level might even disappear briefly. This rather minor problem for contextual analysis is a huge problem for kin selection. That is, another way of saying the complaint about these regression models is that there are non-linearities built into multilevel selection. I suspect that if you could force the models to be linear that the manipulation of equating inclusive fitness with multilevel selection in a dynamical model just might work. However, because the two levels will inevitably have non linearities, and in most cases will in some way interact, the linear approximation of kin selection models are doomed to failure.

In other words, kin selection practitioners are guilty of one of the basic errors that all undergraduate statistics students are taught. They are extrapolating from the current population conditions to some point in the far distant frequency space. In short, they are extrapolating beyond their data.



Contact Us ©2010 The University of Vermont – Burlington, VT 05405 – (802) 656-3131