A phenotypic view of evolution Evolution in Structured Populations


Now I know I am “El Lobo Solitario”: I don’t even agree with Allen, Nowak, and Wilson

And now for something completely different.

something different

Up until this point I have been writing my own thoughts about my own little world.  Given that Allen, Nowak and Wilson recently published a sure to be controversial paper on inclusive fitness (Allen, Nowak and Wilson 2013 Limitations of inclusive fitness.  PNAS early edition), and given the role I played in their thinking (i.e., none at all), I figured I should weigh in on their paper.  So, on with the show.

The main point of the paper is that fitness components are not additive.  That is, that you cannot make a clean partitioning between the effects of a behavior on the actors direct effects on themselves, and their indirect effects on others.  They quote Hamilton’s original definition in support of this.  They are of course correct; however, here they might have done well to read Williams:

“No matter how functionally dependent a gene may be, and no matter how complicated its interactions with other genes and environmental factors, it must always be true that a given gene substitution will have an arithmetic mean effect on fitness in any population.” (G.C.Williams, Adaptation and Natural Selection).

The point is that at any given moment it must be true that you can divide anything, including fitness, into components by whatever criteria you choose.  The other point is that I have used that quote for years as a straw man, and I simply cannot believe that I just used it in a positive setting!  What they are really complaining about is whether or not that partitioning is meaningful.  Their point is thus incorrect in that yes, you can partition fitness into direct and inclusive components, but it is correct that in a non-additive system that partitioning will be good for the moment, and will qualitatively change every time the conditions change.

Their second point is that this applies both to Hamilton’s original formulation of inclusive fitness and to the neighbor modulated, or direct, fitness approach.  They make the point that the direct fitness approach is a regression method, and regression is not a causal analysis, that in fact, it is little more than a glorified correlation:

regression equation

As we all know correlation is not causation.  The way regression is supposed to be used is to find the best prediction of Y given a known X.  This is the classic problem that I discussed as recently as last week.  Yes, we should all be using path analysis, yes it is a lot of work and, as often as not, we don’t use path analysis.  Nevertheless, there has been a lot of darn good work on selection has used a regression approach, so I, at least, hesitate to dismiss correlational approaches out right.

There is another important point here that strikes close to home.  That is, contextual analysis not only is a regression approach, but as I have shown elsewhere (Goodnight, 2013. Evolution 67: 1539-1548) contextual analysis uses the SAME regression equation used in the direct fitness approach.  So, why do I like contextual analysis and not like the direct fitness approach?  It turns out that there is a difference between the two approaches, and the difference is exactly the problem that Allen et al. are complaining about.  When using contextual analysis a tangent is calculated for the fitness surface at the point that the population currently occupies.  This tangent is used to calculate the strength of selection acting on the population at the current moment.  This calculation is frequently used to project the future potential response to selection, but the this is an extension, not a basic part of the model.  In contrast, the direct fitness approach establishes the equation at the current state of the population, and then the equation is solved for the conditions under which the slope (dW/dX) equals zero.  This, then is the problem that Allen et al. identified.  Using the kin selection approach we extrapolate from the current conditions to identify the optimal conditions that the population is presumably moving towards.  This extrapolation works fine in a linear system, but such extrapolations are notoriously problematical in complex non-linear systems.

CA VS kin sel

So, as far as this point goes, Allen et al. and I are on the same page.  Kin selection has a serious problem because it is an optimality approach that uses extrapolation of the current conditions in a non-linear complex system to predict the outcome of evolution.  We actually saw the same thing happen with optimal foraging theory.  Twenty years ago it was a hot topic.  Now days it basically doesn’t exist.  When I ask people about it the general answer seems to be that it only works in overly simplistic situations, and at most can provide a qualitative guide to the real world, which is acknowledged to be a complex system.

There is actually a fairly confusing ambiguity in the paper over this topic.  The real problem with kin selection stems from it being based in optimality theory, and thus requires simplistic assumptions so that it is possible to extrapolate from current to future conditions.  In my mind this would bring all optimality approaches into question.  Nevertheless in the discussion they bring in game theory, which is a variant on optimality theory, as one of the saviors of sociobiology.  It seems to me that if kin selection is suspect, then game theory ought to be just as suspect.  Perhaps I am wrong, and perhaps one of my readers can explain it to me.

It is the last part where Allen et al and I part ways.  In their last section (common-sense approaches to evolutionary theory) they state: “The target of selection is not the individual, but the allele or the genomic ensemble that affects behavior.”  This just floored me.  They have just explained to us that biological systems are complex, and yet somehow they fail to understand that genetic systems are complex?  Kin selection can be useful for developing an intuitive or qualitative grasp of social evolution, but for many reasons, including those given by Allen et al. it fails when it is applied to the real world.  Similarly, purely genetic models can be useful for developing intuitive and qualitative understandings of evolution, but for many of the same reasons they fail when applied to the real world.  Selection acts on the phenotype, not genes, and that is a fact.  Even in rare situations where we see selection acting on the sequence of a gene, it is in fact acting on the genes phenotype, not on the gene as a purveyor of information.

fallen runner

Like New Zealand runner Nikki Hamblin, Allen, Nowak and Wilson ran a good race, but ultimately fell short. (http://www.stuff.co.nz/sport/other-sports/5522807/Hamblin-protest-fails-after-world-champs-fall)

Introduction to Contextual Analysis

First off, I have been told that you can’t talk about social evolution without mentioning kin selection:  “kin selection”.  With that done, lets now talk about contextual analysis.  (Ok, lets be honest, in future posts I will more fully diss kin selection.  Suffice it to say, as presently constructed it simply makes no sense from a phenotypic, or for that matter, a reality based perspective.)

The basic idea with contextual analysis is that we do a standard Arnold/Wade selection analysis almost exactly like I described it last week, thus the first thing of which to remind you is that we are talking about phenotypic selection and, at least as originally formulated and as typically used in experimental situations, we are talking about the selection vector, S, and the selection gradient b = P-1S.

R = G P-1S = Gβ

The difference is that the S vector will include traits measured on more than one level of organization.  For example, in a standard group selection setting, both traits measured on the organism and on the group would be included in the selection analysis.  These group level traits could potentially be summary traits for organismal level traits, or they could be traits that only can be measured on the group. As an example, an individual trait might be the leaf area of a plant, then a group summary could be the group mean leaf area, and group size might be a contextual trait.

CA S vector

In this S vector the  ΔZ1 through ΔZN are observed changes in N organismal level traits, the  ΔZ1 through ΔZN are observed changes in the group mean of the N organismal level traits, and the green ΔC1 through ΔCK are observed changes in the group mean of K “contextual” traits that can only be measured on the group.

As with any selection analysis selection of the traits is critical.  From a practical perspective the total number of traits needs to be kept small.  As anybody who has worked with real data knows multivariate methods eat degrees of freedom for lunch.


In addition, it is important to thoughtfully select the traits based on the biology of the organism.  One major issue with selection analyses in general is that the outcome of the analysis may qualitatively change depending on the selection of the traits.  For this reason people are beginning to advocate for using path analysis as in preference to a standard multivariate regression (e.g., Scheiner, Mitchell & Callahan J. Evol. Biol. 13:423-433.  I believe that Michael Morrissey is working on a paper on the topic, but I can’t find it right now)

The problem of thoughtfully choosing the correct trait becomes a bit more complicated when you move to contextual or group summary traits.  Consider that in the above I defined the summary trait as the group mean of the organismal traits.  In fact, you can immediately ask the question whether it should be the group mean, or some jackknifed version of the group mean that leaves out the focal individual.  Also, there may be times when the group mean isn’t important, rather it is some trait of the most extreme individual.  For example, in lions usurping males often kill the cubs, and for cubs the probability of survival is influenced by the ability of the dominant male to fend off other males.  In this case, the group mean could be very misleading.

lions fighting - carole white_medium

The survival of the cubs may depend on the outcome of this fight. (http://www.governorscamp.com/blog/migration-wildlife-update-masai-mara)

In any case, once the traits are identified it is then possible to do a regression of fitness on phenotype, giving the multilevel selection gradient.   This raises the interesting point that there is only a single dependent variable (relative fitness), thus, fitness can only be assigned at one level.  Thus, in our classic group selection example, we assign fitness at the level of the organism, and measure traits on the organism and on the group to which they belong.  In essence, using a perhaps bizarre perspective, we are treating the group level traits as if they were individual traits, thus we are treating, say, population size as the population size that an individual experiences.

This, then, gives us a very nice definition of group selection:  Group selection is occurring when there is significant selection on a contextual trait.  Or in more general terms, group selection is occurring when the fitness of an individual is a function of the characteristics of the group to which they belong.

I will not lie:  When I started out I did not believe that contextual analysis would work.  It was a HUGE paradigm shift for me (Kuhn, forgive me! I really try to avoid that term! ).  First, as a graduate student I had always used individual fitness and group mean fitness, and group selection to me was the differential survival and reproduction of groups.  Contextual analysis does neither of these things.  Nevertheless, it does work – Next week I will talk about how I tried, and failed, to prove that contextual analysis would not work – and for the moment I will ask you to believe it works.

In any case this change in perspective significantly broadens the concept of group selection.  When Maynard-Smith (1964. Nature 201: 1145-1147) first imposed himself on the group selection debate he basically defined a group as something with clear boundaries; metaphorically speaking, something you can walk around.  Contextual analysis makes it clear that the phenomenon of what Maynard-Smith called group selection is part of a much larger phenomenon.  Contextual analysis works equally well with “continuous” groups.  In a continuous plant population a contextual trait might be the mean leaf area of all plants within a 30 cm radius of the focal plant.  Notice that in this case every individual is at the center of their own “group”, so unlike classic group selection every individual will have a unique value for their contextual traits.  Another point is that the groups need not be physical at all.  For species with kin recognition selection on kin groups makes perfect sense.  Finally, it quickly becomes clear that many forms of frequency dependent and density dependent selection are indeed forms of multilevel selection.

Basically, what contextual analysis shows is that classic group selection is part of some sort of multivariate continuum that at one end has Maynard-Smith’s clearly delineated groups and at the other end(s) has frequency dependent selection, continuous groups and “virtual” groups.  If you want to say that there are some things that are group selection and others that are frequency dependent selection you can do that, but since they are on a continuum, there can be no objective criterion for drawing that line and where you draw it will inevitably be arbitrary.



Measuring Selection on Multiple Traits

Models and laboratory studies of multilevel selection are all well and good, but, of course, the real question is how important is group selection in nature. All proselytizing by old dead white men aside (ok I admit, I am a old live white man), this is an empirical question, that no amount of theorizing will answer.  In order to answer that question we need a tool.  Fortunately, that tool, which is contextual analysis, was provided to us by the social sciences, by way of a paper by Heisler and Damuth (1987 Am. Nat. 130: 582-602).  I already talked bout this in an earlier blog post.

Before describing contextual analysis it is worth spending some time on multivariate selection analysis.  Arnold and Wade (1984 Evolution 38: 709-718;  Evolution 38: 720-734) started with the standard multivariate breeder’s equation (shown here in four forms):

A&W Equation 1

Where S is the change in a vector of traits, Z, as a result of selection but before reproduction, and R is the change in Z in the next generation.  Z is a vector of K traits (Z1 to ZK), is the mean of the original population,  is the fitness weighted mean of the population, that is the mean of the population after selection but before reproduction, and  is the mean of the offspring in the next generation.  G and P-1 are the additive genetic covariance matrix and the inverse of the phenotypic covariance matrix respectively.  Again, this is something I have talked about before.

Arnold and Wade pointed out that in quantitative genetics there is a distinction between selection and the response to selection.  The response to selection, R, is really selection, S passed through the filter of the patterning compartment (in quantitative genetics GP-1).  In other words, S is logically about ecology and best studied in the field, and R is about inheritance, and under most circumstances best studied in the lab (or in any case using breeding designs.).  As a result a selection study will typically only measure S, and ignore inheritance.

Most people combine P-1S as the selection gradient β:

A&W Equation 2

Without showing the math, b is the selection gradient.  That is, it is the vector of direct effects of selection.  The problem with using S alone is that a trait can change both from selection acting directly on the trait and from selection acting on a trait with which it is phenotypically correlated.  The selection gradient avoids this problem by mathematically removing the indirect effects of selection.  Note that this indirect selection is different than a correlated response to selection.  Indirect selection is a change that occurs within a generation as a result of selection acting on another trait.  This indirect selection occurs when there are phenotypic correlations.  In contrast, correlated responses to selection depend on genetic correlations, and are a response to selection (in the offspring!) in one trait due to selection acting on another trait.

indirect selection


In an earlier post I showed that:

A&W equation 3


where pi is the frequency of the ith type, and wi is its relative fitness.  From this it should be apparent that the change in phenotype due to the direct effects of selection is:

A&W equation 4

where ∇ is called the gradient.  The point is that if we do a regression of relative fitness on phenotype the slope of this regression line will be the strength of selection.  More importantly, it is the strength of selection with the indirect selection effects stripped off.  As an example, consider a sample data set I made up (One feature of occasionally teaching an applied statistics course:  I can fudge data with the best of them!)

made up data set

In this data set fitness is an function of Z1 plus a little noise so we get at least a little error variance, thus if we do a simple regression of fitness on phenotype we get:

Simple regressions

Notice that we get a significant relationship between Z2 and fitness, even though selection is acting only on Z1.  Indeed

A&W equation 5

In this example ΔZ2 is purely indirect selection because of the phenotypic correlation with Z1.  Fortunately, when we do the partial regression this issue is resolved:

multiple regression

Note that the estimate for Z1 is virtually identical to the estimate from the simple regression.  Since we know that selection was only on Z1, in theory they should be identical, but in this example the sample size is small and I put in some stochastic variation in the relative fitness measure.  On the other hand, it is now obvious that there is no selection acting on Z2, with the parameter estimate being almost zero, and not even “marginally insignificant”.

The point is that, as we all know, multiple regression on correlated characters will give a different answer than will simple regression.  In practical terms this means that it may appear that selection is acting on one trait, when in fact it is acting on a second correlated trait.  As long as we measure that trait and include it in the multiple regression the we will in fact only measure the direct effects of selection.

I am rehashing this issue that most are aware of because next week I will be talking about contextual analysis.  It is important to remember this issue with indirect selection, as it will rear its ugly little head in the form of apparent group selection when none is acting.  To get an intuitive sense of where this will be going, contextual analysis IS the Arnold Wade selection approach applied to a situation where Z1 is an individual trait and Z2 is the group trait.  In this case if there is only individual selection acting then there will be indirect selection at the group level.  That is, because groups made up of high fitness individuals will have a high average fitness it will appear that there is group selection acting, even though that is not true.


Changes in Species Abundance and the Response to Community Selection

In all of the previous discussions I have been talking about evolution through changes within species.  However, when considering community selection there is another way that evolution can occur, that is evolution can occur through changes in species composition.

First off, I should mention that this is a classic case of listening to experiments.  This was not a thought that occurred to me until after I read a pair of paper by Swenson and his colleagues (Swenson et al. 2000. P.N.A.S.97: 9110-9114; Swenson et al. 2000 Env. Microbiol. 2: 564-571).  For that reason, and because the genetic school of group selection is quintessentially an experimental field, it makes sense to briefly describe one of their experiments.

In the experiment that I think is most compelling they went out a nearby garden and took a sample of soil.  They took this soil, soaked it in water, and used the water to inoculate sterile soil.  This soil was divided into 30 pots.  Fifteen of these pots were assigned to the high selection line and fifteen were assigned to the low selection line (Yes, my big complaint with this experiment is the lack of replication).  A number (about 50) of Arabidopsis seeds were planted into each pot.  At the end of each “generation” the three pots with the highest (high selection line) or lowest (low selection line) were selected.  The remaining 12 pots were discarded.  In each line the soil from the three selected pots were mixed, water was added and the water was used to inoculate fifteen pots for the next generation.  New Arabidopsis seeds from a stock population were again seeded into the soil.  This process was continued for 16 generations.  Note that the Arabidopsis were always drawn from a stock population, and as a result they could not evolve.  Instead, Swenson and his colleagues were selecting for soil communities that either promoted or retarded the growth of the Arabidopsis plants.

wilson plant exp design copy

Experimental design of the soil community selection of Swenson et al (Swenson et al. 2000. P.N.A.S.97: 9110-9114).  In this experiment soil communities were selected based on their ability to support test populations of a standardized Arabidopsis strain.

As with all group selection type experiments, there was a significant response to selection (yea, I know it gets boring to say that, but remember that it was less than 20 years ago that Harrison and Hastings (1996 TR.E.E. 11: 180-183) said “ . . . extinction and recolonization have only a limited potential to create, or coexist with, strong genetic differentiation . . ..  This implies that adaptive evolution is unlikely to occur by classic interdemic selection, a conclusion that has often been reached.”)

Swenson plant dry weight

Plant dry weight plotted as a difference between the high and low lines.  Solid triangles selection for low dry weight, open triangles selection for high dry weight.

The interesting thing about this study is that they did not know what species were in the soil community.  All they had was an extract of soil gathered from under some trees on campus.  Nevertheless they got a substantial response to selection.  It is likely true that some of the response to selection took place through genetic changes within the species in the communities, but it is more likely that the majority of the response to selection is due to change in species composition.

What appears to be happening here is that communities are varying in their species composition.  Presumably there is little migration into the communities (but who knows with potentially airborne micro-organisms).  Also the founding propagules were large:  The small inoculum treatment had 0.6 grams of soil, which presumably had a LOT of micro-organisms.  Thus, it is reasonable to speculate that every species was present in every experimental pot.  What differed was the relative numbers of each species.  Selection then presumably favored those communities that had high populations of species that promoted (or suppressed) plant growth.  I have spoken with D. S. Wilson about this, and he told me that there were clear differences in the surface of the pots.  For example, the pots that suppressed plant growth were often covered with algae.

In the Swenson study the communities were founded using a migrant pool model of migration.  This considerably limits the modes by which adaptation can occur.  As pointed out last time, in a migrant pool the interactions among species are randomized every generation.  One presumes that the efficiency of community selection by species composition changes would be more efficient with a multispecies propagule pool.  In that case favorable communities would have been transferred together, preserving the details of the community structure.  Indeed, in a second experiment on community selection for pH of water they did use a multispecies propagule pool model.  This second experiment was similar to the soil experiment, except that sterile pond water samples were inoculated with filtered pond water.

wilson ph exp des copy

As usual there was a response to selection:

Swenson pH

Unfortunately, since this is a very different system it is impossible to compare the two experiments for the effect of migration pattern.

Perhaps more important is the question of what is evolving here.  In these experiments, as in all selection experiments, the selective agent is the investigator, and it is the investigator who decides what fitness is.  In the case of the soil community experiments fitness was defined by plant growth, and yet the plants were not allowed to evolve.  The soil community is evolving; however, at least in theory, this need not involve any genetic changes in the constituent species.  If I may make a loose analogy, it seems that in this case the ecological “niches” are analogous to the loci of traditional genetics, and the micro-organism species are analogous to alleles.  The analogy breaks down in the sense that in genetic systems there are exactly two alleles per locus, whereas in these communities it is not clear whether there it is the individual organism or the species that is the “allele”, and in either case there is not a fixed number of entities (alleles) per niche (locus).  Interestingly, if we were somehow able to look at the constituent species, and pretend that evolution only occurred at the individual level we would (at least potentially) see no genetic changes in the community members, and conclude that there was no evolution going on.  This emphasizes the absurdity of the reductionistic gene selection view.  We have selection leading to a response to selection; however, because the changes are in species numbers rather than in genes, a gene selectionist would conclude that there was no evolution occurring.

Finally, the sad thing about this experiment is that it is a one-off experiment and as far as I am aware nobody has followed up on it.  A molecular evolutionist could have a field day by screening the extracts for some measure of the community diversity; however, for what ever reason this experiment has not come to wide attention.  In contrast people like myself are unlikely to have the skills or resources to be able to explore the changes taking place in a microbial community.

Group structure, Ecology and Heritability.

One of the interesting points raised by the community selection experiments I talked about last week is the issue of the mode of founding of new units.  When we think about genes and individual level traits in a typical sexual organisms this is a non-issue.  Every individual has two parents, and each contributes half of the patterning compartment (genes) to the offspring.  When we start moving to different levels of organization and different modes of inheritance the pattern is no longer invariant.  This raises the important point that what is heritable, and what types of traits selection can act on is not only a function of the level at which selection acts, but also the ecology and the process of the founding of new units.

This is not a new idea.  In the recent history of multilevel selection the first discussion of this that I am aware of is in Wade’s quarterly review of biology (1978. Qrt. Rev. Biol. 53: 101-114.).  In this review he makes the important distinction between a “migrant pool” and a “propagule pool” form of group founding.

Migrant pool                        propagule pool

The important point here is that in the migrant pool individuals move independently to found the next generation.  As a result interactions among individuals are broken up at each founding event, and genetic differences are homogenized.  In contrast, in the propagule pool, groups of individuals move together to found the next generation of groups.  This means that social interactions and mating structure are preserved, and genetic differences among groups are maintained rather than being homogenized.

There are actually two processes going on here.  First, there is the genetic homogenization.  This is primarily concerned with what might be termed the field of recombination.  I have already discussed how in a large panmictic population epistasis can be ignored, but in small populations gene interactions can be statistically converted to additive genetic variance.  Population structure, by changing gene frequencies within the group, and excluding some alleles at interacting loci, has the effect of driving this conversion, and in effect giving “heritability” to these epistatic interactions.  From a genes eye view this would equate to limiting the field of interacting partners, giving more reliability to the genotypic effect our focal allele, and assuring that sexual reproduction does not completely scramble the epistatic interactions.  It may seem odd to think of population structure as an ecological means of modifying linkage and recombination, but from a genetic standpoint that is exactly what it does.  Returning to the migrant pool versus the propagule pool, it should be apparent that the propagule pool will do a much better job of preserving these genetic combinations and limiting the field of recombination than will the migrant pool.

The second important point is the interactions among individuals.  That is, with the migrant pool the individuals that are interacting will have parents drawn randomly from the metapopulation, and interacting partners will not have predictable behaviors and physiologies.  In contrast with the migrant pool, these ecological interactions will be preserved, since the parents of interacting individuals would also have interacted.  The migrant pool makes these ecological interactions predictable and thus heritable.

This can be made much clearer if we think about community selection.  For community selection we can keep the migrant pool model, but we can add nuance to the propagule pool model.  In particular we can distinguish between single species and the multi-species propagule pools (note I have changed the color scheme.  A is species A, B is species B, and I have dropped the subscripts to simplify the drawing) (Goodnight 2011. Phil. Trans.  R. Soc. B 366: 1401-1409).

SS propagule pool              MS propagule pool

In the community selection experiment I described last week I used a multispecies propagule pool.  This means that every generation the same strain of the two species were transferred together.  As a result the interactions between the two species became heritable.  Thus, a change in the predation rate of T. confusum that affected the population size of T. castaneum would be heritableat the community level and could contribute to the response to community selection on T. castaneum population size.

In that experiment I saw clear evidence that interactions between the species was contributing to the response to selection.  It is tempting to speculate that had I used a single species propagule pool model to found the next generation I would not have seen these interspecies interactions contributing to the response to selection (I tried to get NSF funding for that once, but well, we all know how such things go, and now I am too allergic to the beetles to do the experiment).

The other thing that I have not talked about, mainly because I don’t have any clear ideas about it, is the effect of propagule size on group heritability.  It seems reasonable to speculate that on the one hand smaller propagules should increase the rate of divergence of populations, but on the other hand, lower the heritability of group characteristics.  I have no idea where these two forces would balance out, although some experimental data (e.g., Swenson, Wilson and Elias 2000. P.N.A.S. 97: 9110-9114;  Wade 1982.  Evolution 36: 945-961) suggest that the answer will not necessarily be simple.

What about nature?  Well, it is pretty unimaginable that whole forest biomes migrate together to found a new community on bare ground.  We all love Clements, Shelford, and Emerson, but I, and I dare say all, or at least nearly all, modern biologists have to remain skeptical about their expansive view of the “superorganism”.  However, it is quite reasonable that parts of a community may migrate together to found new communities.  For example, more than half of the cells in a typical human belong to some other species.  When we migrate to a new location we bring our gut fauna with us.  At a macro level, many organisms carry ectoparasites (fleas and lice) and phoretic organisms (seeds and some mites) with them as they move from place to place. Thus, at this level multispecies propagule pools do not seem that unreasonable. I suspect the reality is that real communities are founded by a mixture of individual migrants, single species propagules, and multispecies propagules.  How much these different modes contribute to the heritability of traits is an interesting question, and one that I suspect can only be answered empirically.

Interactions among Individuals and the response to Community Selection

Last week, based on theory, I made the claim that group selection could act on interactions among individuals.  At the risk of exposing my over-inflated ego I think that one of the best experiments demonstrating this is my postdoctoral research on selection in two species communities of Tribolium (Goodnight 1990. Evolution 44: 1614-1624; Goodnight 1990. Evolution 44: 1625-1636).

As I discussed last time, Wade’s lab was convinced that it was interactions among individuals that was the reason that group selection was so effective.  The question became how to demonstrate this.  As with any experiment, the question is how to create a weak link that can be exploited to get at the question you are interested in.  I realized that there was no easy way to look at within species interactions, but that a two species competitive system provided a seam across which I could distinguish these inter-individual interactions from within individual genetic effects.  In particular, there were two things to look for.  First, I could look for correlated responses to selection in one species due to selection on the other species – such effects are prima-facie evidence of interspecies pleiotropic effects, since they can only be mediated by interactions among individuals.  Second, I could examine the effect of modifying the community structure.  If removing the competing species affected the response to selection this would indicate that the competing species was an important part of the environment.  More importantly, if changing the strain of the competing species affected the response this would indicate that the specific coevolved strain was important in the response.

For my postdoc I moved to the Mertz lab at the University of Illinois at Chicago where I did an experiment that involved sixteen independent selection experiments.  Each experiment was a set of ten vials set up with 16 T. castaneum and 16 T. confusum.  The populations were allowed to grow for a generation and four traits were measured (I am resisting the temptation to describe the treatments in detail):  population size in each species and emigration rate in each species.  In each selection treatment I selected in one direction on one trait.  Each selection treatment was replicated.  (4 traits X 2 directions X 2 replicates = 16 selection treatments).  In each selection treatment the five most extreme vials were selected.  For example, in the selection treatments for increased T. castaneum population size, the five vials with the largest T. castaneum population size were retained, and the remaining five discarded.  Each surviving communities were each used to set up two communities in the next generation bringing the total back up to ten.

Exp design

The selection protocol.  Selected communities each founded two communities in the next generation.  Importantly, the two species were transferred together, resulting in the potential for community level heritability.

As with all group selection experiments (and I do mean ALL group selection experiments), there was a rapid direct response to selection.

Comm sel fig 1Comm sel fig 2

In addition, there were also within species correlated responses to selection:

comm Sel win sp cor resp

And more importantly, between species correlated responses to selection

Bet sp. corr response

It is this last result that is perhaps the most important.  That is, the between species correlated responses to selection can only occur if selection on one species is causing heritable changes that affect both species.  Of course, lets get real here, we know what is going on.  The high population vials are really not pleasant places for the beetles to live.  Those treatments selected for high population size of either species became over crowded, and the beetles were more than eager to emigrate.  However, just because we have a good idea of the ecological cause does not alter the fact that this is interspecies pleiotropy.

At the end of the selection experiment I did three assays.  Assay A was simply the intact community in the generation after the last generation of selection.  In assay B I broke up the communities into single species populations.  In assay C I again broke up the community, but then added back a naïve test strain of the opposite species.

Assay Design

(Sorry for the cut and paste from a class lecture):  Left is assay A, middle assay B, right assay C

The results of this set of assays was telling:

Community assays 3

In all of the intact communities (assay A) there was a significant difference between the up and down lines.  This difference entirely disappeared in the single species populations (assay B).  The “reconstructed” communities were significant for emigration, but not for population size.

Assay C tells shows that for population size the response to selection was completely (and based on the figures dramatically) dependent on the genetic identity of the competing species.  Ecology clearly plays an important role.  For emigration the response to selection was restored when the ecology was restored in the reconstructed communities, but for population size it is clearly more than  that:  It is apparent that indirect genetic effects are contributing to the response to selection.

So group selection can act on interactions among individuals.  One can ask why this isn’t just Dawkin’s extended phenotype, and why this cannot just be reduced to selection among genes, albeit genes in another species.  The difference comes in how an individual gets its fitness.  In the treatments in which I was selecting for large population size in T. castaneum, any behavior on the part of the T. confusum (read changes in cannibalism rate) that increases the population size of the T. castaneum will increase its chance for survival and thus its fitness.  However, it increases its chance of survival because it increases the chance of the groups survival.  The only way that lowering of the cannibalism rate in T. confusum increases the strains fitness is by virtue of its effect on the community, and with it the average fitness of all members of the community.  In a single panmictic group a change in cannibalism would change the population size of T. castaneum, but there would be no mechanism for that to feed back to an increase in fitness of the T. confusum strain or individual.  In short, we are left with the same conclusion I arrived at last week.  Community selection, as with group selection, leads to adaptations that are qualitatively different than adaptations that are possible when selection is acting at a lower level.





Indirect Effects, Bruce Griffing, and Mean Plants

One of the striking results of the Wade group selection experiment is just how effective group selection was.  Indeed, it was far more effective than anybody ever expected.  On thinking about this Wade (and I as a hanger-on starting graduate student) a likely cause of this unexpected response quickly became clear.  When Thomas Park did his work on the ecology of his population size strains of Tribolium confusum he concluded that the different population sizes were maintained by differences in cannibalism rate.  From this we (ok, really Wade) speculated that the unexpected response to selection was due to group selection being able to act on interactions among individuals in general, or in the case of Tribolium on cannibalism rates.

This is an important point, because David Mertz told me (sadly, I doubt this is published) that they were unable to identify any nutritive advantage to cannibalism in standard high nutrient culture conditions typically used in the lab.  If this is the case then cannibalism is a purely neutral trait at the individual level, but based on Park’s results, changes in cannibalism has huge consequences at the population level.  Thus, we can speculate that the reason that group selection is so effective is that it can act on genetic effects (interactions among individuals) that have no effect on individual fitness, and simply cannot contribute to a response to selection at the individual level.

Not surprisingly, at some level we were re-inventing the wheel.  It turns out that we were rediscovering something that plant breeders had known for a long time.  That is, it has long been known that if you have a field of plants and you select the individuals with the greatest yield and plant up their seeds in the next generation you will frequently see a negative response to selection.  That is if you select individuals for high yield, the yield per hectare will go down.  The solution that plant breeders arrived at is “strain selection”  in which they planted a plot of a particular cross, and chose plots that have the highest yield.  In other words, they realized that if you want to increase yield per hectare, you should select on yield per hectare, not on yield per individual.

It also turns out that there are some very smart theoreticians in the plant-breeding world, and one of them was Bruce Griffing.  Griffing decided to develop a quantitative genetic model that was done excruciatingly correctly, and as a result very confusingly.  To get some idea of how detailed these models are, between 1981 and 1982 he published 10 papers on the subject in the journal of theoretical biology.  Fortunately for us an abstract of this opus was published in 1977 (Griffing, 1977. Proceedings of the International Congress on Quantitative Genetics, August 16-21, 1976 ).  (See Wolf, Brodie, Cheverud, Moore and Wade.1998 TREE13: 64-69. for a more modern approach to this problem)

Without going into the actual model, Griffing assumed that there were two traits, one which was direct effect of an individual on itself, and the other which was the indirect effect of an individual on others.  Thus, for example, a direct effect trait might be the ability of a plant to absorb nutrients from the soil, and an indirect effect trait might be the extent to which a plant prevents neighbors from absorbing nutrients from the soil.   The phenotype of an individual then is a combination of its ability to absorb nutrients, plus its neighbor’s ability to prevent it from absorbing nutrients.


Root competition

(Taken from: http://plantscience.psu.edu/research/labs/roots/research-projects/root-competition )

He also assumed that there was a genetic correlation between them.  In the example of our nutrient uptake, plants that were good at taking up nutrients would do so by taking from their neighbors, and thus suppressing their nutrient uptake.  This makes logical sense.  That is, the way a plant obtains more nutrients is to increase the root system, and to aggressively extract nutrients out of the soil.  The nutrients have to come from somewhere, and where they come from is from nutrients their neighbors would otherwise have absorbed.

This is the problem with individual selection.  Individual selection can only act on the direct effects.  The indirect effects are neutral with respect to individual selection.  Thus, selecting for the highest yielding plants selects for those plants that can most successfully extract nutrients are selected regardless of what effect they may have on neighboring plants.   The next generation, when the selected offspring are planted together, the aggressive interactions prevail, and instead of getting an increase in yield, you get an increase in aggressiveness, and with it an overall lowering of yield.  In other words, aggressiveness is a neutral trait evolves as a correlated response to selection, and this correlated response lowers the overall yield of the population.  In still other words, you get a field of mean plants that spend their time beating on each other instead of doing their job.

In contrast group selection the aggressiveness is not neutral since selection is on the yield of the entire group.  When selection is applied at the group level those fields that have the overall highest yield will be selected.  This will favor a balance between the direct effects (ability to garner nutrients) and indirect effects (aggressiveness) that maximizes overall yield of the field or group.  Notice that the yield of individual plants in this group are likely to be lower than could be attained by individual selection alone, but so will the aggressiveness towards other plants.  In short it will be a typical compromise, where nobody is happy (that is, no sub-trait is maximized), but the overall outcome is best for all (total yield is maximized).

The point of this parable is that there is an important shift that takes place as selection moves from one level to another.  Aggressiveness is not a trait that individual selection can act on and genetic variation for aggressiveness has no effect on individual fitness.  As we move to the group level, aggressiveness does contribute to the response to group selection.  Thus, this higher level of selection is drawing in components of variation that were not available to individual selection.  In short group selection leads to a qualitatively different adaptation than will occur as a result of individual selection.  That is also why the classical models failed so miserably.  They assumed that variation in traits measured at the group level were simply composites (averages) of traits measured at the individual level.   Instead, group level traits must be assumed to be different traits with a different genetic basis that simply cannot be extrapolated from individual level effects.

Take that Haystack Model! (Maynard Smith 1964. Nature 201: 1145-1147.) (http://www.sightswithin.com/Claude.Oscar.Monet/Page_3/)


Wynne-Edwards, Theoreticians and Group Selection: When Data Meets Theory

When talking about multilevel selection it is worth giving a bit of history.  This does two things.  First, it shows us where we came from, but more importantly, the study of MLS is fraught with missteps that perhaps we can avoid in the future. . .

In talking about MLS one should probably go back to Darwin and his pondering about sterile casts in insects; however, the start of the modern controversy can trace back to V. C. Wynne-Edwards, and that is where I will start.  Wynne-Edwards, like many fine biologists of his time, was first and foremost an exceptional naturalist.  What he knew best were the Arctic pelagic birds.  Apparently he got free tickets on the Canadian Pacific line and the Cunnard line, and made several trips to Europe and back watching the pelagic birds.  He also did a number of expeditions into the deep north (Mark Borrello has a good account of this ).  What he noticed was that over fishing could occur, but those occurrences were rare, and speculated that some form of group selection was preventing that from happening.  Theoreticians – read Maynard-Smith and Williams – quickly jumped all over this and explained in no uncertain terms, that while group selection could happen, theory indicated that the chances of it being important were vanishingly small.

Wynne Edwards Baffen Island Wynne Edwards

Wynne-Edwards on Baffen Island, and later in life.

This is a story we have heard many times before.  A naturalist has an idea about how nature works, but does not have the theory to back it up.  Theoreticians then trot out the standard models and show why the naturalist is wrong.  The one I am most familiar with, besides Wynne-Edwards, are Mayr and Carson arguing for genetic revolutions.  It was a cheap and easy put down to use the models of the time to show that population bottlenecks weren’t evolutionarily important.  Perhaps the best showcase of this controversy is the interchange between Carson and Templeton on the one hand (1984. Ann. Rev. Ecol. Syst. 15: 97-131) and Barton and Charlesworth on the other hand (1984. Ann. Rev. Ecol. Syst.  15: 133-164).  In that interchange Carson and Templeton lack any formal theory to back up their claims, and are quickly outmatched by the theoretical canons of Barton and Charlesworth.  As my previous blog entries hint, the problem was not in the naturalists insights, but in the theoreticians models.  The point is, it happens all the time.

In the past, theoreticians did not like group selection, and this bias has continued to this day.  For example, Coyne, in a 2012 blog entry  makes the same hackneyed arguments that were made back in the early 1970s.   I find this stunning.  It seems to me that you ignore the masters at your own peril.  The people who are supporting group selection (and founder events) are people with their noses in the dirt studying the organisms. They KNOW the biology of their organisms, and by-in-large when they have a strong opinion about their study organism they are very likely at least mostly right.  On the other hand the theoreticians that are complaining have a bunch of mathematical gibberish that few understand (I know, I am one).  They are complaining about a master’s knowledge of a system that they have never seen.  So, here is my first adage:

When a good naturalist has an insight that does not agree with your world-view do not dismiss that insight until you truly understand it.

I know nothing about Wynne-Edwards’ pelagic birds, but I do know that he was an excellent naturalist.  If he says there is group selection I am inclined to believe, while he may not be right, but he is probably not wrong.  By that I mean he is almost certainly correct that something important is going on, but he may not be exactly correct as to the mechanism.  In short when a dirty fingernails biologist sees a pattern that does not fit with theory, theoreticians should ask themselves what is wrong with their theory.  Instead they seem more inclined to tell them that they are “just a naturalist” and if they listened to the theory instead of the organisms they would understand why they were wrong.

Of course it gets worse than that.  It was soon readily accepted that group selection was unlikely to be important, a view nicely captured by Harrison and Hastings in 1996 (TREE 11: 180-183):

“. . . extinction and recolonization have only a limited potential to create, or coexist with, strong genetic differentiation . . ..  This implies that adaptive evolution is unlikely to occur by classic interdemic selection, a conclusion that has often been reached.”

Into this mix comes Michael Wade.  His advisor was Thomas Park, another fine naturalist, but a laboratory naturalist.  Park spent his career working with Tribolium flour beetles.  One of the results of his work was that in just a few generations he was able to get three strains of Tribolium confusum, one with a large, one with a medium and one with a small population size.  When raised in the same conditions the large population size strain had a ten-fold larger population size than the small population size strain.  Park was an ecologist, so he focused on the ecology of why these strains had such different population sizes.  After much work he showed that the differences were due to cannibalism rates (Park, Mertz, and Petrusewicz 1961. Physiol. Zoöl. 34: 62-80).


Tribolium confusum

Given his work with Tribolium, it is hardly surprising that he was open to Wade doing an experimental study of group selection.  Note that here is an experimentalist and naturalist (Park), with his knowledge of Tribolium biology, learning about the group selection theory with his graduate student, and deciding that the two are not in agreement.  Here the experimentalist is at a huge advantage over the theorist,  since they do not need to work from the old theories.  Does group selection work?  They decided to try it and find out.  The resulting thesis resulted in one of the classic studies of group selection (Wade, M. J. 1977. 31: 134-153)

Without going into details, Wade got a much greater response to selection than he expected.  Since that time there have been numerous studies that have confirmed that group selection is much more effective than anybody would have believed (Goodnight and Stevens 1997. Am. Nat. 150(Supplement): S59-S79).  Yet, the majority of evolutionary biologists, and a lot of theoreticians, dismiss group selection as being ineffective.  This has the makings of a theoretician’s dream.  We have experimental results that are at odds with the theoretical predictions.  This is ripe for asking why the old models don’t work.  So this brings us to my second adage:

When data and theory disagree it is the theory that is wrong.

For reasons that are not clear to me, even though experiments show that group selection is surprisingly effective over a wide range of conditions, it has not garnered theoretical attention among evolutionary biologists.

Wade Results


Wades results.  Red line:  Group selection for large population size, Blue line:  Group selection for low population size.

In any case to sum up this essay (rant?) I think the real problem is that biologists, and especially theoreticians, are like everybody else: They have their own preconceived ideas about how the world works.  When presented with data that disagrees with those views the first reaction is to either ignore or dismiss those data.  However, the truth is that there is gold in those disagreements, for it is reconciling inconvenient data with our world-view that real advances are made.

Multilevel Selection: The Adaptation Approach and the Evolutionary Change Approach

Last week I finished talking about gene interaction for at least a while.  I hope I convinced you that assigning fitnesses to individual genes is a fools errand, both because in a world with interactions the assignment of fitnesses to individual alleles would be so context dependent as to be useless, and because it ends up being an NP hard problem that would require a computer larger than the universe to solve.   At this point I want to turn to multilevel selection. I will start by talking about what I think is an important distinction that is rarely recognized, and ends up being a major contributor to the ongoing controversy that seems to surround multilevel selection.

It turns out that there two distinct approaches to the study of multilevel selection.  One of these I will call the “adaptation” approach, and the other I will call the “evolutionary change” approach.  (In Goodnight and Stevens (1997. Am. Nat. 150(Supplement): S59-S79) we called these the adapationist school and the genetical school.).   If I come off biased it is because my thinking falls squarely in the evolutionary change school; however, both of these approaches are quite valid.

The adaptation approach is the most well known.  In this approach an adaptation is identified, and plausible scenarios for its evolution are identified.  The investigator then designs experiments and uses rules for deciding which of the plausible explanations is the most likely to be true.  Some are eliminated due to the rules of science:  any explanation involving a supernatural force (an intelligent designer or aliens) is automatically ruled out because such explanations cannot be subjected to scientific investigation, because science is only concerned with naturalistic explanations.  Others must be eliminated by experimentation.  For example, scientists have debated the origin of snakes, and why they are legless, since the 19th century.  In recent years two major explanations have arisen for why snakes are legless.  One is that snakes are of terrestrial origin, and lost their legs possibly as an adaptation to burrowing underground.  The second, and currently more widely accepted is that they evolved from mosasaurs and are of marine origin.  Under this model they lost their legs since they were fully aquatic.  Distinguishing these two explanations is not trivial, and the debate is still on-going, with Vidal and Hedges (2004 Proc R. Soc Lond B 271:S226) using molecular phylogenetic evidence to argue for a terrestrial origin, and Lee (Biol Lett 2005 1:227), also using molecular phylogenetic evidence, arguing for a marine origin for snakes.



Two models for the evolution of leglessness in snakes.  One is terrestrial, the other aquatic.  The currently favored hypothesis is that ancient monitor lizards gave rise to mosasaurs, which in turn gave rise to snakes.  In this hypothesis the garter snake sunning in your back yard is the descendent of a sea monster.


Unfortunately it is often impossible to eliminate all but one plausible naturalistic explanation.  In this case it is necessary to establish rules for choosing among the available possibilities to decide on which is the most plausible.   In the multilevel selection literature perhaps the most famous such rule is Williams’ principle of parsimony, which is worth quoting in full:

“In explaining adaptation, one should assume the adequacy of the simplest form of natural selection, that of alternative alleles in Mendelian populations, unless the evidence clearly shows that this theory does not suffice.”  (G. C. Williams 1974.  Adaptation and Natural Selection).

The important element that defines the adaptation approach is “in explaining adaptations”.  Outside of subjects strictly associated with evolutionary theory nearly all of biology takes an adaption approach.  That is, in physiology and anatomy it is taken as a given that a structure or behavior has an adaptive value, and the goal of most of the research is, in essence, determining what that adaptive value is.  Within evolutionary biology, all of kin selection, game theory, some of multilevel selection, and all optimization theory in general falls into the realm of the adaptation approach.  An excellent example of the adaptation approach using a multilevel selection is given in Wilson and Caldwell (Evolution 1981.  35:882).

In contrast the evolutionary change approach is used to study changes in populations resulting from ongoing evolutionary forces.  It is usually applied to multilevel selection in two ways.  First there is a long tradition of experiments in which selection is applied at the group level, and responses to that selection are observed.  Second, contextual analysis has been used to study selection both theoretically and in the field.  In the selection experiments group selection is applied as a treatment, and it is typically applied so that it in different treatments it may be acting in the same direction or in opposition to individual selection.   Contextual analysis is a statistical analysis method that is basically a multiple regression selection analysis in which group and individual level traits are simultaneously included in the same analysis.  Rather counter intuitively this system works to partition selection at multiple levels, a result that has been demonstrated both theoretically (e.g.,  Goodnight, Schwartz and Stevens 1992. Am. Nat. 140:743) and experimentally (e.g., Eldakar et al. 2010. Evolution 64: 3183).  As with any selection analysis approach what is being measured is the covariance between group and individual level traits with relative fitness.  This deserves further discussion, but in the interests of space we will leave discussion of the details for a later time.  The important point is that what is measured is the change in phenotype within a generation due to selection.

The distinction between these two approaches is critical, and unfortunately not often appreciated.   Most importantly, in the adaptive approach rules are needed to distinguish between different explanations, whereas such rules are of no use in the evolutionary change approach.  This is the reason that Williams’ principle of parsimony is central to kin selection theory, but is almost never mentioned in papers using the evolutionary change approach.  This is also why the kin selection theory essentially cares only about the evolution of altruism, whereas in MLS theory the evolution of altruism, which, if it is mentioned, is only mentioned as an afterthought.  It really is stunning how little discussion of altruism there is in either the experimental group selection literature or in the contextual analysis literature.

To me one of the most confusing aspects of the multilevel selection vs kin selection controversy is that multilevel selection experiments are virtually never cited in kin selection discussions.  More importantly, the central discovery of these studies:  that group selection is more effective than expected because of genetically based interactions among individuals has never been incorporated into kin selection theory.  This distinction between the adaptation approach and the evolutionary change approach may be an explanation.  The evolutionary change approach is interested in process and interactions qualitatively affect the rate at which change occurs.  In contrast, the adaptation approach is interested in endpoints.  The rate at which those endpoints are achieved is of relatively little interest to this approach.  Adaptationists should be careful, however.  One thing that should be clear from my past blog posts on epistasis is that interactions can also change the endpoints.

Why I like the phenotypic view: Epistasis and the mean field approximation

Before we move on to multilevel selection, I realized that I had not finished an important aspect of the story I was developing.  In particular, I did not step back and ask how all the talk of gene interaction affected the main thesis of this blog, the phenotypic view of evolution.  I tell other that it is always essential to step back and ask why what they are doing is interesting, and think it wise to follow my own advice.  Also, in fact, my thinking about epistasis was one of the important reasons I have come to reject the genic view.  I promise promise promise that I will move on to multilevel selection next week.

There are two ways that the genic view fails. The first is philosophical.  Even if we could reduce selection to changes in gene frequency, that does not mean we should.  That is, because selection acts on the whole phenotype our best understanding will come from a world view that focuses on the phenotype, and not on one that focuses on its part.  An engineer wants to know what works, and does not particularly need to know why it works, but scientists want to know why things work.  If selection acts on phenotypes, then that is where we should focus our theory.  The importance of this will hopefully become more obvious when we talk about multilevel selection.  (Oh, and apologies to engineers.  They do care about why things work, its just that they don’t NEED to know the why to do their job.)

The second is practical.  In short the genic approach doesn’t work.

Here is where a little history helps.  Sir Ronald Fisher is the initial developer of what is now quantitative genetics.  (The Genetical Theory of Natural Selection) When Fisher developed his theory in the early 1900s (the first edition was published in 1930) the nature of genes was not known, and there was an ongoing controversy over whether the major mode of inheritance was particulate or continuous (Provine’s book is an excellent summary of the history of this period ).  The point is that he made a number of simplifying assumptions that were clearly not true, but nevertheless reasonable first approximations.  In particular, he assumed that population sizes were infinitely large, that mating was random, and that traits were determined by an infinite number of loci, each with infinitesimal effect.

It is almost certainly true that Fisher was aware that these simplifying assumptions were at best approximations.  He was well aware of epistasis, and he was aware of the problems of assortative mating.  My best guess (based on talking with Yaneer Bar Yam  a complex systems friend) is that Fisher was aware of the limitations of his assumptions, but when the assumptions of infinite population size etc. were relaxed by later investigators, the consequences were assumed to still hold.  In particular, in an infinitely large random mating population with random mating every gene experiences every genotype in proportion to its frequency.  In addition, Fisher’s model assumes infinitely many loci each with infinitesimal effects, resulting in the counter-intuitive results is that selection does not results in changes in the frequencies of the underlying genes (or more correctly it results in infinitesimal changes in gene frequency). Under these circumstances the average effect of an allele is constant, and unchanging.  Even if there is epistasis it averages out and can be ignored.  Fisher quite explicitly put epistasis into the residual or environmental variance.  Thus, Fisher’s assumption of additivity was perfectly reasonable given his assumption of infinite population size.  The problem is that later modelers relaxed the assumption of large population size etc., but retained the assumption of additivity.

Selection never acts directly on the gene; however, under Fisher’s assumption of infinite population size and random mating – the “mean field approximation” – there is no practical reason why we cannot reduce selection on organisms to the sum of selection on the individual genes.  The problem is that the real world is not like this.  The number of loci is finite, population sizes are finite, and mating and interactions are not random.  As a result epistatic interactions do not average out, and the average effects of alleles change over space and time.

Where this becomes a problem for the genic view is that value of a gene is ephemeral.  It is contextual and depends on a myriad of epistatic and social interactions.  In theory at any given moment it is possible to assign fitnesses to individual loci, and caracature (by that I mean an imitation of a thing in which striking characteristics are exaggerated in order to create grotesque effect) selection as if it were acting on individual genes.   The grotesque nature of this reductionism becomes apparent in the next round of random mating, or when gene frequencies change, or when the social milieu changes.  The reduction to selection coefficients on individual genes can be done again, however the the selection coefficients will be substantially different than they were in the previous generation.  As a result, fitnesses assigned to individual genes change from generation to generation, and they have no predictive power.  One has to ask the question whether a theory with no predictive power is of any use.

The second problem is that while it may in theory be possible to assign fitnesses to individual alleles at individual loci, there are some 25,000 loci in humans, which seems to be about right for most organisms.  Reducing selection coefficients to individual loci requires solving a problem in 25,000 unknowns.  This problem is exacerbated by the fact that the genes are linked, and in linkage disequilibrium.  In short it is what is called an NP hard problem.  Colloquially what that means is that while it is conceptually possible to assign fitnesses to individual genes, a computer large enough to solve the problem cannot even theoretically be made.  PLEASE NOTE:  I have been corrected on this.  NP hard problems can be solvable for finite N, and in fact this problem might be possible for a large enough computer.  The real problem is obtaining enough quality data.  (thanks to Peter for this)

deep thought jpg

Not even Deep Thought can solve the problem of assigning fitnesses to genes.


Of course there certainly are examples of single locus diseases, such as sickle cell anemia, where we can assign individual fitnesses to genotypes; however, even here we find that these are not simple single locus diseases.  Sickle cell anemia is particularly severe among the Bantu, but is apparently frequently much milder in Arabia, India and Senegal.  Thus, even the archetypical single locus genetic disease is really polygenic (A. Gabriel 2010.  Nature Education 3:2 ).

The other exaggeration in my thoughts above is that many of those 25,000 loci are fixed, or have a small enough effect that they can be safely ignored.  However, remember that if there are only two alleles per locus the number of possible genotypes goes up as a factor of 2n , where n is the number of loci, a number that goes up very quickly as the number of loci increases.  Somewhere above about 20 loci this starts to become an NP hard problem, and certainly beyond the range of experiments with real organisms.

Now an important point:  I cannot prove this, but selection and drift both appear to have the effect of reducing dimensionality within populations.  Thus in any given population it will often be the case that an additive dominance genic view works just fine.  The problem is a classic complexity issue of a model working for the wrong reasons.  That is selection and drift can reduce complexity to the point that the genic view works WITHIN a population but this reducibility will be heavily context dependent, and not be descriptive of the genes performance outside of the specific population.

The bottom line is that no matter what model we use nature will do what it will do.  Having a genic versus a phenotypic perspective does not change the result of evolution, it changes our perception of it.  My argument is that the reductionist genic view is constraining and simplistic, whereas the phenotypic view is a perspective that can embrace these complexities, and incorporate them into the phenotype to phenotype transition equation as needed.


Added after the fact:

OK, this is embarrassing, but I think Peter was wrong in his comment below, but so was I.  There are about 25000 loci in humans give or take.  If each one has two loci (a serious underestimate), then there are three possible genotypes at each locus.  To fully implement the genic view we would need to construct and phenotype all possible genotypes, which would be 3N (not 2N).  When N = 25000 this is a very large number, a number far larger than the number of electrons in the universe (usually estimated to be around 1080).  Because the number is so large it is not even theoretically possible to build a computer that is large enough that it can store, let alone process this information.  This is the essence of an NP hard problem.  How large is 325000 basically infinity.  The best I can find is that 383 = 3,990,838,394,187,339,929,5 34,246,675,572,349,035,227.  My only excuse is that I just wanted to make the point that the number of possible genotypes is a very large number, and I made the statement without thinking it through.

Skip to toolbar