A phenotypic view of evolution Evolution in Structured Populations


How to include group selection without really trying: Adding contextual traits to a selection analysis

Last week I introduced the multivariate breeders equation.  Before moving on I want to talk a little about the types of traits that can be included in the selection vector.  Of course it is logical enough to include morphological traits such as body size or leaf area.  Only slightly more complicated are behavioral traits, and these are more complicated only because they are more difficult to quantify.  But what about heritability?  Nancy Burley (1988 Animal Behaviour  36: 1235-1237) found that female zebra finches in one population preferred males that had a red band placed on their leg.  Clearly this is not heritable (leg bands were put there by a scientist!) and yet just as clearly there is a covariance between color of leg band and fitness (read mating success).  Even though this is not heritable there is no reason, other than the obvious silliness of it, that it cannot be included as a trait in the selection analysis.  This actually is something of an important point in that there are often times when selection analyses will be done on traits for which the heritability is unknown, or for which there are confounding environmental or developmental factors.  A good example of this is the original paper describing measuring selection in natural populations, Arnold and Wade (1984. Evolution 38(4): 720-734).  In this study they found that there was sexual selection for body size in frogs, with females preferring larger males.  We know nothing about the heritability of body size, except that there is the confounding factor that older frogs are larger than smaller frogs.  Thus, there are both environmental and developmental factors contributing to this trait; nevertheless, the selection analysis is perfectly valid, and this paper ends up being a landmark contribution to evolutionary biology.

zebra finch with band


However, what I really want to talk about is when traits are not clearly attached to the individual.  For example, if we can study selection on bird song dialects, can we study selection on deviation from the “mean” bird song dialect in a particular group (e.g., the rare-male advantage? Spiess 1968.  Am. Nat. 363)?  How about something even further removed from the individual, such as the population density, or the speed of the slowest member of the herd?    It turns out that yes, these can be treated as if they were traits of the individual.  The general approach is called contextual analysis.  In a perfect world it would be so obvious that it wouldn’t have a name, but given biologists propensity to name things, and to dismiss that which they don’t understand, I am afraid a name is required.

Contextual analysis was originally developed in the social sciences as  a way of including social factors into an analysis of a persons opinions or behaviors.  For example, whether or not a person votes, or goes to church, or owns a gun is strongly influenced by the social norms in their community.  (I will save space by omitting examples.  You can make up your own.).  This general idea was brought into the realm of biology by Heisler and Damuth (1987. American Naturalist 130: 582-602.).  Since then it has been roundly criticized by those who don’t bother to understand it (e.g., West et al. Journal of Evolutionary Biology 20: 415-432.), and enthusiastically embraced by those who collect and analyze data (e.g., Eldakar, et al. 2010. Evolution 64: 3183-3189, Moorad 2013. Evolution 67: 1635-1648).  Of course I am totally unbiased on this issue. . .

The basic idea is that a selection analysis can include not only “individual” traits (e.g., leaf area), but also group or neighborhood summary traits (e.g., group mean leaf area), and contextual traits, that is traits that cannot be measured on an individual (e.g., population size).  The crazy thing is that when you do this it works.  In an older paper we sort of beat it to death theoretically, and it stood up to all challenges (Goodnight, C. J., et al. 1992. American Naturalist 140:743-761.).  Bottom line is that contextual analysis works.   It is no better or worse than the selection analysis championed by Arnold and Wade.

I have come around to thinking about traits like population density as properties of the individual – that is, population density can be thought of as the population density that an individual experiences.  If there is a classic group structure then all members of the same group will experience the same population density, however, in a viscous continuous population every individual is at the center of their own neighborhood and will experience slightly different population densities.  Thus, in the neighborhood situation it is clear that “contextual” traits really are properties of the individual.

The problem, of course, is that we really don’t know how to measure, or even how to define the heritability of contextual traits.  In principle it is the regression of the population traits that an offspring experiences on the population traits the parents experience; however, standard genetic methods specifically randomize the social environment.  Obviously protocols for measuring the heritability of contextual traits will not be simple modifications of standard breeding methods.  Nevertheless, as pointed out at the beginning of this essay, even if we aren’t sure of the heritability of contextual traits, we can still use them in selection analyses.

The point is that contextual analysis (CA) is really just the idea that you can include contextual traits in a selection analysis.  As such it is not multiple regression per se.  It could be used just as well in a path analysis (contextual path analysis?  I like it) as was done by Stevens et al. (1995. American Naturalist 145: 513-526), as well as any of the various means of measuring selection that have been suggested over the years.

I should deal with a few complaints about CA.  One common one is that the traits we used in Goodnight Schwartz and Stevens are a nebulous trait and the group mean of that trait.  This has led many to think you MUST use the group mean of the trait.  The reality is that just as with any selection analysis choice of traits is important and nuanced.  In our model we were dealing with theoretical infinitely sized groups, and the group mean of the trait made sense.  In actual experimental situations it may make sense to leave out the focal individual in a jack-knifed group mean (note that in this case every individuals contextual trait will be slightly different).  In other cases other summary measures may be appropriate.  As an example, in humans single combat, or champion warfare, was used as a means of settling disputes.  Each side would choose a champion and the two champions would compete in a duel.  The side with the winning champion would win the dispute.  In this case the correct summary statistic might be the strength of the strongest member of the group.


“The Jan. 1593 single combat, using war elephants, between Siamese King Naresuan and the Burmese crown prince Crown Prince Minchit Sra – still celebrated in Thai history (statue in Samut Prakan Province, Thailand)” (http://en.wikipedia.org/wiki/Single_combat).

The other rather nebulous concern is that CA is somehow flawed, and that kin selection is a better choice.  I will have much more to say about this, but suffice it to say that I have addressed this issue in a recent Evolution article (Goodnight 2013. Evolution 67:1539-1548).  Contextual analysis and the direct fitness approach to kin selection are based on the same equation.  In this respect they are both either flawed or not flawed.  The difference between contextual analysis (and multilevel selection in general) and direct fitness (and kin selection in general) is that contextual analysis measures the strength of selection at a given point in time, whereas kin selection identifies the optimum.  You will almost never see multilevel selection identifying an optimum (the exception are studies of stabilizing selection), and you never see a measure of the strength of kin selection.  They do not exist, they never have, and they never will.

The multivariate breeder’s equation

Last week I talked about the univariate breeder’s equation.  A major extension of this was provided by Lande (1979 Evolution 33: 402-416).  In this paper Lande shows how the basic breeders equation can be expanded to include multiple traits.  Putting my tongue firmly in my cheek, the basic technique involves (1) separating heritability into its to its components, and (2) capitalizing all of the letters:

MVBE equation 1

Yea, multivariate quantitative genetics in one easy lesson!  Of course this assumes you know linear algebra.  For those poor pitiful folk who maybe aren’t completely satisfied with my derivation I perhaps should expand on this.

First, Lande (OK, I am in Montpellier without my reprints, or for that matter reasonable internet, so this is from memory) examined the relationship between brain size and body size.  These are two traits, but they are correlated.  That is larger bodied individuals tend to have larger brains etc., however this relationship is not perfect.  (Nor do I have Illustrator on this computer!).  Thus, this relationship might have looked something like this:

MVBE equation 2

The point is that individuals with larger body sizes tend to have larger brains, but not this isn’t always true.  Lande suggests that we consider the traits as being a vector, in this case with two elements:

MVBE equation 3

There are some interesting points to note about this.  Most important, is that although the traits are listed separately, they are correlated.  Thus, if we select on one trait, the other trait will also change.  Thus, even if selection – i.e., survival and reproduction, is ONLY affected by body size, changes in brain size will also occur.  This is rarely talked about, but it will be important when we talk about multilevel selection.  As a result, lets give it a name.  When selection acts on one trait and causes a change in another trait within a generation, and before reproduction, lets call it indirect selection.  This is different than a correlated response to selection, which would occur if selection on body size were to change the brain size in the offspring.

MVBE equation 4

Thus, looking at S alone will not actually tell us what selection is acting upon.  To find this out we need to include the phenotypic covariance matrix.  This is a matrix that has the phenotypic variances on the diagonal, and covariances on the off-diagonal.

MVBE equation 5

Of course we need to “divide” by the P matrix, which we cannot do, but we can multiply by the inverse of the P matrix.  This is easy enough with a 2X2 matrix, but if you want to use more than two variables get a computer!

MVBE equation 6

With the fervent hope that I got this right from memory (damn, I need internet) we can multiply this with the selection vector to get the “selection gradient”. Although I appear to be unable to do the math while sitting here listening to talk of Franz Boaz, issues of race and racism, and the new synthesis, this gradient,  gives us the direct effects of selection.  That is, it mathematically removes the indirect selection from the equation, and only shows the direct effects of selection on the trait.  Thus, if there is selection acting only on body size, both brain size and body size will show changes as a result of selection (the S vector), however, the gradient (beta) will show that there is only selection acting on body size.

Finally, the G matrix is the genetic covariance matrix.  (I will always be a bit miffed at Russ for this.  As a person who works on epistasis I know that it should be the A matrix.  If anybody wants to lead the nomenclatural charge, I will provide artillery support!).  This is similar to the the phenotypic covariance matrix, except that the variances and covariances are additive genetic variances and additive genetic covariances.  Similarly, the R vector is a vector of the change in mean phenotype between generations.  Thus, we can rewrite the breeder’s equation:

MVBE equation 7

So, it is really that simple.  There are a couple of random thoughts that need to be mentioned here.  One is the oft-repeated point that with multiple traits a population may appear to “spiral” up a fitness peak:

MVBE equation 8 try 2

from:  http://www.bio.tamu.edu/users/ajones/gmatrixonline/whatisg.html

However, there are a number of other issues that are more rarely discussed.  One is that you really can only study selection on a few traits, perhaps four.  More than this is a statistical nightmare.

More importantly, if you actually do a study of this sort you will find that it is very sensitive to the assumption that you have in fact measured the traits that are under selection.  If you add or remove a trait from the analysis you will get a qualitatively different result.   The reason for this is that this analysis of selection is technically miss-using multiple regression.  No matter how we dress it up regression is basically a glorified correlation.  Thus properly used we would ask how is fitness best predicted by the traits we have measured.  In fact, in the setup I have outlined we are asking how is fitness caused by the traits we have measured.  Thus we are miss-using regression and pretending it is a causal analysis.

For the most part we will live with this lie, however, if we wanted to be honest we would do a path analysis, which Sewall Wright specifically called a causal system of analysis.  To learn this I recommend the out of print book Li, C. C. (1975). Path Analysis: A primer. Pacific Grove, CA, The Boxwood Press. (Interestingly, adjusted for inflation the cost is the same as it was when it was originally printed).  The important insight from this is that properly done the causal paths of path analysis are determined by study and an understanding of the biology of the system.  The statistics is added on afterwards to determine the strength of the paths.  This I think is a generally important point.  Statistics is a tool, and never can and never will substitute for a knowledge of a system.

Another point is that in essence all selection is on “fitness”, and a sense all selection on traits is indirect selection.  This is why Arnold and Wade called the variance in relative fitness the “opportunity for selection.  (Arnold, S. J. and M. J. Wade (1984) 38: 709-718).  That particular conceit aside, another point is that traits we measure are almost never the traits that are actually under selection.  Remember that I described the phenotype as a vector that goes from birth to death.  As such when we measure a trait it is usually a measurement made at a particular point, whereas selection is acting on a more cumulative measure of a similar correlated trait.  For example, I once did a study where we measured photosynthetic rates.  These instantaneous measures made at a particular time of day, on a randomly sampled leaf, at a particular point in the life of the plants.  We made the reasonable assumption that this instantaneous measure was representative (read correlated) with the trait we were really interested in, which is the intrinsic photosynthetic rate of that plant.  The point of this is that all selection studies need to be taken with a grain of salt.  That said most investigators do a pretty good job, and most results are probably at least qualitatively correct.  We have to be careful not to be curmudgeons that think that all science is useless because we cannot do it perfectly.

The univariate Breeders equation

Sorry this is late.  I am posting it from a cafe in Montpellier (the place in France, not the Vermont state capital).

I got a question on facebook asking what the breeder’s equation is.  This is a little embarrassing since I have been acting as if everybody obviously knew this equation, when in fact, only those who have studied quantitative genetics can be expected to be familiar with it.  I will first discuss the original form of the equation that considers only a single trait.  Next week I will talk about the multivariate form of the equation.  Thus, without further ado, the breeders equation. . .

The equation is called the breeder’s equation because it is the basic equation that animal breeders use to predict the response to selection.  Often this will have a profit motive.  That is a breeder may want to know whether it is worth the effort to try to select for a desired trait, such as the number of eggs a hen produces, or the amount of marbling in the meat of a cow.  To do this they need to decide how strong a selection pressure they want to apply, and whether or not there is enough genetic variance for the trait to make the result worth the effort.  The basic equation itself is very simple:

r = h2s

Here r is the response to selection, h2 is the heritability and s is the selection differential.  Lets take these elements one at a time.

I discussed s last week, so that is only a few inches below this post.  You can look at that if you want a numerical example of how s can be calculated.  The selection differential is the difference between the original population mean for the trait and the relative fitness weighted population mean:

BE equation 1 new

To make a point it is convenient to rewrite this with 1/I = pi, and also remembering that the mean relative fitness,BE equation 2 , is 1:

BE equation 3

Using this notation it becomes clear that  is the frequency after selection, and again reminds us that s is the covariance between a trait and relative fitness.  Again, I refer you to last weeks post for a numerical example.

One of the big and obvious issues with this is defining fitness.  At one level “fitness” is a nearly metaphysical term for something that is really impossible to measure.  Any one definition that you can come up with has a counter example.  For example, lifetime reproductive success is often given as a measure of fitness, but one only need raise the example of the “grandchildless” mutations in both Drosophila and Caenorhabditis (I suspect they are actually mutatations in different genes).  Females that have this mutation have normal fertility, but their children are sterile, and one would thus have to say that they had a fitness of zero.

Fortunately, many years ago Arnold and Wade (1984 Evolution 38: 709-718) came to our rescue on this issue.  In this paper they show that, provided fitness components are multiplicative, we can divide fitness, and with it selection, into multiple episodes.  Summarizing their results (By the way, their efforts to partition the selection gradient don’t work.  I am pretty sure it’s a mistake):

BE equation 5

In words, this is saying that fitness can be divided into a series of episodes.  Thus, for example, you might have “survival to first molt”, “survival to second molt given that you survived the first molt”, etc..  Note that to make the fitness components multiplicative, one simply needs to make them conditional.  Thus, “survival to adulthood” and “mating success” is not multiplicative, but “survival to adulthood” and “mating success given that you survived to adulthood” would be.

The way that this saves us is that we can simply admit the impossibility of measuring total fitness in nearly all circumstances.  Instead we measure one of these episodes of selection, and call the appropriate fitness component “fitness”.  Thus, we can use the total number of seeds produced as a measure of reproductive success even though we don’t know how many seedlings died before the first census.

There are two caveats that need to be mentioned here.  First, since we are not measuring total fitness we need to be aware that selection during another episode may be countering the selection we measure.  We may find that bigger male frogs have greater mating success indicating that sexual selection is favoring larger males (this is what Arnold and Wade found), and ask why it is that that male frogs are not huge.  Perhaps unknown to us is that survival selection favors smaller individuals that are less likely to starve, and it is this counter selection that keeps the giant killer man eating frogs at bay.  The second issue is that this assumes the selection episodes are independent.  If they aren’t there are a whole bunch of cross-products that make the results ugly.  To see where this might happen, imagine you are in a pair of back-to-back foot races.  Clearly, if you run hard in the first race, you will have a better chance of winning the first, but may be too exhausted to perform well in the second.  Thus, the two races are not independent.  There is no very good solution to this second problem other than to be aware that it may complicate your results.

In any case, to repeat myself yet again, selection is quite measurable in natural populations, and even though we cannot measure total fitness, we can still do good biology by settling for working with fitness components.

The second term is the heritability.  This is a formal term, and is the ratio of the additive genetic variance to the phenotypic variance.  Additive genetic variance was originally defined by Ronald Fisher to be the covariance between the average effect and the average excess, so this becomes the formal definition.  It is also unmeasurable, and frankly not useful.  For practical purposes the additive genetic variance is the variance that can be passed from parent to offspring.  Basically, every individual has a “breeding value”, which is the trait value of their offspring when they are randomly mated in the population.  To reiterate, it is not what an individual looks like, it is what their offspring look like.  This leads to the rather odd point that, for example, bulls have a breeding value for milk yield, which is the amount of milk their daughters produce.  The offspring of a pair of parents is expected to be the average of the breeding values of the two parents, but of course for any particular offspring it will vary from this expectation.  For our purposes, the important thing is that the variance in breeding values is, at least for our present purposes, equal to the additive genetic variance.  There are many ways to measure additive genetic variance, all involve some sort of comparison of relatives, such as parent offspring regression or a comparison of sibs and half sibs.  For our story one is worth mentioning, that is the regression of offspring on “midparents”.  In this method a regression is done of the value of the trait in the offspring against the mean of their parents BE equation 6.  This does not measure the additive genetic variance per se, but the slope of this regression is equal to the heritability.  The phenotypic variance is simply the variance of the trait in the population.

Finally, the response to selection is the difference between the mean of the offspring (after selection),z-bar’ , and the parents (before selection), z-bar:

BE equation 7

Thus, we have our whole equation:

BE equation 8

That is the classical breeders equation.  This can be seen very nicely using a graph I put up earlier.

breeders equation revised

In this figure the regression of offspring on mid-parents is the heritability (I will clean this up later.  I modified it while on a plane).  From it you can see that the slope of the regression line “translates” the change in the mean of the parents into a change in the mean of the offspring.  This figure suffers mainly from the fact that the offspring are variable, thus, there should be a distribution around the regression line.

The last thing to add is that I am presenting an argument for a phenotypic view of evolution.  The important thing about the breeder’s equation is that it tends to work, and it works well, at least for short periods of time.   However, Fisher developed this using the assumption of a specific underlying model of additive gene action.  To allow for non-linearities and the possibility of inheritance of factors other than purely parentally inherited genetic factors the classic breeders equation should probably be modified to allow a more generic function for the transition equation.  Thus we might write it as r = g(s), that is the response to selection is an unspecified function of the selection differential.  In most cases multiplying the selection differential by a constant between 0 and 1 (i.e., g(s) = h2s) will probably give a pretty accurate prediction.

Selection as an Ecological Process

Having discussed relative fitness it is now time to turn to selection.  The first important point is to think about the “breeders” equation:


here r is the response to selection, h2 is the heritability, and s is the selection differential.  The important point is that evolution by natural selection (r) has two components, h2 and s.  In other words, there is s, which is the ecological process of selection, and h2, which is a statement about genetics.  This is an important detail.  Selection is an ecological process with no statements made about the patterning process (read genetics) underlying the trait.  I emphasize this because this has often been a source of confusion.  Don’t feel bad if you were confused about this.  No less than John Endler in his book “natural selection in the wild” made the mistake that it was only selection if it was acting on a heritable trait.  If quantitative genetics isn’t enough for you there is a philosophical reason that selection is acting regardless of the heritable basis of a trait.  That is that heritability is a continuous value going from 0 to 1, with 1 meaning that the offspring trait is exactly intermediate between the parents, and 0 meaning that there is no relationship between the parental values of the trait and the offspring value.  If selection can only act on heritable traits the question becomes how low can the heritability be before it no longer counts as selection.  Obviously regardless of what value you may choose it will be arbitrary.  Much better to say that selection is simply an ecological process without regard to the transmission characteristics.

Endler’s mistake is worth commenting on.  He cites the famous “necessary and sufficient” criteria for evolution by natural selection that were first put forward by Lewontin (Lewontin, R. C. 1970. Ann. Rev. Ecol. Syst. 1: 1-18).  That is, Lewontin tells us that for evolution by natural selection to occur you need three conditions:

  1.    There must be variation in among individuals in phenotype (phenotypic variation)
  2.    Differences in phenotype must affect the fitness (covariance between fitness and phenotype)
  3.    The phenotypes must be heritable (covariance between parents and offspring)

Lewontin was careful in two important regards.  The one that is important for this essay is that he calls it evolution by natural selection.  No less important, but not the subject of this essay, is that he refers to “heritability” generically, not in the formal sense that implies a well-defined genetic basis.  (Oddly, however, he specifies that it is fitness, not phenotype, that is heritable.  Seems incorrect to me.) Endler’s mistake was to call this natural selection.  For natural selection criteria 1 and 2 are necessary and sufficient.  For EVOLUTION by natural selection all three are required.

Making the distinction between selection and evolution by selection are important both for the conceptual and philosophical reasons outlined above, but also for a very practical reason.  That is, typically the experiments we do to determine heritability are very different from the experiments we do to study selection.  Studies of the genetic basis for a trait (I don’t call this heritability since heritability studies are a specific subset of methods for studying genetics) are typically done in the lab.  Examples include parent offspring regressions in which a trait is measured in both the parents and in the offspring, and the regression of offspring on parents is used to determine the heritability, other quantitative genetic methods such as half sib breeding designs and QTL mapping, but also molecular methods in which specific genes are identified and their effect on the phenotype assessed.  Also falling into this category, and deserving special mention because it tends to confuse people, are laboratory selection experiments.  In selection experiments a known selection pressure is applied, so that is not in question.  What is in question is the nature of the response to selection.  Thus, somewhat counter-intuitively lab selection experiments are really about genetics and heritability.  In contrast, studies of selection are typically done in the field.  Selection is an ecological process, and studying it is really about measuring how an individual’s phenotype affects its fitness in nature.  Thus, under most circumstances it makes little sense to study selection in anything other than intact ecological settings.  I say under most circumstances because science is a broad field, and there are certainly exceptions to any general rule you can make.  A good example of a laboratory study that was concerned with how selection works rather than the response to a known selection pressure is Wade’s experimental study of kin selection (1980, Evolution 34: 844-855).  In this study he examined how changing he relatedness among members of a population changed the response to selection that was naturally occurring under the laboratory culture conditions.  There are also several good documented examples of evolution by natural selection in natural populations  (Endler’s book is excellent on this, so having attacked him on what was probably, in his mind, a minor point, let me now recommend the larger message of his book.).

Finally, in the Price equation,sel as ecol eq 1 , it is tempting to include the second term, E(w,Δz), as part of selection, as I believe Okasha does.  I have chosen to call this “environmental change”.  Although E(w,Δz) has the relative fitness in it this does not imply that it is selection.  The “practical” formula for this would be:

sel as ecol eq 1A

that is, it is the fitness weighted mean change in phenotype.  The fitness weighting is corrects for the consequences of selection, but is not actually selection in this case.  In the absence of selection sel as ecol eq 2 , thus if there is no selection this reduces to the average change in phenotype as a result of an ecological event.

Let me end this discussion with a brief example of how selection is quantified.  Selection, s, is simply the difference in the population mean of the trait before and after selection but before reproduction.  That is:

sel as ecol eq 3

where  is the mean after selection, and  is the mean before selection.  For example, say you go out in a pond and measure all the turtles.  There are 10, and they have the weights, in grams, of:

6, 7, 8, 9 10, 11, 11, 12, 12, and 14 grams, for a mean, , of 10 grams.

If little children come along and catch the smallest ones for pets (which is NOT good for the turtles), but leave the big ones (which bite), then the surviving ones might be

11, 11, 12, 12, and 14 grams, for a mean, , of 12 grams.


sel as ecol eq 3A

The point of this silly exercise is that s is a real number with a scale.  In this case it is two grams.  Selection in the form of taking the little ones away (as pets) increased the mean of the population by two grams.

Now we can bring back in the covariance approach (I just have to say, I used the Price equation for years before I knew it had a name . . . ).  I am assigning a 0 fitness if they were captured, and a 1 fitness if they survived to reproduce.  Then the mean fitness is 0.5, and the relative fitness is the absolute fitness divided by the mean fitness thus:

Absolute Fitness relative fitness phenotype
0 0 5
0 0 6
0 0 8
0 0 9
0 0 10
1 2 11
1 2 11
1 2 12
1 2 12
1 2 14

Using the covariances:

sel as ecol eq 4

I hope that is not too sophomoric, but I find it helpful to actually spell out the math once in a while so that everybody is clear on what is happening.  The bottom line is that, as Price told us, the selection differential is simply the covariance between a trait and relative fitness.  Note, that this example fulfills Lewontin’s criteria.  There is phenotypic variation, and that phenotypic variation covaries with fitness, thus we we have fulfilled the first two criteria, and selection is occurring.  I made no statements about the heritability of body weight, thus, we do not know whether or not the phenotype is heritable (Lewontin’s third criterion), and we can make no statements about whether or not the selection will lead to evolution.

I am traveling to Montpellier France next week for the International Society for History Philosophy and Social Studies of Biology meeting (Giving a talk on community selection, and participating in a roundtable discussion on individuality).  I am not sure whether or not there will be a post next week.

Putting the fun back in the fundamental theorem

One final entertaining aspect of relative fitness is Fisher’s Fundamental Theorem.  In his 1930 book Fisher states “The rate of increase in fitness of any organism at any time is equal to its genetic variance in fitness at that time.”  This is a bit grandiose as will become clear below, but it is an interesting fact of algebra and the definition of relative fitness that this is in fact the case.

Selection table

By definition the mean relative fitness before selection is 1 (don’t make me do the math!).  The mean relative fitness after selection is:

FFT equation 1

where Pg and Pg’ is the frequency of the gth genotype before and after selection respectively, and wg is the relative fitness of the gth genotype.  So, the change is relative fitness is:

FFT equation 2

At this point it is convenient to square the 1 (I am allowed to do that right?):

FFT equation 3

And, thus we see that for Haldane selection Fisher’s Fundamental theorem is easy to the point of being silly.  I constantly forget how to derive FFT.  Every time I re-derive it at some point it smacks you in the face as being almost bizarrely obvious.

Well, naturally the same thing can be done without genes.  If we have a set of individuals that vary in relative fitness then the mean relative fitness is

FFT equation 4

Similarly, the mean of the population after selection, but before reproduction is

FFT equation 5

And the change in relative fitness

FFT equation 6

Note the important difference:  in the Haldane equations we had the hubris to think that we exactly knew how the genetics affected fitness.  Thus we could get a version of Fisher’s fundamental theorem that went from one generation to the next.  In the phenotypic example I did not specify the patterning node or the transmission rules at all, thus I did not give enough information to make any statements about the next generation.  Nevertheless, working within a generation we are able to derive a form of the fundamental theorem that works quite well.

FFT equation 7

If, for some reason terrorists lock you in a cell for three weeks with Fisher’s “The genetical theory of natural selection”, and you actually read the book (OK its only page 35, you can do it), as I said before, he states “The rate of increase in fitness of any organism at any time is equal to its genetic variance in fitness at that time.”  So he was actually talking about the change that occurs between generations.  The genetic variance he was referring to is the “additive genetic variance”, and the fitness he was referring to is the relative fitness.

Much has been written about the fundamental theorem.  Many have questioned its usefulness, with, for example, Lewontin calling it “Fisher’s not-so-fundamental theorem” (Lewontin, R. C. 1970. The Units of Selection. Annual Review of Ecology and Systematics 1: 1-18.).  The issue is that while the fundamental theorem is a mathematical truism, the real world is not always so simple that it is actually useful.  Fisher himself was obviously aware of this, as the last part of his chapter on the fundamental theorem is in fact about the deterioration of the environment, and what would eventually be called the Alice in Wonderland situation (Lerner 1954 genetic homeostasis  — you didn’t think I would cite Van Valen did you?).  If I could put words in his mouth I suspect that Fisher felt that in most situations the environment deteriorated at just about exactly the same rate at which fitness increased.  I am certain that he felt that the situation in which there was a steady increase in the fitness of a population over many generations would be rare indeed.

Incidentally, I find it interesting that Fisher spoke of “deterioration of the environment” in 1930, which Lerner claimed in 1954 as the “Alice in Wonderland situation”, and Van Valen claimed in 1973 as the red queen hypothesis.  It is clear that this idea of the environment deteriorating becomes available for naming rights every 20 years more or less.  Since nobody has claimed it in over 20 years, I hear by claim the “Cheshire Cat hypothesis”, which is the idea that the environment deteriorates at a rate fast enough blah blah blah, you get the picture.  From this point forward everybody should refer to this idea as “Goodnight’s Cheshire Cat hypothesis”.  This naming right shall last for 20 years at which time somebody else can claim it since nothing will be left but the grin.


(taken from http://jieli1985.wordpress.com/2012/03/30/quantum-cheshire-cat-even-weirder-than-schrodingers/)

The most interesting variant of the fundamental theorem is probably the Price equation.  This has already come up, and will come up again in the future.  (Price is a very interesting fellow, and a close colleague of Hamilton of Hamilton’s rule fame).

First, it is worth pointing out that a variance is the covariance of a trait with itself.  Thus, the variance in relative fitness can also be thought of as the covariance of relative fitness with itself.  This immediately allows us to generalize Fisher’s fundamental theorem.  That is, the change in a trait is equal to the covariance between that trait and relative fitness.  To see this consider a trait z, that varies among individuals, thus zi is the state of the trait in the ith individual (note, as usual, no transition equations!).  The change in z as a result of selection, but before reproduction is:

FFT equation 7

Note, I took the liberty of multiplying the right half of the first line by 1, but of course, a very special 1, which is the mean relative fitness.  This turns out to be a special case of the Price equation.  At the risk of repeating myself, Okasha (2006. Evolution and the levels of selection. New York, Oxford University Press.) gives a very nice quick derivation of the basic Price equation (Price himself uses non-standard and somewhat awkward notation).

FFT equation 8

So, as with Red Queen hypothesis, we learn that there is nothing new under the sun, and that the Price equation is essentially Fisher’s fundamental theorem.  Given that the Price equation is over 20 years old, and also directly traceable to Fisher, perhaps I should claim this one as well.  Anybody up for the Goodnight equation?

There are actually two caveats that make the Price equation different from Fisher’s fundamental theorem.  First, Fisher assumed that the environment was constant and that  (he probably actually assumed that Dzi = 0).  Second, and significantly, notice that Price used absolute fitness rather than relative fitness.  This is significant because Price was a Hamilton colleague, and because Hamilton developed optimality models he (and Price) would not have seen the importance of using relative fitness.  For me as a starting graduate student this was a huge problem, and a big part of the reason that kin selection models were absolutely incomprehensible to me for many years.  Although unimportant for optimality models where the choice of scale for fitness is irrelevant, for rate of evolution approaches (quantitative genetics) it is absolutely essential.  Finally, it is important to point out that the  term is environmental change, the 5th force of evolution I discussed earlier.

Of course, there is much more to do with the Price equation.  For those who are not faint of heart check out Steve Franks work either his book  or his paper on the subject (Frank, S. A. 1997. The Price equation, Fisher’s fundamental theorem, kin selection, and causal analysis). Evolution 51: 1712-1729.

A discussion of the Price equation is not complete unless we extend it to structured populations.  Basically if we imagine that there I individuals in each of J demes, then the relative fitness and trait values of the ith individual in the jth deme is ij and zij respectively.  And we can take the Price equation and further divide it into within and between group components.  It’s a lot of silly hard algebra, and eventually nearly everything cancels out, so I will spare you the gory details.  In the final analysis the Price equation can be partitioned out as:

FFT equation 9


FFT equation 10

In words, we can divide the total change in a trait z into an among group covariance with group mean relative fitness, the average within group covariance between the trait and individual fitness, and similar changes in the trait itself.  Now a word of caution on a future topic.  I once talked to Steve Frank about the Price equation, and if I remember correctly, he denied ever calling cov(w.j,z.j) group selection.  It isn’t. He knows it, I know it, and now you know it.  (At some point in the future I will show you why it is not group selection.)

Relative Fitness

My intention over the next several weeks is to delve into the five forces of evolution – selection, mutation, migration, drift, and environmental change – in more detail. I will deal with the forces separately, but all of the forces interact.  For example drift is not particularly exciting unless there is selection to magnify its effects.  Similarly, there are (as yet unpublished – I’m working on it) examples of selection having drift like effects on genetic variance.  Finally, in Wade and Goodnight (1991. Science 253: 1015 – 1018) we imposed selection by differential migration.  Thus, it should be clear that the various forces are not independent.  It is not so clear, and honestly I am not certain, whether the forces really are logically distinct, or arbitrary boundaries dictated by the human need to categorize things.

(Please excuse some of the unfortunate formatting:  I don’t seem to be much of a mathematical type setter)

As I pointed out last time, even if they may be fuzzy around the edges these forces logically divide the world:

  • Selection – deterministic change
  • Drift – dispersive change
  • Mutation – random change
  • Migration – change in elements
  • Environmental change – change due to external causes.

I will start with selection, and todays lecture, er, I mean post, will be on relative fitness, and unfortunately boring.  One of the things that I have discovered as I have been doing this blog is that I have these very nice compact concepts, such as how selection works.  When I start to unpack these concepts I discover they are big and messy, and that there are a lot of important details that cannot be just brushed under the carpet, even if I think they are obvious.  Relative fitness is one of those concepts.

It is easiest to introduce relative fitness using a purely genetic system (is it any wonder that the gene centered view is so common place, given that concepts are often much easier to explain!).  Fortunately, J.B.S. Haldane (http://en.wikipedia.org/wiki/J._B._S._Haldane) provided a very easy way of modeling selection in a one locus two allele system.  We start with standard HWC proportions, but add to that a new entry, fitness.  Notice that, in Haldane’s very gene centric view, phenotype is simply skipped.

Table 1

Here S is the strength of selection, which is unfortunate, since in my next post S will mean something else.  Sigh, different modeling traditions lead to confusing symbol choices.  This is an example of “additive gene action”.  Every A1 allele confers a fitness of ½, and every A2 allele confers a fitness of (1-S)/2.  Thus an A1A1 genotype has a fitness of ½ + ½ = 1 etc.  In future posts I will talk about non-additive systems.  In non-additive systems genes interact so that the phenotype is not just the sum of the underlying genes.

The fitnesses are “normalized” fitnesses.  That is, they are numbers that are arbitrarily scaled.  Absolute fitness is the actual value of the fitness trait.  In this example it may be that A1A1 genotypes lay 1000 eggs, A1A2 lay 750 eggs, and A2A2 lay 500 eggs, or perhaps they are probabilities of survival of 100%, 95% and 90%.  In the first example the absolute fitnesses are 1000, 750 and 500, and the second are 100, 95 and 90.  Normalizing is done by dividing by the absolute fitness of the best genotype so the best genotype has a fitness of 1, and S = 0.5 in the first case and 0.05 in the second case.  The problem with normalized fitness is that it is not very useful for predicting the outcome of selection.   It should be obvious that A1A1 has the highest fitness, so the population will eventually fix for the A1 allele (and A2 will be eliminated), but from the normalized fitnesses alone we can’t say anything about the rate at which evolution will proceed.

To determine the change in gene frequency as a result of selection we need to know the “relative” fitness.  To get relative fitness the first step is to calculate the mean absolute fitness in the population.  I am going to do this using mathematical notation, as this will allow it to be generalized beyond the simple one-locus two-allele case:

Equation 1

here ̅w is the mean normalized fitness, Pg is the frequency of the gth genotype (p2, 2pq or q2) and wg is the genotypes normalized fitness.  The equation is general, regardless of the number of genotypes or their frequency, but the result (1-sq) is specific to this simple additive system.  I mention that only because 1-sq is one of those meaningless numbers that is burned irrevocably into my skull.  Don’t let that happen to you!

With the mean normalized fitness we can get the relative fitness by dividing each normalized fitness by the population mean normalized fitness.  The frequency after selection is simply the frequency before selection times the relative fitness.  Thus:

Table 2

Now, we can calculate the rate of change due to selection.  Remembering the frequency of the A1 allele can be calculated as the sum of the homozygotes plus half the heterozygotes:

Equation 2

which is only important because it shows that we can in fact talk about the rate at which selection works.  The relative fitness highlights a couple of important points, however.  Most importantly the relative fitnesses have the term (1-sq), that is there is a gene frequency in them.  Thus, unlike the normalized fitnesses the relative fitnesses are a function not only of the individual, but also of the population in which they are measured, and that the relative fitness will change as the gene frequencies in the population change.  The other thing to notice is that the mean relative fitness is always one (should be obvious), and that those genotypes that will increase in frequency have a relative fitness greater than one, and those that will decrease in frequency have a relative fitness less than one.


The other thing to notice is that other than being indexed by genotype the relative fitnesses have no mention of the genes in them.  Thus, we could easily re-write the mean fitness of the population indexing it by individuals: Symbol 1 .  In the example developed above it is simply a different means of indexing by individuals rather than genotypes, so it should give the same mean fitness, although the calculations would not be as pretty.  The relative fitness of the ith individual then becomes Symbol 2 ( Relative fitness symbol— an italic w with a tilde over it — is the universal symbol for relative fitness.).


Relative fitness is used very frequently in talking about natural selection, so think of this as a tutorial rather than a real blog entry!  By the way, in your meanderings through population genetics you will discover that frequently people will call fitnesses that are normalized so that the most fit genotype has a fitness of 1 as being relative fitness.  This is a holdover from Haldane’s models, which as you saw above involved dividing by the absolute fitness of the most fit genotype.  This normalized fitness has almost no utility outside of Haldane’s models, so I strongly urge you to refer to what Haldane did as “normalization” of fitness and what I showed you here to be relative fitness.


The Forces of Evolution

There is more to say about inheritance, and I will get back to it in the future, but for the moment I want to move on and give an overview of the forces of evolution.  As before, it is useful to return to the Hardy Weinberg Castle (HWC) equilibrium I discussed a couple of weeks ago.  To remind you, the one-locus two-allele HWC equilibrium is a situation in which the genotype frequencies:

A1A1 A1A2 A2A2
p2 2pq q2

remain constant through time.  That is, the gene frequencies, genotype frequencies and phenotype frequencies don’t change.  As mentioned before, the beauty of the HWC equilibrium is that if we use a genetic definition evolution (change in gene frequencies), then when the HWC equilibrium is holding, no evolution is occurring.

The reason the HWC equilibrium is a convenient starting point is that we know exactly the conditions under which it holds.  That is, the HWC equilibrium will hold if five conditions are met.  These are:  no mutation;  no migration;  No selection;  very large population size (no genetic drift), and random mating.

Thus, we can quickly see that the forces of evolution are mutation, migration selection and genetic drift.  It turns out that non-random mating (the fifth condition) does not change gene frequencies (it changes genotype frequencies), and thus does not qualify as a force of selection under this narrow definition.

To make it stand out, the forces of evolution under this narrow definition are:




Genetic Drift.

Yellow brick road

The problem, of course, is that “change in gene frequencies” is not an adequate definition.  For the phenotypic based definition I am using: Change in the distribution of phenotypes in a population due to the gain, loss, or replacement of individuals.  It will be necessary to alter the definitions of the forces of evolution.

These four forces continue to be valid when moving to a phenotypic view of evolution.  In particular they are actually the only four logical ways that we can have the addition or loss of individuals from a population in a constant environment.  What changes is the tendency to tie the forces explicitly to genes.  So, without further adieu:

Selection:  Selection is in many ways the easiest of the four forces.  Selection is a main subject of quantitative genetics.  In the standard quantitative genetics view evolution by selection is broken up into selection and the response to selection.  Selection is an ecological process in which some individuals have a greater chance of reproducing than do other individuals.  Note that it says nothing about genetics or inheritance.  Thus, selection is qualitatively no different than what you do when you go to the grocery store and choose chicken noodle soup rather than cream of mushroom soup.  The “genetics” comes in in the form of heritability, which in standard quantitative genetics is simply a constant or proportionality that converts within generation change in the mean of population due to selection into between generation changes in the appearance of offspring.

Selection changes the distribution of sources of information (parents) that can contribute to that transition equation, but does not change the actual transition equation.  Any time the probability that a source will be included in the transition equation is correlated with the sources phenotype (i.e., not random with respect to phenotype) it can be considered selection.

Mutation:  Campbell and Reece (Biology 8th edition) defines mutation as “A change in the nucleotide sequence of an organisms DNA, . . .”.  An appropriate definition of genetic mutations, but actually not quite appropriate for an evolutionary perspective.

If an individual has a mutation occur during their lifetime this is not evolution.  It is actually (and rather counter intuitively) “development”.  That is, it is a change in an individual, not the gain or loss of an individual.  On the other hand, if an individual is formed from a pair of gametes, one of which is mutant, then this is an evolutionary change.  Thus, if an individual starts with one genotype, and due to a mutation ends up with another, that would technically be development, however, it would be evolutionary change if those mutational changes were passed on to their offspring.

From a phenotypic perspective mutation is a random change in the information contributing to the transition equation.  Thus, for example if the offspring of two AA individuals has an Aa phenotype, this indicates that there was a random change in the information contributing “genotype” to the patterning node, and results in the gain (or replacement) of an individual with a randomly different phenotype.  Note that this works perfectly well for any random change in the phenotype of an individual, whether it is genetic, cultural, or due to some other cause.  Thus mutation can be defined as a random change in the information contributing to the patterning node of an individual.

Migration:  This one is easy.  Migration is simply the gain or loss of individuals (and their phenotypes) because they move into or out of a population.

Drift:  Drift is traditionally seen as a random change in gene frequency due to sampling effects in a small population.  A phenotypic view of drift would be similar.  Drift is the random change in distribution of phenotypes that occurs when population sizes are less than infinite, and as a result this randomly changes the distribution of phenotypes that can contribute to the next generation.

I believe these four forces logically cover all of the possible forces of evolution in a constant environment.  Selection is deterministic changes due to fitness differences, mutation is random changes due to unpredictable changes in the patterning node of individuals, migration is adding or removing individuals from the population, and drift is random changes due to sampling in a finite population.

Now the thing that has disturbed me:  I have always accepted that all evolutionary change can be attributed to one of these four forces.  This may be true for a gene centered view of evolution, but it will not be true for the phenotypic centered view.  In one of the earlier postings I pointed out that a secular change in the environment, such as global warming, could cause phenotypic changes in a population that would have to be considered evolution.  This form of evolutionary change does not fall into any of the other previous categories, and thus we must include a new force: environmental change.  For those readers who are mathematically inclined it should not come as a surprise that this force exists.  It is equivalent to the “transmission bias” term that is a part of the Price equation discussed by Okasha (2006, Evolution and the levels of selection).  I will present a full discussion of why we need to include this force in a future post.

The phenotypic view: Stopping rules and Quantitative Genetics

As Henry Louis Mencken once said “For every complex problem there is an answer that is clear, simple and wrong.”  Dawkins’ gene centric view is exactly that: clear, simple and wrong.  In its place I suggested a phenotypic view of evolution in which phenotypes are viewed as creating new phenotypes with a transition equation describing this process.  The most obvious elements of the transition equation are genes, but there are numerous other factors ranging from cytoplasmic factors and vertically transmitted bacteria to culture and inherited environmental factors (e.g., taking over the parent’s nesting site when they die).

This raises the question of whether we have traded the incorrect gene centered view for one that is, while perhaps closer to the truth, nevertheless impossibly complex, and therefor intractable.  The answer lies in deciding what we want to know and what we need to know about a system.

In the gene centered view of evolution we really don’t have a good understanding of how evolution acts on a gene until we have identified that gene, know its sequence, know the allelic variants and their effect on the phenotype, and all of the pleiotropic forces acting on that gene.  This information, which might be thought of as gene ecology and gene physiology, but certainly not gene evolution, has bred a sense that if you don’t understand everything there is to know about a gene you really can’t study evolution.

A good example is the alcohol dehydrogenase gene (ADH)  in Drosophila.  Since the discovery ADH in 1964 (Johnson and Denniston Nature 204:906) and the identification of the latitudinal gradient in the ADH fast/slow allele polymorphism in 1973 (Vigue & Johnson Biochem. Genet. 13:721) an enormous amount has been discovered about the structure and function of the ADH locus.  For example, we know its structure, and the regions that are polymorphic (figure below) and that there is selection is affecting the F/S polymorphism (McDonald & Kreitman 1991 351:652):

ADH gene

Map of the ADH locus (taken from Powel 1997 Progress and Prospects in Evolutionary Biology:  The Drosophila Model.  Oxford University Press, page 423).  Note that we know the specific location of the single nucleotide polymorphism that gives rise to the fast and slow alleles.

Nevertheless, all of this knowledge of the gene structure has told us little about the selective forces acting that create maintain the ADH cline, nor has it explained the rather substantial variation in alcohol knockdown resistance that is seen within the different ADH genotypes:

ADH activity

ADH activity in 47 lines of D. melanogaster each with a different homozygous second chromosome from a natural population, but identical for all other chromosomes. (taken from Powel 1997 Progress and Prospects in Evolutionary Biology:  The Drosophila Model.  Oxford University Press, page 423).

The point is not to criticize the very fundamental and interesting work on the ADH gene.  Rather, my point is that from the perspective of understanding the evolution of alcohol tolerance at the phenotypic level, this tremendous detail concerning the gene physiology has done little to advance our understanding of either selection or inheritance of this trait.  On the other hand, from a genic perspective, it is clear we still do not know enough about the ADH gene to explain its evolution in the manner we would like.

Unfortunately, demonstrating that the genic approach is not adequate does not save the phenotypic approach.  It should be clear that using the phenotypic approach the “true” transition equation from one phenotype to the next is impossibly difficult, and almost certainly complex in the formal sense of the word.  This, however, turns out to not be too serious a problem, and one for which the solution is already known.  In particular, we need to recognize that there is a difference between what we need to know, and what we want to know.  Certainly, we would like to know everything there is to know about the factors affecting a trait such as alcohol metabolism.  From a phenotypic evolution perspective what we need to know is enough to predict the offspring phenotype from the parental phenotypes.

The point is that we can work to define the between generations transition equation in ever greater detail;  however, at some point we can decide that we know the transition equation well enough and stop at that level of detail.  The best starting point for this is classic quantitative genetics.

Anybody who has studied quantitative genetics (which, sadly is a small and dwindling number) knows that it was introduced by Fisher, and if you read the classic text of Falconer and Mackay, you know that the field was originally developed from a theoretical perspective of the phenotypic effects of numerous unmeasurable underlying loci.  However, after giving it some serious thought I realized that what Fisher really did was make the world safe for blending inheritance.

In the early part of the last century, around the time of the rediscovery of Mendel there were two schools of thought surrounding inheritance (see Provine “The origins of theoretical population genetics” 2001, U of C Press).  One of these schools was the Biometicians.  The Biometricians wanted to believe in blending inheritance, but they were caught in the statistical problem that with bi-parental inhertance the variance in a population should go down by half every generation.  This is a crippling problem for the blending inheritance model.  Fisher came along and showed that with an infinite number of Mendelian loci of infinitesimal effect (the infinitesimal model) the offspring mean would in fact be the average of the two parents, but that because of hidden variation (the segregation variance) the variance in the population would not decline.  Thus, simplistically, quantitative genetics is based on the assumption of blending inheritance with no loss of variation.

The point for our story, however, is that Fisher suggested a simple regression of offspring on midparents to describe the transition equation between generations.  The slope of this simple regression is (in theory) the heritability, and is proportional to the response to selection (yes, I am getting ahead of myself).  (By the way, I say in theory, because Fisher presented a very explicit model of additive genetic variance, and by extension heritability, in terms of the underlying Mendelian loci.  It turns out that the regression of offspring on midparents conforms to this formal definition of heritablity only under certain circumstances.)

Breeders equation

The regression of offspring on midparents.  The slope of the regression is the heritability, which is proportional to the response to selection based on the breeder’s equation.  Empirically there is ample evidence that in many cases this simple regression approach is predictive of the response to selection, often for 100 or more generations – and actually far longer than it should work if the simple genetic model on which it is based is correct.

Although Fisher (and Falconer and MacKay) developed this model on the assumption of underlying genetic effects, the truth is that the linear regression is probably a reasonable first approximation of the transition equation from parent to offspring for a wide range of possible modes of inheritance.  Although we may be very curious, we do not need to know why this regression works, we simply need to know that it does in fact work.  This simple transition equation in many cases may work well enough and we can stop there.  In other cases a more complex transition equation might be called for.  In the example of alcohol metabolism in Drosophila I would be tempted to use a linear model that had both a continuous effect (parental alcohol resistance), and a discrete effect (genotype at the ADH locus).

Thus, although the “true” transition equation is indeed impossibly complex, in almost all cases a relatively simple model will be good enough.  As evolutionary biologists the goal is to predict the phenotype in the next generation.  For this job quantitative genetics, or an extension of classic quantitative genetics, is the fundamentally correct approach.  How much detail we choose to put into describing the transition equation will be a function of how good a prediction we need to make.  Given that humans are inherently curious, we almost certainly will want to understand the details of why our transition equation works, but when we choose to do that, we need to be clear that we are studying gene anatomy, gene physiology, or gene ecology, but we are not studying the evolutionary biology of the original trait.


The phenotype as the center of Evolution

When I first learned about evolution I was taught that evolution was change in gene frequency.  As I pointed out before this definition is inadequate, however, the gene frequency definition has the interesting property that evolution can be described in terms of deviations from the Hardy-Weinberg-Castle (HWC) equilibrium.  For all its faults the HWC equilibrium gives us a familiar starting place for discussing the forces of evolution.

Consider a one-locus two-allele system with an A1 and an A2 allele.  If the frequency of the A1 allele is p, and the frequency of the A2 allele is q (p + q = 1), then the HWC proportions are simply given by (p + q)2:

A1A1 A1A2 A2A2
p2 2pq q2

This result was independently “discovered” by Hardy, Weinberg, and Castle, which is why it bears their name.  (Note I put discovered in quotes.  Hardy, at least, was always a bit embarrassed that his name was attached to such an obvious result, and Castle’s graduate student, Sewall Wright, is reputed, although I can’t find a reference to this, to have referred to it as the “well known law of proportions”).

Importantly, the HWC equilibrium (as opposed to the proportions) is a phenomenon in which the HWC proportions are maintained for multiple generations, and gene frequencies and genotype frequencies do not change.  This was the beauty of the “change in gene frequencies” definition of evolution.  The HWC equilibrium forms a null hypothesis for that definition:  If the HWC equilibrium holds then evolution is not occurring.

To move towards a phenotypic view of evolution we need to think about the informational function of genes.  As I described in my last post, genes are a bit complex in that they have three different functions.  They transmit information between generations, they interact with other genes and other heritable elements and the environment to create the phenotype, and they (or at least their sequence and position in the genome) are part of the phenotype.  It is the transmission of information between generations aspect of genes that is of interest here.

Think of the birth of a new individual to be the creation of a new phenotype based on standard rules.  In our simple one locus two allele genetic system the rules are that the offspring’s phenotype is determined by randomly selecting one allele from each parent.  It is convenient to think of the phenotype of the offspring to be determined by a “transition equation” that converts the phenotypes of the parents into the phenotype of the offspring.

In our simple genetic system the simplest phenotype is the number of A1 alleles.  Thus, an A1A1 individual has a phenotype of 2, an A1A2 a phenotype of 1, and an A2A2 a phenotype of 0.  From this we can easily set up a table of the possible phenotypes that any pair of parents can produce:


  Father’sPhenotype (andgenotype)

Mother’s phenotype (and genotype)

2 (A1A1)   1 (A1A2)   0 (A2A2)
  2  (A1A1)   2 (A1A1)   1.5 (A1A1 & A1A2)   1 (A1A2)
  1  (A1A2)   1.5 (A1A1 & A1A2)   1 (A1A1 & A1A2 &A2A2)   0.5 (A2A2& A1A2)
  0  (A2A2)   1 (A1A2)   0.5 (A2A2& A1A2)   0 (A2A2)


Thus, in this simple world the transition equation translating the phenotypes of the parents into the mean phenotype of the offspring is clear, and easily described as the mean of the two parents.  Note that even here, however, there is a stochastic element.  We can predict the mean phenotype for the offspring, but not its actual value.  For example in a cross between two A1A2 individuals, the mean phenotype of the offspring is one, but it is impossible to exactly predict the phenotype of a specific offspring from the parental phenotypes.

In real systems with gene interaction, maternal effects, epigenetics, culture etc. the transition equations governing the formation of phenotypes are far more complicated, and almost certainly contain both discrete (genes) and continuous (culture) elements.  Although it may be computationally difficult, there is no  conceptual difficulty in adding additional factors to the “transition equation” from parental phenotypes to offspring phenotypes.  In essence the transition equation of phenotypes in one generation to phenotypes in the next is simply the rules of inheritance for the factors that contribute to the patterning node described in the previous post.

Nothing is lost and much is gained viewing these Mendelian rules as “transition equations” that convert phenotypes one generation into phenotypes in the next generation.  For example, one of the serious problems with the gene centric view of evolution is how to incorporate cultural inheritance.  Culture is inherently a continuous phenomenon.  You learn language primarily from your parents, but also from many others.  I grew up in the Midwest, but my children grew up in Vermont.  They swallow their t’s just like any red blooded Vermon’er does, and they did not learn that from me.  Language is certainly heritable, but the rules are not discrete or simple like the Mendelian rules I outlined above.  Incorporating continuous factors such as culture, and the complications of multiple cultural “parents” is a technical difficulty, but not a conceptual difficulty for the phenotypic view of evolution.

Note the strong contrast of this phenotypic view to the Dawkinsian genic view of evolution.  In the phenotypic view of the world “genes” take on the subservient role of simply being mathematical constructs (transition equations) that have a physical reality in the form of DNA.  In the genic view genes are central in evolution.  In this view the genes are immortal “replicators” and create “vehicles” (phenotypes) that carry them forward to the next generation.  You can’t make this stuff up.  Here is the real quote:  “The fundamental units of natural selection, the basic things that survive or fail to survive, that form lineages of identical copies with occasional random mutations, are called replicators.  DNA molecules are replicators.  They generally, . . ., gang together into large communal survival machines or ‘vehicles’.  The vehicles that we know best are individual bodies like our own.” (Dawkins, The Selfish Gene).  From a phenotypic perspective this quote is just silly.

I think Dawkin’s concept of the “meme” is particularly telling.  As I pointed out culture is intrinsically continuous and no problem for the phenotypic view of evolution; however, for the genic view it is a huge problem.  In Dawkin’s view the gene as an object is the center of evolution.  Culturally inherited traits cannot be objects from the genic perspective unless they are atomized.  The meme is an attempt to force this intrinsically continuous concept into the particulate framework that is essential for the genic view of evolution.

Historically I find this fascinating, since in the early 1900’s the Biometricians refused to accept the idea of particulate inheritance.  Fisher came along and saved us by explaining how particles of infinitesimal effect could appear continuous.  From that we became so enamored of genes and particulate inheritance that when our theory eventually was confronted with an aspect of inheritance that was continuous we find Dawkins being the mirror image of the biometricians and refusing to accept continuous inheritance.

If I may be a bit sarcastic for a moment (and I have always wanted to do this), to paraphrase Wolfgang Pauli, Dawkins isn’t right, he’s not even wrong.  The only “replicator” I am aware of was invented by Xerox, and in the living world the only “vehicles” I am aware of are horses, but only when they are wearing saddles.  (My colleagues point out that we need to include donkeys, camels and especially elephants – can’t forget elephants – in our definition of vehicles.)

Defining phenotype

Having divided the formation of the phenotype into a patterning node and non-heritable nodes, I want to spend some time discussing what is and isn’t part of the phenotype.

The phenotype is the interface of an object with its environment.  Everything, even rocks, have a phenotype.  It is probably useful to use a somewhat philosophical concept of the “phenotype” being the sum of all aspects of the organism that interacts with the environment, or conceptually a gestalt concept of the totality of the organism’s appearance behavior and physiology.

Such a concept of phenotype is not particularly useful, thus, it is convenient to divide the phenotype into “characters” or “traits” (I will use the two more or less interchangeably).  A trait is a measurable aspect of the phenotype that is of interest to us.  Thus, for example, many flowers have ultraviolet nectar guides that attract pollinators.  Humans were unable to detect ultraviolet radiation until it was discovered by Johann Ritter in 1801 (http://coolcosmos.ipac.caltech.edu/cosmic_classroom/classroom_activities/ritter_bio.html).  Basically, the distinction I am making between phenotype and trait means that nectar guides were always part of the phenotype of flowers, but didn’t become traits until 1801, when humans were able to detect them.  Finally, given that a trait is a particular measurable aspect of the phenotype, I will use the term “trait state” (or character state) to refer to the value that a given individual has for a particular trait.  Thus, for the trait “height” my trait state is five feet five inches.

Aspects of the phenotype obviously include those traits that are of interest to evolutionary biologists, such as body weight, running speed, amylase activity, foraging behavior, but also traits that are generally not of interest because they are ephemeral, or have little chance of being heritable.  Examples of the latter would include such things as the time since the last meal, what that meal was, tattoos, and other body art, and the color clothes a person wears.  We don’t normally think of these as “traits”; however, an important aspect of the phenotype are non-heritable changes, such as adding a person tattoo, or whether or not a lizard has lost its tail.

My definition of evolution is about the gain and loss of individuals.  This raises an issue about traits that change over time.  Many of these traits will be of trivial interest.  For example whether you are inhaling or exhaling would seem to be part, if perhaps an uninteresting part, of your phenotype at any given moment.  On the other hand, weight is clearly part of the phenotype of an organism, and it can fluctuate with development and time.  The question is how to deal with traits that change, whether slowly or quickly.  My own solution, and maybe other have a different solution, is to think of an individual’s phenotype as a vector through time.  In this view every trait has two values, the trait value, and the age at which the trait was measured.  Thus, we could measure body weight at age 6, 18, and 32.  These would all presumably be different.  We could also number of fingers on the left hand, and discover that in nearly all cases this was constant.  In the finger number example, while the time element is technically there, there is little reason to keep track of it.

When I talk about multilevel selection it will also be useful to think of the phenotype as including some aspects of the social environment.  For example, there will be times when the size of the group an individual experiences can be treated as a trait of the individual.  This is best left for a later discussion, briefly, however, these supra-individual level traits will be useful to consider when features of the group affects an individual’s fitness

It is worth discussing genes at this point as an example of what is and isn’t part of the phenotype.  In particular, genes actually have three more or less independent functions.

First, they are passed from parent to offspring, and they carry with them information.  This is traditionally what one thinks of when referring casually to genes.  Basically genes are units of heritable information that is passed vertically between generations.  It is worth noting that as we have gotten a better handle on the structure of the genome, the concept of what a gene really is has become less clear.   That is, a gene is a strip of DNA, but it may be DNA that is transcribed and translated into a protein, DNA that is transcribed into the many of the various RNA’s without being translated into a protein, or it might be one of the many controlling elements.  For our purposes it really doesn’t matter, the first function is that it is information that is passed from parent to offspring.

Second, genes are part of the patterning node, and as such they participate in the formation of the phenotype.  This is a distinct function from the inheritance aspect, a point that is made very obvious by the single celled Prtotist, Paramecium.  Paramecium have two nuclei, a micronucleus and a macronucleus.  Briefly, the micronucleus is reserved for reproduction, and is only used for transmitting information when the cell divides.  On the other hand, the macronucleus has multiple copies of the genome, and functions in protein synthesis and the management of the cell.  In animals we have only a single nucleus that functions both for information transfer and protein synthesis; however, even here there are cells that become “polytene” meaning that they have multiple copies of the genome.  These tend to be cells that will no longer divide, but have a heavy protein synthesis load, such as salivary gland cells in Drosophila, and liver cells in humans.

Paramecium drawing

Figure:  Drawing of a Paramecium (http://www.flickr.com/photos/worldworldworld/4095866648/sizes/o/in/photostream/).   The micro nucleus, which divides when the cell divides, transmits information between the mother and daughter cell.  The macronucleus has multiple copies of the genome and produces the proteins and RNA that interact with the environment to produce the phenotype.

The third aspect of genes is that they are part of the phenotype.  That is, they are traits of an individual since they have a sequence, and a position in the chromosome.  As such they have to be considered part of the phenotype.  For example, there are circumstances in which we might want to identify and select animals with a particular allele at a particular locus.  In this case we are in fact selecting for those animals that have the trait state of having the desired gene sequence at the locus of interest.  Typically we are selecting for individuals with certain alleles because we hope it will have a desired effect on some other trait; however, this is correctly viewed as a “correlated response to selection”.  That is, we are selecting individuals with the desired gene sequence, and the phenotype changes because it is correlated with the genic phenotype. (I have not talked about it, so this has to remain an aside; however, notice that even in this case selection is on individuals, not genes.  With the exception of transposable elements I have not been able to come up with a single example that can legitimately be called “genic selection”.  I will talk about this when I discuss selection.)

Thus, in summary, the gene sequence, and corollaries such as the location of a band on a gel, can be considered to be part of the phenotype.  The information that the gene carries, and its function in the patterning node in creating other characteristics of the organism is not part of the phenotype.


Skip to toolbar