Models and laboratory studies of multilevel selection are all well and good, but, of course, the real question is how important is group selection in nature. All proselytizing by old dead white men aside (ok I admit, I am a old live white man), this is an empirical question, that no amount of theorizing will answer. In order to answer that question we need a tool. Fortunately, that tool, which is contextual analysis, was provided to us by the social sciences, by way of a paper by Heisler and Damuth (1987 Am. Nat. 130: 582-602). I already talked bout this in an earlier blog post.

Before describing contextual analysis it is worth spending some time on multivariate selection analysis. Arnold and Wade (1984 Evolution 38: 709-718; Evolution 38: 720-734) started with the standard multivariate breeder’s equation (shown here in four forms):

Where S is the change in a vector of traits, Z, as a result of selection but before reproduction, and R is the change in Z in the next generation. Z is a vector of K traits (Z_{1} to Z_{K}), is the mean of the original population, is the fitness weighted mean of the population, that is the mean of the population after selection but before reproduction, and is the mean of the offspring in the next generation. G and P^{-1} are the additive genetic covariance matrix and the inverse of the phenotypic covariance matrix respectively. Again, this is something I have talked about before.

Arnold and Wade pointed out that in quantitative genetics there is a distinction between selection and the response to selection. The response to selection, R, is really selection, S passed through the filter of the patterning compartment (in quantitative genetics GP^{-1}). In other words, S is logically about ecology and best studied in the field, and R is about inheritance, and under most circumstances best studied in the lab (or in any case using breeding designs.). As a result a selection study will typically only measure S, and ignore inheritance.

Most people combine P^{-1}S as the selection gradient β:

Without showing the math, b is the selection gradient. That is, it is the vector of direct effects of selection. The problem with using S alone is that a trait can change both from selection acting directly on the trait and from selection acting on a trait with which it is phenotypically correlated. The selection gradient avoids this problem by mathematically removing the indirect effects of selection. Note that this indirect selection is different than a correlated response to selection. Indirect selection is a change that occurs within a generation as a result of selection acting on another trait. This indirect selection occurs when there are phenotypic correlations. In contrast, correlated responses to selection depend on genetic correlations, and are a response to selection (in the offspring!) in one trait due to selection acting on another trait.

In an earlier post I showed that:

where p_{i} is the frequency of the *i*th type, and w_{i} is its relative fitness. From this it should be apparent that the change in phenotype due to the direct effects of selection is:

where ∇ is called the gradient. The point is that if we do a regression of relative fitness on phenotype the slope of this regression line will be the strength of selection. More importantly, it is the strength of selection with the indirect selection effects stripped off. As an example, consider a sample data set I made up (One feature of occasionally teaching an applied statistics course: I can fudge data with the best of them!)

In this data set fitness is an function of Z1 plus a little noise so we get at least a little error variance, thus if we do a simple regression of fitness on phenotype we get:

Notice that we get a significant relationship between Z2 and fitness, even though selection is acting only on Z1. Indeed

In this example ΔZ2 is purely indirect selection because of the phenotypic correlation with Z1. Fortunately, when we do the partial regression this issue is resolved:

Note that the estimate for Z1 is virtually identical to the estimate from the simple regression. Since we know that selection was only on Z1, in theory they should be identical, but in this example the sample size is small and I put in some stochastic variation in the relative fitness measure. On the other hand, it is now obvious that there is no selection acting on Z2, with the parameter estimate being almost zero, and not even “marginally insignificant”.

The point is that, as we all know, multiple regression on correlated characters will give a different answer than will simple regression. In practical terms this means that it may appear that selection is acting on one trait, when in fact it is acting on a second correlated trait. As long as we measure that trait and include it in the multiple regression the we will in fact only measure the direct effects of selection.

I am rehashing this issue that most are aware of because next week I will be talking about contextual analysis. It is important to remember this issue with indirect selection, as it will rear its ugly little head in the form of apparent group selection when none is acting. To get an intuitive sense of where this will be going, contextual analysis IS the Arnold Wade selection approach applied to a situation where Z1 is an individual trait and Z2 is the group trait. In this case if there is only individual selection acting then there will be indirect selection at the group level. That is, because groups made up of high fitness individuals will have a high average fitness it will appear that there is group selection acting, even though that is not true.