Last week I gave an informal seminar for our Friday lunch group, and realized that not everybody knows what I am talking about when I talk about epistasis. I didn’t do a “what is epistasis” post originally because this was supposed to be a blog about the phenotype, and well, epistasis is genetics. That said, much of what I want to talk about is about epistasis, so here is a bit of remediation for my blog. If you know all about this stuff you can skip it. . .
Our original models of population genetics were based on the “additive dominance” model. That is, all loci act independently. We have to allow dominance, since its existence has been obvious since Mendel’s time. This assumption of only additive effects was originally quite reasonable. Fisher fully understood that there were interactions among loci, and he is in fact the person who first termed these interactions “epistasis”. However, in developing his model he assumed an infinite number of loci, random interactions, and an infinitely large population size. This is a reasonable first set of assumptions for developing the (then) brand new fields of population genetics and quantitative genetics. Physicists and complex systems people will know this as the “mean field approximation”, and they will both laud the successes based on the mean field approximation, and lament the limitations if it.
The problem comes when we start relaxing these assumptions, and allow finite population sizes, non-random interactions, and a finite number of loci. In this case gene interaction starts to become much more important. When it comes to gene interaction we know that enzyme chains can be quite long, and that there are lots of other subtle interactions, thus, from one perspective there must be epistatic interactions involving tens or even hundreds of loci. Although these interactions are undoubtedly real, there are a number of reasons to think that they can be reasonably ignored, at least over short evolutionary time scales. The main one, is that as a variance component the contribution of additive type epistasis (AXA, AXAXA etc) to the variance among demes for an N way interaction is approximately proportional to 2 times the inbreeding coefficient raised to the Nth power (≈(2f)NVAx. . . ). In other words, if there were large variances do to high order epistasis populations would suddenly blow up and additive genetic variance would blossom out of nowhere when populations were inbred. This doesn’t happen. As a result, I work with two locus models, figuring that going above two loci is a lot of work for very little return. In any case these low order models are sufficient to move beyond the rarified additivistic world we grew up in (that’s supposed to be a pun on atavistic).
In any case, if we have a system with two alleles at each of two loci there are 9 possible genotypes. The genotypic values are numbers, and if you know your statistics you know that a set of nine numbers has a mean and 8 degrees of freedom. In other words these 9 numbers can always be divided into a set of 8 vectors that are independent of each other. Any set of 8 vectors that are orthogonal will do, but I like to use the following 8 (sigh. If I was any good with HTML this would be formatted better. . .):
Additive A locus Additive B locus
Dominance A locus Dominance B locus
Additive by Additive Epistasis Additive by Dominance Epistasis
Dominance by Additive Epistasis Dominance by Dominance Epistasis
Statisticians in the crowd will hate these, since these are not traditional orthogonal contrasts (they don’t all add up to zero), but they are independent of each other, and they do work. (Making them orthogonal contrasts that obey the rules changes the intercepts, but otherwise has no effect other than making them less aesthetic).
These values are what Jim Cheverud calls “physiological epistasis”. That is they are fixed genotypic values that are constants regardless of the characteristics of the population in which they are measured. This is not particularly interesting for studying evolution. Instead we need to think about “statistical epistasis”. Statistical epistasis is genetic variance that can be attributed to gene interaction. Unlike physiological epistasis, statistical epistasis is a property of the individual and the population in which they are measured.
A quick analogy is appropriate here. I have a fixed height (five feet 5 inches if you must know). I have that height whether I am measured in Holland or Peru. However, if I want to watch a parade, well, in Holland, where people tend to be tall, I am short and since I am an adult at the back of the pack I probably won’t see much. In contrast, in Peru, where people tend to be much shorter, I will be relatively tall, and very likely I will be able to see the parade. My height of 5’ 5” is a fixed value similar to the genotypic values of physiological epistasis, whereas whether I am tall or short is similar to statistical epistasis.
So to convert physiological epistasis into statistical epistasis you need to partition the variance in genotypic values into statistical variance components. This is done by the old platitude of “doing a regression of phenotype on genotype”. Following Falconer and MacKay, a good graph of the additive genetic variance for one locus with dominance:
In this figure the additive genetic variance is the variance due to regression, and the dominance variance is the residual variance. For additive by additive epistasis we need to do a multiple regression, and resulting in a 3 dimensional graph:
This graph shows two slopes. The red plane is the regression of phenotype on genotype for the frequency of the A1 allele = 0.5, and the brown plane is the regression for the A1 allele = 0.25. For both regressions the frequency of the B1 allele = 0.5.
The important point is that the statistical components of variance change as gene frequencies change. In the above example, when the gene frequency of the A1 allele changes from 0.5 to 0.25 the additive genetic variance at the B locus changes from zero to being non-zero with the B2 allele favored. If the gene frequency of the A1 changed to 0.75 the additive genetic variance for the B locus would also have increased, but instead it would be B1 allele that was favored (and the graph would not have been as pretty). In the real world we generally do not have access to the actual interacting loci, or their gene frequencies. Thus, we simply have to recognize that the additive genetic variance changes in a complex way as inbreeding, selection and drift act to change the underlying gene frequencies in complex manners.