12.6 - Hardy-Weinberg Equilibrium

Printer-friendly versionPrinter-friendly version

In diploid populations (with 2 copies of each chromosome) if you consider a single genotype (SNP or other) with:

  • A with frequency p
  • a with frequency (1 - p)

Then, under random mating with equal survival probability for all offspring:

  • AA has a frequency of  \(p^2\)
  • Aa has a frequency of  \(2p(1-p)\)
  • aa has a frequency of  \((1-p)^2\)

This is called the Hardy-Weinberg equilibrium.  It is assessed by considering whether the percentage of samples having each genotype are in the correct proportion, which can be tested using a Chi-square goodness of fit test.

Genes can fail to be in Hard-Weinberg equilibrium for a number of reasons.  Firstly mating is seldom completely at random - some mates are more attractive or more fertile and hence contribute more offspring to the population.  Genes associated with these traits will not be in HW equilibrium.  Another cause of non-random mating is proximity - subpopulations which are physically isolated will inbreed (compared to the larger population from which they originated) .  The variant may be in HW equilibrium with respect to the subpopulation, but not with respect to the larger population.  Finally, some genotypes may reduce fitness so that no offspring survive - this occurs, for example in diseases in which one homozygote is highly unfit but the heterozygote has enhanced fitness.  SNPs in LD with a variant that is not in HW equilibrium are also likely to fail to be in HW equilibrium.

If you look at the workflow for SNP analysis researchers often removed genes that are not in HW equilibrium. This might be a controversial step from a statistician's point of view as there may be a possibility of HW disequilibrium due to some type of selection, making these more interesting SNPs rather than SNPs that would be removed.  On the other hand, lack of HW equilibrium suggests some type of population structure that might need to be taken into account.  As an example - suppose that a genetic disease is more prevalent in a north European population compared to other human subpopulations.  The alleles that confer blonde hair (and many other phenotypes) are also more prevalent in the north European population.  If we do not screen for HW equilibrium, many of these alleles will appear to be associated with the disease.