Epistasis Blog

From the Artificial Intelligence Innovation Lab at Cedars-Sinai Medical Center (www.epistasis.org)

Saturday, May 30, 2009

Individual Genomes Instead of Race for Personalized Medicine

Only two data points but nonetheless interesting.

Ng PC, Zhao Q, Levy S, Strausberg RL, Venter JC. Individual genomes instead of race for personalized medicine. Clin Pharmacol Ther. 2008 Sep;84(3):306-9. [PubMed]

Abstract

The cost of sequencing and genotyping is aggressively decreasing, enabling pervasive personalized genomic screening for drug reactions. Drug-metabolizing genes have been characterized sufficiently to enable practitioners to go beyond simplistic ethnic characterization and into the precisely targeted world of personal genomics. We examine six drug-metabolizing genes in J. Craig Venter and James Watson, two Caucasian men whose genomes were recently sequenced. Their genetic differences underscore the importance of personalized genomics over a race-based approach to medicine. To attain truly personalized medicine, the scientific community must aim to elucidate the genetic and environmental factors that contribute to drug reactions and not be satisfied with a simple race-based approach.

Wednesday, May 27, 2009

Synthetic Biology

Here are a few new reviews on synthetic biology. I think the potential to use synthetic gene networks to understand epistasis is enormous.

Bhalerao KD. Synthetic gene networks: the next wave in biotechnology? Trends Biotechnol. 2009 Jun;27(6):368-74.[PubMed]

Deplazes A. Piecing together a puzzle. An exposition of synthetic biology. EMBO Rep. 2009 May;10(5):428-32. [PubMed]

Purnick PE, Weiss R. The second wave of synthetic biology: from modules to systems. Nat Rev Mol Cell Biol. 2009 Jun;10(6):410-22.[PubMed]

Sunday, May 17, 2009

Origins of magic: review of genetic and epigenetic effects

An interesting take on genetics.

Sreeram V Ramagopalan, Marian Knight, George C Ebers, Julian C Knight. Origins of magic: review of genetic and epigenetic effects. British Medical Journal 2007;335:1299-301. [PubMed]

OBJECTIVE: To assess the evidence for a genetic basis to magic. DESIGN: Literature review. SETTING: Harry Potter novels of J K Rowling. PARTICIPANTS: Muggles, witches, wizards, and squibs. INTERVENTIONS: Limited. MAIN OUTCOME MEASURES: Family and twin studies, magical ability, and specific magical skills. RESULTS: Magic shows strong evidence of heritability, with familial aggregation and concordance in twins. Evidence suggests magical ability to be a quantitative trait. Specific magical skills, notably being able to speak to snakes, predict the future, and change hair colour, all seem heritable. CONCLUSIONS: A multilocus model with a dominant gene for magic might exist, controlled epistatically by one or more loci, possibly recessive in nature. Magical enhancers regulating gene expressionmay be involved, combined with mutations at specific genes implicated in speech and hair colour such as FOXP2 and MCR1.

Thursday, May 14, 2009

Genetic Programming Theory and Practice (GPTP)

I am at the 2009 Genetic Programming Theory and Practice (GPTP) workshop organized by the Center for rhe Study of Complex Systems at the University of Michigan. I gave a talk this morning our computational evolution system for detecting and modeling epistasis. Tonight I will give a demo of our new 3D visualization system. Follow along on my Twitter feed.

Friday, May 08, 2009

Genetic Architecture of Quantitative Traits

This is a very nice new review from Flint and Mackay. I might add this to the list of 100 below.

Flint J, Mackay TF. Genetic architecture of quantitative traits in mice, flies, and humans. Genome Res. 2009 May;19(5):723-33. [PubMed]

Abstract

We compare and contrast the genetic architecture of quantitative phenotypes in two genetically well-characterized model organisms, the laboratory mouse, Mus musculus, and the fruit fly, Drosophila melanogaster, with that found in our own species from recent successes in genome-wide association studies. We show that the current model of large numbers of loci, each of small effect, is true for all species examined, and that discrepancies can be largely explained by differences in the experimental designs used. We argue that the distribution of effect size of common variants is the same for all phenotypes regardless of species, and we discuss the importance of epistasis, pleiotropy, and gene by environment interactions. Despite substantial advances in mapping quantitative trait loci, the identification of the quantitative trait genes and ultimately the sequence variants has proved more difficult, so that our information on the molecular basis of quantitative variation remains limited. Nevertheless, available data indicate that many variants lie outside genes, presumably in regulatory regions of the genome, where they act by altering gene expression. As yet there are very few instances where homologous quantitative trait loci, or quantitative trait genes, have been identified in multiple species, but the availability of high-resolution mapping data will soon make it possible to test the degree of overlap between species.

Thursday, May 07, 2009

100 Publications Every Graduate Student Should Read

I have been wanting for several years now to make a list of 100 important publications that every one of my graduate students should read before they graduate. I will make this list here and will slowly add to it and edit it over the coming months. I will attempt to organize these by discipline. Please email me or post your suggestions or comments.

Bioinformatics

Benson D, Boguski M, Lipman D, Ostell J. The National Center for Biotechnology Information. Genomics. 1990 Mar;6(3):389-91. [PubMed]


Boguski MS. Bioinformatics. Curr Opin Genet Dev. 1994 Jun;4(3):383-8. [PubMed]

Gentleman R. R Programming for Bioinformatics. 2008, Chapman & Hall. [Amazon]

Moore JH. Bioinformatics. J Cell Physiol. 2007 Nov;213(2):365-9. [PubMed]

Reif DM, Dudek SM, Shaffer CM, Wang J, Moore JH. Exploratory visual analysis of pharmacogenomic results. Pac Symp Biocomput. 2005:296-307. [PubMed]

Biostatistics

Benjamini Y, Hochberg Y. Controlling the false-discovery rate: A practical and powerful approach to multiple testing. J. Royal Stat. Soc. B 1995; 57:289-300. [PDF]

Crowley M. The R Book. 2007, Wiley. [Amazon]

Dalgaard P. Introductory Statistics with R. 2008. Springer. [Amazon]

Hastie T, Tibshirani R, Friedman J. The elements of statistical learning. 2009, Springer. [Amazon]

Lipton P. Testing hypotheses: prediction and prejudice. Science 2005 Jan 14;307(5707):219-21. [PubMed]

Rohlf RR, Sokal FJ. Biometry. 1994, W.H. Freeman. [Amazon]

Complex Adaptive Systems

Adami C. What is complexity? Bioessays. 2002 Dec;24(12):1085-94. [PubMed]

Di Paolo EA, Noble J, Bullock S. 2000. Simulation models as opaque thought experiments. In Dedau MA, McCaskill JS, Packard NH, Rasmussen S (eds) Artificial Life VII., pp 497-506. Cambridge MA: The MIT Press. [PDF]

Holland J. Hidden Order. Helix Books. 1996. [Amazon]

Kaplan D, Glass L. Understanding nonlinear dynamics. 2008, Springer. [Amazon]

Computer Science

Jakulin A, Bratko I. Analyzing attribute dependencies. Lecture Notes in Artificial Intelligence 2003; 2838:229-240. [PDF]

Robnik-Sikonja M, Konenenko I. Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learning. 2003; 53:23-69. [PDF]

Epidemiology

Ioannidis JP. Why most published research findings are false. PLoS Med. 2005 Aug;2(8):e124. [PubMed]

Khoury MJ, Millikan R, Little J, Gwinn M. The emergence of epidemiology in the genomics age. Int J Epidemiol. 2004 Oct;33(5):936-44. [PubMed]

Rothman KJ. No adjustments are needed for multiple comparisons. Epidemiology. 1990 Jan;1(1):43-6. [PubMed]

Rothman KJ. Modern Epidemiology. 2008, Lippincott Williams & Wilkins. [Amazon]

Rothman KJ, Greenland S. Causation and causal inference in epidemiology. Am J Public Health. 2005;95 Suppl 1:S144-50. [PubMed]

Rothman KJ, Greenland S, Walker AM. Concepts of interaction. Am J Epidemiol. 1980 Oct;112(4):467-70. [PubMed]

Genetics

Association

Cordell HJ, Clayton DG. Genetic association studies. Lancet. 2005 Sep 24-30;366(9491):1121-31. [PubMed]

Risch N, Merikangas K. The future of genetic studies of complex human disease. Science 1996;273:1516–19. [PubMed]

Spielman RS, McGinnis RE, Ewens WJ. Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet. 1993 Mar;52(3):506-16. [PubMed]

Canalization

Gibson G. Decanalization and the origin of complex disease. Nat Rev Genet. 2009 Feb;10(2):134-40. [PubMed]

Waddington CH. The canalization of development and genetic assimilation of acquired characters. Nature 1942;150:563–565.

Digital Genetics

Adami C. Digital genetics: unravelling the genetic basis of evolution. Nat Rev Genet. 2006 Feb;7(2):109-18. [PubMed]

Lenski RE, Ofria C, Collier TC, Adami C. 1999. Genome complexity, robustness and genetic interactions in digital organisms. Nature 400:661-4. [PubMed]

Ecological and Community Genetics

Handley LJ, Manica A, Goudet J, Balloux F. Going the distance: human population genetics in a clinal world. Trends Genet. 2007 Sep;23(9):432-9. [PubMed]

Sloan CD, Duell EJ, Shi X, Irwin R, Andrew AS, Williams SM, Moore JH. Ecogeographic genetic epidemiology. Genet Epidemiol. 2009 May;33(4):281-9. [PubMed]

Whitham TG, Bailey JK, Schweitzer JA, Shuster SM, Bangert RK, LeRoy CJ, Lonsdorf EV, Allan GJ, DiFazio SP, Potts BM, Fischer DG, Gehring CA, Lindroth RL, Marks JC, Hart SC, Wimp GM, Wooley SC. A framework for community and ecosystem genetics: from genes to ecosystems.
Nat Rev Genet. 2006 Jul;7(7):510-23. [PubMed]

Epistasis

Cheverud JM, Routman EJ. Epistasis and its contribution to genetic variance components. Genetics. 139:1455-61 (1995). [PubMed]

Cordell HJ. 2002. Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. Hum Mol Genet 11:2463-8. [PubMed]

Culverhouse R, Suarez BK, Lin J, Reich T. A perspective on epistasis: limits of models displaying no main effect. Am J Hum Genet. 2002 Feb;70(2):461-71. [PubMed]

Fisher RA. The correlations between relatives on the supposition of Mendelian inheritance. Trans R Soc Edinburgh 52:399-433 (1918).

Frankel WN, Schork NJ. Who's afraid of epistasis? Nat Genet. 1996 Dec;14(4):371-3. [PubMed]

Hollander WF. 1955. Epistasis and hypostasis. J Hered 46:222-225.

Marchini J, Donnelly P, Cardon LR. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet. 2005 Apr;37(4):413-7. [PubMed]

McKinney BA, Reif DM, Ritchie MD, Moore JH. Machine learning for detecting gene-gene interactions: a review. Appl Bioinformatics. 2006;5(2):77-88. [PubMed]

Moore JH. 2003. The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum Hered 56:73-82. [PubMed]

Moore JH. 2005. A global view of epistasis. Nat Genet 37:13-14. [PubMed]

Moore JH, Barney N, Tsai CT, Chiang FT, Gui J, White BC. Symbolic modeling of epistasis. Hum Hered. 2007;63(2):120-33. [PubMed]

Moore JH, Gilbert JC, Tsai CT, Chiang FT, Holden T, Barney N, White BC. A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J Theor Biol. 2006 Jul 21;241(2):252-61. [PubMed]

Moore JH, Williams SM. New strategies for identifying gene-gene interactions in hypertension.
Ann Med. 2002;34(2):88-95. [PubMed]

Moore JH, Williams SM. Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis. Bioessays. 2005 Jun;27(6):637-46. [PubMed]

Nelson MR, Kardia SL, Ferrell RE, Sing CF. A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. Genome Res. 11:458-70 (2001). [PubMed]

Phillips PC. 1998. The language of gene interaction. Genetics 149:1167-71. [PubMed]

Phillips PC. Epistasis--the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet. 2008 Nov;9(11):855-67. [PubMed]

Remold SK, Lenski RE. Pervasive joint influence of epistasis and plasticity on mutational effects in Escherichia coli. Nat Genet. 2004 Apr;36(4):423-6. [PubMed]

Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet. 2001 Jul;69(1):138-47. [PubMed]

Segre D, Deluna A, Church GM, Kishony R. 2005. Modular epistasis in yeast metabolism. Nat Genet 37:77-83. [PubMed]

Wolf J, Brodie III B, Wade M (eds.) Epistasis and the Evolutionary Process. Oxford University Press, new York (2000). [Amazon]

Evolututionary Genetics

Tishkoff SA, Reed FA, Friedlaender FR, Ehret C, Ranciaro A, Froment A, Hirbo JB, Awomoyi AA, Bodo JM, Doumbo O, Ibrahim M, Juma AT, Kotze MJ, Lema G, Moore JH, Mortensen H, Nyambo TB, Omar SA, Powell K, Pretorius GS, Smith MW, Thera MA, Wambebe C, Weber JL, Williams SM. The Genetic Structure and History of Africans and African Americans.
Science. 2009, in press. [PubMed]

Tishkoff SA, Williams SM. Genetic analysis of African populations: human evolution and complex disease. Nat Rev Genet. 2002 Aug;3(8):611-21. [PubMed]

Wright S. The roles of mutation, inbreeding, crossbreeding and selection in evolution. Proc 6th Intl Congr Genet 1:356-366 (1932).

Gene-Environment Interaction (Reaction Norms)

Lewontin RC. Annotation: the analysis of variance and the analysis of causes. American Journal of Human Genetics 1974;26:400-411. [PubMed]

Wahlsten D. Insensitivity of the analysis of variance to heredity-environment interaction. Behavioral and Brain Sciences 1990; 13:109–161.

Sing CF, Stengård JH, Kardia SL. Dynamic relationships between the genome and exposures to environments as causes of common human diseases. World Rev Nutr Diet. 2004;93:77-91. [PubMed]

Genetic Architecture

Flint J, Mackay TF. Genetic architecture of quantitative traits in mice, flies, and humans. Genome Res. 2009 May;19(5):723-33. [PubMed]

Rea TJ, Brown CM, Sing CF. Complex adaptive system models and the genetic analysis of plasma HDL-cholesterol concentration. Perspect Biol Med. 2006 Autumn;49(4):490-503. [PubMed]

Sing CF, Haviland MB, Reilly SL. Genetic architecture of common multifactorial diseases. Ciba Found Symp. 1996;197:211-29. [PubMed]

Sing CF, Stengard JH, Kardia SL. Genes, environment, and cardiovascular disease. Arterioscler Thromb Vasc Biol. 2003 Jul 1;23(7):1190-6. [PubMed]

Sing CF, Zerba KE, Reilly RL. Traversing the biological complexity int he hierarchy between genome and CAD endpoints in the population at large. Clinical Genetics 1994; 46:6-14. [PubMed]

Thornton-Wells TA, Moore JH, Haines JL. 2004. Genetics, statistics, and human disease: Analytical retooling for complexity. Trends Genet 20:640-7. [PubMed]

Genetic Heterogeneity

Sing CF, Boerwinkle E, Moll PP. Strategies for elucidating the phenotypic and genetic heterogeneity of a chronic disease with a complex etiology. Progress in Clinical and Biological Research 1985; 194:36-66. [PubMed]

Thornton-Wells TA, Moore JH, Haines JL. Dissecting trait heterogeneity: a comparison of three clustering methods applied to genotypic data. BMC Bioinformatics. 2006 Apr 12;7:204. [PubMed]

Genome-Wide Association Studies (GWAS)

Clark AG, Boerwinkle E, Hixson J, Sing CF. Determinants of the success of whole-genome association testing. Genome Res. 2005 Nov;15(11):1463-7. [PubMed]

Hirschhorn JN, Daly MJ. Genome-wide association studies for common diseases and complex traits. Nat Rev Genet. 2005 Feb;6(2):95-108. Nat Genet. 2005 Apr;37(4):413-7. [PubMed]

Moore JH. From genotypes to genometypes: putting the genome back in genome-wide association studies. Eur J Hum Genet. 2009, in press. [PubMed]

Moore JH, Ritchie MD. The challenges of whole-genome approaches to common diseases. JAMA. 2004; 291:1642–1643. [PubMed]

Pattin KA, Moore JH. Exploiting the proteome to improve the genome-wide genetic analysis of epistasis in common human diseases. Hum Genet. 2008 Aug;124(1):19-29. [PubMed]

Wang WY, Barratt BJ, Clayton DG, Todd JA. Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet. 2005 Feb;6(2):109-18. [PubMed]

Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007 Jun 7;447(7145):661-78. [PubMed]

Hardy-Weinberg Equilibrium

Nielsen DM, Ehm MG, Weir BS. Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus. Am J Hum Genet. 1998 Nov;63(5):1531-40. [PubMed]

Ryckman KK, Jiang L, Li C, Bartlett J, Haines JL, Williams SM. A prevalence-based association test for case-control studies. Genet Epidemiol. 2008 Nov;32(7):600-5. [PubMed]

Heritability

Feldman MW, Lewontin RC. The heritability hang-up. Science 1975;190:1163-1168. [PubMed]

Visscher PM, Hill WG, Wray NR. Heritability in the genomics era--concepts and misconceptions. Nat Rev Genet. 2008 Apr;9(4):255-66. [PubMed]

Linkage

Blangero J, Williams JT, Almasy L. Variance component methods for detecting complex trait loci. Adv Genet. 42:151-81 (2001). [PubMed]

Botstein D, White RL, Skolnick M, Davis RW. Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet. 1980 May;32(3):314-31. [PubMed]

Hall JM, Lee MK, Newman B, Morrow JE, Anderson LA, Huey B, King MC. Linkage of early-onset familial breast cancer to chromosome 17q21. Science. 1990 Dec 21;250(4988):1684-9. [PubMed]

Haseman JK, Elston RC. The investigation of linkage between a quantitative trait and a marker locus. Behavioral Genetics 1972;2:3-19. [PubMed]

Morton NE. Sequential tests for the detection of linkage. Am J Hum Genet. 1955;7:277-318.

Linkage Disequilibrium and Haplotypes

Hill WG. Estimation of linkage disequilibrium in randomly mating populations. Heredity. 1974 Oct;33(2):229-39. [PubMed]

Lewontin RC. On measures of gametic disequilibrium. Genetics. 1988 Nov;120(3):849-52. [PubMed]

Lewontin RC, Kojima K. The evolututionary dynamics of complex polymorphisms. Evolution 14:450-472 (1960).

Nordborg M, Tavare S. Linkage disequilibrium: what history has to tell us. Trends Genet. 2002 Feb;18(2):83-90. [PubMed]

Templeton AR, Maxwell T, Posada D, Stengård JH, Boerwinkle E, Sing CF. Tree scanning: a method for using haplotype trees in phenotype/genotype association studies. Genetics. 2005 Jan;169(1):441-53. [PubMed]

The International HapMap Consortium. The International HapMap Project. Nature. 2003 Dec 18;426(6968):789-96. [PubMed]

International HapMap Consortium. A haplotype map of the human genome. Nature. 2005 Oct 27;437(7063):1299-320. [PubMed]

Weiss KM, Clark AG. Linkage disequilibrium and the mapping of complex human traits. Trends Genet. 2002 Jan;18(1):19-24. [PubMed]

Mendelian Genetics

Dipple KM, McCabe ER. Phenotypes of patients with "simple" Mendelian disorders are complex traits: thresholds, modifiers, and systems dynamics. Am J Hum Genet. 2000 Jun;66(6):1729-35. [PubMed]

Dipple KM, McCabe ER. Modifier genes convert "simple" Mendelian disorders to complex traits. Mol Genet Metab. 2000 Sep-Oct;71(1-2):43-50. [PubMed]

Pharmacogenetics

Wilke RA, Reif DM, Moore JH. Combinatorial pharmacogenetics. Nat Rev Drug Discov. 2005 Nov;4(11):911-8. [PubMed]

Pleiotropy

Hodgkin J. Seven types of pleiotropy. Int J Dev Biol. 1998;42:501-505. [PubMed]

Tyler AL, Asselbergs FW, Williams SM, Moore JH. Shadows of complexity: what biological networks reveal about epistasis and pleiotropy. Bioessays. 2009 Feb;31(2):220-7. [PubMed]

Quantitative Genetics

Reilly SL, Ferrell RE, Kottke BA, Kamboh MI, Sing CF. The gender-specific apolipoprotein E genotype influence on the distribution of lipids and apolipoproteins in the population of Rochester, MN. I. Pleiotropic effects on means and variances. Am J Hum Genet. 49:1155-66 (1991). [PubMed]

Reilly SL, Ferrell RE, Kottke BA, Sing CF. The gender-specific apolipoprotein E genotype influence on the distribution of plasma lipids and apolipoproteins in the population of Rochester, Minnesota. II. Regression relationships with concomitants. Am J Hum Genet. 1992 Dec;51(6):1311-24. [PubMed]

Reilly SL, Ferrell RE, Sing CF. The gender-specific apolipoprotein E genotype influence on the distribution of plasma lipids and apolipoproteins in the population of Rochester, MN. III. Correlations and covariances. Am J Hum Genet. 55:1001-18 (1994). [PubMed]

Rockman MV, Kruglyak L. Genetics of global gene expression. Nat Rev Genet. 2006 Nov;7(11):862-72. [PubMed]

Textbooks

Elston RC, Johnson W. Basic Biostatistics for Geneticists and Epidemiologists. 2008, Wiley. [Amazon]

Hartl D, Clark AG. Principles of Population Genetics. 2006, Sinauer. [Amazon]

Strachan T, Read A. Human Molecular Genetics. 2003, Taylor & Francis Group. [Amazon]

Ziegler A, Koenig IR. A Statistical Approach to Genetic Epidemiology. 2006, Wiley. [Amazon]

Genomics

Gibson G, Muse, S. A Primer of Genome Science. 2009, Sinauer Associates. [Amazon]

Moore JH, Parker JS, Olsen NJ, Aune TM. Symbolic discriminant analysis of microarray data in autoimmune disease. Genet Epidemiol. 2002 Jun;23(1):57-69. [PubMed]

Mathematics

Shannon CE. A mathematical theory of communication. The Bell System Technical Journal. 1948; 27:379-423. [PDF]

Systems Biology

Ideker T, Galitski T, Hood L. 2001. A new approach to decoding life: systems biology. Ann Rev Genomics Hum Genet 2:343-72. [PubMed]

Ideker T, Thorsson V, Ranish JA, Christmas R, Buhler J, Eng JK, Bumgarner R, Goodlett DR, Aebersold R, Hood L. 2001. Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292:929–934. [PubMed]

Jansen RC. 2003. Studying complex biological systems using multifactorial perturbation. Nat Rev Genet 4:145-51. [PubMed]

Moore JH, Boczko EM, Summar ML. Connecting the dots between genes, biochemistry, and disease susceptibility: systems biology modeling in human genetics. Mol Genet Metab. 2005 Feb;84(2):104-11. [PubMed]

Oliveri P, Davidson EH. Gene regulatory network controlling embryonic specification in the sea urchin. Curr Opin Genet Dev. 2004 Aug;14(4):351-60. [PubMed]

Friday, May 01, 2009

The Genetic Structure and History of Africans and African Americans

Our paper with Drs. Sarah Tishkoff and Scott Williams on "The Genetic Structure and History of Africans and African Americans" has been published online in Science. You can access the online version here. The full paper will be in print soon. The Dartmouth press release is here.

Sarah A. Tishkoff, Floyd A. Reed, Françoise R. Friedlaender, Christopher Ehret, Alessia Ranciaro, Alain Froment, Jibril B. Hirbo, Agnes A. Awomoyi, Jean-Marie Bodo, Ogobara Doumbo, Muntaser Ibrahim, Abdalla T. Juma, Maritha J. Kotze, Godfrey Lema, Jason H. Moore, Holly Mortensen, Thomas B. Nyambo, Sabah A. Omar, Kweli Powell, Gideon S. Pretorius, Michael W. Smith, Mahamadou A. Thera, Charles Wambebe, James L. Weber, Scott M. Williams. Science, published online April 30, 2009. [Science] [PubMed]

Abstract

Africa is the source of all modern humans, but characterization of genetic variation and of relationships among populations across the continent has been enigmatic. We studied 121 African populations, 4 African American populations, and 60 non-African populations for patterns of variation at 1327 nuclear microsatellite and insertion/deletion markers. We identified 14 ancestral population clusters in Africa that correlate with self-described ethnicity and shared cultural and/or linguistic properties. We observe high levels of mixed ancestry in most populations, reflecting historic migration events across the continent. Our data also provide evidence for shared ancestry among geographically diverse hunter-gatherer populations (Khoesan-speakers and Pygmies). The ancestry of African Americans is predominantly from Niger-Kordofanian (~71%), European (~13%), and other African (~8%) populations, although admixture levels varied considerably among individuals. This study helps tease apart the complex evolutionary history of Africans and African Americans, aiding both anthropological and genetic epidemiologic studies.