Epistasis Blog

From the Artificial Intelligence Innovation Lab at Cedars-Sinai Medical Center (www.epistasis.org)

Thursday, May 07, 2009

100 Publications Every Graduate Student Should Read

I have been wanting for several years now to make a list of 100 important publications that every one of my graduate students should read before they graduate. I will make this list here and will slowly add to it and edit it over the coming months. I will attempt to organize these by discipline. Please email me or post your suggestions or comments.


Benson D, Boguski M, Lipman D, Ostell J. The National Center for Biotechnology Information. Genomics. 1990 Mar;6(3):389-91. [PubMed]

Boguski MS. Bioinformatics. Curr Opin Genet Dev. 1994 Jun;4(3):383-8. [PubMed]

Gentleman R. R Programming for Bioinformatics. 2008, Chapman & Hall. [Amazon]

Moore JH. Bioinformatics. J Cell Physiol. 2007 Nov;213(2):365-9. [PubMed]

Reif DM, Dudek SM, Shaffer CM, Wang J, Moore JH. Exploratory visual analysis of pharmacogenomic results. Pac Symp Biocomput. 2005:296-307. [PubMed]


Benjamini Y, Hochberg Y. Controlling the false-discovery rate: A practical and powerful approach to multiple testing. J. Royal Stat. Soc. B 1995; 57:289-300. [PDF]

Crowley M. The R Book. 2007, Wiley. [Amazon]

Dalgaard P. Introductory Statistics with R. 2008. Springer. [Amazon]

Hastie T, Tibshirani R, Friedman J. The elements of statistical learning. 2009, Springer. [Amazon]

Lipton P. Testing hypotheses: prediction and prejudice. Science 2005 Jan 14;307(5707):219-21. [PubMed]

Rohlf RR, Sokal FJ. Biometry. 1994, W.H. Freeman. [Amazon]

Complex Adaptive Systems

Adami C. What is complexity? Bioessays. 2002 Dec;24(12):1085-94. [PubMed]

Di Paolo EA, Noble J, Bullock S. 2000. Simulation models as opaque thought experiments. In Dedau MA, McCaskill JS, Packard NH, Rasmussen S (eds) Artificial Life VII., pp 497-506. Cambridge MA: The MIT Press. [PDF]

Holland J. Hidden Order. Helix Books. 1996. [Amazon]

Kaplan D, Glass L. Understanding nonlinear dynamics. 2008, Springer. [Amazon]

Computer Science

Jakulin A, Bratko I. Analyzing attribute dependencies. Lecture Notes in Artificial Intelligence 2003; 2838:229-240. [PDF]

Robnik-Sikonja M, Konenenko I. Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learning. 2003; 53:23-69. [PDF]


Ioannidis JP. Why most published research findings are false. PLoS Med. 2005 Aug;2(8):e124. [PubMed]

Khoury MJ, Millikan R, Little J, Gwinn M. The emergence of epidemiology in the genomics age. Int J Epidemiol. 2004 Oct;33(5):936-44. [PubMed]

Rothman KJ. No adjustments are needed for multiple comparisons. Epidemiology. 1990 Jan;1(1):43-6. [PubMed]

Rothman KJ. Modern Epidemiology. 2008, Lippincott Williams & Wilkins. [Amazon]

Rothman KJ, Greenland S. Causation and causal inference in epidemiology. Am J Public Health. 2005;95 Suppl 1:S144-50. [PubMed]

Rothman KJ, Greenland S, Walker AM. Concepts of interaction. Am J Epidemiol. 1980 Oct;112(4):467-70. [PubMed]



Cordell HJ, Clayton DG. Genetic association studies. Lancet. 2005 Sep 24-30;366(9491):1121-31. [PubMed]

Risch N, Merikangas K. The future of genetic studies of complex human disease. Science 1996;273:1516–19. [PubMed]

Spielman RS, McGinnis RE, Ewens WJ. Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet. 1993 Mar;52(3):506-16. [PubMed]


Gibson G. Decanalization and the origin of complex disease. Nat Rev Genet. 2009 Feb;10(2):134-40. [PubMed]

Waddington CH. The canalization of development and genetic assimilation of acquired characters. Nature 1942;150:563–565.

Digital Genetics

Adami C. Digital genetics: unravelling the genetic basis of evolution. Nat Rev Genet. 2006 Feb;7(2):109-18. [PubMed]

Lenski RE, Ofria C, Collier TC, Adami C. 1999. Genome complexity, robustness and genetic interactions in digital organisms. Nature 400:661-4. [PubMed]

Ecological and Community Genetics

Handley LJ, Manica A, Goudet J, Balloux F. Going the distance: human population genetics in a clinal world. Trends Genet. 2007 Sep;23(9):432-9. [PubMed]

Sloan CD, Duell EJ, Shi X, Irwin R, Andrew AS, Williams SM, Moore JH. Ecogeographic genetic epidemiology. Genet Epidemiol. 2009 May;33(4):281-9. [PubMed]

Whitham TG, Bailey JK, Schweitzer JA, Shuster SM, Bangert RK, LeRoy CJ, Lonsdorf EV, Allan GJ, DiFazio SP, Potts BM, Fischer DG, Gehring CA, Lindroth RL, Marks JC, Hart SC, Wimp GM, Wooley SC. A framework for community and ecosystem genetics: from genes to ecosystems.
Nat Rev Genet. 2006 Jul;7(7):510-23. [PubMed]


Cheverud JM, Routman EJ. Epistasis and its contribution to genetic variance components. Genetics. 139:1455-61 (1995). [PubMed]

Cordell HJ. 2002. Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. Hum Mol Genet 11:2463-8. [PubMed]

Culverhouse R, Suarez BK, Lin J, Reich T. A perspective on epistasis: limits of models displaying no main effect. Am J Hum Genet. 2002 Feb;70(2):461-71. [PubMed]

Fisher RA. The correlations between relatives on the supposition of Mendelian inheritance. Trans R Soc Edinburgh 52:399-433 (1918).

Frankel WN, Schork NJ. Who's afraid of epistasis? Nat Genet. 1996 Dec;14(4):371-3. [PubMed]

Hollander WF. 1955. Epistasis and hypostasis. J Hered 46:222-225.

Marchini J, Donnelly P, Cardon LR. Genome-wide strategies for detecting multiple loci that influence complex diseases. Nat Genet. 2005 Apr;37(4):413-7. [PubMed]

McKinney BA, Reif DM, Ritchie MD, Moore JH. Machine learning for detecting gene-gene interactions: a review. Appl Bioinformatics. 2006;5(2):77-88. [PubMed]

Moore JH. 2003. The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum Hered 56:73-82. [PubMed]

Moore JH. 2005. A global view of epistasis. Nat Genet 37:13-14. [PubMed]

Moore JH, Barney N, Tsai CT, Chiang FT, Gui J, White BC. Symbolic modeling of epistasis. Hum Hered. 2007;63(2):120-33. [PubMed]

Moore JH, Gilbert JC, Tsai CT, Chiang FT, Holden T, Barney N, White BC. A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J Theor Biol. 2006 Jul 21;241(2):252-61. [PubMed]

Moore JH, Williams SM. New strategies for identifying gene-gene interactions in hypertension.
Ann Med. 2002;34(2):88-95. [PubMed]

Moore JH, Williams SM. Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis. Bioessays. 2005 Jun;27(6):637-46. [PubMed]

Nelson MR, Kardia SL, Ferrell RE, Sing CF. A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. Genome Res. 11:458-70 (2001). [PubMed]

Phillips PC. 1998. The language of gene interaction. Genetics 149:1167-71. [PubMed]

Phillips PC. Epistasis--the essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet. 2008 Nov;9(11):855-67. [PubMed]

Remold SK, Lenski RE. Pervasive joint influence of epistasis and plasticity on mutational effects in Escherichia coli. Nat Genet. 2004 Apr;36(4):423-6. [PubMed]

Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet. 2001 Jul;69(1):138-47. [PubMed]

Segre D, Deluna A, Church GM, Kishony R. 2005. Modular epistasis in yeast metabolism. Nat Genet 37:77-83. [PubMed]

Wolf J, Brodie III B, Wade M (eds.) Epistasis and the Evolutionary Process. Oxford University Press, new York (2000). [Amazon]

Evolututionary Genetics

Tishkoff SA, Reed FA, Friedlaender FR, Ehret C, Ranciaro A, Froment A, Hirbo JB, Awomoyi AA, Bodo JM, Doumbo O, Ibrahim M, Juma AT, Kotze MJ, Lema G, Moore JH, Mortensen H, Nyambo TB, Omar SA, Powell K, Pretorius GS, Smith MW, Thera MA, Wambebe C, Weber JL, Williams SM. The Genetic Structure and History of Africans and African Americans.
Science. 2009, in press. [PubMed]

Tishkoff SA, Williams SM. Genetic analysis of African populations: human evolution and complex disease. Nat Rev Genet. 2002 Aug;3(8):611-21. [PubMed]

Wright S. The roles of mutation, inbreeding, crossbreeding and selection in evolution. Proc 6th Intl Congr Genet 1:356-366 (1932).

Gene-Environment Interaction (Reaction Norms)

Lewontin RC. Annotation: the analysis of variance and the analysis of causes. American Journal of Human Genetics 1974;26:400-411. [PubMed]

Wahlsten D. Insensitivity of the analysis of variance to heredity-environment interaction. Behavioral and Brain Sciences 1990; 13:109–161.

Sing CF, Stengård JH, Kardia SL. Dynamic relationships between the genome and exposures to environments as causes of common human diseases. World Rev Nutr Diet. 2004;93:77-91. [PubMed]

Genetic Architecture

Flint J, Mackay TF. Genetic architecture of quantitative traits in mice, flies, and humans. Genome Res. 2009 May;19(5):723-33. [PubMed]

Rea TJ, Brown CM, Sing CF. Complex adaptive system models and the genetic analysis of plasma HDL-cholesterol concentration. Perspect Biol Med. 2006 Autumn;49(4):490-503. [PubMed]

Sing CF, Haviland MB, Reilly SL. Genetic architecture of common multifactorial diseases. Ciba Found Symp. 1996;197:211-29. [PubMed]

Sing CF, Stengard JH, Kardia SL. Genes, environment, and cardiovascular disease. Arterioscler Thromb Vasc Biol. 2003 Jul 1;23(7):1190-6. [PubMed]

Sing CF, Zerba KE, Reilly RL. Traversing the biological complexity int he hierarchy between genome and CAD endpoints in the population at large. Clinical Genetics 1994; 46:6-14. [PubMed]

Thornton-Wells TA, Moore JH, Haines JL. 2004. Genetics, statistics, and human disease: Analytical retooling for complexity. Trends Genet 20:640-7. [PubMed]

Genetic Heterogeneity

Sing CF, Boerwinkle E, Moll PP. Strategies for elucidating the phenotypic and genetic heterogeneity of a chronic disease with a complex etiology. Progress in Clinical and Biological Research 1985; 194:36-66. [PubMed]

Thornton-Wells TA, Moore JH, Haines JL. Dissecting trait heterogeneity: a comparison of three clustering methods applied to genotypic data. BMC Bioinformatics. 2006 Apr 12;7:204. [PubMed]

Genome-Wide Association Studies (GWAS)

Clark AG, Boerwinkle E, Hixson J, Sing CF. Determinants of the success of whole-genome association testing. Genome Res. 2005 Nov;15(11):1463-7. [PubMed]

Hirschhorn JN, Daly MJ. Genome-wide association studies for common diseases and complex traits. Nat Rev Genet. 2005 Feb;6(2):95-108. Nat Genet. 2005 Apr;37(4):413-7. [PubMed]

Moore JH. From genotypes to genometypes: putting the genome back in genome-wide association studies. Eur J Hum Genet. 2009, in press. [PubMed]

Moore JH, Ritchie MD. The challenges of whole-genome approaches to common diseases. JAMA. 2004; 291:1642–1643. [PubMed]

Pattin KA, Moore JH. Exploiting the proteome to improve the genome-wide genetic analysis of epistasis in common human diseases. Hum Genet. 2008 Aug;124(1):19-29. [PubMed]

Wang WY, Barratt BJ, Clayton DG, Todd JA. Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet. 2005 Feb;6(2):109-18. [PubMed]

Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007 Jun 7;447(7145):661-78. [PubMed]

Hardy-Weinberg Equilibrium

Nielsen DM, Ehm MG, Weir BS. Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus. Am J Hum Genet. 1998 Nov;63(5):1531-40. [PubMed]

Ryckman KK, Jiang L, Li C, Bartlett J, Haines JL, Williams SM. A prevalence-based association test for case-control studies. Genet Epidemiol. 2008 Nov;32(7):600-5. [PubMed]


Feldman MW, Lewontin RC. The heritability hang-up. Science 1975;190:1163-1168. [PubMed]

Visscher PM, Hill WG, Wray NR. Heritability in the genomics era--concepts and misconceptions. Nat Rev Genet. 2008 Apr;9(4):255-66. [PubMed]


Blangero J, Williams JT, Almasy L. Variance component methods for detecting complex trait loci. Adv Genet. 42:151-81 (2001). [PubMed]

Botstein D, White RL, Skolnick M, Davis RW. Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet. 1980 May;32(3):314-31. [PubMed]

Hall JM, Lee MK, Newman B, Morrow JE, Anderson LA, Huey B, King MC. Linkage of early-onset familial breast cancer to chromosome 17q21. Science. 1990 Dec 21;250(4988):1684-9. [PubMed]

Haseman JK, Elston RC. The investigation of linkage between a quantitative trait and a marker locus. Behavioral Genetics 1972;2:3-19. [PubMed]

Morton NE. Sequential tests for the detection of linkage. Am J Hum Genet. 1955;7:277-318.

Linkage Disequilibrium and Haplotypes

Hill WG. Estimation of linkage disequilibrium in randomly mating populations. Heredity. 1974 Oct;33(2):229-39. [PubMed]

Lewontin RC. On measures of gametic disequilibrium. Genetics. 1988 Nov;120(3):849-52. [PubMed]

Lewontin RC, Kojima K. The evolututionary dynamics of complex polymorphisms. Evolution 14:450-472 (1960).

Nordborg M, Tavare S. Linkage disequilibrium: what history has to tell us. Trends Genet. 2002 Feb;18(2):83-90. [PubMed]

Templeton AR, Maxwell T, Posada D, Stengård JH, Boerwinkle E, Sing CF. Tree scanning: a method for using haplotype trees in phenotype/genotype association studies. Genetics. 2005 Jan;169(1):441-53. [PubMed]

The International HapMap Consortium. The International HapMap Project. Nature. 2003 Dec 18;426(6968):789-96. [PubMed]

International HapMap Consortium. A haplotype map of the human genome. Nature. 2005 Oct 27;437(7063):1299-320. [PubMed]

Weiss KM, Clark AG. Linkage disequilibrium and the mapping of complex human traits. Trends Genet. 2002 Jan;18(1):19-24. [PubMed]

Mendelian Genetics

Dipple KM, McCabe ER. Phenotypes of patients with "simple" Mendelian disorders are complex traits: thresholds, modifiers, and systems dynamics. Am J Hum Genet. 2000 Jun;66(6):1729-35. [PubMed]

Dipple KM, McCabe ER. Modifier genes convert "simple" Mendelian disorders to complex traits. Mol Genet Metab. 2000 Sep-Oct;71(1-2):43-50. [PubMed]


Wilke RA, Reif DM, Moore JH. Combinatorial pharmacogenetics. Nat Rev Drug Discov. 2005 Nov;4(11):911-8. [PubMed]


Hodgkin J. Seven types of pleiotropy. Int J Dev Biol. 1998;42:501-505. [PubMed]

Tyler AL, Asselbergs FW, Williams SM, Moore JH. Shadows of complexity: what biological networks reveal about epistasis and pleiotropy. Bioessays. 2009 Feb;31(2):220-7. [PubMed]

Quantitative Genetics

Reilly SL, Ferrell RE, Kottke BA, Kamboh MI, Sing CF. The gender-specific apolipoprotein E genotype influence on the distribution of lipids and apolipoproteins in the population of Rochester, MN. I. Pleiotropic effects on means and variances. Am J Hum Genet. 49:1155-66 (1991). [PubMed]

Reilly SL, Ferrell RE, Kottke BA, Sing CF. The gender-specific apolipoprotein E genotype influence on the distribution of plasma lipids and apolipoproteins in the population of Rochester, Minnesota. II. Regression relationships with concomitants. Am J Hum Genet. 1992 Dec;51(6):1311-24. [PubMed]

Reilly SL, Ferrell RE, Sing CF. The gender-specific apolipoprotein E genotype influence on the distribution of plasma lipids and apolipoproteins in the population of Rochester, MN. III. Correlations and covariances. Am J Hum Genet. 55:1001-18 (1994). [PubMed]

Rockman MV, Kruglyak L. Genetics of global gene expression. Nat Rev Genet. 2006 Nov;7(11):862-72. [PubMed]


Elston RC, Johnson W. Basic Biostatistics for Geneticists and Epidemiologists. 2008, Wiley. [Amazon]

Hartl D, Clark AG. Principles of Population Genetics. 2006, Sinauer. [Amazon]

Strachan T, Read A. Human Molecular Genetics. 2003, Taylor & Francis Group. [Amazon]

Ziegler A, Koenig IR. A Statistical Approach to Genetic Epidemiology. 2006, Wiley. [Amazon]


Gibson G, Muse, S. A Primer of Genome Science. 2009, Sinauer Associates. [Amazon]

Moore JH, Parker JS, Olsen NJ, Aune TM. Symbolic discriminant analysis of microarray data in autoimmune disease. Genet Epidemiol. 2002 Jun;23(1):57-69. [PubMed]


Shannon CE. A mathematical theory of communication. The Bell System Technical Journal. 1948; 27:379-423. [PDF]

Systems Biology

Ideker T, Galitski T, Hood L. 2001. A new approach to decoding life: systems biology. Ann Rev Genomics Hum Genet 2:343-72. [PubMed]

Ideker T, Thorsson V, Ranish JA, Christmas R, Buhler J, Eng JK, Bumgarner R, Goodlett DR, Aebersold R, Hood L. 2001. Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292:929–934. [PubMed]

Jansen RC. 2003. Studying complex biological systems using multifactorial perturbation. Nat Rev Genet 4:145-51. [PubMed]

Moore JH, Boczko EM, Summar ML. Connecting the dots between genes, biochemistry, and disease susceptibility: systems biology modeling in human genetics. Mol Genet Metab. 2005 Feb;84(2):104-11. [PubMed]

Oliveri P, Davidson EH. Gene regulatory network controlling embryonic specification in the sea urchin. Curr Opin Genet Dev. 2004 Aug;14(4):351-60. [PubMed]


At 11:51 AM, Blogger Stephen Turner said...

John Ioannidis - Why Most Published Research Findings Are False:

And as a group, the 4 NEJM articles on GWAS:

At 3:12 PM, Blogger Stephen Turner said...

One more - NHGRI's working group on what replication means.


At 3:43 PM, Anonymous Konrad said...

Good job. Stephen was quicker and I fully agree with him: "Why Most Published Research Findings Are False" by John P. A. Ioannidis should be definitely on the list.

At 4:03 PM, Blogger Alberto said...

Wow, how much time do you give your students to fulfill this task?

At 5:58 PM, Anonymous Anonymous said...

May I add Kacser and Burns: The Molecular Basis of Dominance.



Post a Comment

<< Home