Epistasis Blog

From the Artificial Intelligence Innovation Lab at Cedars-Sinai Medical Center (www.epistasis.org)

Saturday, March 24, 2007

Genomic Mining for Complex Disease Traits with “Random Chemistry”

Our paper with Dr. Maggie Eppstein from the University of Vermont has been accepted for publication in Genetic Programming and Evolvable Machines. This paper describes a random chemistry approach for detecting epistasis in genome-wide association studies.

Genomic Mining for Complex Disease Traits with “Random Chemistry”
Eppstein MJ, Payne JL, White BC, Moore JH
Genetic Programming and Evolvable Machines, in press (2007).

Our rapidly growing knowledge regarding genetic variation in the human genome offers great potential for understanding the genetic etiology of disease. This, in turn, could revolutionize detection, treatment, and in some cases prevention of disease. While genes for most of the rare monogenic diseases have already been discovered, most common diseases are complex traits, resulting from multiple gene-gene and gene-environment interactions. Detecting epistatic genetic interactions that predispose for disease is an important, but computationally daunting, task currently facing bioinformaticists. Here, we propose a new evolutionary approach that attempts to hill-climb from large sets of candidate epistatic genetic features to smaller sets, inspired by Kauffman’s “random chemistry” approach to detecting small auto-catalytic sets of molecules from within large sets. Although the algorithm is conceptually straightforward, its success hinges upon the creation of a fitness function able to discriminate large sets that contain subsets of interacting genetic features from those that don’t. Here, we employ an approximate and noisy fitness function based on the ReliefF data mining algorithm. We establish proof-of-concept using synthetic data sets, where individual features have no marginal effects. We show that the resulting algorithm can successfully detect epistatic pairs from up to 1000 candidate single nucleotide polymorphisms in time that is linear in the size of the initial set, although success rate degrades as heritability declines. Research continues into seeking a more accurate fitness approximator for large sets and other algorithmic improvements that will enable us to extend the approach to larger data sets and to lower heritabilities.

Monday, March 19, 2007

Fisher, Faith, and Eugenics

As you know, Sir Ronald Fisher had a huge impact on both statistics and genetics. A new paper by James Moore (no relation) at The Open University in the UK in the journal Studies in History and Philosophy of Science explores Fisher’s simultaneous commitment to Darwinism, Anglican Christianity and eugenics. This is an interesting paper for those interested in Fisher and how faith impacts the professional lives of scientists. There are several other interesting papers in the same issue on gene expression and heritability.

R. A. Fisher: a faith fit for eugenics

James Moore, Department of History of Science, Technology and Medicine, The Open University, Milton Keynes, MK7 6AA, UK

Stud Hist Philos Biol Biomed Sci. 2007 Mar;38(1):110-35. [PubMed]

In discussions of ‘religion-and-science’, faith is usually emphasized more than works, scientists’ beliefs more than their deeds. By reversing the priority, a lingering puzzle in the life of Ronald Aylmer Fisher (1890–1962), statistician, eugenicist and founder of the neo-Darwinian synthesis, can be solved. Scholars have struggled to find coherence in Fisher’s simultaneous commitment to Darwinism, Anglican Christianity and eugenics. The problem is addressed by asking what practical mode of faith or faithful mode of practice lent unity to his life? Families, it is argued, with their myriad practical, emotional and intellectual challenges, rendered a mathematically-based eugenic Darwinian Christianity not just possible for Fisher, but vital.

Friday, March 09, 2007

MDR Applied to a Large Prospective Investigation of Gene-Gene Interactions

A nice new paper by Manuguerra et al. in Carcinogenesis applies MDR to detecting interactions in several different types of cancer.

Manuguerra M et al. Multi-factor dimensionality reduction applied to a large prospective investigation on gene-gene and gene-environment interactions. Carcinogenesis. 2007 Feb;28(2):414-22. [PubMed]

It is becoming increasingly evident that single-locus effects cannot explain complex multifactorial human diseases like cancer. We applied the multi-factor dimensionality reduction (MDR) method to a large cohort study on gene-environment and gene-gene interactions. The study (case-control nested in the EPIC cohort) was established to investigate molecular changes and genetic susceptibility in relation to air pollution and environmental tobacco smoke (ETS) in non-smokers. We have analyzed 757 controls and 409 cases with bladder cancer (n=124), lung cancer (n=116) and myeloid leukemia (n=169). Thirty-six gene variants (DNA repair and metabolic genes) and three environmental exposure variables (measures of air pollution and ETS at home and at work) were analyzed. Interactions were assessed by prediction error percentage and cross-validation consistency (CVC) frequency. For lung cancer, the best model was given by a significant gene-environment association between the base excision repair (BER) XRCC1-Arg399Gln polymorphism, the double-strand break repair (DSBR) BRCA2-Asn372His polymorphism and the exposure variable 'distance from heavy traffic road', an indirect and robust indicator of air pollution (mean prediction error of 26%, P<0.001, p="0.02).">T (mean prediction error of 22%, P<0.001,>T, MnSOD-Ala9Val and CYP1A1-Ile462Val had a minimum prediction error of 31% (P<0.001) and a maximum CVC of 4.40 (P=0.086). The MDR method seems promising, because it provides a limited number of statistically stable interactions; however, the biological interpretation remains to be understood.

Thursday, March 08, 2007

MDR is #40

Our open-source MDR software package is #40 out of over 1,000 bioinformatics software packages on Sourceforge.net with over 8100 downloads. MDR also ranks #108 out of over 1,700 artificial intelligence software packages. A new version will be available for download soon.

Wednesday, March 07, 2007

MDR 1.1 Beta Available Soon!

We are putting the finishing touches on a new version of MDR. We hope to have this ready for you to evaluate by the end of next week.