Epistasis Blog

From the Artificial Intelligence Innovation Lab at Cedars-Sinai Medical Center (www.epistasis.org)

Friday, September 30, 2005

Open-Source MDR 0.5.1

The Dartmouth Computational Genetics Laboratory is pleased to announce the release of version 0.5.1 BETA of our open-source Multifactor Dimensionality Reduction (MDR) software package.

Download information can be found here.

New features in MDR 0.5.1 include:

1) Filter methods

This new feature provides the ability to 'filter' or select from your list of SNPs or other attributes a subset that can be analyzed using MDR. We provide ReliefF and chi-square statistics for filtering. We also provide tools for visualizing the fitness landscape in the form of a line plot, histogram, or raw text. This approach is useful when the number of attributes or variables exceeds the number that is practical for an exhaustive MDR analysis of attribute combinations. For a recent review of the ReliefF algorithm see Robnik-Sikonja M, Kononenko I. 2003. Theoretical and empirical analysis of reliefF and rreliefF. Machine Learning 53:23-69. We are preparing a manuscript reporting results that show ReliefF works much better than chi-square when gene-gene interactions are present. We think this approach will be useful for MDR analysis of genome-wide data.

2) New configuration option

We have included the option to disable the fitness landscape in the configuration tab. Computing and saving the fitness landscape consumes a lot of memory when there are many attribute combinations to evaluate.

3) New statistics

The output for the Best Model tab includes new statistics such as balanced accuracy, chi-square, precision, kappa, and F-measure. Are there others you would like to see? Let us know here!

4) Minor interface changes

We have also implemented some minor interface changes that we think will make MDR easier to use. For example, the Configuration options are now in a tab instead of a new window. We have removed the Raw Results tab due to its redundancy.

5) Source code redesigned

The source code for the new version of MDR has been redsigned for easier use and reuse.

6) Documentation

The documentation we distribute with the MDR software has been updated. The docs describe each of the features and their use.

Plans for the Future:

The next major release of MDR will include several wrapper algorithms such as simulated annealing and genetic algorithms for stochastic searching when an exhaustive search is not possible, as is the case in a genome-wide association studies. We plan to have these new features ready sometime this fall.

Remember! The MDR software is still in the beta testing phase. Your feedback is very important to us. Questions, criticisms, and suggestions can be directed to me (jason DOT h DOT moore {at} Dartmouth DOT edu). Feel free to post your feedback at the MDR project on Sourceforge.net.

Meetings and Presentations for Oct.-Nov.

I will be presenting our work on epistasis and human disease at several upcoming conferences. Here are the details:

"A flexible framework for data mining and knowledge discovery in psychiatric genetics"
Platform Presentation
Session on "Novel Analytical Approaches to Gene Discovery"
Saturday, Oct. 15th @ 1:15
XIII World Congress on Psychiatric Genetics, Boston, MA

"A flexible data mining framework for detecting and interpreting gene-gene interactions"
Platform Presentation
Session on "Computational Analysis"
Sunday, Oct. 23rd @ 9:15
2005 annual meeting of the International Genetic Epidemiology Society, Park City, Utah

"Open-source multifactor dimensionality reduction (MDR) software for detecting and interpreting gene-gene interactions"
Poster Presentation
Oct. 23-24
2005 annual meeting of the International Genetic Epidemiology Society, Park City, Utah

"A flexible framework for data mining and knowledge discovery in human genetics"
Poster Presentation (#1247)
Session on "Genomics"
Wednesday, Oct. 26th @ 4:30
2005 annual meeting of the American Society of Human Genetics, Salt Lake City, Utah

"Open-source multifactor dimensionality reduction (MDR) software for detecting, characterizing, and interpreting gene-gene interactions"
Poster Presentation (#2304)
Session on "Statistical Genetics and Genetic Epidemiology"
Thursday, Oct. 27th @ 4:30
2005 annual meeting of the American Society of Human Genetics, Salt Lake City, Utah

"Traversing the conceptual divide between biological and statistical epistasis: Systems biology and a more modern synthesis"
Oral Presentation
Friday, Nov. 4th
Department of Biological Sciences, University of Idaho, Moscow, Idaho

"A global view of epistasis"
Keynote Speaker
Thursday, Nov. 10th
2005 Meeting of the Portuguese Society of Human Genetics, Portugal

Wednesday, September 21, 2005

Epistasis for Fitness-Related Quantitative Traits in Arabidopsis thaliana

Russell Malmberg at the University of Georgia has published a new paper in Genetics on epistasis in Arabidopsis.

Malmberg RL, Held S, Waits A, Mauricio R. Epistasis for Fitness-Related Quantitative Traits in Arabidopsis thaliana Grown in the Field and in the Greenhouse. Genetics. 2005 Sep 12 [PubMed]


The extent to which epistasis contributes to adaptation, population differentiation, and speciation is a long-standing and important problem in evolutionary genetics. Using recombinant inbred lines of Arabidopsis thaliana grown under natural field conditions, we have examined the genetic architecture of fitness-correlated traits with respect to epistasis; we identified both single locus additive and two locus epistatic QTLs for natural variation in fruit number, germination, and seed length and width. For fruit number, we found 7 significant epistatic interactions, but only 2 additive QTLs. For seed germination, length, and width, there were from 2 to 4 additive QTLs and 5-8 epistatic interactions. The epistatic interactions were both positive and negative. In each case, the magnitude of the epistatic effects were roughly double those of the effects of the additive QTLS, varying from -41% to +29% for fruit number, and from -5% to +4% for seed germination, length, and width. A number of the QTLs we describe participate in more than one epistatic interaction, and some loci identified as additive also may participate in an epistatic interaction; the genetic architecture for fitness traits may be a network of additive and epistatic effects. We compared the map positions of the additive and epistatic QTLs for germination, seed width, and seed length from plants grown in both the field and greenhouse. While the total number of significant additive and epistatic QTLs was similar between the two growth conditions, the map locations were largely different. We found a small number of significant epistatic QTL x environment effects when we tested directly for them. Our results support the idea that epistatic interactions are an important part of natural genetic variation, and reinforce the need for caution in comparing results from greenhouse-grown and field-grown plants.

Saturday, September 17, 2005

The synthetic genetic interaction spectrum of essential genes

A new paper by Davierwala et al. in Nature Genetics describes an interesting study of gene newtorks in yeast.

Davierwala AP et al. The synthetic genetic interaction spectrum of essential genes. Nat Genet. 2005 Sep 11 [PubMed]


The nature of synthetic genetic interactions involving essential genes (those required for viability) has not been previously examined in a broad and unbiased manner. We crossed yeast strains carrying promoter-replacement alleles for more than half of all essential yeast genes to a panel of 30 different mutants with defects in diverse cellular processes. The resulting genetic network is biased toward interactions between functionally related genes, enabling identification of a previously uncharacterized essential gene (PGA1) required for specific functions of the endoplasmic reticulum. But there are also many interactions between genes with dissimilar functions, suggesting that individual essential genes are required for buffering many cellular processes. The most notable feature of the essential synthetic genetic network is that it has an interaction density five times that of nonessential synthetic genetic networks, indicating that most yeast genetic interactions involve at least one essential gene.

Genome-wide interaction networks

A new paper by Calvano et al. in Nature studies genome-wide interaction networks in human subjects receiving an inflammatory stimulus.

Calvano et al. A network-based analysis of systemic inflammation in humans. Nature. 2005 Aug 31 [PubMed]


Oligonucleotide and complementary DNA microarrays are being used to subclassify histologically similar tumours, monitor disease progress, and individualize treatment regimens. However, extracting new biological insight from high-throughput genomic studies of human diseases is a challenge, limited by difficulties in recognizing and evaluating relevant biological processes from huge quantities of experimental data. Here we present a structured network knowledge-base approach to analyse genome-wide transcriptional responses in the context of known functional interrelationships among proteins, small molecules and phenotypes. This approach was used to analyse changes in blood leukocyte gene expression patterns in human subjects receiving an inflammatory stimulus (bacterial endotoxin). We explore the known genome-wide interaction network to identify significant functional modules perturbed in response to this stimulus. Our analysis reveals that the human blood leukocyte response to acute systemic inflammation includes the transient dysregulation of leukocyte bioenergetics and modulation of translational machinery. These findings provide insight into the regulation of global leukocyte activities as they relate to innate immune system tolerance and increased susceptibility to infection in humans.

Thursday, September 15, 2005

The quantitative genetics of transcription

A new paper by Greg Gibson and Bruce Weir in Trends in Genetics reviews the detection and characterization of expression quantitative trait loci (eQTLs). Previous genetic studies of transcription have demonstrated widespread epistasis. See, for example, my blog entry from April 24th, 2006.

Gibson G, Weir B. The quantitative genetics of transcription. Trends Genet. 2005 Sep 7 [PubMed]

Quantitative geneticists have become interested in the heritability of transcription and detection of expression quantitative trait loci (eQTLs). Linkage mapping methods have identified major-effect eQTLs for some transcripts and have shown that regulatory polymorphisms in cis and in trans affect expression. It is also clear that these mapping strategies have little power to detect polygenic factors, and some new statistical approaches are emerging that paint a more complex picture of transcriptional heritability. Several studies imply pervasive non-additivity of transcription, transgressive segregation and epistasis, and future studies will soon document the extent of genotype-environment interaction and population structure at the transcriptional level. The implications of these findings for genotype-phenotype mapping and modeling the evolution of transcription are discussed.

Saturday, September 10, 2005

Simultaneous mapping of epistatic QTL in DU6i x DBA/2 mice

An interesting new paper by Carlborg et al. in Mammalian Genome describes epistatic effects on growth and body composition traits in mouse crosses. Note the interesting patterns of interactions that are described.

Carlborg O, Brockmann GA, Haley CS. Simultaneous mapping of epistatic QTL in DU6i x DBA/2 mice. Mamm Genome. 2005 Jul;16(7):481-94. [PubMed]


We have mapped epistatic quantitative trait loci (QTL) in an F(2) cross between DU6i x DBA/2 mice. By including these epistatic QTL and their interaction parameters in the genetic model, we were able to increase the genetic variance explained substantially (8.8%-128.3%) for several growth and body composition traits. We used an analysis method based on a simultaneous search for epistatic QTL pairs without assuming that the QTL had any effect individually. We were able to detect several QTL that could not be detected in a search for marginal QTL effects because the epistasis cancelled out the individual effects of the QTL. In total, 23 genomic regions were found to contain QTL affecting one or several of the traits and eight of these QTL did not have significant individual effects. We identified 44 QTL pairs with significant effects on the traits, and, for 28 of the pairs, an epistatic QTL model fit the data significantly better than a model without interactions. The epistatic pairs were classified by the significance of the epistatic parameters in the genetic model, and visual inspection of the two-locus genotype means identified six types of related genotype-phenotype patterns among the pairs. Five of these patterns resembled previously published patterns of QTL interactions.

A recent review paper on epistasis by Carlborg and Haley:

Carlborg O, Haley CS. Epistasis: too often neglected in complex trait studies? Nat Rev Genet. 2004 Aug;5(8):618-25. [PubMed]

Thursday, September 08, 2005

Evolutionary Computation and Machine Learning in Bioinformatics (EVOBIO 2006)

I will be chairing (along with Dr. Carlos Cotta) the 4th annual European Workshop on Evolutionary Computation and Machine Learning in Bioinformatics (EVOBIO 2006) in Budapest, Hungary on April 10-12 of next year. Click here for more information. Paper submissions for the conference are due Nov. 4th. All submitted papers will be peer-reviewed and those accepted will be published in the Lecture Notes in Computer Science series. The workshop will be held in conjunction with the 2006 European Genetic Programming (EuroGP) conference. I hope to see you in Budapest!

Vermont Job Opportunity

Department of Biology, University of Vermont

Applications are invited for a RESEARCH FACULTY member in the Department of Biology in the area of molecular genetics and bioinformatics. This faculty member will be supported by the Vermont Genetics Network, an NIH-sponsored program, and will be expected to work with baccalaureate college faculty and students in colleges around Vermont to bring technology at the University of Vermont (UVM), such as microarrays and proteomics, into undergraduate classrooms. All applicants are expected to hold a Ph.D. degree, have experience in the general area of bioinformatics and molecular biology, and have an interest in teaching or working with undergraduates. Candidates must apply online at website: http://www.uvmjobs.com and must attach to that application curriculum vitae, a statement of interest in working with undergraduates in molecular biology settings, and names with contact information of three references. We will be accepting applications until Friday, September 16, 2005.