Epistasis Blog

From the Artificial Intelligence Innovation Lab at Cedars-Sinai Medical Center (www.epistasis.org)

Sunday, July 31, 2005

New Journal: Cancer Informatics

Cancer Informatics is a new journal that will provide a forum for publishing research related to computational methods for detecting, characterizing, and interpreting epistasis or gene-gene interactions. I am currently serving on the editorial board.

Aim and Scope:

Cancer Informatics is a peer-reviewed, open-access research journal where those engaged in cancer research can turn for rapid communication of the latest advances in the application of bioinformatics and computational biological toward the discovery of new knowledge in oncology and cancer biology, and toward the clinical translation of that knowledge to increase the efficacy of practicing oncologists, radiologists and pathologists.

Papers submitted for review in Cancer Informatics should focus on the following areas:

1) advances in high throughput genomic, proteomic and genetic data analysis, including new insights from the re-analysis or integrated analysis of published data sets

2) advances in cancer detection, imaging and visualization through improved algorithms

3) cancer information databases

4) advances in telemedicine and medical imaging algorithms

5) steps toward personalized and molecular medicine, biomarker validation studies, integrative and translational cancer research

6) improvements in the understanding computational cancer biology through the study of protein structures and protein-protein interactions, cancer pathway analysis and modeling

7) improved understanding of cancer as a process through systems biology

8) cancer modeling

9) cancer surveillance

10) improvements in in silico drug discovery algorithms

11) improved methods for studying survivorship patterns in clinical trials related to novel biomarkers

12) genomic and genetic mapping algorithms to improve the discovery of components of the genome important to the occurrence, development of cancer and responses of cancer to treatment

The speed of reporting advances in informatics solutions for cancer research, the open access model, and the highest editorial standards with which Cancer Informatics is produced will have the combined effect of maximizing the synergistic impact of advances in bioinformatics and computational biology on cancer research.

Saturday, July 30, 2005

Multiple Locus Linkage Analysis of Genomewide Expression in Yeast

A new paper by Storey et al. in PLoS Biology documents epistasis in yeast.

Storey JD, Akey JM, Kruglyak L. Multiple Locus Linkage Analysis of Genomewide Expression in Yeast. PLoS Biol. 2005 Jul 26;3(8):e267 [PubMed] [PDF]


With the ability to measure thousands of related phenotypes from a single biological sample, it is now feasible to genetically dissect systems-level biological phenomena. The genetics of transcriptional regulation and protein abundance are likely to be complex, meaning that genetic variation at multiple loci will influence these phenotypes. Several recent studies have investigated the role of genetic variation in transcription by applying traditional linkage analysis methods to genomewide expression data, where each gene expression level was treated as a quantitative trait and analyzed separately from one another. Here, we develop a new, computationally efficient method for simultaneously mapping multiple gene expression quantitative trait loci that directly uses all of the available data. Information shared across gene expression traits is captured in a way that makes minimal assumptions about the statistical properties of the data. The method produces easy-to-interpret measures of statistical significance for both individual loci and the overall joint significance of multiple loci selected for a given expression trait. We apply the new method to a cross between two strains of the budding yeast Saccharomyces cerevisiae, and estimate that at least 37% of all gene expression traits show two simultaneous linkages, where we have allowed for epistatic interactions. Pairs of jointly linking quantitative trait loci are identified with high confidence for 170 gene expression traits, where it is expected that both loci are true positives for at least 153 traits. In addition, we are able to show that epistatic interactions contribute to gene expression variation for at least 14% of all traits. We compare the proposed approach to an exhaustive two-dimensional scan over all pairs of loci. Surprisingly, we demonstrate that an exhaustive two-dimensional scan is less powerful than the sequential search used here. In addition, we show that a two-dimensional scan does not truly allow one to test for simultaneous linkage, and the statistical significance measured from this existing method cannot be interpreted among many traits.

Saturday, July 23, 2005

New Data Mining Book

The second edition of "Data Mining: Practical Machine Learning Tools and Techniques" by Witten and Frank is now available. More information can be found here. The Amazon.com link is here.

This book provides a very nice introduction to many different data mining methods. In addition, it is based on the Weka data mining software package that is freely available from here along with Java source files. Weka is very comprehensive and provides a wide variety if different machine learning algorithms along with data manipulation tools all in an easy to us software package with a nice GUI. Many of these tools are useful for detecting and characterizing gene-gene and gene-environment interactions.

We have included the kernel of our Multifactor Dimensionality Reduction (MDR) algorithm in a special version of Weka for Computational Genetics (Weka-CG). Weka-CG can be found here.

Gene Networks

Understanding how genes interact with one another through vast, interconnected networks will play a very important role in characterizing the genetic architecture of disease susceptibility. A recent paper by Basso et al. in Nature Genetics presents a new method for inferring gene networks from gene expression profiles.

Basso K, Margolin AA, Stolovitzky G, Klein U, Dalla-Favera R, Califano A. Reverse engineering of regulatory networks in human B cells. Nat Genet. 2005 Apr;37(4):382-90. [PubMed]


Cellular phenotypes are determined by the differential activity of networks linking coregulated genes. Available methods for the reverse engineering of such networks from genome-wide expression profiles have been successful only in the analysis of lower eukaryotes with simple genomes. Using a new method called ARACNe (algorithm for the reconstruction of accurate cellular networks), we report the reconstruction of regulatory networks from expression profiles of human B cells. The results are suggestive a hierarchical, scale-free network, where a few highly interconnected genes (hubs) account for most of the interactions. Validation of the network against available data led to the identification of MYC as a major hub, which controls a network comprising known target genes as well as new ones, which were biochemically validated. The newly identified MYC targets include some major hubs. This approach can be generally useful for the analysis of normal and pathologic networks in mammalian cells.

A nice introduction to scale-free networks is given by Barabasi in Scientific American.

Barabasi AL, Bonabeau E. Scale-free networks. Sci Am. 2003 May;288(5):60-9. [PubMed]

A very interesting endomesodermal gene network can be found at the Davidson Lab web page.

Google "gene networks"

Google Scholar "gene networks"

PubMed "gene networks"

Citeseer "gene networks"

See related posts on July 16, June 12, and February 16

Thursday, July 21, 2005

Open-source MDR 0.4.5

Version 0.4.5 of our Multifactor Dimensionality Reduction (MDR) software has been posted. This new version includes several minor bug fixes.

Download the latest version of the MDR software from here.

Learning Petri net models of non-linear gene interactions

A recent paper by Mayo in BioSystems studies the use of Petri nets for building models of biochemical systems that are consistent with epistatic models of disease susceptibility.

Mayo M. Biosystems. 2005 Jul 14; [Epub ahead of print] [PubMed]

This work builds on our previous work in this area. See, for example:

Moore JH, Boczko EM, Summar ML. Connecting the dots between genes, biochemistry, and disease susceptibility: systems biology modeling in human genetics. Mol Genet Metab. 2005 Feb;84(2):104-11. [PubMed]

Moore JH, Hahn LW. Petri net modeling of high-order genetic systems using grammatical evolution. Biosystems. 2003 Nov;72(1-2):177-86. [PubMed]

For more information in Petri nets visit the Petri Nets World website.

Saturday, July 16, 2005

Interactome modeling

A new reviewin FEBS Letters by Marc Vidal explores networks of interacting molecules and their impact on human diseases. To what extent do the statistical interactions we identify in human populations reflect an underlying interactome? Dr. Scott Williams and I explore this question in our recent paper in BioEssays (2005 Jun;27(6):637-46; See my May 18th posting).

Vidal M. Interactome modeling. FEBS Lett. 2005 Mar 21;579(8):1834-8. [PubMed]


A long-term goal of the field of interactome modeling is to understand how global and local properties of complex macromolecular networks impact on observable biological properties, and how changes in such properties can lead to human diseases. The information available at this stage of development of the field provides strong evidence for the existence of such interesting global and local properties, but also demonstrates that many more datasets will be needed to provide accurate models with increasingly predictive capacity. This review focuses on an early attempt at mapping a multicellular interactome network and on the lessons learned from that attempt.

Epistatic Pleiotropy and the Genetic Architecture of Skull Trait Complexes in Mice.

A new paper by Wolf et al. in Genetics presents evidence that genetic integration in the skull is achieved by a complex combination of pleiotropic effects.

Wolf J, Leamy LJ, Routman E, Cheverud JM. Epistatic Pleiotropy and the Genetic Architecture of Covariation Within Early- and Late-Developing Skull Trait Complexes in Mice. Genetics. 2005 Jul 14; [Epub ahead of print] [PubMed]


The role of epistasis as a source of trait variation is well established, but its role as a source of covariation among traits (i.e., as a source of "epistatic pleiotropy") is rarely considered. In this study we examine the relative importance of epistatic pleiotropy in producing covariation within early- and late-developing skull trait complexes in a population of mice derived from an intercross of the Large and Small inbred strains. Significant epistasis was found for several pair-wise combinations of the 21 quantitative trait loci (QTL) affecting early-developing traits, and among the 20 QTL affecting late-developing traits. The majority of the epistatic effects were restricted to single traits but epistatic pleiotropy still contributed significantly to covariances. Because of their proportionally larger effects on variances than on covariances, epistatic effects tended to reduce within-group correlations of the early-developing (but not late-developing) traits and reduce their overall degree of integration. The expected contributions of single-locus and two-locus epistatic pleiotropic QTL effects to the genetic covariance between traits were analyzed using a two-locus population genetic model. The model demonstrates that, in order for single-locus or epistatic pleiotropy to contribute to trait covariances in the study population, both traits must show the same pattern of genetic effects. In general, covariance patterns produced by single-locus and epistatic pleiotropy predicted by the model agreed well with actual values calculated from the QTL analysis. Nearly all single-locus and epistatic pleiotropic effects contributed positive components to covariances between traits, suggesting that genetic integration in the skull is achieved by a complex combination of pleiotropic effects.

Thursday, July 14, 2005

The Inherent Complexities of Gene–Environment Interactions

A recent paper by Grigorenko reviews the complexities of gene-environment interactions.

Grigorenko EL. The inherent complexities of gene-environment interactions. J Gerontol B Psychol Sci Soc Sci. 2005 Mar;60 Spec No 1:53-64. [PubMed]


The article outlines the complexities of gene-environment interactions in the determination of human disease, especially as they relate to aging, and stresses the importance of continuing such studies, in spite of their inherent difficulties. First, a capsule review of the literature pertaining to studies of gene-environment interactions is presented, and designs and methodologies used to detect these interactions are briefly discussed. Second, research questions and problems that can be addressed as outcomes of gene-environment interaction studies are exemplified. Third, a number of illustrative examples of gene-environment interactions are presented. Fourth, various types of gene-environment interactions are briefly discussed. Fifth, concluding remarks are offered, and possibilities of studying gene-environment interaction within social and biological research on aging are outlined.

Saturday, July 09, 2005

Leo Breiman died Tuesday, July 5th

Leo Breiman, professor emeritus of statistics at UC Berkeley, died on Tuesday, July 5th. Dr. Breiman is best known for his work on classification and regression trees (CART) which are applicable to genetic analysis. The UC Berkeley press release can be read here.

The following paper illustrates the use of classification trees for identifying gene-gene interactions:

Bureau et al. Identifying SNPs predictive of phenotype using random forests. Genet Epidemiol. 2005 Feb;28(2):171-82. [PubMed]

Wednesday, July 06, 2005

Epistasis and the genetics of human diseases

A new paper by Nagel in Comptes Rendus Biologies discusses the role of epistasis in disease susceptibility. Our paper on the ubiquitous nature of epistasis in human disease susceptibility is specifically cited and quoted (Moore, Hum Hered. 2003;56(1-3):73-82).

Nagel RL. Epistasis and the genetics of human diseases. C R Biol. 2005 Jul;328(7):606-15. [PubMed]


Epistasis or modifier genes, that is, gene-gene interactions of non-allelic partners, play a major role in susceptibility to common human diseases. This old genetic concept has experienced a major renaissance recently. Interestingly, epistatic genes can make the disease less severe, or make it more severe. Hence, most diseases are of different intensities in different individuals and in different ethnicities. This phenomenon affects sickle-cell anemia carriers and other hemoglobinopathies, systemic lupus erythematosus, cystic fibrosis, complex autoimmune diseases, venous thromboembolism, and many others. It is likely, and fortunate, than 20 years form now, patients entering a medical facility will be subjected to a genomic scanning, including pathogenic genes as well as epistatic genes.