Epistasis Blog

From the Artificial Intelligence Innovation Lab at Cedars-Sinai Medical Center (www.epistasis.org)

Saturday, September 15, 2012

New NIH R01: Bioinformatics Strategies for Brain Imaging Genetics

Our new R01 with Drs. Li Shen and Andy Saykin from IUPUI has been funded by the National Library of Medicine. Andy and I serve as PIs with Li as the contact PI. The grant started September 1st. More information can be found on the NIH RePORT website.

Abstract

Today's generation of multi-modal imaging systems produces massive high dimensional data sets, which when coupled with high throughput genotyping data such as single nucleotide polymorphisms (SNPs), provide exciting opportunities to enhance our understanding of phenotypic characteristics and the genetic architecture of human diseases. However, the unprecedented scale and complexity of these data sets have presented critical computational bottlenecks requiring new concepts and enabling tools. To address these challenges, using the study of Alzheimer's disease (AD) as a test bed, this project will develop and validate novel bioinformatics strategies for multidimensional brain imaging genetics. Aim 1 is to develop a novel bi- multivariate analysis strategy, S3K-CCA, for studying imaging genetic associations. Existing imaging genetics methods are typically designed to discover single-SNP-single-QT, single-SNP-multi-QT or multi-SNP-single- QT associations, and have limited power in revealing complex relationships between interlinked genetic markers and correlated brain phenotypes. To overcome this limitation, S3K-CCA is designed to be a sparse bi- multivariate learning model that simultaneously uses multiple response variables with multiple predictors for analyzing large-scale multi-modal neurogenomic data. Aim 2 is to develop HD-BIG, a visualization and systems biology framework for integrative analysis of High-Dimensional Brain Imaging Genetics data. Machine learning strategies to seamlessly incorporate valuable domain knowledge to produce biologically meaningful results is still an under-explored area in imaging genetics. In this aim, we will develop a user-friendly heat map interface to visualize high-dimensional results, adjust learning parameters and strategies, interact with existing bioinformatics resources and tools, and facilitate visual exploratory and systems biology analysis. A novel imaging genetic enrichment analysis (IGEA) method will be developed to identify relevant genetic pathways and associated brain circuits, and to reveal complex relationships among them. Aim 3 is to evaluate the proposed S3K-CCA and IGEA methods and the HD-BIG framework using both simulated and real imaging genetics data. This project is expected to produce novel bioinformatics algorithms and tools for comprehensive joint analysis of large scale heterogeneous imaging genetics data. The availability of these powerful methods is critical to the success of many imaging genetics initiatives. In addition, they can also help enable new computational applications in other areas of biomedical research where systematic and integrative analysis of large-scale multi-modal data is critical. Using AD as an exemplar, the proposed methods will demonstrate the potential for enhancing mechanistic understanding of complex disorders, which can benefit public health outcomes by facilitating diagnostic and therapeutic progress.

Tuesday, September 11, 2012

Epistasis dominates the genetic architecture of Drosophila quantitative traits

Great new paper from Dr. Trudy Mackay at NC State. Ask yourself this simple question. If epistasis is so common in flies, why not humans?

Huang W, Richards S, Carbone MA, Zhu D, Anholt RR, Ayroles JF, Duncan L, Jordan KW, Lawrence F, Magwire MM, Warner CB, Blankenburg K, Han Y, Javaid M, Jayaseelan J, Jhangiani SN, Muzny D, Ongeri F, Perales L, Wu YQ, Zhang Y, Zou X, Stone EA, Gibbs RA, Mackay TF. Epistasis dominates the genetic architecture of Drosophila quantitative traits. Proc Natl Acad Sci U S A. 2012 Sep 4. [PubMed]

Abstract

Epistasis-nonlinear genetic interactions between polymorphic loci-is the genetic basis of canalization and speciation, and epistatic interactions can be used to infer genetic networks affecting quantitative traits. However, the role that epistasis plays in the genetic architecture of quantitative traits is controversial. Here, we compared the genetic architecture of three Drosophila life history traits in the sequenced inbred lines of the Drosophila melanogaster Genetic Reference Panel (DGRP) and a large outbred, advanced intercross population derived from 40 DGRP lines (Flyland). We assessed allele frequency changes between pools of individuals at the extremes of the distribution for each trait in the Flyland population by deep DNA sequencing. The genetic architecture of all traits was highly polygenic in both analyses. Surprisingly, none of the SNPs associated with the traits in Flyland replicated in the DGRP and vice versa. However, the majority of these SNPs participated in at least one epistatic interaction in the DGRP. Despite apparent additive effects at largely distinct loci in the two populations, the epistatic interactions perturbed common, biologically plausible, and highly connected genetic networks. Our analysis underscores the importance of epistasis as a principal factor that determines variation for quantitative traits and provides a means to uncover genetic networks affecting these traits. Knowledge of epistatic networks will contribute to our understanding of the genetic basis of evolutionarily and clinically important traits and enhance predictive ability at an individualized level in medicine and agriculture.