New NIH R01: Bioinformatics Strategies for Brain Imaging Genetics
Our new R01 with Drs. Li Shen and Andy Saykin from IUPUI has been funded by the National Library of Medicine. Andy and I serve as PIs with Li as the contact PI. The grant started September 1st. More information can be found on the NIH RePORT website.
Today's generation of multi-modal imaging systems produces massive high dimensional data sets, which when coupled with high throughput genotyping data such as single nucleotide polymorphisms (SNPs), provide exciting opportunities to enhance our understanding of phenotypic characteristics and the genetic architecture of human diseases. However, the unprecedented scale and complexity of these data sets have presented critical computational bottlenecks requiring new concepts and enabling tools. To address these challenges, using the study of Alzheimer's disease (AD) as a test bed, this project will develop and validate novel bioinformatics strategies for multidimensional brain imaging genetics. Aim 1 is to develop a novel bi- multivariate analysis strategy, S3K-CCA, for studying imaging genetic associations. Existing imaging genetics methods are typically designed to discover single-SNP-single-QT, single-SNP-multi-QT or multi-SNP-single- QT associations, and have limited power in revealing complex relationships between interlinked genetic markers and correlated brain phenotypes. To overcome this limitation, S3K-CCA is designed to be a sparse bi- multivariate learning model that simultaneously uses multiple response variables with multiple predictors for analyzing large-scale multi-modal neurogenomic data. Aim 2 is to develop HD-BIG, a visualization and systems biology framework for integrative analysis of High-Dimensional Brain Imaging Genetics data. Machine learning strategies to seamlessly incorporate valuable domain knowledge to produce biologically meaningful results is still an under-explored area in imaging genetics. In this aim, we will develop a user-friendly heat map interface to visualize high-dimensional results, adjust learning parameters and strategies, interact with existing bioinformatics resources and tools, and facilitate visual exploratory and systems biology analysis. A novel imaging genetic enrichment analysis (IGEA) method will be developed to identify relevant genetic pathways and associated brain circuits, and to reveal complex relationships among them. Aim 3 is to evaluate the proposed S3K-CCA and IGEA methods and the HD-BIG framework using both simulated and real imaging genetics data. This project is expected to produce novel bioinformatics algorithms and tools for comprehensive joint analysis of large scale heterogeneous imaging genetics data. The availability of these powerful methods is critical to the success of many imaging genetics initiatives. In addition, they can also help enable new computational applications in other areas of biomedical research where systematic and integrative analysis of large-scale multi-modal data is critical. Using AD as an exemplar, the proposed methods will demonstrate the potential for enhancing mechanistic understanding of complex disorders, which can benefit public health outcomes by facilitating diagnostic and therapeutic progress.