Epistasis Blog

From the Computational Genetics Laboratory at the University of Pennsylvania (www.epistasis.org)

Saturday, January 13, 2018

News piece on Gene Medic

Here is a news piece on my new Atari 2600 game Gene Medic that appeared in the Daily Pennsylvanian.

Thursday, January 11, 2018

A heuristic method for simulating open-data of arbitrary complexity that can be used to compare and evaluate machine learning methods

A new version of our HIBACHI approach for simulating more realistic data.

Moore JH, Shestov M, Schmitt P, Olson RS. A heuristic method for simulating open-data of arbitrary complexity that can be used to compare and evaluate machine learning methods. Pac Symp Biocomput. 2018;23:259-267. [PDF]

A central challenge of developing and evaluating artificial intelligence and machine learning methods for regression and classification is access to data that illuminates the strengths and weaknesses of different methods. Open data plays an important role in this process by making it easy for computational researchers to easily access real data for this purpose. Genomics has in some examples taken a leading role in the open data effort starting with DNA microarrays. While real data from experimental and observational studies is necessary for developing computational methods it is not sufficient. This is because it is not possible to know what the ground truth is in real data. This must be accompanied by simulated data where that balance between signal and noise is known and can be directly evaluated. Unfortunately, there is a lack of methods and software for simulating data with the kind of complexity found in real biological and biomedical systems. We present here the Heuristic Identification of Biological Architectures for simulating Complex Hierarchical Interactions (HIBACHI) method and prototype software for simulating complex biological and biomedical data. Further, we introduce new methods for developing simulation models that generate data that specifically allows discrimination between different machine learning methods.

Wednesday, January 10, 2018

Leveraging putative enhancer-promoter interactions to investigate two-way epistasis in Type 2 Diabetes GWAS

We presented this paper at the 2018 Pacific Symposium on Biocomputing. This is an effort to incorporate functional genomics annotations into epistasis analysis in regulatory regions.

Manduchi E, Chesi A, Hall MA, Grant SFA, Moore JH. Leveraging putative enhancer-promoter interactions to investigate two-way epistasis in Type 2 Diabetes GWAS. Pac Symp Biocomput. 2018;23:548-558. [PDF]

We utilized evidence for enhancer-promoter interactions from functional genomics data in order to build biological filters to narrow down the search space for two-way Single Nucleotide Polymorphism (SNP) interactions in Type 2 Diabetes (T2D) Genome Wide Association Studies (GWAS). This has led us to the identification of a reproducible statistically significant SNP pair associated with T2D. As more functional genomics data are being generated that can help identify potentially interacting enhancer-promoter pairs in larger collection of tissues/cells, this approach has implications for investigation of epistasis from GWAS in general.

Monday, January 01, 2018

Gene Medic - a retro edutainment game for the Atari 2600


I am please to announce the release of my new retro edutainment game of genome medicine for the Atari 2600 video computer system (VCS). The game is called Gene Medic and the goal is to edit a patient's mutations to restore health. You can find information about the game along with the binary and source core here.