Epistasis Blog

From the Computational Genetics Laboratory at the University of Pennsylvania (www.epistasis.org)

Sunday, December 23, 2007

An Open-Ended Computational Evolution System for the Genetic Analysis of Epistasis

Our paper on the development and evaluation of a prototype computational evolution system for epistasis analysis has been accepted for oral presentation at the EvoBIO'08 conference in Naples, Italy in March of 2008. The peer-reviewed paper will be published by Springer in Lecture Notes in Computer Science. Email me after Jan. 1st for a preprint. A complete list of accepted papers for the conference can be found here. This paper was inspired by the Banzhaf et al. review that distinguishes 'artificial evolution' and computational evolution' and fits with our theme of using expert knowledge to guide stochastic search algorithms for genetic analysis. I have included Figure 1 below the abstract.

Moore, J.H., Andrews, P.C., Barney, N., White, B.C. Development and Evaluation of an Open-Ended Computational Evolution System for the Genetic Analysis of Susceptibility to Common Human Diseases. Lecture Notes in Computer Science, in press (2008).

Abstract. An important goal of human genetics is to identify DNA sequence variations that are predictive of susceptibility to common human diseases. This is as a classification problem with data consisting of discrete attributes and a binary outcome. A variety of different machine learning methods based on artificial evolution have been developed and applied to modeling the relationship between genotype and phenotype. While artificial evolution approaches show promise, they are far from perfect and are only loosely based on real biological and evolutionary processes. It has recently been suggested that a new paradigm is needed where ‘artificial evolution’ is transformed to ‘computational evolution’ by incorporating more biological and evolutionary complexity into existing algorithms. It has been proposed that computational evolution systems will be more likely to solve problems of interest to biologists and biomedical researchers. The goal of the present study was to develop and evaluate a prototype computational evolution system for the analysis of human genetics data. We describe here this new open-ended computational evolution system and provide initial results from a simulation study that suggest more complex operators result in better solutions. This study represents a first step towards the use of computational evolution for bioinformatics problem-solving in the domain of human genetics.

Figure 1. Visual overview of our prototype computational evolution system for discovering symbolic discriminant functions that differentiate disease subject from healthy subjects using information about single nucleotide polymorphisms (SNPs). The hierarchical structure is shown on the left while some specific examples at each level are shown on the right. The top two levels of the hierarchy (A and B) exist to generate variability in the operators that modify the solutions. Shown in C is an example set of operators that will perform recombination on the two solutions shown in D. As illustrated in B, there is a 0.50 probability that a mutation to the recombination operator in C will add an operator thus making this particular operator more complex. This system allows operators of any arbitrary complexity to modify solutions. Note that we used a 24x24 grid of solutions in the present study. A 12x12 grid is shown as an illustrative example.


Post a Comment

<< Home