Epistasis Blog

From the Artificial Intelligence Innovation Lab at Cedars-Sinai Medical Center (www.epistasis.org)

Wednesday, April 26, 2006

Interaction Graphs and Dendrograms

Our paper applying logistic regression, MDR, interaction graphs, and interaction dendrograms to the study of the relationship between DNA repair gene SNPs, smoking, and bladder cancer susceptibility has been published in Carcinogenesis. The final pdf is available for download.

Andrew AS, Nelson HH, Kelsey KT, Moore JH, Meng AC, Casella DP, Tosteson TD, Schned AR, Karagas MR. Concordance of multiple analytical approaches demonstrates a complex relationship between DNA repair gene SNPs, smoking and bladder cancer susceptibility. Carcinogenesis. 2006 May;27(5):1030-1037. [PubMed]

Sunday, April 16, 2006

MDR Downloads

We had a total of 498 downloads of our open-source MDR software packages from Sourceforge.net during the month of March. This tops the previous record of 388 in October of 2005. The total download count of all versions since the release in February of 2005 is 4310. The current version of MDR is 1.0.0rc1 and can be downloaded from here. This is the last beta version. The next release will be MDR 1.0.0 which should be ready by the end of May. Thanks to all that have provided us feedback about the software.

Saturday, April 15, 2006

Dissecting Trait Heterogeneity

Thornton-Wells et al. have published a new paper in BMC Bioinformatics that explores the use of cluster analysis as an approach for dealing with trait heterogeneity.

Thornton-Wells TA, Moore JH, Haines JL. Dissecting trait heterogeneity: a comparison of three clustering methods applied to genotypic data. BMC Bioinformatics. 2006 Apr 12;7(1):204 [PubMed]


BACKGROUND: Trait heterogeneity, which exists when a trait has been defined with insufficient specificity such that it is actually two or more distinct traits, has been implicated as a confounding factor in traditional statistical genetics of complex human disease. In the absence of detailed phenotypic data collected consistently in combination with genetic data, unsupervised computational methodologies offer the potential for discovering underlying trait heterogeneity. The performance of three such methods--Bayesian Classification, Hypergraph-Based Clustering, and Fuzzy k-Modes Clustering--appropriate for categorical data were compared. Also tested was the ability of these methods to detect trait heterogeneity in the presence of locus heterogeneity and/or gene-gene interaction, which are two other complicating factors in discovering genetic models of complex human disease. To determine the efficacy of applying the Bayesian Classification method to real data, the reliability of its internal clustering metrics at finding good clusterings was evaluated using permutation testing. RESULTS: Bayesian Classification outperformed the other two methods, with the exception that the Fuzzy k-Modes Clustering performed best on the most complex genetic model. Bayesian Classification achieved excellent recovery for 75% of the datasets simulated under the simplest genetic model, while it achieved moderate recovery for 56% of datasets with a sample size of 500 or more (across all simulated models) and for 86% of datasets with 10 or fewer nonfunctional loci (across all simulated models). Neither Hypergraph Clustering nor Fuzzy k-Modes Clustering achieved good or excellent cluster recovery for a majority of datasets even under a restricted set of conditions. When using the average log of class strength as the internal clustering metric, the false positive rate was controlled very well, at three percent or less for all three significance levels (0.01, 0.05, 0.10), and the false negative rate was acceptably low (18 percent) for the least stringent significance level of 0.10. CONCLUSIONS: Bayesian Classification shows promise as an unsupervised computational method for dissecting trait heterogeneity in genotypic data. Its control of false positive and false negative rates lends confidence to the validity of its results. Further investigation of how different parameter settings may improve the performance of Bayesian Classification, especially under more complex genetic models, is ongoing.

Friday, April 07, 2006

Epistatic effects between two genes in the renin-angiotensin system and systolic blood pressure and coronary artery calcification

A new paper by Dr. Sharon Kardia et al. in Medical Science Monitor reports evidence for epistatic effects of the insertion/deletion polymorphism in the angiotensin-converting enzyme (ACE) gene and the -6 promoter polymorphism of the angiotensinogen (AGT) gene on systolic blood pressure and coronary artery calcification. It is important to note that this is one of the few examples of the successful application of Dr. Jim Cheverud's physiological epistasis approach in a human study. Cheverud's original paper on physiological epistasis appeared in Genetics in 1995 [see PubMed].

Here is Dr. Kardia's paper:

Kardia SL, Bielak LF, Lange LA, Cheverud JM, Boerwinkle E, Turner ST, Sheedy Ii PF, Peyser PA. Epistatic effects between two genes in the renin-angiotensin system and systolic blood pressure and coronary artery calcification. Med Sci Monit. 2006 Mar 28;12(4):CR150-158 [PubMed]


Background: Coronary artery calcification (CAC) is an important indicator of future coronary artery disease events. Since elevated blood pressure (BP) is an important predictor of CAC, genetic polymorphisms in the renin-angiotensin system and their interaction may play a role in explaining CAC quantity variation. Material/Methods: As part of the Epidemiology of Coronary Artery Calcification Study, 166 asymptomatic women and 166 asymptomatic men were genotyped for the insertion/deletion polymorphism in the angiotensin-converting enzyme (ACE) gene and the -6 promoter polymorphism of the angiotensinogen (AGT) gene. We used a novel method to detect gene-gene interaction and compared it to the standard two-way analysis of variance (ANOVA) method. Results: Based on a two-way ANOVA model, there was no evidence for epistasis for either systolic BP or CAC in either men or women. However, using a novel method, we found evidence of significant gene-gene interaction in systolic BP in men and gene-gene interaction in both systolic BP levels and CAC quantity in women. Conclusions: Our study demonstrates that new methods of assessing epistasis maybe important in understanding the complex genetics of systolic blood pressure as well as subclinical coronary atherosclerosis.