Epistasis Blog

From the Artificial Intelligence Innovation Lab at Cedars-Sinai Medical Center (www.epistasis.org)

Friday, January 08, 2010

A novel approach to simulate gene-environment interactions in complex diseases

This looks interesting and perhaps useful. Let me know if you try it.

Amato R, Pinelli M, D'Andrea D, Miele G, Nicodemi M, Raiconi G, Cocozza S. A novel approach to simulate gene-environment interactions in complex diseases. BMC Bioinformatics. 2010 Jan 5;11(1):8. [PubMed]


BACKGROUND: Complex diseases are multifactorial traits caused by both genetic and environmental factors. They represent the most part of human diseases and include those with largest prevalence and mortality (cancer, heart disease, obesity, etc.). Despite of a large amount of information that have been collected about both genetic and environmental risk factors, there are relatively few examples of studies on their interactions in epidemiological literature. One reason can be the incomplete knowledge of the power of statistical methods designed to search for risk factors and their interactions in this data sets. An improvement in this direction would lead to a better understanding and description of gene-environment interaction. To this aim, a possible strategy is to challenge the different statistical methods against data sets where the underlying phenomenon is completely known and fully controllable, like for example simulated ones. RESULTS: We present a mathematical approach that models gene-environment interactions. By this method it is possible to generate simulated populations having gene-environment interactions of any form, involving any number of genetic and environmental factors and also allowing non-linear interactions as epistasis. In particular, we implemented a simple version of this model in a Gene-Environment iNteraction Simulator (GENS), a tool designed to simulate case-control data sets where a one gene-one environment interaction influences the disease risk. The main effort has been to allow user to describe characteristics of population by using standard epidemiological measures and to implement constraints to make the simulator behavior biologically meaningful. CONCLUSIONS: By the multi-logistic model implemented in GENS it is possible to simulate case-control samples of complex disease where gene-environment interactions influence the disease risk. The user has a full control of the main characteristics of the simulated population and a Monte Carlo process allows random variability. A Knowledge-based approach reduces the complexity of the mathematical model by using reasonable biological constraints and makes the simulation more understandable in biological terms. Simulated data sets can be used for the assessment of novel statistical methods or for the evaluation of the statistical power when designing a study.


Post a Comment

<< Home