A central goal of human genetics is to identify susceptibility genes for common human diseases. An important challenge is modelling gene-gene interaction or epistasis that can result in nonadditivity of genetic effects. The multifactor dimensionality reduction (MDR) method was developed as a machine learning alternative to parametric logistic regression for detecting interactions in the absence of significant marginal effects. The goal of MDR is to reduce the dimensionality inherent in modelling combinations of polymorphisms using a computational approach called constructive induction. Here, we propose a Robust Multifactor Dimensionality Reduction (RMDR) method that performs constructive induction using a Fisher's Exact Test rather than a predetermined threshold. The advantage of this approach is that only statistically significant genotype combinations are considered in the MDR analysis. We use simulation studies to demonstrate that this approach will increase the success rate of MDR when there are only a few genotype combinations that are significantly associated with case-control status. We show that there is no loss of success rate when this is not the case. We then apply the RMDR method to the detection of gene-gene interactions in genotype data from a population-based study of bladder cancer in New Hampshire.
- Data mining
- Machine learning