For genome-wide association studies, it has been increasingly recognized that the popular locus-by-locus search for DNA variants associated with disease susceptibility may not be effective, especially when there are interactions between or among multiple loci, for which a multi-loci search strategy may be more productive. However, even if computationally feasible, a genome-wide search over all possible multiple loci requires exploring a huge model space and making costly adjustment for multiple testing, leading to reduced statistical power. On the other hand, there are accumulating data suggesting that protein products of many disease-causing genes tend to interact with each other, or cluster in the same biological pathway. To incorporate this prior knowledge and existing data on gene networks, we propose a gene network-based method to improve statistical power over that of the exhaustive search by giving higher weights to models involving genes nearby in a network. We use simulated data under realistic scenarios, including a large-scale human protein-protein interaction network and 23 known ataxia-causing genes, to demonstrate potential gain by our proposed method when disease-genes are clustered in a network.
Bibliographical noteFunding Information:
Acknowledgments This research was partially supported by NIH grants GM081535 and HL65462. The author is grateful to Dr. Trey Ideker for providing the PPI network data, and thanks two reviewers for helpful comments.