TY - JOUR
T1 - Effects of environment, genetics and data analysis pitfalls in an esophageal cancer genome-wide association study
AU - Statnikov, Alexander
AU - Li, Chun
AU - Aliferis, Constantin F.
PY - 2007/9/26
Y1 - 2007/9/26
N2 - Background. The development of new high-throughput genotyping technologies has allowed fast evaluation of single nucleotide polymorphisms (SNPs) on a genome-wide scale. Several recent genome-wide association studies employing these, technologies suggest that panels of SNPs can be a useful tool for predicting cancer susceptibility and discovery of potentially important new disease loci. Methodology/principal findings. In the present paper we undertake a careful examination of the relative significance of genetics, environmental factors, and biases of the data analysis protocol that was used in a previously published genome-wide association study. That prior study reported a nearly perfect discrimination of esophageal cancer patients and healthy controls on the basis of only genetic information. On the other hand our results strongly suggest that SNPs in this dataset are not statistically linked to the phenotype, while several environmental factors and especially family history of esophageal cancer (a proxy to both environmental and genetic factors) have only a modest association with the disease. Conclusions/Significance. The main component of the previously claimed strong discriminatory signal is due to several data analysis pitfalls that in combination led to the strongly optimistic results. Such pitfalls are preventable and should be avoided in future studies since they create misleading conclusions and generate many false leads for subsequent research.
AB - Background. The development of new high-throughput genotyping technologies has allowed fast evaluation of single nucleotide polymorphisms (SNPs) on a genome-wide scale. Several recent genome-wide association studies employing these, technologies suggest that panels of SNPs can be a useful tool for predicting cancer susceptibility and discovery of potentially important new disease loci. Methodology/principal findings. In the present paper we undertake a careful examination of the relative significance of genetics, environmental factors, and biases of the data analysis protocol that was used in a previously published genome-wide association study. That prior study reported a nearly perfect discrimination of esophageal cancer patients and healthy controls on the basis of only genetic information. On the other hand our results strongly suggest that SNPs in this dataset are not statistically linked to the phenotype, while several environmental factors and especially family history of esophageal cancer (a proxy to both environmental and genetic factors) have only a modest association with the disease. Conclusions/Significance. The main component of the previously claimed strong discriminatory signal is due to several data analysis pitfalls that in combination led to the strongly optimistic results. Such pitfalls are preventable and should be avoided in future studies since they create misleading conclusions and generate many false leads for subsequent research.
UR - http://www.scopus.com/inward/record.url?scp=41549136980&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=41549136980&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0000958
DO - 10.1371/journal.pone.0000958
M3 - Article
C2 - 17895998
AN - SCOPUS:41549136980
SN - 1932-6203
VL - 2
JO - PLoS One
JF - PLoS One
IS - 9
M1 - e958
ER -