A second generation human haplotype map of over 3.1 million SNPs

Kelly A. Frazer, Dennis G. Ballinger, David R. Cox, David A. Hinds, Laura L. Stuve, Richard A. Gibbs, John W. Belmont, Andrew Boudreau, Paul Hardenbol, Suzanne M. Leal, Shiran Pasternak, David A. Wheeler, Thomas D. Willis, Fuli Yu, Huanming Yang, Changqing Zeng, Yang Gao, Haoran Hu, Weitao Hu, Chaohua LiWei Lin, Siqi Liu, Hao Pan, Xiaoli Tang, Jian Wang, Wei Wang, Jun Yu, Bo Zhang, Qingrun Zhang, Hongbin Zhao, Hui Zhao, Jun Zhou, Stacey B. Gabriel, Rachel Barry, Brendan Blumenstiel, Amy Camargo, Matthew Defelice, Maura Faggart, Mary Goyette, Supriya Gupta, Jamie Moore, Huy Nguyen, Robert C. Onofrio, Melissa Parkin, Jessica Roy, Erich Stahl, Ellen Winchester, Liuda Ziaugra, David Altshuler, Yan Shen, Zhijian Yao, Wei Huang, Xun Chu, Yungang He, Li Jin, Yangfan Liu, Yayun Shen, Weiwei Sun, Haifeng Wang, Yi Wang, Ying Wang, Xiaoyan Xiong, Liang Xu, Mary M Y Waye, Stephen K W Tsui, Hong Xue, J. Tze Fei Wong, Luana M. Galver, Jian Bing Fan, Kevin Gunderson, Sarah S. Murray, Arnold R. Oliphant, Mark S. Chee, Alexandre Montpetit, Fanny Chagnon, Vincent Ferretti, Martin Leboeuf, Jean François Olivier, Michael S. Phillips, Stéphanie Roumy, Clémentine Sallée, Andrei Verner, Thomas J. Hudson, Pui Yan Kwok, Dongmei Cai, Daniel C. Koboldt, Raymond D. Miller, Ludmila Pawlikowska, Patricia Taillon-Miller, Ming Xiao, Lap Chee Tsui, William Mak, Qiang Song You, Paul K H Tam, Yusuke Nakamura, Takahisa Kawaguchi, Takuya Kitamoto, Takashi Morizono, Atsushi Nagashima, Yozo Ohnishi, Akihiro Sekine, Toshihiro Tanaka, Tatsuhiko Tsunoda, Panos Deloukas, Christine P. Bird, Marcos Delgado, Emmanouil T. Dermitzakis, Rhian Gwilliam, Sarah Hunt, Jonathan Morrison, Don Powell, Barbara E. Stranger, Pamela Whittaker, David R. Bentley, Mark J. Daly, Paul I W De Bakker, Jeff Barrett, Yves R. Chretien, Julian Maller, Steve McCarroll, Nick Patterson, Itsik Pe'Er, Alkes Price, Shaun Purcell, Daniel J. Richter, Pardis Sabeti, Richa Saxena, Stephen F. Schaffner, Pak C. Sham, Patrick Varilly, Lincoln D. Stein, Lalitha Krishnan, Albert Vernon Smith, Marcela K. Tello-Ruiz, Gudmundur A. Thorisson, Aravinda Chakravarti, Peter E. Chen, David J. Cutler, Carl S. Kashuk, Shin Lin, Gonçalo R. Abecasis, Weihua Guan, Yun Li, Heather M. Munro, Zhaohui Steve Qin, Daryl J. Thomas, Gilean McVean, Adam Auton, Leonardo Bottolo, Niall Cardin, Susana Eyheramendy, Colin Freeman, Jonathan Marchini, Simon Myers, Chris Spencer, Matthew Stephens, Peter Donnelly, Lon R. Cardon, Geraldine Clarke, David M. Evans, Andrew P. Morris, Bruce S. Weir, Todd A. Johnson, James C. Mullikin, Stephen T. Sherry, Michael Feolo, Andrew Skol, Houcan Zhang, Ichiro Matsuda, Yoshimitsu Fukushima, Darryl R. MacEr, Eiko Suda, Charles N. Rotimi, Clement A. Adebamowo, Ike Ajayi, Toyin Aniagwu, Patricia A. Marshall, Chibuzor Nkwodimmah, Charmaine D M Royal, Mark F. Leppert, Missy Dixon, Andy Peiffer, Renzong Qiu, Alastair Kent, Kazuto Kato, Norio Niikawa, Isaac F. Adewole, Bartha M. Knoppers, Morris W. Foster, Ellen Wright Clayton, Jessica Watkin, Donna Muzny, Lynne Nazareth, Erica Sodergren, George M. Weinstock, Imtaz Yakub, Bruce W. Birren, Richard K. Wilson, Lucinda L. Fulton, Jane Rogers, John Burton, Nigel P. Carter, Christopher M. Clee, Mark Griffiths, Matthew C. Jones, Kirsten McLay, Robert W. Plumb, Mark T. Ross, Sarah K. Sims, David L. Willey, Zhu Chen, Hua Han, Le Kang, Martin Godbout, John C. Wallenburg, Paul L'Archevêque, Guy Bellemare, Koji Saeki, Hongguang Wang, Daochang An, Hongbo Fu, Qing Li, Zhen Wang, Renwu Wang, Arthur L. Holden, Lisa D. Brooks, Jean E. McEwen, Mark S. Guyer, Vivian Ota Wang, Jane L. Peterson, Michael Shi, Jack Spiegel, Lawrence M. Sung, Lynn F. Zacharia, Francis S. Collins, Karen Kennedy, Ruth Jamieson, John Stewart

We describe the Phase II HapMap, which characterizes over 3.1 million human single nucleotide polymorphisms (SNPs) genotyped in 270 individuals from four geographically diverse populations and includes 25-35% of common SNP variation in the populations surveyed. The map is estimated to capture untyped common variation with an average maximum r2 of between 0.9 and 0.96 depending on population. We demonstrate that the current generation of commercial genome-wide genotyping products captures common Phase II SNPs with an average maximum r2 of up to 0.8 in African and up to 0.95 in non-African populations, and that potential gains in power in association studies can be obtained through imputation. These data also reveal novel aspects of the structure of linkage disequilibrium. We show that 10-30% of pairs of individuals within a population share at least one region of extended genetic identity arising from recent ancestry and that up to 1% of all common variants are untaggable, primarily because they lie within recombination hotspots. We show that recombination rates vary systematically around genes and between genes of different function. Finally, we demonstrate increased differentiation at non-synonymous, compared to synonymous, SNPs, resulting from systematic differences in the strength or efficacy of natural selection between populations.

Original language: English (US)
Pages: 851-861
Number of pages: 11
Issue: 7164
Published: Oct 18 2007

