Construction and comparison of three reference-quality genome assemblies for soybean

Babu Valliyodan, Steven B. Cannon, Philipp E. Bayer, Shengqiang Shu, Anne V. Brown, Longhui Ren, Jerry Jenkins, Claire Y.L. Chung, Ting Fung Chan, Christopher G. Daum, Christopher Plott, Alex Hastie, Kobi Baruch, Kerrie W. Barry, Wei Huang, Gunvant Patil, Rajeev K. Varshney, Haifei Hu, Jacqueline Batley, Yuxuan YuanQijian Song, Robert M. Stupar, David M. Goodstein, Gary Stacey, Hon Ming Lam, Scott A. Jackson, Jeremy Schmutz, Jane Grimwood, David Edwards, Henry T. Nguyen

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

We report reference-quality genome assemblies and annotations for two accessions of soybean (Glycine max) and for one accession of Glycine soja, the closest wild relative of G. max. The G. max assemblies provided are for widely used US cultivars: the northern line Williams 82 (Wm82) and the southern line Lee. The Wm82 assembly improves the prior published assembly, and the Lee and G. soja assemblies are new for these accessions. Comparisons among the three accessions show generally high structural conservation, but nucleotide difference of 1.7 single-nucleotide polymorphisms (snps) per kb between Wm82 and Lee, and 4.7 snps per kb between these lines and G. soja. snp distributions and comparisons with genotypes of the Lee and Wm82 parents highlight patterns of introgression and haplotype structure. Comparisons against the US germplasm collection show placement of the sequenced accessions relative to global soybean diversity. Analysis of a pan-gene collection shows generally high conservation, with variation occurring primarily in genomically clustered gene families. We found approximately 40–42 inversions per chromosome between either Lee or Wm82v4 and G. soja, and approximately 32 inversions per chromosome between Wm82 and Lee. We also investigated five domestication loci. For each locus, we found two different alleles with functional differences between G. soja and the two domesticated accessions. The genome assemblies for multiple cultivated accessions and for the closest wild ancestor of soybean provides a valuable set of resources for identifying causal variants that underlie traits for the domestication and improvement of soybean, serving as a basis for future research and crop improvement efforts for this important crop species.

Original languageEnglish (US)
Pages (from-to)1066-1082
Number of pages17
JournalPlant Journal
Volume100
Issue number5
DOIs
StatePublished - Dec 1 2019

Bibliographical note

Funding Information:
The authors thank Sarah Kingan (Pacific Biosciences) for providing exploratory sequence and analysis using PacBio sequence data, Nathan Weeks (USDA-ARS, Ames, IA) for providing computational environments for genome analysis, Andrew Farmer (NCGR, Santa Fe, NM) for calculating functional annotations for predicted genes and the HudsonAlpha Genome Sequencing Group for contributions to the Glycine max (Wm82) V4 genome assembly and resources. Substantial funding for the work was provided by the United Soybean Board and three industrial partners, Bayer Crop Science (currently BASF), Monsanto (currently Bayer), and Dow AgroSciences (currently Corteva), for funding contributions to the USB soybean genome sequencing project to H.T.N. (1320-532-5615). Funding to T.-F.C. and H.-M.L. in support of this work was supplied by the Hong Kong Research Grants Council Area of Excellence Scheme (AoE/M-403/16). The work conducted by the US Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, is supported by the Office of Science of the US Department of Energy under contract no. DE-AC02-05CH11231. Additional funding for analysis was provided by National Science Foundation (NSF) award 1444806 to S.B.C. for genome assembly and analysis. Genome finishing work for Wm82 provided by NSF award 0822258 to S.A.J. A portion of the analysis work in this project was provided through in-kind contributions from the USDA Agricultural Research Service, project 5030-21000-069-00-D. The USDA is an equal opportunity provider and employer. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the US Department of Agriculture. This work was supported by resources provided by the Pawsey Supercomputing Centre with funding from the Australian Government and the Government of Western Australia.

Funding Information:
The authors thank Sarah Kingan (Pacific Biosciences) for providing exploratory sequence and analysis using PacBio sequence data, Nathan Weeks (USDA‐ARS, Ames, IA) for providing computational environments for genome analysis, Andrew Farmer (NCGR, Santa Fe, NM) for calculating functional annotations for predicted genes and the HudsonAlpha Genome Sequencing Group for contributions to the Glycine max (Wm82) V4 genome assembly and resources. Substantial funding for the work was provided by the United Soybean Board and three industrial partners, Bayer Crop Science (currently BASF), Monsanto (currently Bayer), and Dow AgroSciences (currently Corteva), for funding contributions to the USB soybean genome sequencing project to H.T.N. (1320‐532‐5615). Funding to T.‐F.C. and H.‐M.L. in support of this work was supplied by the Hong Kong Research Grants Council Area of Excellence Scheme (AoE/M‐403/16). The work conducted by the US Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, is supported by the Office of Science of the US Department of Energy under contract no. DE‐AC02‐05CH11231. Additional funding for analysis was provided by National Science Foundation (NSF) award 1444806 to S.B.C. for genome assembly and analysis. Genome finishing work for Wm82 provided by NSF award 0822258 to S.A.J. A portion of the analysis work in this project was provided through in‐kind contributions from the USDA Agricultural Research Service, project 5030‐21000‐069‐00‐D. The USDA is an equal opportunity provider and employer. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the US Department of Agriculture. This work was supported by resources provided by the Pawsey Supercomputing Centre with funding from the Australian Government and the Government of Western Australia.

Publisher Copyright:
© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd

Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.

Keywords

  • Glycine max
  • Glycine soja
  • comparative genomics
  • domestication
  • genome assembly
  • soybean

Fingerprint Dive into the research topics of 'Construction and comparison of three reference-quality genome assemblies for soybean'. Together they form a unique fingerprint.

Cite this