Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula

Karen M. Moll, Peng Zhou, Thiruvarangan Ramaraj, Diego Fajardo, Nicholas P. Devitt, Michael J Sadowsky, Robert M Stupar, Peter L Tiffin, Jason R. Miller, Nevin D Young, Kevin A Silverstein, Joann Mudge

Research output: Contribution to journalArticle

19 Citations (Scopus)

Abstract

Background: Third generation sequencing technologies, with sequencing reads in the tens- of kilo-bases, facilitate genome assembly by spanning ambiguous regions and improving continuity. This has been critical for plant genomes, which are difficult to assemble due to high repeat content, gene family expansions, segmental and tandem duplications, and polyploidy. Recently, high-throughput mapping and scaffolding strategies have further improved continuity. Together, these long-range technologies enable quality draft assemblies of complex genomes in a cost-effective and timely manner. Results: Here, we present high quality genome assemblies of the model legume plant, Medicago truncatula (R108) using PacBio, Dovetail Chicago (hereafter, Dovetail) and BioNano technologies. To test these technologies for plant genome assembly, we generated five assemblies using all possible combinations and ordering of these three technologies in the R108 assembly. While the BioNano and Dovetail joins overlapped, they also showed complementary gains in continuity and join numbers. Both technologies spanned repetitive regions that PacBio alone was unable to bridge. Combining technologies, particularly Dovetail followed by BioNano, resulted in notable improvements compared to Dovetail or BioNano alone. A combination of PacBio, Dovetail, and BioNano was used to generate a high quality draft assembly of R108, a M. truncatula accession widely used in studies of functional genomics. As a test for the usefulness of the resulting genome sequence, the new R108 assembly was used to pinpoint breakpoints and characterize flanking sequence of a previously identified translocation between chromosomes 4 and 8, identifying more than 22.7 Mb of novel sequence not present in the earlier A17 reference assembly. Conclusions: Adding Dovetail followed by BioNano data yielded complementary improvements in continuity over the original PacBio assembly. This strategy proved efficient and cost-effective for developing a quality draft assembly compared to traditional reference assemblies.

Original languageEnglish (US)
Article number578
JournalBMC Genomics
Volume18
Issue number1
DOIs
StatePublished - Aug 4 2017

Fingerprint

Medicago truncatula
Fabaceae
Technology
Genome
Plant Genome
Genomic Segmental Duplications
A 17
Costs and Cost Analysis
Chromosomes, Human, Pair 8
Chromosomes, Human, Pair 4
Polyploidy
Nucleic Acid Repetitive Sequences
Genomics

Keywords

  • BioNano
  • Dovetail
  • Genome assembly
  • Medicago truncatula
  • Next generation sequencing
  • PacBio

Cite this

Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula. / Moll, Karen M.; Zhou, Peng; Ramaraj, Thiruvarangan; Fajardo, Diego; Devitt, Nicholas P.; Sadowsky, Michael J; Stupar, Robert M; Tiffin, Peter L; Miller, Jason R.; Young, Nevin D; Silverstein, Kevin A; Mudge, Joann.

In: BMC Genomics, Vol. 18, No. 1, 578, 04.08.2017.

Research output: Contribution to journalArticle

@article{95a23e17604b4940a95ce81839d2de3a,
title = "Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula",
abstract = "Background: Third generation sequencing technologies, with sequencing reads in the tens- of kilo-bases, facilitate genome assembly by spanning ambiguous regions and improving continuity. This has been critical for plant genomes, which are difficult to assemble due to high repeat content, gene family expansions, segmental and tandem duplications, and polyploidy. Recently, high-throughput mapping and scaffolding strategies have further improved continuity. Together, these long-range technologies enable quality draft assemblies of complex genomes in a cost-effective and timely manner. Results: Here, we present high quality genome assemblies of the model legume plant, Medicago truncatula (R108) using PacBio, Dovetail Chicago (hereafter, Dovetail) and BioNano technologies. To test these technologies for plant genome assembly, we generated five assemblies using all possible combinations and ordering of these three technologies in the R108 assembly. While the BioNano and Dovetail joins overlapped, they also showed complementary gains in continuity and join numbers. Both technologies spanned repetitive regions that PacBio alone was unable to bridge. Combining technologies, particularly Dovetail followed by BioNano, resulted in notable improvements compared to Dovetail or BioNano alone. A combination of PacBio, Dovetail, and BioNano was used to generate a high quality draft assembly of R108, a M. truncatula accession widely used in studies of functional genomics. As a test for the usefulness of the resulting genome sequence, the new R108 assembly was used to pinpoint breakpoints and characterize flanking sequence of a previously identified translocation between chromosomes 4 and 8, identifying more than 22.7 Mb of novel sequence not present in the earlier A17 reference assembly. Conclusions: Adding Dovetail followed by BioNano data yielded complementary improvements in continuity over the original PacBio assembly. This strategy proved efficient and cost-effective for developing a quality draft assembly compared to traditional reference assemblies.",
keywords = "BioNano, Dovetail, Genome assembly, Medicago truncatula, Next generation sequencing, PacBio",
author = "Moll, {Karen M.} and Peng Zhou and Thiruvarangan Ramaraj and Diego Fajardo and Devitt, {Nicholas P.} and Sadowsky, {Michael J} and Stupar, {Robert M} and Tiffin, {Peter L} and Miller, {Jason R.} and Young, {Nevin D} and Silverstein, {Kevin A} and Joann Mudge",
year = "2017",
month = "8",
day = "4",
doi = "10.1186/s12864-017-3971-4",
language = "English (US)",
volume = "18",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Strategies for optimizing BioNano and Dovetail explored through a second reference quality assembly for the legume model, Medicago truncatula

AU - Moll, Karen M.

AU - Zhou, Peng

AU - Ramaraj, Thiruvarangan

AU - Fajardo, Diego

AU - Devitt, Nicholas P.

AU - Sadowsky, Michael J

AU - Stupar, Robert M

AU - Tiffin, Peter L

AU - Miller, Jason R.

AU - Young, Nevin D

AU - Silverstein, Kevin A

AU - Mudge, Joann

PY - 2017/8/4

Y1 - 2017/8/4

N2 - Background: Third generation sequencing technologies, with sequencing reads in the tens- of kilo-bases, facilitate genome assembly by spanning ambiguous regions and improving continuity. This has been critical for plant genomes, which are difficult to assemble due to high repeat content, gene family expansions, segmental and tandem duplications, and polyploidy. Recently, high-throughput mapping and scaffolding strategies have further improved continuity. Together, these long-range technologies enable quality draft assemblies of complex genomes in a cost-effective and timely manner. Results: Here, we present high quality genome assemblies of the model legume plant, Medicago truncatula (R108) using PacBio, Dovetail Chicago (hereafter, Dovetail) and BioNano technologies. To test these technologies for plant genome assembly, we generated five assemblies using all possible combinations and ordering of these three technologies in the R108 assembly. While the BioNano and Dovetail joins overlapped, they also showed complementary gains in continuity and join numbers. Both technologies spanned repetitive regions that PacBio alone was unable to bridge. Combining technologies, particularly Dovetail followed by BioNano, resulted in notable improvements compared to Dovetail or BioNano alone. A combination of PacBio, Dovetail, and BioNano was used to generate a high quality draft assembly of R108, a M. truncatula accession widely used in studies of functional genomics. As a test for the usefulness of the resulting genome sequence, the new R108 assembly was used to pinpoint breakpoints and characterize flanking sequence of a previously identified translocation between chromosomes 4 and 8, identifying more than 22.7 Mb of novel sequence not present in the earlier A17 reference assembly. Conclusions: Adding Dovetail followed by BioNano data yielded complementary improvements in continuity over the original PacBio assembly. This strategy proved efficient and cost-effective for developing a quality draft assembly compared to traditional reference assemblies.

AB - Background: Third generation sequencing technologies, with sequencing reads in the tens- of kilo-bases, facilitate genome assembly by spanning ambiguous regions and improving continuity. This has been critical for plant genomes, which are difficult to assemble due to high repeat content, gene family expansions, segmental and tandem duplications, and polyploidy. Recently, high-throughput mapping and scaffolding strategies have further improved continuity. Together, these long-range technologies enable quality draft assemblies of complex genomes in a cost-effective and timely manner. Results: Here, we present high quality genome assemblies of the model legume plant, Medicago truncatula (R108) using PacBio, Dovetail Chicago (hereafter, Dovetail) and BioNano technologies. To test these technologies for plant genome assembly, we generated five assemblies using all possible combinations and ordering of these three technologies in the R108 assembly. While the BioNano and Dovetail joins overlapped, they also showed complementary gains in continuity and join numbers. Both technologies spanned repetitive regions that PacBio alone was unable to bridge. Combining technologies, particularly Dovetail followed by BioNano, resulted in notable improvements compared to Dovetail or BioNano alone. A combination of PacBio, Dovetail, and BioNano was used to generate a high quality draft assembly of R108, a M. truncatula accession widely used in studies of functional genomics. As a test for the usefulness of the resulting genome sequence, the new R108 assembly was used to pinpoint breakpoints and characterize flanking sequence of a previously identified translocation between chromosomes 4 and 8, identifying more than 22.7 Mb of novel sequence not present in the earlier A17 reference assembly. Conclusions: Adding Dovetail followed by BioNano data yielded complementary improvements in continuity over the original PacBio assembly. This strategy proved efficient and cost-effective for developing a quality draft assembly compared to traditional reference assemblies.

KW - BioNano

KW - Dovetail

KW - Genome assembly

KW - Medicago truncatula

KW - Next generation sequencing

KW - PacBio

UR - http://www.scopus.com/inward/record.url?scp=85026805940&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85026805940&partnerID=8YFLogxK

U2 - 10.1186/s12864-017-3971-4

DO - 10.1186/s12864-017-3971-4

M3 - Article

C2 - 28778149

AN - SCOPUS:85026805940

VL - 18

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

IS - 1

M1 - 578

ER -