Recompleting the Caenorhabditis elegans genome

Jun Yoshimura, Kazuki Ichikawa, Massa J. Shoura, Karen L. Artiles, Idan Gabdank, Lamia Wahba, Cheryl L. Smith, Mark L. Edgley, Ann E. Rougvie, Andrew Z. Fire, Shinichi Morishita, Erich M. Schwarz

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Caenorhabditis elegans was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard C. elegans strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any C. elegans available today. To provide a more accurate C. elegans genome, we performed long-read assembly of VC2010, a modern strain derived from N2. Our VC2010 assembly has 99.98% identity to N2 but with an additional 1.8 Mb including tandem repeat expansions and genome duplications. For 116 structural discrepancies between N2 and VC2010, 97 structures matching VC2010 (84%) were also found in two outgroup strains, implying deficiencies in N2. Over 98% of N2 genes encoded unchanged products in VC2010; moreover, we predicted ≥53 new genes in VC2010. The recompleted genome of C. elegans should be a valuable resource for genetics, genomics, and systems biology.

Original languageEnglish (US)
Pages (from-to)1009-1022
Number of pages14
JournalGenome research
Volume29
Issue number6
DOIs
StatePublished - Jan 1 2019

Fingerprint

Caenorhabditis elegans
Genome
Tandem Repeat Sequences
Systems Biology
Genomics
Genes
Yeasts
Bacteria
DNA

PubMed: MeSH publication types

  • Journal Article
  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

Cite this

Yoshimura, J., Ichikawa, K., Shoura, M. J., Artiles, K. L., Gabdank, I., Wahba, L., ... Schwarz, E. M. (2019). Recompleting the Caenorhabditis elegans genome. Genome research, 29(6), 1009-1022. https://doi.org/10.1101/gr.244830.118

Recompleting the Caenorhabditis elegans genome. / Yoshimura, Jun; Ichikawa, Kazuki; Shoura, Massa J.; Artiles, Karen L.; Gabdank, Idan; Wahba, Lamia; Smith, Cheryl L.; Edgley, Mark L.; Rougvie, Ann E.; Fire, Andrew Z.; Morishita, Shinichi; Schwarz, Erich M.

In: Genome research, Vol. 29, No. 6, 01.01.2019, p. 1009-1022.

Research output: Contribution to journalArticle

Yoshimura, J, Ichikawa, K, Shoura, MJ, Artiles, KL, Gabdank, I, Wahba, L, Smith, CL, Edgley, ML, Rougvie, AE, Fire, AZ, Morishita, S & Schwarz, EM 2019, 'Recompleting the Caenorhabditis elegans genome', Genome research, vol. 29, no. 6, pp. 1009-1022. https://doi.org/10.1101/gr.244830.118
Yoshimura J, Ichikawa K, Shoura MJ, Artiles KL, Gabdank I, Wahba L et al. Recompleting the Caenorhabditis elegans genome. Genome research. 2019 Jan 1;29(6):1009-1022. https://doi.org/10.1101/gr.244830.118
Yoshimura, Jun ; Ichikawa, Kazuki ; Shoura, Massa J. ; Artiles, Karen L. ; Gabdank, Idan ; Wahba, Lamia ; Smith, Cheryl L. ; Edgley, Mark L. ; Rougvie, Ann E. ; Fire, Andrew Z. ; Morishita, Shinichi ; Schwarz, Erich M. / Recompleting the Caenorhabditis elegans genome. In: Genome research. 2019 ; Vol. 29, No. 6. pp. 1009-1022.
@article{aefc80d73b1b445a9e38b825b00a3b9e,
title = "Recompleting the Caenorhabditis elegans genome",
abstract = "Caenorhabditis elegans was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard C. elegans strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any C. elegans available today. To provide a more accurate C. elegans genome, we performed long-read assembly of VC2010, a modern strain derived from N2. Our VC2010 assembly has 99.98{\%} identity to N2 but with an additional 1.8 Mb including tandem repeat expansions and genome duplications. For 116 structural discrepancies between N2 and VC2010, 97 structures matching VC2010 (84{\%}) were also found in two outgroup strains, implying deficiencies in N2. Over 98{\%} of N2 genes encoded unchanged products in VC2010; moreover, we predicted ≥53 new genes in VC2010. The recompleted genome of C. elegans should be a valuable resource for genetics, genomics, and systems biology.",
author = "Jun Yoshimura and Kazuki Ichikawa and Shoura, {Massa J.} and Artiles, {Karen L.} and Idan Gabdank and Lamia Wahba and Smith, {Cheryl L.} and Edgley, {Mark L.} and Rougvie, {Ann E.} and Fire, {Andrew Z.} and Shinichi Morishita and Schwarz, {Erich M.}",
year = "2019",
month = "1",
day = "1",
doi = "10.1101/gr.244830.118",
language = "English (US)",
volume = "29",
pages = "1009--1022",
journal = "Genome Research",
issn = "1054-9803",
publisher = "Cold Spring Harbor Laboratory Press",
number = "6",

}

TY - JOUR

T1 - Recompleting the Caenorhabditis elegans genome

AU - Yoshimura, Jun

AU - Ichikawa, Kazuki

AU - Shoura, Massa J.

AU - Artiles, Karen L.

AU - Gabdank, Idan

AU - Wahba, Lamia

AU - Smith, Cheryl L.

AU - Edgley, Mark L.

AU - Rougvie, Ann E.

AU - Fire, Andrew Z.

AU - Morishita, Shinichi

AU - Schwarz, Erich M.

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Caenorhabditis elegans was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard C. elegans strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any C. elegans available today. To provide a more accurate C. elegans genome, we performed long-read assembly of VC2010, a modern strain derived from N2. Our VC2010 assembly has 99.98% identity to N2 but with an additional 1.8 Mb including tandem repeat expansions and genome duplications. For 116 structural discrepancies between N2 and VC2010, 97 structures matching VC2010 (84%) were also found in two outgroup strains, implying deficiencies in N2. Over 98% of N2 genes encoded unchanged products in VC2010; moreover, we predicted ≥53 new genes in VC2010. The recompleted genome of C. elegans should be a valuable resource for genetics, genomics, and systems biology.

AB - Caenorhabditis elegans was the first multicellular eukaryotic genome sequenced to apparent completion. Although this assembly employed a standard C. elegans strain (N2), it used sequence data from several laboratories, with DNA propagated in bacteria and yeast. Thus, the N2 assembly has many differences from any C. elegans available today. To provide a more accurate C. elegans genome, we performed long-read assembly of VC2010, a modern strain derived from N2. Our VC2010 assembly has 99.98% identity to N2 but with an additional 1.8 Mb including tandem repeat expansions and genome duplications. For 116 structural discrepancies between N2 and VC2010, 97 structures matching VC2010 (84%) were also found in two outgroup strains, implying deficiencies in N2. Over 98% of N2 genes encoded unchanged products in VC2010; moreover, we predicted ≥53 new genes in VC2010. The recompleted genome of C. elegans should be a valuable resource for genetics, genomics, and systems biology.

UR - http://www.scopus.com/inward/record.url?scp=85067907489&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85067907489&partnerID=8YFLogxK

U2 - 10.1101/gr.244830.118

DO - 10.1101/gr.244830.118

M3 - Article

C2 - 31123080

AN - SCOPUS:85067907489

VL - 29

SP - 1009

EP - 1022

JO - Genome Research

JF - Genome Research

SN - 1054-9803

IS - 6

ER -