Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon

Shawn T. O'Neil, Jason D.K. Dzurisin, Rory D. Carmichael, Neil F. Lobo, Scott J. Emrich, Jessica J. Hellmann

Research output: Contribution to journalArticlepeer-review

111 Scopus citations

Abstract

Background: Several recent studies have demonstrated the use of Roche 454 sequencing technology for de novo transcriptome analysis. Low error rates and high coverage also allow for effective SNP discovery and genetic diversity estimates. However, genetically diverse datasets, such as those sourced from natural populations, pose challenges for assembly programs and subsequent analysis. Further, estimating the effectiveness of transcript discovery using Roche 454 transcriptome data is still a difficult task.Results: Using the Roche 454 FLX Titanium platform, we sequenced and assembled larval transcriptomes for two butterfly species: the Propertius duskywing, Erynnis propertius (Lepidoptera: Hesperiidae) and the Anise swallowtail, Papilio zelicaon (Lepidoptera: Papilionidae). The Expressed Sequence Tags (ESTs) generated represent a diverse sample drawn from multiple populations, developmental stages, and stress treatments.Despite this diversity, > 95% of the ESTs assembled into long (> 714 bp on average) and highly covered (> 9.6× on average) contigs. To estimate the effectiveness of transcript discovery, we compared the number of bases in the hit region of unigenes (contigs and singletons) to the length of the best match silkworm (Bombyx mori) protein--this "ortholog hit ratio" gives a close estimate on the amount of the transcript discovered relative to a model lepidopteran genome. For each species, we tested two assembly programs and two parameter sets; although CAP3 is commonly used for such data, the assemblies produced by Celera Assembler with modified parameters were chosen over those produced by CAP3 based on contig and singleton counts as well as ortholog hit ratio analysis. In the final assemblies, 1,413 E. propertius and 1,940 P. zelicaon unigenes had a ratio > 0.8; 2,866 E. propertius and 4,015 P. zelicaon unigenes had a ratio > 0.5.Conclusions: Ultimately, these assemblies and SNP data will be used to generate microarrays for ecoinformatics examining climate change tolerance of different natural populations. These studies will benefit from high quality assemblies with few singletons (less than 26% of bases for each assembled transcriptome are present in unassembled singleton ESTs) and effective transcript discovery (over 6,500 of our putative orthologs cover at least 50% of the corresponding model silkworm gene).

Original languageEnglish (US)
Article number310
JournalBMC Genomics
Volume11
Issue number1
DOIs
StatePublished - May 17 2010
Externally publishedYes

Bibliographical note

Funding Information:
This work was supported by the Office of Science (BER), US Department of Energy, Grant no. DE-FG02-05ER to JJH, and the Arthur J. Schmitt Foundation. We also thank Katrina Hill, Jessica Keppel, Chris Lambert, Shannon Pelini, Aubrey Podell, Sean Ryan, and Megan Stachura for field and laboratory assistance. Finally, we thank three anonymous reviewers for insightful comments.

Fingerprint

Dive into the research topics of 'Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon'. Together they form a unique fingerprint.

Cite this