TY - JOUR
T1 - Transcriptomic resources for the medicinal legume Mucuna pruriens
T2 - De novo transcriptome assembly, annotation, identification and validation of EST-SSR markers
AU - Sathyanarayana, N.
AU - Pittala, Ranjith Kumar
AU - Tripathi, Pankaj Kumar
AU - Chopra, Ratan
AU - Singh, Heikham Russiachand
AU - Belamkar, Vikas
AU - Bhardwaj, Pardeep Kumar
AU - Doyle, Jeff J.
AU - Egan, Ashley N.
N1 - Funding Information:
This research was supported by funding from the Dept. of Biotechnology (DBT), Govt. of India (Grant No. BT/PR3489/PBD/16/945/2011) to NS and from the US National Science Foundation to ANE (DEB-1352217) and JJD (DEB-0948800).
Publisher Copyright:
© 2017 The Author(s).
PY - 2017/5/25
Y1 - 2017/5/25
N2 - Background: The medicinal legume Mucuna pruriens (L.) DC. has attracted attention worldwide as a source of the anti-Parkinson's drug L-Dopa. It is also a popular green manure cover crop that offers many agronomic benefits including high protein content, nitrogen fixation and soil nutrients. The plant currently lacks genomic resources and there is limited knowledge on gene expression, metabolic pathways, and genetics of secondary metabolite production. Here, we present transcriptomic resources for M. pruriens, including a de novo transcriptome assembly and annotation, as well as differential transcript expression analyses between root, leaf, and pod tissues. We also develop microsatellite markers and analyze genetic diversity and population structure within a set of Indian germplasm accessions. Results: One-hundred ninety-one million two hundred thirty-three thousand two hundred forty-two bp cleaned reads were assembled into 67,561 transcripts with mean length of 626 bp and N50 of 987 bp. Assembled sequences were annotated using BLASTX against public databases with over 80% of transcripts annotated. We identified 7,493 simple sequence repeat (SSR) motifs, including 787 polymorphic repeats between the parents of a mapping population. 134 SSRs from expressed sequenced tags (ESTs) were screened against 23 M. pruriens accessions from India, with 52 EST-SSRs retained after quality control. Population structure analysis using a Bayesian framework implemented in fastSTRUCTURE showed nearly similar groupings as with distance-based (neighbor-joining) and principal component analyses, with most of the accessions clustering per geographical origins. Pair-wise comparison of transcript expression in leaves, roots and pods identified 4,387 differentially expressed transcripts with the highest number occurring between roots and leaves. Differentially expressed transcripts were enriched with transcription factors and transcripts annotated as belonging to secondary metabolite pathways. Conclusions: The M. pruriens transcriptomic resources generated in this study provide foundational resources for gene discovery and development of molecular markers. Polymorphic SSRs identified can be used for genetic diversity, marker-trait analyses, and development of functional markers for crop improvement. The results of differential expression studies can be used to investigate genes involved in L-Dopa synthesis and other key metabolic pathways in M. pruriens.
AB - Background: The medicinal legume Mucuna pruriens (L.) DC. has attracted attention worldwide as a source of the anti-Parkinson's drug L-Dopa. It is also a popular green manure cover crop that offers many agronomic benefits including high protein content, nitrogen fixation and soil nutrients. The plant currently lacks genomic resources and there is limited knowledge on gene expression, metabolic pathways, and genetics of secondary metabolite production. Here, we present transcriptomic resources for M. pruriens, including a de novo transcriptome assembly and annotation, as well as differential transcript expression analyses between root, leaf, and pod tissues. We also develop microsatellite markers and analyze genetic diversity and population structure within a set of Indian germplasm accessions. Results: One-hundred ninety-one million two hundred thirty-three thousand two hundred forty-two bp cleaned reads were assembled into 67,561 transcripts with mean length of 626 bp and N50 of 987 bp. Assembled sequences were annotated using BLASTX against public databases with over 80% of transcripts annotated. We identified 7,493 simple sequence repeat (SSR) motifs, including 787 polymorphic repeats between the parents of a mapping population. 134 SSRs from expressed sequenced tags (ESTs) were screened against 23 M. pruriens accessions from India, with 52 EST-SSRs retained after quality control. Population structure analysis using a Bayesian framework implemented in fastSTRUCTURE showed nearly similar groupings as with distance-based (neighbor-joining) and principal component analyses, with most of the accessions clustering per geographical origins. Pair-wise comparison of transcript expression in leaves, roots and pods identified 4,387 differentially expressed transcripts with the highest number occurring between roots and leaves. Differentially expressed transcripts were enriched with transcription factors and transcripts annotated as belonging to secondary metabolite pathways. Conclusions: The M. pruriens transcriptomic resources generated in this study provide foundational resources for gene discovery and development of molecular markers. Polymorphic SSRs identified can be used for genetic diversity, marker-trait analyses, and development of functional markers for crop improvement. The results of differential expression studies can be used to investigate genes involved in L-Dopa synthesis and other key metabolic pathways in M. pruriens.
KW - Differential gene expression
KW - EST-SSRs
KW - Fabaceae
KW - Leguminosae
KW - Mucuna pruriens
KW - Population structure
KW - Transcriptomics
KW - Velvet bean
UR - http://www.scopus.com/inward/record.url?scp=85019726177&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85019726177&partnerID=8YFLogxK
U2 - 10.1186/s12864-017-3780-9
DO - 10.1186/s12864-017-3780-9
M3 - Article
C2 - 28545396
AN - SCOPUS:85019726177
SN - 1471-2164
VL - 18
JO - BMC Genomics
JF - BMC Genomics
IS - 1
M1 - 409
ER -