Proteogenomics has emerged as a valuable approach in cancer research, which integrates genomic and transcriptomic data with mass spectrometry-based proteomics data to directly identify expressed, variant protein sequences that may have functional roles in cancer. This approach is computationally intensive, requiring integration of disparate software tools into sophisticated workflows, challenging its adoption by nonexpert, bench scientists. To address this need, we have developed an extensible, Galaxy-based resource aimed at providing more researchers access to, and training in, proteogenomic informatics. Our resource brings together software from several leading research groups to address two foundational aspects of proteogenomics: (i) generation of customized, annotated protein sequence databases from RNA-Seq data; and (ii) accurate matching of tandem mass spectrometry data to putative variants, followed by filtering to confirm their novelty. Directions for accessing software tools and workflows, along with instructional documentation, can be found at z.umn.edu/canresgithub. Cancer Res; 77(21); e43-46.
Bibliographical noteFunding Information:
This work was supported by Ghent University Concerted Research Action-BOF12/GOA/014 to L. Martens; Bergen Research Foundation and the Research Council of Norway (H. Barsnes); BMBF grant 031 A538A RBC (de.NBI; B. Gru€ning); NCI-ITCR grant 1U24CA199347 and NSF (U.S.) grant 1458524 to M.C. Chambers, P.D. Jagtap, J.E. Johnson, T. McGowan, P. Kumar, G. Onsongo, C.R. Guerrero, and T.J. Griffin.
© 2017 American Association for Cancer Research.