Background: N-linked glycoprotein is a highly interesting class of proteins for clinical and biological research. The large-scale characterization of N-linked glycoproteins accomplished by mass spectrometry-based glycoproteomics has provided valuable insights into the interdependence of glycoprotein structure and protein function. However, these studies focused mainly on the analysis of specific sample type, and lack the integration of glycoproteomic data from different tissues, body fluids or cell types. Methods: In this study, we collected the human glycosite-containing peptides identified through their de-glycosylated forms by mass spectrometry from over 100 publications and unpublished datasets generated from our laboratory. A database resource termed N-GlycositeAtlas was created and further used for the distribution analyses of glycoproteins among different human cells, tissues and body fluids. Finally, a web interface of N-GlycositeAtlas was created to maximize the utility and value of the database. Results: The N-GlycositeAtlas database contains more than 30,000 glycosite-containing peptides (representing > 14,000 N-glycosylation sites) from more than 7200 N-glycoproteins from different biological sources including human-derived tissues, body fluids and cell lines from over 100 studies. Conclusions: The entire human N-glycoproteome database as well as 22 sub-databases associated with individual tissues or body fluids can be downloaded from the N-GlycositeAtlas website at http://nglycositeatlas.biomarkercenter.org.
Bibliographical noteFunding Information:
This work was supported by the National Natural Science Foundation of China (Grant Nos. 91853123, 81773180, and 21705127), and Natural Science Foundation of Shaanxi Province (Grant No: 2018JM7086074). HZ was supported by the National Institutes of Health (Grant Nos.: U01CA152813, U24CA210985, P01HL107153, and R21AI122382).