Major Histocompability Complex (MHC) Class I molecules provide a pathway for cells to present endogenous peptides to the immune system, allowing it to distinguish healthy cells from those infected by pathogens. Software tools based on neural networks such as NetMHC and NetMHCpan predict whether peptides will bind to variants of MHC molecules. These tools are trained with experimental data, consisting of the amino acid sequence of peptides and their observed binding strength. Such tools generally do not explicitly consider hydrophobicity, a significant biochemical factor relevant to peptide binding. It was observed that these tools predict that some highly hydrophobic peptides will be strong binders, which biochemical factors suggest is incorrect. This paper investigates the correlation of the hydrophobicity of 9-mer peptides with their predicted binding strength to the MHC variant HLA-A*0201 for these software tools. Two studies were performed, one using the data that the neural networks were trained on and the other using a sample of the human proteome. A significant bias within NetMHC-4.0 towards predicting highly hydrophobic peptides as strong binders was observed in both studies. This suggests that hydrophobicity should be included in the training data of the neural networks. Retraining the neural networks with such biochemical annotations of hydrophobicity could increase the accuracy of their predictions, increasing their impact in applications such as vaccine design and neoantigen identification.
|Original language||English (US)|
|Title of host publication||Mathematical and Computational Oncology - Third International Symposium, ISMCO 2021, Proceedings|
|Editors||George Bebis, Terry Gaasterland, Mamoru Kato, Mohammad Kohandel, Kathleen Wilkie|
|Publisher||Springer Science and Business Media Deutschland GmbH|
|Number of pages||14|
|State||Published - 2021|
|Event||3rd International Symposium on Mathematical and Computational Oncology, ISMCO 2021 - Virtual, Online|
Duration: Oct 11 2021 → Oct 13 2021
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Conference||3rd International Symposium on Mathematical and Computational Oncology, ISMCO 2021|
|Period||10/11/21 → 10/13/21|
Bibliographical noteFunding Information:
Supported by NSF Grant 2036064.
© 2021, Springer Nature Switzerland AG.
- Machine learning
- MHC Class I
- Neural networks