MHCRoBERTa
MHCRoBERTa predicts interactions between peptides and type I MHC molecules to estimate peptide–MHC class I binding affinities for neoantigen identification.
Key Features:
- RoBERTa pre-training approach: A RoBERTa-based pre-training strategy is tailored for predicting interactions between type I MHC molecules and peptides.
- Transfer learning with label-agnostic protein sequences: The model leverages transfer learning using label-agnostic protein sequences to improve prediction accuracy.
- Performance metrics: Benchmark results report a Spearman rank correlation coefficient (SRCC) of 0.785 and an area under the curve (AUC) of 0.817, with the SRCC 14.3% higher than NetMHCpan3.0 and 3% higher than MHCflurry.
- IC50 prediction improvement: Demonstrates enhanced prediction of IC50 values relevant to peptide–MHC binding affinity assessments.
- Visualization of multi-head self-attention: Provides visualization of multi-head self-attention across layers and heads to assess learned syntax and semantics for prediction tasks.
Scientific Applications:
- Cancer immunotherapy and neoantigen identification: Used to screen and prioritize peptide candidates for neoantigen discovery and personalized cancer vaccine development.
Methodology:
RoBERTa pre-training tailored for peptide–MHC class I interaction prediction, transfer learning using label-agnostic protein sequences, and visualization of multi-head self-attention across layers and heads.
Topics
Details
- License:
- Not licensed
- Cost:
- Free of charge
- Tool Type:
- command-line tool
- Operating Systems:
- Mac, Linux
- Programming Languages:
- Shell, Python
- Added:
- 7/26/2022
- Last Updated:
- 11/24/2024
Operations
Data Inputs & Outputs
Epitope mapping
Inputs
Outputs
Publications
Wang F, Wang H, Wang L, Lu H, Qiu S, Zang T, Zhang X, Hu Y. MHCRoBERTa: pan-specific peptide–MHC class I binding prediction through transfer learning with label-agnostic protein sequences. Briefings in Bioinformatics. 2022;23(3). doi:10.1093/bib/bbab595. PMID:35443027.
DOI: 10.1093/bib/bbab595
PMID: 35443027
Funding: - National Natural Science Foundation of China: 62076082
- National Key Research and Development Project: 2016YFC0901605