MHCRoBERTa

MHCRoBERTa predicts interactions between peptides and type I MHC molecules to estimate peptide–MHC class I binding affinities for neoantigen identification.


Key Features:

  • RoBERTa pre-training approach: A RoBERTa-based pre-training strategy is tailored for predicting interactions between type I MHC molecules and peptides.
  • Transfer learning with label-agnostic protein sequences: The model leverages transfer learning using label-agnostic protein sequences to improve prediction accuracy.
  • Performance metrics: Benchmark results report a Spearman rank correlation coefficient (SRCC) of 0.785 and an area under the curve (AUC) of 0.817, with the SRCC 14.3% higher than NetMHCpan3.0 and 3% higher than MHCflurry.
  • IC50 prediction improvement: Demonstrates enhanced prediction of IC50 values relevant to peptide–MHC binding affinity assessments.
  • Visualization of multi-head self-attention: Provides visualization of multi-head self-attention across layers and heads to assess learned syntax and semantics for prediction tasks.

Scientific Applications:

  • Cancer immunotherapy and neoantigen identification: Used to screen and prioritize peptide candidates for neoantigen discovery and personalized cancer vaccine development.

Methodology:

RoBERTa pre-training tailored for peptide–MHC class I interaction prediction, transfer learning using label-agnostic protein sequences, and visualization of multi-head self-attention across layers and heads.

Topics

Details

License:
Not licensed
Cost:
Free of charge
Tool Type:
command-line tool
Operating Systems:
Mac, Linux
Programming Languages:
Shell, Python
Added:
7/26/2022
Last Updated:
11/24/2024

Operations

Data Inputs & Outputs

Epitope mapping

Inputs

Outputs

    Publications

    Wang F, Wang H, Wang L, Lu H, Qiu S, Zang T, Zhang X, Hu Y. MHCRoBERTa: pan-specific peptide–MHC class I binding prediction through transfer learning with label-agnostic protein sequences. Briefings in Bioinformatics. 2022;23(3). doi:10.1093/bib/bbab595. PMID:35443027.

    PMID: 35443027
    Funding: - National Natural Science Foundation of China: 62076082 - National Key Research and Development Project: 2016YFC0901605