SnapKin

SnapKin predicts kinase substrates and phosphorylation sites from mass spectrometry (MS)-based phosphoproteomic data using an ensemble deep learning approach to improve substrate prediction accuracy.


Key Features:

  • Ensemble Deep Learning (snapshot ensemble): Employs a snapshot ensemble learning algorithm that integrates multiple deep learning models to enhance predictive performance and model stability for kinase-substrate prediction.
  • Pseudo-Positive Learning Strategy: Uses a pseudo-positive learning strategy to mitigate small sample size issues arising from limited experimentally validated kinase substrates.
  • Data Re-Sampling Based Ensemble Learning: Incorporates data re-sampling techniques within the ensemble to stabilize models and address high noise levels in phosphoproteomics datasets.
  • Utilization of Large Phosphoproteomics Datasets: Developed and evaluated using seven large-scale phosphoproteomics datasets, including six previously published datasets and an additional muscle differentiation dataset.

Scientific Applications:

  • Kinase-substrate identification: Predicts kinase-specific substrates and phosphorylation sites from complex phosphoproteomic measurements.
  • Signaling pathway analysis: Supports elucidation of cellular signaling pathways and protein interaction networks by providing more accurate phosphorylation site assignments.
  • Proteomics research on cell differentiation: Applied to studies of muscle differentiation using a dedicated muscle differentiation phosphoproteomics dataset.
  • Disease mechanism and therapeutic target research: Aids investigation of disease mechanisms and the identification of candidate therapeutic targets through improved substrate prediction.

Methodology:

Integrates traditional and deep learning models into a snapshot ensemble framework and applies pseudo-positive learning and data re-sampling strategies for training on MS-based phosphoproteomic data.

Topics

Details

Tool Type:
command-line tool, library
Added:
3/19/2021
Last Updated:
4/9/2021

Operations

Publications

Lin M, Xiao D, Geddes TA, Burchfield JG, Parker BL, Humphrey SJ, Yang P. SnapKin: a snapshot deep learning ensemble for kinase-substrate prediction from phosphoproteomics data. Unknown Journal. 2021. doi:10.1101/2021.02.23.432610.