SnapKin
SnapKin predicts kinase substrates and phosphorylation sites from mass spectrometry (MS)-based phosphoproteomic data using an ensemble deep learning approach to improve substrate prediction accuracy.
Key Features:
- Ensemble Deep Learning (snapshot ensemble): Employs a snapshot ensemble learning algorithm that integrates multiple deep learning models to enhance predictive performance and model stability for kinase-substrate prediction.
- Pseudo-Positive Learning Strategy: Uses a pseudo-positive learning strategy to mitigate small sample size issues arising from limited experimentally validated kinase substrates.
- Data Re-Sampling Based Ensemble Learning: Incorporates data re-sampling techniques within the ensemble to stabilize models and address high noise levels in phosphoproteomics datasets.
- Utilization of Large Phosphoproteomics Datasets: Developed and evaluated using seven large-scale phosphoproteomics datasets, including six previously published datasets and an additional muscle differentiation dataset.
Scientific Applications:
- Kinase-substrate identification: Predicts kinase-specific substrates and phosphorylation sites from complex phosphoproteomic measurements.
- Signaling pathway analysis: Supports elucidation of cellular signaling pathways and protein interaction networks by providing more accurate phosphorylation site assignments.
- Proteomics research on cell differentiation: Applied to studies of muscle differentiation using a dedicated muscle differentiation phosphoproteomics dataset.
- Disease mechanism and therapeutic target research: Aids investigation of disease mechanisms and the identification of candidate therapeutic targets through improved substrate prediction.
Methodology:
Integrates traditional and deep learning models into a snapshot ensemble framework and applies pseudo-positive learning and data re-sampling strategies for training on MS-based phosphoproteomic data.
Topics
Details
- Tool Type:
- command-line tool, library
- Added:
- 3/19/2021
- Last Updated:
- 4/9/2021
Operations
Publications
Lin M, Xiao D, Geddes TA, Burchfield JG, Parker BL, Humphrey SJ, Yang P. SnapKin: a snapshot deep learning ensemble for kinase-substrate prediction from phosphoproteomics data. Unknown Journal. 2021. doi:10.1101/2021.02.23.432610.