DeepHE
DeepHE predicts human essential genes by integrating DNA and protein sequence-derived features with protein-protein interaction (PPI) network embeddings and training a multilayer neural network to identify genes critical for human survival.
Key Features:
- Integration of biological data sources: Combines DNA sequence, protein sequence, and protein-protein interaction (PPI) network features for each gene.
- Network embedding: Employs a deep learning-based network embedding method to automatically extract features from PPI network data.
- Comprehensive sequence feature set: Utilizes 89 sequence-derived features per gene combining genetic and proteomic information.
- Multilayer neural network: Trains a multilayer neural network on the integrated feature set to predict gene essentiality.
- Cost-sensitive learning: Implements a cost-sensitive training approach to address class imbalance between essential and non-essential genes.
- Performance metrics: Reported average Area Under the Curve (AUC) > 94%, area under the precision-recall curve (AP) > 90%, and accuracy > 90%, outperforming SVM, Naïve Bayes, Random Forest, and Adaboost.
Scientific Applications:
- Essential gene identification: Predicts human essential genes to support computational biology analyses and reduce reliance on labor-intensive wet-lab experiments.
- Drug target discovery: Prioritizes genes critical for human survival as potential therapeutic targets to accelerate drug discovery.
Methodology:
Feature extraction from DNA and protein sequences (89 sequence-derived features) and automatic network embedding from PPI networks; training a multilayer neural network on the integrated feature set; and applying cost-sensitive learning to mitigate class imbalance.
Topics
Details
- License:
- MIT
- Programming Languages:
- Python
- Added:
- 1/18/2021
- Last Updated:
- 2/27/2021
Operations
Publications
Zhang X, Xiao W, Xiao W. DeepHE: Accurately Predicting Human Essential Genes based on Deep Learning. Unknown Journal. 2020. doi:10.1101/2020.02.14.950048.