Pheg
Pheg predicts human essential genes from nucleotide composition and association information using the λ-interval Z curve to enable sequence-based identification of essential genes.
Key Features:
- λ-interval Z curve representation: Represents nucleotide composition and association information by applying the λ-interval Z curve method to DNA sequences.
- Support Vector Machine (SVM) classifier: Uses an SVM model to classify genes as essential or non-essential based on the extracted sequence features.
- Performance: Achieves an Area Under the Curve (AUC) greater than 0.88 in both 5-fold cross-validation and jackknife tests.
- Prediction of additional essential genes: Identified additional essential genes that were overlooked in experimental datasets.
- Homology insights: Among newly predicted essential genes, 20 were homologous to known mouse essential genes.
Scientific Applications:
- Gene function analysis: Facilitates identification of genes critical for cell survival and core biological processes.
- Cancer research: Supports prediction of gene essentiality in human cancer cell lines to inform potential therapeutic targets.
- Comparative genomics: Enables cross-species comparisons of essential genes and assessment of evolutionary conservation.
Methodology:
Extracts nucleotide composition and association features using the λ-interval Z curve, classifies genes with a Support Vector Machine, and evaluates performance using 5-fold cross-validation and jackknife tests.
Topics
Details
- Tool Type:
- web application
- Operating Systems:
- Linux, Windows, Mac
- Added:
- 6/5/2018
- Last Updated:
- 11/25/2024
Operations
Publications
Guo F, Dong C, Hua H, Liu S, Luo H, Zhang H, Jin Y, Zhang K. Accurate prediction of human essential genes using only nucleotide composition and association information. Bioinformatics. 2017;33(12):1758-1764. doi:10.1093/bioinformatics/btx055. PMID:28158612. PMCID:PMC7110051.