PhD-SNP
PhD-SNP predicts the deleteriousness of non-synonymous single nucleotide polymorphisms (nsSNPs) in human proteins to identify variants associated with genetic disease.
Key Features:
- Support Vector Machine Algorithm: Employs support vector machines (SVMs) to process protein sequence information and discriminate disease-related from neutral single-point mutations.
- High Accuracy: Demonstrates prediction accuracy exceeding 74% for determining whether a single point mutation is associated with genetic diseases.
- Comprehensive Dataset: Developed and validated on a dataset comprising 21,185 single-point mutations across 3,587 proteins, with 61% labeled as disease-related.
- Focus on Non-Synonymous SNPs (nsSNPs): Concentrates on nsSNPs that cause amino acid changes in protein sequences.
- Application to Significant Diseases: Addresses mutations relevant to diseases including Alzheimer's, Parkinson's, and Creutzfeldt-Jakob's diseases.
Scientific Applications:
- Disease Prediction and Research: Identifies nsSNPs that may contribute to genetic disorders, aiding research into pathogenic variants.
- Genetic Variation Analysis: Provides insights into how specific amino acid substitutions affect protein function and disease susceptibility.
- Biomedical Informatics: Integrates machine learning with biological sequence data to support genotype–phenotype investigations.
Methodology:
The method trains a support vector machine on a dataset of known nsSNPs (21,185 single-point mutations across 3,587 proteins) and analyzes sequence context and structural implications of each mutation to assess the likelihood that a SNP is deleterious.
Topics
Collections
Details
- Tool Type:
- web application
- Operating Systems:
- Linux, Windows, Mac
- Added:
- 1/22/2015
- Last Updated:
- 11/24/2024
Operations
Publications
Capriotti E, Calabrese R, Casadio R. Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics. 2006;22(22):2729-2734. doi:10.1093/bioinformatics/btl423. PMID:16895930.
PMID: 16895930