SpliceFinder
SpliceFinder predicts splice sites within genomic sequences using convolutional neural networks to identify canonical (GT/AG) and non-canonical splice junctions for gene structure analysis.
Key Features:
- CNN-based prediction: Employs a convolutional neural network architecture trained on human genomic data for ab initio splice site prediction.
- Iterative dataset reconstruction: Uses an iterative reconstruction approach to address class imbalance during training.
- Dinucleotide discrimination: Accounts for frequent GT and AG dinucleotide occurrences in non-splicing regions to reduce false positives.
- High accuracy: Achieves a classification accuracy of 90.25%, approximately 10% higher than existing algorithms.
- Reduced false positives: Produces about half as many false positives compared to state-of-the-art tools.
- High recall: Maintains a recall rate higher than 0.8 for splice site detection.
- Non-canonical site detection: Identifies non-canonical splice sites in addition to canonical GT/AG junctions.
- Sliding-window localization: Localizes exact splice site positions within long genomic sequences using a sliding window technique.
- Cross-species robustness: Generalizes without retraining to Drosophila melanogaster, Mus musculus, Rattus, and Danio rerio.
Scientific Applications:
- Gene structure analysis: Infers splice junctions to support understanding of gene location and structure.
- Accurate splice site cataloging: Produces more reliable splice site predictions by reducing false positives while maintaining high recall.
- Cross-species analysis: Enables splice site identification across multiple species without retraining for comparative sequence analyses.
Methodology:
Training a convolutional neural network on human genomic data, applying iterative dataset reconstruction to mitigate class imbalance, and scanning long sequences with a sliding window to localize splice sites.
Topics
Details
- Added:
- 1/14/2020
- Last Updated:
- 12/24/2020
Operations
Publications
Wang R, Wang Z, Wang J, Li S. SpliceFinder: ab initio prediction of splice sites using convolutional neural network. BMC Bioinformatics. 2019;20(S23). doi:10.1186/s12859-019-3306-3. PMID:31881982. PMCID:PMC6933889.