CGR
CGR transforms DNA sequences into numerical representations that are used with an artificial neural network to detect acceptor and donor splice sites.
Key Features:
- Chaos Game Representation (CGR): An iterative mapping technique that converts DNA sequences into numerical sequences in a one-to-one manner by assigning each nucleotide a specific position on a plane.
- Artificial Neural Network: Numerical sequences produced by CGR serve as feature vectors that are input to an artificial neural network to classify splice sites.
- Simplicity and Efficiency: The approach uses a single neural network component and computational experiments report good accuracy on the NN269 dataset.
Scientific Applications:
- Splice Site Detection: Identification of acceptor and donor splice sites within DNA sequences to support analyses of splicing mechanisms and exon–intron boundaries.
- Genomic Research: Precise splice site identification for applications in studies of gene expression regulation, genetic disorder research, and evolutionary biology.
Methodology:
DNA sequences are converted into numerical sequences using the chaos game representation, an iterative mapping that assigns nucleotides to positions on a plane in a one-to-one manner. The resulting numerical sequences serve as feature vectors. These feature vectors are input into an artificial neural network that classifies acceptor and donor splice sites.
Topics
Details
- Tool Type:
- command-line tool
- Programming Languages:
- Python
- Added:
- 1/14/2020
- Last Updated:
- 1/7/2021
Operations
Publications
Hoang T, Yin C, Yau SS. Splice sites detection using chaos game representation and neural network. Genomics. 2020;112(2):1847-1852. doi:10.1016/j.ygeno.2019.10.018. PMID:31704313.
PMID: 31704313