CGR

CGR transforms DNA sequences into numerical representations that are used with an artificial neural network to detect acceptor and donor splice sites.


Key Features:

  • Chaos Game Representation (CGR): An iterative mapping technique that converts DNA sequences into numerical sequences in a one-to-one manner by assigning each nucleotide a specific position on a plane.
  • Artificial Neural Network: Numerical sequences produced by CGR serve as feature vectors that are input to an artificial neural network to classify splice sites.
  • Simplicity and Efficiency: The approach uses a single neural network component and computational experiments report good accuracy on the NN269 dataset.

Scientific Applications:

  • Splice Site Detection: Identification of acceptor and donor splice sites within DNA sequences to support analyses of splicing mechanisms and exon–intron boundaries.
  • Genomic Research: Precise splice site identification for applications in studies of gene expression regulation, genetic disorder research, and evolutionary biology.

Methodology:

DNA sequences are converted into numerical sequences using the chaos game representation, an iterative mapping that assigns nucleotides to positions on a plane in a one-to-one manner. The resulting numerical sequences serve as feature vectors. These feature vectors are input into an artificial neural network that classifies acceptor and donor splice sites.

Topics

Details

Tool Type:
command-line tool
Programming Languages:
Python
Added:
1/14/2020
Last Updated:
1/7/2021

Operations

Publications

Hoang T, Yin C, Yau SS. Splice sites detection using chaos game representation and neural network. Genomics. 2020;112(2):1847-1852. doi:10.1016/j.ygeno.2019.10.018. PMID:31704313.