CRAIG

CRAIG predicts gene structures in genomic sequences using a conditional random field (CRF)–based ab initio gene prediction framework.


Key Features:

  • Conditional Random Field Gene Modeling: Uses a semi-Markov conditional random field (CRF) model to represent gene structures and dependencies among genomic features.
  • Large-Margin Training Algorithm: Applies an online large-margin learning algorithm related to multiclass support vector machines (SVMs) to train prediction models.
  • Integration of Genomic Evidence: Incorporates multiple genomic signals and sequence features to improve prediction of gene structures.
  • Improved Long Intron Prediction: Demonstrates enhanced performance in predicting genes containing long introns.

Scientific Applications:

  • Genome Annotation: Identifies gene structures in genomic sequences for annotation of eukaryotic genomes.
  • Computational Genomics: Supports analysis of genomic regions such as vertebrate genomes and ENCODE project datasets.
  • Gene Structure Analysis: Enables investigation of exon–intron organization and complex gene architectures.

Methodology:

CRAIG models gene structures using a semi-Markov conditional random field trained with an online large-margin algorithm related to multiclass support vector machines to integrate genomic sequence features and predict gene annotations.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux
Programming Languages:
Perl
Added:
12/18/2017
Last Updated:
11/25/2024

Operations

Publications

Bernal A, Crammer K, Hatzigeorgiou A, Pereira F. Global Discriminative Learning for Higher-Accuracy Computational Gene Prediction. PLoS Computational Biology. 2007;3(3):e54. doi:10.1371/journal.pcbi.0030054. PMID:17367206. PMCID:PMC1828702.

Links