BRAKER1

BRAKER1 performs unsupervised eukaryotic gene prediction and genome annotation by integrating RNA-Seq data with GeneMark-ET and AUGUSTUS.


Key Features:

  • Unsupervised training: Leverages RNA-Seq data for unsupervised training, eliminating the need for pre-trained parameters or expert-prepared training sets.
  • Iterative GeneMark-ET training: Uses GeneMark-ET to perform iterative training on RNA-Seq data to generate initial gene structures.
  • AUGUSTUS refinement: Employs AUGUSTUS to refine GeneMark-ET predictions by using the predicted gene structures for additional training and integrating RNA-Seq read information.
  • Input requirements: Requires a genome assembly file and a BAM-formatted file containing spliced alignments of RNA-Seq reads to the genome.
  • Performance comparison: Has been reported to achieve higher accuracy than MAKER2 when training and prediction are performed using only RNA-Seq data.

Scientific Applications:

  • Eukaryotic genome annotation: Generating gene models for annotation of diverse eukaryotic genomes using RNA-Seq evidence.
  • Comparative genomics: Providing gene predictions for comparative analyses across species.
  • Evolutionary biology and large-scale projects: Supplying automated RNA-Seq-based gene predictions for evolutionary studies and large-scale genome projects.

Methodology:

GeneMark-ET performs iterative training on RNA-Seq data to produce initial gene structures, and AUGUSTUS refines those predictions by incorporating the predicted gene structures and RNA-Seq read information.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux
Programming Languages:
Perl
Added:
8/3/2017
Last Updated:
11/25/2024

Operations

Data Inputs & Outputs

Genome annotation

Publications

Hoff KJ, Lange S, Lomsadze A, Borodovsky M, Stanke M. BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS. Bioinformatics. 2015;32(5):767-769. doi:10.1093/bioinformatics/btv661. PMID:26559507. PMCID:PMC6078167.

PMID: 26559507
PMCID: PMC6078167
Funding: - National Institutes of Health: HG000783 - German Research Foundation: STA 1009/10-1

Documentation

Links