AUGUSTUS

AUGUSTUS predicts protein-coding genes in eukaryotic genomes using ab initio modeling and integration of extrinsic evidence such as RNA-Seq, expressed sequence tags (ESTs), proteomics data, and annotations from related species.


Key Features:

  • Ab Initio Gene Prediction: Predicts gene structures in genomic sequences using a Hidden Markov Model (HMM) framework with models for intron length, donor splice sites, and GC-content dependent parameters.
  • Extrinsic Evidence Integration: Incorporates RNA-Seq data, ESTs, proteomics data, and annotations from related genomes to improve gene prediction accuracy.
  • Comparative Gene Prediction: Performs simultaneous gene prediction across multiple aligned genomes by exploiting evolutionary conservation and negative selection.
  • Genome-Specific Model Training: Allows parameter training tailored to specific genomes to improve prediction accuracy.
  • PPX Extension: Uses protein multiple sequence alignments to identify additional members of protein families within genomes.

Scientific Applications:

  • Eukaryotic Genome Annotation: Identifies protein-coding genes during structural annotation of eukaryotic genomes.
  • Comparative Genomics: Enables gene prediction across multiple aligned genomes to study evolutionary conservation of gene structures.
  • Transcriptome-Assisted Annotation: Integrates RNA-Seq and EST evidence to refine gene models in genome annotation projects.

Methodology:

AUGUSTUS applies Hidden Markov Models with submodels for splice sites, intron length distributions, and GC-content dependent parameters, integrates extrinsic evidence from transcriptomic and proteomic data, and performs comparative gene prediction on aligned genomes using a graph-based binary labeling framework optimized through subgradient-based dual decomposition.

Topics

Collections

Details

License:
Artistic-1.0
Maturity:
Mature
Tool Type:
command-line tool
Operating Systems:
Linux, Mac
Programming Languages:
C++
Added:
2/10/2017
Last Updated:
11/25/2024

Operations

Data Inputs & Outputs

Homology-based gene prediction

Other operations do not define inputs or outputs.

Publications

Hoff KJ, Lomsadze A, Borodovsky M, Stanke M. Whole-Genome Annotation with BRAKER. Methods in Molecular Biology. 2019. doi:10.1007/978-1-4939-9173-0_5. PMID:31020555. PMCID:PMC6635606.

Nachtweide S, Stanke M. Multi-Genome Annotation with AUGUSTUS. Methods in Molecular Biology. 2019. doi:10.1007/978-1-4939-9173-0_8. PMID:31020558.

Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003;19(suppl_2):ii215-ii225. doi:10.1093/bioinformatics/btg1080. PMID:14534192.

Stanke M, Schöffmann O, Morgenstern B, Waack S. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics. 2006;7(1). doi:10.1186/1471-2105-7-62. PMID:16469098. PMCID:PMC1409804.

Stanke M, Tzvetkova A, Morgenstern B. AUGUSTUS at EGASP: using EST, protein and genomic alignments for improved gene prediction in the human genome. Genome Biology. 2006;7(S1). doi:10.1186/gb-2006-7-s1-s11. PMID:16925833. PMCID:PMC1810548.

Stanke M, Diekhans M, Baertsch R, Haussler D. Using native and syntenically mapped cDNA alignments to improve <i>de novo</i> gene finding. Bioinformatics. 2008;24(5):637-644. doi:10.1093/bioinformatics/btn013. PMID:18218656.

Keller O, Kollmar M, Stanke M, Waack S. A novel hybrid gene prediction method employing protein multiple sequence alignments. Bioinformatics. 2011;27(6):757-763. doi:10.1093/bioinformatics/btr010. PMID:21216780.

König S, Romoth LW, Gerischer L, Stanke M. Simultaneous gene finding in multiple genomes. Bioinformatics. 2016;32(22):3388-3395. doi:10.1093/bioinformatics/btw494. PMID:27466621. PMCID:PMC5860283.

PMID: 27466621
PMCID: PMC5860283
Funding: - the DFG Research Unit: 1234 - the German Research Foundation: GRK 1870/01

Hoff KJ, Stanke M. Predicting Genes in Single Genomes with AUGUSTUS. Current Protocols in Bioinformatics. 2018;65(1). doi:10.1002/cpbi.57. PMID:30466165.

Documentation

Links

Related Tools

webaugustus
Relation: usedBy