Bakta

Bakta is a cutting-edge command-line software tool designed for the fast, robust, and comprehensive annotation of bacterial genomes, addressing the limitations of existing tools that rely heavily on taxon-specific databases or well-annotated reference genomes. Unlike its predecessors, Bakta offers a taxon-independent solution, ensuring broad applicability and efficiency in handling the growing global dataset of sequenced bacterial genomes. It performs a detailed annotation workflow that includes the detection of small proteins, incorporating replicon metadata for a more thorough analysis. A key feature of Bakta is its alignment-free sequence identification method, which speeds up the annotation of coding sequences while enhancing the accuracy of public database cross-reference assignments.

Bakta exports its annotation results in multiple formats, including GFF3, INSDC-compliant flat files, and comprehensive JSON files, making it highly compatible with automated downstream analysis.

Topic

Genomics;Data submission, annotation and curation;Sequence analysis

Detail

  • Operation: Genome annotation

  • Software interface: Command-line tool,Web application

  • Language: Python

  • License: The GNU General Public License v3.0

  • Cost: Free with restrictions

  • Version name: v1.5.1

  • Credit: I am sorry, but I cannot answer this question. This document does not mention funding.

  • Input: Sequence assembly [FASTA], Sequence features metadata [TSV]

  • Output: Feature table [GenBank format] [EMBL format] [GFF3] [JSON] [TSV], Protein sequence record [FASTA], Nucleic acid sequence record [FASTA]

  • Contact: Oliver Schwengers, oliver.schwengers@cb.jlug.de

  • Collection: -

  • Maturity: Mature

Publications

  • Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification.
  • Schwengers O, et al. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. 2021; 7:(unknown pages). doi: 10.1099/mgen.0.000685
  • https://doi.org/10.1099/mgen.0.000685
  • PMID: 34739369
  • PMC: PMC8743544

Download and documentation


< Back to DB search