Bowtie

Bowtie aligns short DNA sequence reads to large genomes for high-throughput sequence mapping.


Key Features:

  • Burrows-Wheeler indexing: Leverages Burrows-Wheeler indexing to achieve high performance in speed and memory usage when indexing large genomes.
  • Quality-aware backtracking: Employs a quality-aware backtracking algorithm that permits mismatches during alignment.
  • Throughput: Achieves alignment throughput exceeding 25 million reads per CPU hour.
  • Memory footprint: Maintains a minimal memory footprint of approximately 1.3 gigabytes for the human genome.
  • Parallel processing: Supports parallel processing by utilizing multiple processor cores simultaneously.
  • Short-read optimization: Optimized for aligning short reads generated by next-generation sequencing platforms.
  • Benchmark evaluations: Has been evaluated against Bowtie2, BWA, SOAP2, MAQ, RMAP, GSNAP, Novoalign, and mrsFAST (mrFAST) using synthetic data and real RNA-Seq data.
  • Speed–accuracy trade-offs: Demonstrates trade-offs between speed and accuracy, favoring throughput in many tests while other tools may perform better on longer read lengths.

Scientific Applications:

  • Short-read alignment: Mapping short DNA sequence reads to large reference genomes such as the human genome.
  • High-throughput NGS processing: Processing large-scale next-generation sequencing datasets at high throughput.
  • RNA-Seq read alignment: Alignment of RNA-Seq reads in transcriptomic analyses (evaluated using real RNA-Seq data).
  • Comparative benchmarking: Comparative evaluation of aligner throughput and speed–accuracy trade-offs across mapping tools.

Methodology:

Uses Burrows-Wheeler indexing with an extended, quality-aware backtracking algorithm that permits mismatches and supports parallel processing across multiple processor cores.

Topics

Collections

Details

Maturity:
Mature
Cost:
Free of charge
Tool Type:
command-line tool
Operating Systems:
Linux, Windows, Mac
Programming Languages:
C++
Added:
1/13/2017
Last Updated:
11/5/2024

Operations

Data Inputs & Outputs

Publications

Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology. 2009;10(3). doi:10.1186/gb-2009-10-3-r25. PMID:19261174. PMCID:PMC2690996.

Hatem A, Bozdağ D, Toland AE, Çatalyürek ÜV. Benchmarking short sequence mapping tools. BMC Bioinformatics. 2013;14(1). doi:10.1186/1471-2105-14-184. PMID:23758764. PMCID:PMC3694458.

Mareuil F, Doppelt-Azeroual O, Ménager H. A public Galaxy platform at Pasteur used as an execution engine for web services. Unknown Journal. 2017. doi:10.7490/f1000research.1114334.1.

Documentation

Links