Bowtie
Bowtie aligns short DNA sequence reads to large genomes for high-throughput sequence mapping.
Key Features:
- Burrows-Wheeler indexing: Leverages Burrows-Wheeler indexing to achieve high performance in speed and memory usage when indexing large genomes.
- Quality-aware backtracking: Employs a quality-aware backtracking algorithm that permits mismatches during alignment.
- Throughput: Achieves alignment throughput exceeding 25 million reads per CPU hour.
- Memory footprint: Maintains a minimal memory footprint of approximately 1.3 gigabytes for the human genome.
- Parallel processing: Supports parallel processing by utilizing multiple processor cores simultaneously.
- Short-read optimization: Optimized for aligning short reads generated by next-generation sequencing platforms.
- Benchmark evaluations: Has been evaluated against Bowtie2, BWA, SOAP2, MAQ, RMAP, GSNAP, Novoalign, and mrsFAST (mrFAST) using synthetic data and real RNA-Seq data.
- Speed–accuracy trade-offs: Demonstrates trade-offs between speed and accuracy, favoring throughput in many tests while other tools may perform better on longer read lengths.
Scientific Applications:
- Short-read alignment: Mapping short DNA sequence reads to large reference genomes such as the human genome.
- High-throughput NGS processing: Processing large-scale next-generation sequencing datasets at high throughput.
- RNA-Seq read alignment: Alignment of RNA-Seq reads in transcriptomic analyses (evaluated using real RNA-Seq data).
- Comparative benchmarking: Comparative evaluation of aligner throughput and speed–accuracy trade-offs across mapping tools.
Methodology:
Uses Burrows-Wheeler indexing with an extended, quality-aware backtracking algorithm that permits mismatches and supports parallel processing across multiple processor cores.
Topics
Collections
Details
- Maturity:
- Mature
- Cost:
- Free of charge
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Windows, Mac
- Programming Languages:
- C++
- Added:
- 1/13/2017
- Last Updated:
- 11/5/2024
Operations
Data Inputs & Outputs
Publications
Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology. 2009;10(3). doi:10.1186/gb-2009-10-3-r25. PMID:19261174. PMCID:PMC2690996.
Hatem A, Bozdağ D, Toland AE, Çatalyürek ÜV. Benchmarking short sequence mapping tools. BMC Bioinformatics. 2013;14(1). doi:10.1186/1471-2105-14-184. PMID:23758764. PMCID:PMC3694458.
Mareuil F, Doppelt-Azeroual O, Ménager H. A public Galaxy platform at Pasteur used as an execution engine for web services. Unknown Journal. 2017. doi:10.7490/f1000research.1114334.1.