Centrifuger
Centrifuger performs taxonomic classification of metagenomic sequencing reads by comparing them against comprehensive microbial genome databases such as RefSeq to assign taxonomy.
Key Features:
- Taxonomic classification: Performs classification of metagenomic sequencing reads by comparing them to microbial genome databases such as RefSeq.
- Run-block compression (lossless): Uses run-block compression to transform Burrows-Wheeler transformed (BWT) genome sequences into a representation achieving sublinear space complexity.
- FM-index compaction: Integrates strategies to compact the Ferragina-Manzini (FM) index, halving memory usage compared to other FM-index-based approaches.
- Rapid rank queries: Facilitates rapid rank queries on the compressed index to support sequence matching and lookup.
- Unconstrained match length: Supports unconstrained match length to improve precision of taxonomic assignments, particularly at lower taxonomic levels.
- Reduced memory footprint: Reduces memory requirements for processing microbial genomic data without sacrificing classification accuracy.
Scientific Applications:
- Metagenomic taxonomic profiling: Assigns taxonomy to reads in metagenomic studies using comparisons to microbial genome databases.
- High-resolution classification: Provides improved classification accuracy at lower taxonomic levels where precision is critical.
- Large-scale metagenomic analyses: Enables analyses of large microbial databases by reducing storage and memory requirements.
- Sequence classification tasks: Performs sequence-level classification for microbial genomic data.
Methodology:
Compares sequencing reads to microbial genome databases (e.g., RefSeq); transforms Burrows-Wheeler transformed (BWT) genome sequences using lossless run-block compression; compacts the Ferragina-Manzini (FM) index to reduce memory; supports rapid rank queries and unconstrained match length for taxonomic assignment.
Topics
Details
- License:
- MIT
- Cost:
- Free of charge
- Tool Type:
- command-line tool
- Programming Languages:
- C++
- Added:
- 6/18/2024
- Last Updated:
- 11/24/2024
Operations
Publications
Song L, Langmead B. Centrifuger: lossless compression of microbial genomes for efficient and accurate metagenomic sequence classification. Genome Biology. 2024;25(1). doi:10.1186/s13059-024-03244-4. PMID:38664753. PMCID:PMC11046777.