MEGAHIT
MEGAHIT assembles NGS metagenomic sequencing reads de novo using a succinct de Bruijn graph to produce assemblies of large and complex metagenomic datasets.
Key Features:
- De novo NGS metagenomic assembly: Performs de novo assembly of next-generation sequencing (NGS) metagenomic reads.
- Succinct de Bruijn graph: Uses a succinct de Bruijn graph approach to reduce memory footprint during assembly.
- Low memory operation: Operates with low memory usage while maintaining high performance.
- No preprocessing required: Assembles large datasets without requiring partitioning or normalization preprocessing.
- GPU-accelerated and CPU modes: Supports GPU-accelerated execution (demonstrated) and also runs without a GPU.
- Scalability demonstration: Successfully assembled a 252 gigabase pair (Gbps) soil metagenomics dataset in 44.1 hours with a GPU and in 99.6 hours without a GPU on a single computing node.
- Improved assembly quality: Produces assemblies reported as three times larger than previous methods, with improved contig N50 and average contig lengths.
- Increased read alignment: Achieved 55.8% of reads aligned to the assembly for the demonstrated dataset, representing a fourfold improvement over earlier techniques.
- Environmental sample suitability: Effective for extensive metagenomic data from environments such as soil.
Scientific Applications:
- Large-scale metagenome assembly: Assembly of large and complex metagenomic datasets, including soil metagenomes.
- Comparative assembly benchmarking: Comparative evaluation to improve assembly size, contig N50, and average contig lengths versus previous methods.
- Read recruitment enhancement: Increasing the fraction of reads that align to assembled metagenomes (e.g., 55.8% in the demonstrated dataset).
Methodology:
Performs de novo assembly of NGS metagenomic reads using a succinct de Bruijn graph approach and can run with GPU acceleration or in CPU-only mode.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Mac
- Programming Languages:
- C++
- Added:
- 8/3/2017
- Last Updated:
- 11/24/2024
Operations
Publications
Li D, Liu C, Luo R, Sadakane K, Lam T. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct <i>de Bruijn</i> graph. Bioinformatics. 2015;31(10):1674-1676. doi:10.1093/bioinformatics/btv033. PMID:25609793.
PMID: 25609793