wtdbg2
wtdbg2 assembles genomes de novo from long noisy PacBio and Oxford Nanopore Technologies (ONT) reads to produce consensus sequences for large-genome reconstruction.
Key Features:
- De novo assembly: Performs de novo sequence assembly specifically for long noisy reads.
- Raw-read processing: Processes raw reads directly without prior error correction.
- Consensus construction: Constructs consensus sequences from intermediate assembly outputs.
- Supported technologies: Handles long-read data from PacBio and Oxford Nanopore Technologies (ONT).
- Speed: Achieves throughput reported as 2–17× faster than previously published long-read assemblers such as CANU and FALCON.
- Accuracy and contiguity: Maintains comparable contiguity and base accuracy relative to other long-read assemblers.
- Large-genome capability: Scales to assemble very large genomes, including the human genome and the ~32 Gb Axolotl genome.
- High-throughput scaling: Supports population-scale long-read assembly projects by scaling to high-throughput sequencing datasets.
Scientific Applications:
- Large-genome assembly: Assembly of large and complex genomes such as the human genome and the 32 Gb Axolotl genome.
- Population-scale assembly: Generation of assemblies for population-scale studies using high-throughput PacBio and ONT datasets.
- Genomic research: Rapid production of genome assemblies to support comparative genomics and other genomic analyses.
Methodology:
Performs de novo sequence assembly by processing raw long noisy reads without prior error correction and constructing consensus sequences from intermediate assembly outputs.
Topics
Details
- License:
- GPL-3.0
- Maturity:
- Mature
- Cost:
- Free of charge
- Programming Languages:
- C
- Added:
- 1/14/2020
- Last Updated:
- 11/24/2024
Operations
Data Inputs & Outputs
De-novo assembly
Publications
Ruan J, Li H. Fast and accurate long-read assembly with wtdbg2. Nature Methods. 2019;17(2):155-158. doi:10.1038/s41592-019-0669-3. PMID:31819265. PMCID:PMC7004874.
PMID: 31819265
PMCID: PMC7004874
Funding: - U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute: R01HG010040
- National Natural Science Foundation of China: 31571353, 31822029
Downloads
Links
Repository
https://github.com/ruanjue/wtdbg2