wtdbg2

wtdbg2 assembles genomes de novo from long noisy PacBio and Oxford Nanopore Technologies (ONT) reads to produce consensus sequences for large-genome reconstruction.


Key Features:

  • De novo assembly: Performs de novo sequence assembly specifically for long noisy reads.
  • Raw-read processing: Processes raw reads directly without prior error correction.
  • Consensus construction: Constructs consensus sequences from intermediate assembly outputs.
  • Supported technologies: Handles long-read data from PacBio and Oxford Nanopore Technologies (ONT).
  • Speed: Achieves throughput reported as 2–17× faster than previously published long-read assemblers such as CANU and FALCON.
  • Accuracy and contiguity: Maintains comparable contiguity and base accuracy relative to other long-read assemblers.
  • Large-genome capability: Scales to assemble very large genomes, including the human genome and the ~32 Gb Axolotl genome.
  • High-throughput scaling: Supports population-scale long-read assembly projects by scaling to high-throughput sequencing datasets.

Scientific Applications:

  • Large-genome assembly: Assembly of large and complex genomes such as the human genome and the 32 Gb Axolotl genome.
  • Population-scale assembly: Generation of assemblies for population-scale studies using high-throughput PacBio and ONT datasets.
  • Genomic research: Rapid production of genome assemblies to support comparative genomics and other genomic analyses.

Methodology:

Performs de novo sequence assembly by processing raw long noisy reads without prior error correction and constructing consensus sequences from intermediate assembly outputs.

Topics

Details

License:
GPL-3.0
Maturity:
Mature
Cost:
Free of charge
Programming Languages:
C
Added:
1/14/2020
Last Updated:
11/24/2024

Operations

Data Inputs & Outputs

Publications

Ruan J, Li H. Fast and accurate long-read assembly with wtdbg2. Nature Methods. 2019;17(2):155-158. doi:10.1038/s41592-019-0669-3. PMID:31819265. PMCID:PMC7004874.

PMID: 31819265
PMCID: PMC7004874
Funding: - U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute: R01HG010040 - National Natural Science Foundation of China: 31571353, 31822029

Downloads

Links