SOAPdenovo

SOAPdenovo assembles genomes de novo from short-read sequences generated by next-generation massively parallel DNA sequencing technologies to reconstruct contigs and scaffolds for genome assembly.


Key Features:

  • De novo assembly from short reads: Performs de novo assembly using short reads generated by next-generation massively parallel DNA sequencing technologies.
  • Contig and scaffold construction: Constructs contiguous sequences (contigs) and larger genomic scaffolds from short-read data.
  • Performance metrics: Achieved N50 contig sizes of 7.4 kb for the Asian human genome and 5.9 kb for the African human genome, and produced scaffold lengths of 446.3 kb and 61.9 kb, respectively.
  • Short-read data management: Manages ultrahigh-throughput, very short read-length sequences that complicate genome assembly.

Scientific Applications:

  • Reference genome construction: Enables construction of reference genomes for previously unexplored or poorly characterized organisms.
  • Cost-effective genome assembly: Provides a cost-effective approach to assemble genomes when long-read sequencing methods may be impractical due to cost or technical limitations.
  • Large-genome assembly from short reads: Facilitates assembly of large genomes using short-read sequencing data for genomic studies.
  • Downstream genomic analyses: Produces assemblies that support downstream genomic analyses and comparative studies.

Methodology:

Constructs contigs and scaffolds by managing short-read sequences generated by next-generation massively parallel DNA sequencing technologies.

Topics

Details

License:
GPL-3.0
Maturity:
Mature
Tool Type:
workflow
Operating Systems:
Linux, Mac
Programming Languages:
C++, C
Added:
1/13/2017
Last Updated:
11/25/2024

Operations

Publications

Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, Li Y, Li S, Shan G, Kristiansen K, Li S, Yang H, Wang J, Wang J. De novo assembly of human genomes with massively parallel short read sequencing. Genome Research. 2009;20(2):265-272. doi:10.1101/gr.097261.109. PMID:20019144. PMCID:PMC2813482.

Documentation