dnaasm

dnaasm assembles de novo DNA sequences with enhanced restoration of long tandem repeats to accurately reconstruct repetitive genomic regions.


Key Features:

  • Tandem Repeat Restoration: Restores long tandem repeats that exceed maximum read lengths and the insert size of paired-end tags.
  • Relative Frequency Utilization: Leverages the relative frequency of reads instead of relying solely on mapping paired-end tags to unitigs and estimating distances.
  • Single-Read Repeat Reconstruction: Enables reconstruction of repetitive regions covered only by single-read sequencing data.
  • De Bruijn Graph Construction: Constructs a de Bruijn graph to represent overlaps between sequencing reads.
  • Graph Correction: Applies graph correction to resolve sequencing errors and graph ambiguities.
  • Edge Weight Normalization: Normalizes edge weights to reflect read frequencies for improved handling of repeats.
  • Output Generation: Produces assembled DNA sequences from the processed graph.
  • Validation on Bacterial Data: Has been tested on real bacterial datasets to assess performance on repetitive genomic regions.

Scientific Applications:

  • Microbial Genomics: Enables assembly of bacterial genomes containing long tandem repeats.
  • Repetitive Region Analysis: Facilitates characterization and reconstruction of tandem repeats in genomes.
  • Evolutionary Biology: Supports studies of repeat-driven genome evolution by providing accurate repeat sequences.
  • Genetic Engineering: Assists design and verification of constructs where repetitive sequences are relevant.

Methodology:

Builds a de Bruijn graph, performs graph correction, normalizes edge weights based on read frequency, leverages relative read-frequency information instead of paired-end distance estimates, and generates assembled DNA sequences.

Topics

Details

License:
LGPL-3.0
Tool Type:
web application
Operating Systems:
Linux, Windows
Programming Languages:
C++, Python
Added:
7/28/2018
Last Updated:
11/25/2024

Operations

Data Inputs & Outputs

Publications

Kuśmirek W, Nowak R. De novo assembly of bacterial genomes with repetitive DNA regions by dnaasm application. BMC Bioinformatics. 2018;19(1). doi:10.1186/s12859-018-2281-4. PMID:30021513. PMCID:PMC6052550.

PMID: 30021513
PMCID: PMC6052550
Funding: - Polish National Science Centre: 2014/13/B/NZ6/00881