dnaasm
dnaasm assembles de novo DNA sequences with enhanced restoration of long tandem repeats to accurately reconstruct repetitive genomic regions.
Key Features:
- Tandem Repeat Restoration: Restores long tandem repeats that exceed maximum read lengths and the insert size of paired-end tags.
- Relative Frequency Utilization: Leverages the relative frequency of reads instead of relying solely on mapping paired-end tags to unitigs and estimating distances.
- Single-Read Repeat Reconstruction: Enables reconstruction of repetitive regions covered only by single-read sequencing data.
- De Bruijn Graph Construction: Constructs a de Bruijn graph to represent overlaps between sequencing reads.
- Graph Correction: Applies graph correction to resolve sequencing errors and graph ambiguities.
- Edge Weight Normalization: Normalizes edge weights to reflect read frequencies for improved handling of repeats.
- Output Generation: Produces assembled DNA sequences from the processed graph.
- Validation on Bacterial Data: Has been tested on real bacterial datasets to assess performance on repetitive genomic regions.
Scientific Applications:
- Microbial Genomics: Enables assembly of bacterial genomes containing long tandem repeats.
- Repetitive Region Analysis: Facilitates characterization and reconstruction of tandem repeats in genomes.
- Evolutionary Biology: Supports studies of repeat-driven genome evolution by providing accurate repeat sequences.
- Genetic Engineering: Assists design and verification of constructs where repetitive sequences are relevant.
Methodology:
Builds a de Bruijn graph, performs graph correction, normalizes edge weights based on read frequency, leverages relative read-frequency information instead of paired-end distance estimates, and generates assembled DNA sequences.
Topics
Details
- License:
- LGPL-3.0
- Tool Type:
- web application
- Operating Systems:
- Linux, Windows
- Programming Languages:
- C++, Python
- Added:
- 7/28/2018
- Last Updated:
- 11/25/2024
Operations
Data Inputs & Outputs
Publications
Kuśmirek W, Nowak R. De novo assembly of bacterial genomes with repetitive DNA regions by dnaasm application. BMC Bioinformatics. 2018;19(1). doi:10.1186/s12859-018-2281-4. PMID:30021513. PMCID:PMC6052550.