VCAKE
VCAKE performs de novo assembly of short reads with robust error correction by k-mer extension and consensus verification using high-depth coverage.
Key Features:
- Error correction via high-depth coverage: Leverages high sequencing depth to identify and correct errors characteristic of short-read technologies such as Illumina's Solexa Sequencing (≈30 bp reads).
- K-mer extension-based consensus verification: Constructs assemblies by extending k-mers and verifying consensus sequences through repeated observations across high-coverage reads.
- Modification of prior k-mer extension algorithms: Implements a simple modification of earlier k-mer extension approaches (e.g., SSAKE) to improve assembly accuracy in the presence of errors.
- Validated on diverse datasets: Demonstrates improved assembly performance on both simulated and experimental datasets containing sequencing errors.
- Targeted for small-genome assembly: Optimized for de novo assembly of organisms with small genomes using large numbers of short reads.
Scientific Applications:
- De novo genome assembly (small genomes): Produces accurate assemblies from short, high-coverage reads for small-genome sequencing projects.
- Evolutionary biology: Enables generation of reliable genome assemblies for comparative and evolutionary analyses.
- Microbiology: Supports assembly of microbial genomes from short-read sequencing data with high error rates.
- Genetics: Provides accurate sequence reconstructions for downstream genetic analyses and variant discovery in small-genome contexts.
Methodology:
Uses a modified k-mer extension algorithm (building on SSAKE) with consensus verification by repeated observation across high-depth reads to identify and correct sequencing errors.
Topics
Details
- License:
- GPL-3.0
- Maturity:
- Legacy
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Mac
- Programming Languages:
- Perl, C
- Added:
- 1/13/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Jeck WR, Reinhardt JA, Baltrus DA, Hickenbotham MT, Magrini V, Mardis ER, Dangl JL, Jones CD. Extending assembly of short DNA sequences to handle error. Bioinformatics. 2007;23(21):2942-2944. doi:10.1093/bioinformatics/btm451. PMID:17893086.
PMID: 17893086