KmerGenie
KmerGenie estimates the optimal k-mer length for de Bruijn graph-based genome assemblers from sequencing reads to improve assembly quality and accuracy.
Key Features:
- k-mer abundance histograms: Computes approximate abundance histograms for multiple values of k using a fast sampling method.
- Optimization heuristic: Applies a heuristic that predicts the number of distinct genomic k-mers for each candidate k and selects the k that maximizes this predicted count.
- Performance and validation: Has been tested across diverse sequencing datasets with selected k values shown to yield improved genome assemblies.
Scientific Applications:
- De novo genome assembly: Optimizes k for de Bruijn graph-based assemblers to improve the quality and accuracy of assembled genomes from sequencing reads.
Methodology:
Computes approximate k-mer abundance histograms by fast sampling across multiple k values and uses a heuristic to predict distinct genomic k-mers per k, selecting the k that maximizes the predicted count.
Topics
Details
- Maturity:
- Mature
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Mac
- Programming Languages:
- R, C++, Python
- Added:
- 1/13/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Chikhi R, Medvedev P. Informed and automated <i>k</i>-mer size selection for genome assembly. Bioinformatics. 2013;30(1):31-37. doi:10.1093/bioinformatics/btt310. PMID:23732276.
PMID: 23732276