SSAKE

SSAKE assembles short DNA sequence reads from high-throughput sequencing technologies such as Solexa into longer contiguous sequences to enable de novo genome reconstruction and characterization.

Key Features:

Aggressive Assembly Strategy: Progressively searches to identify and extend the longest possible k-mer overlaps between reads to maximize use of short sequences (e.g., 25-nucleotide Solexa reads).
Prefix Tree and Hash Table: Stores sequence data in a hash table and leverages a prefix tree data structure to efficiently manage and search overlaps among reads.
High-Throughput Compatibility: Designed to process millions of short reads simultaneously to accommodate high-throughput sequencing datasets.
Stringent Assembly for Identical Sequences: Emphasizes stringent assembly of highly identical sequences to mitigate ambiguity introduced by ubiquitous genomic repeats.

Scientific Applications:

De novo sequencing projects: Assembles short reads into longer contigs to characterize novel genomic targets.
Genome assembly: Facilitates construction of larger contiguous genomic sequences from fragmented short-read data.
Variant detection: Improves resolution of genomic regions to aid identification of genetic variants.
Structural genomics: Assists analysis of structural variation within genomes by producing longer assembled sequences.

Methodology:

Sequence reads are stored in a hash table, a prefix tree is used to index and search reads, and the algorithm progressively extends the longest k-mer overlaps between sequences.

Visit Official Homepage →

Topics

Sequencing Sequence assembly

Details

License:: GPL-2.0
Tool Type:: command-line tool
Operating Systems:: Linux
Programming Languages:: Perl
Added:: 1/13/2017
Last Updated:: 11/25/2024

Operations

Publications

Warren RL, Sutton GG, Jones SJM, Holt RA. Assembling millions of short DNA sequences using SSAKE. Bioinformatics. 2006;23(4):500-501. doi:10.1093/bioinformatics/btl629. PMID:17158514. PMCID:PMC7109930.

DOI: 10.1093/bioinformatics/btl629

PMID: 17158514

PMCID: PMC7109930

← Back to search