G-SQZ
G-SQZ is a software tool for compressing high-throughput sequencing data without altering their relative order, making selective access faster and easier. It uses a Huffman coding-based representation scheme that has achieved from 65% to 81% compression on benchmark datasets. G-SQZ reduces infrastructure and informatics costs for managing and analyzing large sequencing data. It is available for free download and use for academic/non-profit purposes and a license can be requested for for-profit use.
Topic
Data management;Bioinformatics;Applied mathematics
Detail
Operation: Optimisation and refinement;Formatting
Software interface: Web user interface
Language: ;Huffman coding-based sequencing-reads specific representation scheme that compresses data without altering the relative order. It allows selective access without scanning and decoding form start;Web application;Python
License: -
Cost: Free for academic/non-profit purposes
Version name: -
Credit: -
Input: -
Output: -
Contact: wtembe@tgen.org
Collection: -
Maturity: -
Publications
- G-SQZ: compact encoding of genomic sequence and quality data.
- Tembe W, et al. G-SQZ: compact encoding of genomic sequence and quality data. G-SQZ: compact encoding of genomic sequence and quality data. 2010; 26:2192-4. doi: 10.1093/bioinformatics/btq346
- https://doi.org/10.1093/bioinformatics/btq346
- PMID: 20605925
- PMC: -
Download and documentation
Currently not available or not maintained.
< Back to DB search