Boiler

Boiler is a novel software tool that efficiently compresses and queries large RNA-seq alignment datasets. By discarding most of the per-read data and retaining only essential information, such as genomic coverage vectors and a few empirical distributions, Boiler achieves a significantly smaller storage footprint compared to other compression tools. Despite the high compression level, Boiler can recover the most relevant per-read data, resulting in only a minimal impact on the results of downstream analysis tools for isoform assembly and quantification. Additionally, Boiler enables users to perform rapid and helpful queries on the compressed files without the need for complete decompression.

Topic

RNA-seq;Nucleic acid structure analysis;Data management

Detail

  • Operation: Sequence alignment;Formatting

  • Software interface: Library

  • Language: Python

  • License: MIT License

  • Cost: Free of charge with restrictions

  • Version name: v1.0.0

  • Credit: Sloan Research Fellowship, NSF, National Institute of General Medical Sciences.

  • Input: -

  • Output: -

  • Contact: Langmead B langmea@cs.jhu.edu

  • Collection: -

  • Maturity: -

Publications

  • Boiler: lossy compression of RNA-seq alignments using coverage vectors.
  • Pritt J and Langmead B. Boiler: lossy compression of RNA-seq alignments using coverage vectors. Boiler: lossy compression of RNA-seq alignments using coverage vectors. 2016; 44:e133. doi: 10.1093/nar/gkw540
  • https://doi.org/10.1093/nar/gkw540
  • PMID: 27298258
  • PMC: PMC5027496

Download and documentation


< Back to DB search