For better experience, turn on JavaScript!


CoMSA

CoMSA

CoMSA is a compression and decompression tool for FASTA and Stockholm format multiple sequence alignment (MSA) files. The algorithm in CoMSA relies on a generalization of the positional Burrows-Wheeler transform of non-binary characters. The Authors claim it to be significantly faster than gzip and it can, for example, compress a Stockholm file of size 41.6 Gb into 1.74 Gb, compared to gzip file size of 5.6 Gb. Apart from source code, CoMSA is also available with binaries for Windows and Linux.

Topic

Data management

Details

  • Operation: Multiple sequence alignment; Formatting
  • Input: FASTA; Stockholm
  • Output: -
  • Software interface: Command-line user interface
  • Language:
  • Operating system: Linux; Microsoft Windows
  • License: C++
  • Cost: Free
  • Version name: 1.1
  • Maturity: Stable
  • Credit: -
  • Contact: sebastian.deorowicz _at_ polsl.pl
  • Collection: -

Publications

Deorowicz S, Walczyszyn J, Debudaj-Grabysz A "CoMSA: compression of protein multiple sequence alignment files." Bioinformatics. 2019 Jan 15;35(2):227-234. https://doi.org/10.1093/bioinformatics/bty619
PMID: 30010777


Download and documentation








If you find errors, please report here.