SFA-SPA

The software tool SFA-SPA enables the reconstruction of protein sequences from short peptides identified on nucleotide reads in a metagenomic dataset. The algorithm has been improved to increase computational efficiency and enable the reconstruction of proteins from large metagenomic datasets containing several hundred million reads while maintaining accuracy. The improvements were achieved using a suffix array data structure for fast querying during the assembly process and a redesign of assembly steps for multi-threaded execution.

Topic

Metagenomics;Microbial ecology;Gene and protein families;Molecular interactions, pathways and networks

Detail

  • Operation: Sequence assembly

  • Software interface: Command-line user interface

  • Language: C++;Perl

  • License: GNU General Public License v3

  • Cost: Free

  • Version name: -

  • Credit: National Science Foundation

  • Input: -

  • Output: -

  • Contact: yyang.czhong.syooseph@jcvi.org

  • Collection: -

  • Maturity: -

Publications

  • SFA-SPA: a suffix array based short peptide assembler for metagenomic data.
  • Yang Y, et al. SFA-SPA: a suffix array based short peptide assembler for metagenomic data. SFA-SPA: a suffix array based short peptide assembler for metagenomic data. 2015; 31:1833-5. doi: 10.1093/bioinformatics/btv052
  • https://doi.org/10.1093/bioinformatics/btv052
  • PMID: 25637561
  • PMC: -

Download and documentation


< Back to DB search