SFA-SPA
The software tool SFA-SPA enables the reconstruction of protein sequences from short peptides identified on nucleotide reads in a metagenomic dataset. The algorithm has been improved to increase computational efficiency and enable the reconstruction of proteins from large metagenomic datasets containing several hundred million reads while maintaining accuracy. The improvements were achieved using a suffix array data structure for fast querying during the assembly process and a redesign of assembly steps for multi-threaded execution.
Topic
Metagenomics;Microbial ecology;Gene and protein families;Molecular interactions, pathways and networks
Detail
Operation: Sequence assembly
Software interface: Command-line user interface
Language: C++;Perl
License: GNU General Public License v3
Cost: Free
Version name: -
Credit: National Science Foundation
Input: -
Output: -
Contact: yyang.czhong.syooseph@jcvi.org
Collection: -
Maturity: -
Publications
- SFA-SPA: a suffix array based short peptide assembler for metagenomic data.
- Yang Y, et al. SFA-SPA: a suffix array based short peptide assembler for metagenomic data. SFA-SPA: a suffix array based short peptide assembler for metagenomic data. 2015; 31:1833-5. doi: 10.1093/bioinformatics/btv052
- https://doi.org/10.1093/bioinformatics/btv052
- PMID: 25637561
- PMC: -
Download and documentation
< Back to DB search