SANDPUMA
SANDPUMA is a software tool consisting of an ensemble algorithm designed to predict the substrate specificities of nonribosomal peptide synthetase adenylation (A) domains from DNA sequences. It is based on a phylogenetics-inspired algorithm, called prediCAT, that quantitatively estimates the degree of predictability of each A-domain. SANDPUMA has been benchmarked on a newly gathered, independent test set of 434 A-domain sequences, showing that active-site-motif-based algorithms outperform whole-domain-based methods. It has been integrated into the widely used antiSMASH biosynthetic gene cluster analysis pipeline and is also available as an open-source, standalone tool.
Topic
Genomics;Microbiology;Protein folds and structural domains
Detail
Operation: Protein domain recognition
Software interface: Command-line user interface
Language: Perl;Python
License: GNU General Public License v3
Cost: Free
Version name: -
Credit: National Institutes of Health National Research Service Award, VENI grant from The Netherlands Organization for Scientific Research (NWO).
Input: faa, fna, gbk
Output: -
Contact: chevrette@wisc.edu;marnix.medema@wur.nl
Collection: -
Maturity: -
Publications
- SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria.
- Chevrette MG, et al. SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria. SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria. 2017; 33:3202-3210. doi: 10.1093/bioinformatics/btx400
- https://doi.org/10.1093/bioinformatics/btx400
- PMID: 28633438
- PMC: PMC5860034
Download and documentation
Documentation: https://bitbucket.org/chevrm/sandpuma/src/master/
Home page: https://bitbucket.org/chevrm/sandpuma
Links: https://bitbucket.org/chevrm/sandpuma/src/master/examples/
< Back to DB search