SANDPUMA

SANDPUMA is a software tool consisting of an ensemble algorithm designed to predict the substrate specificities of nonribosomal peptide synthetase adenylation (A) domains from DNA sequences. It is based on a phylogenetics-inspired algorithm, called prediCAT, that quantitatively estimates the degree of predictability of each A-domain. SANDPUMA has been benchmarked on a newly gathered, independent test set of 434 A-domain sequences, showing that active-site-motif-based algorithms outperform whole-domain-based methods. It has been integrated into the widely used antiSMASH biosynthetic gene cluster analysis pipeline and is also available as an open-source, standalone tool.

Topic

Genomics;Microbiology;Protein folds and structural domains

Detail

  • Operation: Protein domain recognition

  • Software interface: Command-line user interface

  • Language: Perl;Python

  • License: GNU General Public License v3

  • Cost: Free

  • Version name: -

  • Credit: National Institutes of Health National Research Service Award, VENI grant from The Netherlands Organization for Scientific Research (NWO).

  • Input: faa, fna, gbk

  • Output: -

  • Contact: chevrette@wisc.edu;marnix.medema@wur.nl

  • Collection: -

  • Maturity: -

Publications

  • SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria.
  • Chevrette MG, et al. SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria. SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria. 2017; 33:3202-3210. doi: 10.1093/bioinformatics/btx400
  • https://doi.org/10.1093/bioinformatics/btx400
  • PMID: 28633438
  • PMC: PMC5860034

Download and documentation


< Back to DB search