dSPRINT

dSPRINT predicts ligand-binding sites within protein domains for DNA, RNA, ions, peptides, and small molecules to characterize ligand-binding properties across human protein domain families.


Key Features:

  • Comprehensive Ligand Prediction: Predicts whether a protein domain binds DNA, RNA, small molecules, ions, or peptides and identifies specific domain positions likely involved in these interactions.
  • Ensemble Machine Learning: Employs an ensemble machine learning approach to enhance predictive accuracy and robustness.
  • Validation: Validated through stringent cross-validation testing, demonstrating strong predictive performance in identifying ligand-binding sites.
  • Training Data: Trains models on known co-crystal structures to infer binding sites in uncharacterized domains.
  • Application to DUFs: Applies predictions to Domains of Unknown Function (DUFs) to aid characterization of their molecular roles.
  • Transferability to Gene Sequences: Transfers domain-level predictions to sequences, enabling extrapolation of ligand-binding properties across 95% of human genes.
  • Coverage: Addresses approximately two-thirds of human protein domain families that lack detailed co-crystal structure data and provides predictions for 6,503 human protein domains.

Scientific Applications:

  • Protein function annotation: Uses predicted ligand-binding sites to annotate molecular functions of protein domains.
  • Characterization of DUFs: Facilitates assignment of molecular roles to domains of unknown or poorly characterized function.
  • Mechanistic studies: Supports investigation of molecular mechanisms by identifying putative interaction sites within domains.
  • Genome-scale functional inference: Enables broad extrapolation of ligand-binding properties across human genes to inform functional genomics analyses.

Methodology:

dSPRINT uses an ensemble machine learning approach trained on known co-crystal structures and evaluated with stringent cross-validation to predict ligand-binding sites and domain positions.

Topics

Details

Tool Type:
web application, workflow
Programming Languages:
Python
Added:
9/8/2021
Last Updated:
11/24/2024

Operations

Publications

Etzion-Fuchs A, Todd DA, Singh M. dSPRINT: predicting DNA, RNA, ion, peptide and small molecule interaction sites within protein domains. Nucleic Acids Research. 2021;49(13):e78-e78. doi:10.1093/nar/gkab356. PMID:33999210. PMCID:PMC8287948.

PMID: 33999210
PMCID: PMC8287948
Funding: - National Science Foundation: ABI-1458457 - National Institutes of Health: R01-GM076275, T32 HG003284

Links