dSPRINT
dSPRINT predicts ligand-binding sites within protein domains for DNA, RNA, ions, peptides, and small molecules to characterize ligand-binding properties across human protein domain families.
Key Features:
- Comprehensive Ligand Prediction: Predicts whether a protein domain binds DNA, RNA, small molecules, ions, or peptides and identifies specific domain positions likely involved in these interactions.
- Ensemble Machine Learning: Employs an ensemble machine learning approach to enhance predictive accuracy and robustness.
- Validation: Validated through stringent cross-validation testing, demonstrating strong predictive performance in identifying ligand-binding sites.
- Training Data: Trains models on known co-crystal structures to infer binding sites in uncharacterized domains.
- Application to DUFs: Applies predictions to Domains of Unknown Function (DUFs) to aid characterization of their molecular roles.
- Transferability to Gene Sequences: Transfers domain-level predictions to sequences, enabling extrapolation of ligand-binding properties across 95% of human genes.
- Coverage: Addresses approximately two-thirds of human protein domain families that lack detailed co-crystal structure data and provides predictions for 6,503 human protein domains.
Scientific Applications:
- Protein function annotation: Uses predicted ligand-binding sites to annotate molecular functions of protein domains.
- Characterization of DUFs: Facilitates assignment of molecular roles to domains of unknown or poorly characterized function.
- Mechanistic studies: Supports investigation of molecular mechanisms by identifying putative interaction sites within domains.
- Genome-scale functional inference: Enables broad extrapolation of ligand-binding properties across human genes to inform functional genomics analyses.
Methodology:
dSPRINT uses an ensemble machine learning approach trained on known co-crystal structures and evaluated with stringent cross-validation to predict ligand-binding sites and domain positions.
Topics
Details
- Tool Type:
- web application, workflow
- Programming Languages:
- Python
- Added:
- 9/8/2021
- Last Updated:
- 11/24/2024
Operations
Publications
Etzion-Fuchs A, Todd DA, Singh M. dSPRINT: predicting DNA, RNA, ion, peptide and small molecule interaction sites within protein domains. Nucleic Acids Research. 2021;49(13):e78-e78. doi:10.1093/nar/gkab356. PMID:33999210. PMCID:PMC8287948.
DOI: 10.1093/nar/gkab356
PMID: 33999210
PMCID: PMC8287948
Funding: - National Science Foundation: ABI-1458457
- National Institutes of Health: R01-GM076275, T32 HG003284
Links
Repository
http://github.com/Singh-Lab/dSPRINT