DOMAC

DOMAC predicts protein domains and fold assignments by integrating template-based fold recognition and ab initio domain prediction to improve domain boundary identification and structural relevance for downstream structure and function analysis.


Key Features:

  • Integration of Methods: Combines homology modeling, domain parsing, and ab initio prediction techniques to leverage known templates when available and generate models de novo otherwise.
  • Machine Learning Approach: Uses statistical machine learning with support vector machines (SVMs) trained on pairwise similarity features from alignment methods and structural compatibility features including predicted secondary structure, relative solvent accessibility, contact maps, and beta-strand pairing.
  • Profile-Profile Alignments: Extracts structural compatibility features via global profile-profile alignments combined with predicted secondary structure and other structural elements.
  • Query-Template Scoring: Produces continuous relevance scores for query-template pairs used to rank templates for fold recognition.
  • Scalability and Modularity: Implements a modular architecture intended to handle large datasets and adapt to different analysis contexts.
  • Performance: Demonstrated CASP7 performance with reported sensitivities of ~85% at the family level, ~56% at the superfamily level, and ~27% at the fold level using the top-ranked template, with improved metrics when using the top five ranked templates.

Scientific Applications:

  • Protein Structure Prediction: Supports determination of protein structures by identifying domain boundaries and plausible folds for modeling.
  • Function Annotation: Enables annotation of protein function by identifying domain architectures that correlate with biochemical activities and cellular roles.
  • Mutagenesis Analysis and Protein Engineering: Informs mutagenesis studies and protein engineering by delineating domain regions relevant to stability, interactions, and function.

Methodology:

Applies a two-stage machine learning approach: derive pairwise similarity features using alignment methods, extract structural compatibility features via global profile-profile alignments combined with predicted secondary structure and other structural elements, process these features with SVMs to predict the structural relevance of query-template pairs, and use the resulting continuous relevance scores to rank templates for fold recognition.

Topics

Details

Tool Type:
web application
Added:
2/10/2017
Last Updated:
11/25/2024

Operations

Publications

Cheng J. DOMAC: an accurate, hybrid protein domain prediction server. Nucleic Acids Research. 2007;35(Web Server):W354-W356. doi:10.1093/nar/gkm390. PMID:17553833. PMCID:PMC1933197.

Cheng J, Baldi P. A machine learning information retrieval approach to protein fold recognition. Bioinformatics. 2006;22(12):1456-1463. doi:10.1093/bioinformatics/btl102. PMID:16547073.