PFP-Pred

PFP-Pred predicts protein fold patterns to support protein structure and function analysis.


Key Features:

  • Ensemble Approach: Employs an ensemble of classifiers optimized using predicted secondary structure, hydrophobicity, van der Waals volume, polarity, polarizability, and dimensions of pseudo-amino acid composition.
  • Optimized Evidence-Theoretic k-Nearest Neighbors (OET-KNN): Uses the OET-KNN rule as the core operation engine for each constituent classifier, leveraging evidence theory to optimize k-nearest neighbors decision-making.
  • Weighted Voting Mechanism: Integrates outputs from individual classifiers through a weighted voting system to determine the final fold classification.
  • Recognition of 27 Fold Patterns: Identifies true fold patterns among 27 possible configurations.
  • Performance Metrics: Achieves an overall success rate of 62% on a testing dataset where most proteins have less than 25% sequence identity to training proteins, outperforming neural network (NN) and support vector machine (SVM) approaches by 6-21%.

Scientific Applications:

  • Protein structure and function analysis: Provides fold predictions that aid interpretation of protein structure and inference of function.
  • Proteomics and bioinformatics: Supports large-scale fold pattern annotation in proteomics and bioinformatics studies.
  • Drug discovery: Facilitates target characterization and structural assessment relevant to drug discovery.
  • Molecular biology research: Assists experimental design and hypothesis generation by predicting protein fold patterns.
  • Biomaterials development: Informs development of novel biomaterials by providing structural predictions for protein components.

Methodology:

Combines an ensemble of classifiers that use OET-KNN as the core engine, with classifiers optimized on predicted secondary structure, hydrophobicity, van der Waals volume, polarity, polarizability, and dimensions of pseudo-amino acid composition, and integrates classifier outputs via weighted voting.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux, Windows, Mac
Added:
12/18/2017
Last Updated:
11/25/2024

Operations

Publications

Shen H, Chou K. Ensemble classifier for protein fold pattern recognition. Bioinformatics. 2006;22(14):1717-1722. doi:10.1093/bioinformatics/btl170. PMID:16672258.

Documentation

Links