DeepFRI

DeepFRI predicts protein function and identifies functional residues by integrating protein structural data with sequence features from a protein language model using Graph Convolutional Networks (GCNs).


Key Features:

  • Structure-based prediction: Performs protein function prediction using protein structural data.
  • Graph Convolutional Networks (GCNs): Implements GCNs to learn from protein structural graphs.
  • Protein language model features: Utilizes sequence features extracted by a protein language model to capture sequence-level signals.
  • Sequence–structure integration: Combines protein language model-derived sequence features with structural information for prediction.
  • Class activation mapping: Applies class activation mapping techniques to localize predicted functions to specific residues.
  • Residue-level annotations: Produces site-specific, residue-resolution functional annotations.
  • Robustness to homology models: Maintains predictive accuracy when experimental structures are substituted with homology models, indicating de-noising capability.
  • Performance versus CNNs: Demonstrates improved predictive performance compared with sequence-based Convolutional Neural Networks (CNNs).
  • Scalability: Scales to large protein sequence repositories for high-throughput annotation.

Scientific Applications:

  • PDB annotation: Annotates structures from the Protein Data Bank (PDB) to generate new functional predictions.
  • SWISS-MODEL annotation: Annotates homology models from SWISS-MODEL to extend functional coverage.
  • Site-specific functional mapping: Identifies residue-level functional sites for structure–function studies.
  • Experimental versus modeled structure analysis: Enables analyses using experimental structures and homology models with only minor drops in accuracy.
  • Large-scale annotation: Facilitates high-throughput annotation of protein sequences and structures.

Methodology:

Uses Graph Convolutional Networks (GCNs) integrated with sequence features extracted from a protein language model and protein structural data, and employs class activation mapping for residue-level localization.

Topics

Details

License:
BSD-3-Clause
Tool Type:
web application
Programming Languages:
Python
Added:
9/8/2021
Last Updated:
9/12/2021

Operations

Publications

Gligorijević V, Renfrew PD, Kosciolek T, Leman JK, Berenberg D, Vatanen T, Chandler C, Taylor BC, Fisk IM, Vlamakis H, Xavier RJ, Knight R, Cho K, Bonneau R. Structure-based protein function prediction using graph convolutional networks. Nature Communications. 2021;12(1). doi:10.1038/s41467-021-23303-9. PMID:34039967. PMCID:PMC8155034.

PMID: 34039967
PMCID: PMC8155034
Funding: - Polska Akademia Nauk: PPN/PPO/2018/1/00014

Links