OpenFold
OpenFold implements a fast, memory-efficient reimplementation of AlphaFold2 for predicting protein structures and enabling retraining and analysis of model learning.
Key Features:
- AlphaFold2 reimplementation: Provides a codebase that reproduces AlphaFold2's architecture and inference behavior.
- Fast and memory-efficient: Offers performance optimizations targeting reduced runtime and memory usage during model execution.
- Trainable framework: Permits training models from scratch with access to the training code and workflows.
- Accessible training data and code: Addresses the absence of publicly available training code and data required for developing new models.
- Comparable accuracy: Achieves accuracy levels reported to be similar to those of AlphaFold2 for protein structure prediction.
- Robust generalization: Demonstrates the ability to generalize protein structures when trained on limited datasets and when entire classes of secondary structure elements are omitted.
- Hierarchical learning insights: Exposes intermediate structures generated during training to analyze the progressive learning of complex folds.
- Supports novel tasks and evaluation: Enables exploration of tasks such as protein-ligand complex prediction and evaluation of generalization across uncharted regions of fold space.
Scientific Applications:
- Protein structure prediction: Predicts tertiary structures of proteins using an AlphaFold2-derived model implementation.
- Retraining for novel tasks: Facilitates retraining to adapt models for tasks such as protein-ligand complex structure prediction.
- Model learning analysis: Allows study of hierarchical and intermediate learning behavior by inspecting structures generated during training.
- Generalization assessment: Enables evaluation of model generalization across fold space and under limited or biased training data.
- Investigating secondary structure effects: Supports experiments that omit entire classes of secondary structure elements to assess their impact on learning and prediction.
Methodology:
Reimplementation of the AlphaFold2 architecture with a trainable framework enabling training models from scratch and analysis of intermediate structures produced during training.
Topics
Details
- License:
- Apache-2.0
- Cost:
- Free of charge
- Tool Type:
- command-line tool
- Operating Systems:
- Linux
- Programming Languages:
- Python
- Added:
- 6/19/2024
- Last Updated:
- 11/24/2024
Operations
Publications
Ahdritz G, Bouatta N, Floristean C, Kadyan S, Xia Q, Gerecke W, O’Donnell TJ, Berenberg D, Fisk I, Zanichelli N, Zhang B, Nowaczynski A, Wang B, Stepniewska-Dziubinska MM, Zhang S, Ojewole A, Guney ME, Biderman S, Watkins AM, Ra S, Lorenzo PR, Nivon L, Weitzner B, Ban YA, Chen S, Zhang M, Li C, Song SL, He Y, Sorger PK, Mostaque E, Zhang Z, Bonneau R, AlQuraishi M. OpenFold: retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. Nature Methods. 2024;21(8):1514-1524. doi:10.1038/s41592-024-02272-z. PMID:38744917. PMCID:PMC11645889.