RCSB Protein Data Bank

RCSB Protein Data Bank curates and distributes archival, annotated structural data for biological macromolecules and Computed Structure Models to support structural biology research.


Key Features:

  • Archive stewardship and curation: Serves as the U.S. data center for the Worldwide Protein Data Bank (wwPDB) and maintains curated structural records ensuring data integrity.
  • Data holdings: Hosts over 200,000 experimentally-determined macromolecular structures and more than 1 million Computed Structure Models (CSMs) generated using artificial intelligence and machine learning.
  • Annotation updates: Integrates weekly updates with functional annotations sourced from external biodata resources.
  • Membrane protein annotation: Incorporates annotations from OPM, PDBTM, MemProtMD, and mpstruc, increasing the number of annotated membrane proteins by approximately 80%.
  • Search and visualization for membrane proteins: Supports exploration of membrane protein tree hierarchies and visualization at the 1D amino acid sequence level and in 3D, with predicted membrane-layer locations displayed using the Mol* viewer.

Scientific Applications:

  • Structural biology research: Provides curated experimental structures and AI/ML-generated CSMs for structural analysis and comparative studies.
  • Membrane protein localization and identification: Enables inference of membrane protein presence and predicted locations through integrated annotations from OPM, PDBTM, MemProtMD, and mpstruc.
  • Drug target investigation: Supports research on membrane proteins, which represent a substantial fraction of FDA-approved drug targets.

Methodology:

Aggregates weekly functional annotations from external biodata resources; includes Computed Structure Models generated by artificial intelligence and machine learning; integrates membrane annotations from OPM, PDBTM, MemProtMD, and mpstruc; and displays predicted membrane-layer locations via the Mol* viewer.

Topics

Details

Cost:
Free of charge
Tool Type:
web application
Operating Systems:
Mac, Linux, Windows
Added:
5/17/2022
Last Updated:
1/30/2023

Operations

Publications

Bittrich S, Rose Y, Segura J, Lowe R, Westbrook JD, Duarte JM, Burley SK. RCSB Protein Data Bank: improved annotation, search and visualization of membrane protein structures archived in the PDB. Bioinformatics. 2021;38(5):1452-1454. doi:10.1093/bioinformatics/btab813. PMID:34864908. PMCID:PMC8826025.

PMID: 34864908
PMCID: PMC8826025
Funding: - National Science Foundation: DBI-1832184 - U.S. Department of Energy: DE-SC0019749 - National Institute of General Medical Sciences: R01GM133198

Burley SK, Bhikadiya C, Bi C, Bittrich S, Chao H, Chen L, Craig PA, Crichlow GV, Dalenberg K, Duarte JM, Dutta S, Fayazi M, Feng Z, Flatt JW, Ganesan S, Ghosh S, Goodsell DS, Green RK, Guranovic V, Henry J, Hudson BP, Khokhriakov I, Lawson CL, Liang Y, Lowe R, Peisach E, Persikova I, Piehl DW, Rose Y, Sali A, Segura J, Sekharan M, Shao C, Vallat B, Voigt M, Webb B, Westbrook JD, Whetstone S, Young JY, Zalevsky A, Zardecki C. RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic Acids Research. 2022;51(D1):D488-D508. doi:10.1093/nar/gkac1077. PMID:36420884. PMCID:PMC9825554.

PMID: 36420884
PMCID: PMC9825554
Funding: - National Science Foundation: DBI-1832184 - U.S. Department of Energy: DE-SC0019749 - National Institutes of Health: R01GM133198 - UK Biotechnology and Biological Research Council: DBI-2019297 - NSF: DBI-1756248, DBI-2112966 - NIH-NIGMS: P41GM109824, R01GM083960

Documentation

API documentation
http://data.rcsb.org