AbLang

AbLang performs antibody sequence completion by using a specialized language model trained on the Observed Antibody Space (OAS) to restore missing residues in B-cell receptor repertoire sequences.


Key Features:

  • Specialized training on antibody data: Trained exclusively on the Observed Antibody Space (OAS) database to capture antibody-specific sequence semantics and structural nuances.
  • Restoration of missing residues: Restores missing residues in antibody sequences, addressing the issue that over 40% of sequences in OAS lack the first 15 amino acids.
  • Performance superiority: Demonstrates superior residue-restoration performance compared to IMGT germlines and the general protein language model ESM-1b.
  • Independence from germline knowledge: Operates without prior knowledge of an antibody's germline.
  • Efficiency and speed: Processes sequences substantially faster than ESM-1b (reported as seven times faster).

Scientific Applications:

  • Immunological research: Facilitates analysis of B-cell receptor repertoires and studies of immune response mechanisms by providing more complete sequence data.
  • Antibody engineering: Assists design and optimization of therapeutic antibodies by restoring and completing antibody sequences for downstream engineering workflows.
  • Disease biomarker discovery: Improves identification and validation of antibody biomarkers for diagnostic and prognostic applications through enhanced sequence completeness.

Methodology:

AbLang is a specialized antibody language model trained exclusively on the Observed Antibody Space (OAS) and evaluated via comparative benchmarking against IMGT germlines and the ESM-1b protein language model.

Topics

Details

License:
BSD-3-Clause
Cost:
Free of charge
Tool Type:
library
Operating Systems:
Mac, Linux, Windows
Programming Languages:
Python
Added:
11/21/2022
Last Updated:
11/24/2024

Operations

Publications

Olsen TH, Moal IH, Deane CM. AbLang: an antibody language model for completing antibody sequences. Bioinformatics Advances. 2022;2(1). doi:10.1093/bioadv/vbac046. PMID:36699403. PMCID:PMC9710568.

PMID: 36699403
PMCID: PMC9710568
Funding: - Engineering and Physical Sciences Research Council: EP/S024093/1