AbLang
AbLang performs antibody sequence completion by using a specialized language model trained on the Observed Antibody Space (OAS) to restore missing residues in B-cell receptor repertoire sequences.
Key Features:
- Specialized training on antibody data: Trained exclusively on the Observed Antibody Space (OAS) database to capture antibody-specific sequence semantics and structural nuances.
- Restoration of missing residues: Restores missing residues in antibody sequences, addressing the issue that over 40% of sequences in OAS lack the first 15 amino acids.
- Performance superiority: Demonstrates superior residue-restoration performance compared to IMGT germlines and the general protein language model ESM-1b.
- Independence from germline knowledge: Operates without prior knowledge of an antibody's germline.
- Efficiency and speed: Processes sequences substantially faster than ESM-1b (reported as seven times faster).
Scientific Applications:
- Immunological research: Facilitates analysis of B-cell receptor repertoires and studies of immune response mechanisms by providing more complete sequence data.
- Antibody engineering: Assists design and optimization of therapeutic antibodies by restoring and completing antibody sequences for downstream engineering workflows.
- Disease biomarker discovery: Improves identification and validation of antibody biomarkers for diagnostic and prognostic applications through enhanced sequence completeness.
Methodology:
AbLang is a specialized antibody language model trained exclusively on the Observed Antibody Space (OAS) and evaluated via comparative benchmarking against IMGT germlines and the ESM-1b protein language model.
Topics
Details
- License:
- BSD-3-Clause
- Cost:
- Free of charge
- Tool Type:
- library
- Operating Systems:
- Mac, Linux, Windows
- Programming Languages:
- Python
- Added:
- 11/21/2022
- Last Updated:
- 11/24/2024
Operations
Publications
Olsen TH, Moal IH, Deane CM. AbLang: an antibody language model for completing antibody sequences. Bioinformatics Advances. 2022;2(1). doi:10.1093/bioadv/vbac046. PMID:36699403. PMCID:PMC9710568.
PMID: 36699403
PMCID: PMC9710568
Funding: - Engineering and Physical Sciences Research Council: EP/S024093/1