DeCOIL

DeCOIL optimizes degenerate codon (DC) libraries to design combinatorial mutagenesis libraries that focus mutations on specific amino acid sites for ML-assisted protein engineering, improving the fitness and diversity of sampled protein variants within the sequence search space.


Key Features:

  • Degenerate codon optimization: Optimizes DC libraries to encode targeted amino acid variability without specifying exact gene sequences.
  • ML-oriented design: Tailored for machine learning (ML)-assisted protein engineering and informed by ML-proposed variant priorities that leverage biophysical and evolutionary data.
  • Combinatorial mutagenesis: Generates combinatorial mutagenesis libraries that concentrate mutations on selected amino acid sites.
  • Fitness and diversity optimization: Directly optimizes libraries to enhance both the predicted fitness and sequence diversity of sampled variants within the search space.
  • Cost-effective synthesis strategy: Reduces the need to synthesize large numbers of exact gene sequences by using optimized degenerate codons.
  • Generalizable and scalable: Presents a generalized approach intended to scale to larger combinatorial sequence spaces.
  • Experimental validation: Demonstrated via computational simulations and wet-lab experiments across two case studies.

Scientific Applications:

  • Library design for experimental screening: Produces informed DC libraries for experimental screens to explore new protein functionalities.
  • ML-guided variant exploration: Enables ML-assisted prioritization and sampling of protein variants in sequence search spaces.
  • Targeted site mutagenesis: Facilitates focused exploration of specific amino acid sites in protein engineering campaigns.
  • Cost-reduced experimental campaigns: Supports more economical generation of variant libraries for downstream wet-lab characterization.

Methodology:

Optimizes degenerate codon libraries to generate combinatorial mutagenesis libraries that focus mutations on specified amino acid sites and directly optimizes for fitness and diversity of sampled protein variants within the sequence search space; validated using computational simulations and wet-lab experiments across two case studies.

Topics

Details

Cost:
Free of charge
Tool Type:
command-line tool
Operating Systems:
Mac, Linux, Windows
Programming Languages:
Python
Added:
12/21/2023
Last Updated:
11/24/2024

Operations

Publications

Yang J, Ducharme J, Johnston KE, Li F, Yue Y, Arnold FH. DeCOIL: Optimization of Degenerate Codon Libraries for Machine Learning-Assisted Protein Engineering. ACS Synthetic Biology. 2023;12(8):2444-2454. doi:10.1021/acssynbio.3c00301. PMID:37524064.

PMID: 37524064
Funding: - Basic Energy Sciences: DE-SC0022218