RanDepict
RanDepict generates diverse chemical structure depictions to create varied datasets for training deep learning-based optical chemical structure recognition (OCSR) systems.
Key Features:
- Multi-library rendering: Uses depiction functionalities from RDKit, CDK, Indigo, and PIKAChU to produce chemical structure images.
- Diverse depiction parameters: Controls parameters such as bond length, line thickness, and label font style to vary depiction appearance.
- Image augmentation capabilities: Adds features such as curved arrows, chemical labels around structures, and geometric distortions to increase image variability.
- Depiction feature fingerprints: Encodes depiction and augmentation characteristics as binary vectors summarizing image features.
- MaxMin algorithm for diversity: Applies the MaxMin algorithm to select diverse samples from valid depiction and augmentation options.
Scientific Applications:
- OCSR dataset generation: Produces diverse and augmented chemical structure images for training optical chemical structure recognition systems.
- Model robustness and benchmarking: Supplies varied depiction styles and augmentations for testing and benchmarking deep learning-based chemical structure recognition models.
Methodology:
Leverages depiction functions from RDKit, CDK, Indigo, and PIKAChU; varies depiction parameters and applies image augmentations; represents images with binary depiction feature fingerprints and selects diverse samples using the MaxMin algorithm.
Topics
Details
- License:
- MIT
- Cost:
- Free of charge
- Tool Type:
- library
- Operating Systems:
- Mac, Linux, Windows
- Programming Languages:
- Python
- Added:
- 9/3/2022
- Last Updated:
- 11/24/2024
Operations
Publications
Brinkhaus HO, Rajan K, Zielesny A, Steinbeck C. RanDepict: Random chemical structure depiction generator. Journal of Cheminformatics. 2022;14(1). doi:10.1186/s13321-022-00609-4. PMID:35668480. PMCID:PMC9169273.
Links
Repository
https://pypi.org/project/RanDepict/