BioAutoML
BioAutoML is a software package that automates the machine learning (ML) pipeline for analyzing biological sequence data. It addresses the challenge of feature engineering, ML algorithm selection, and hyperparameter tuning, which are typically manual and time-consuming processes requiring comprehensive domain knowledge.
The package consists of two main components divided into four modules:
1. Automated feature engineering
- Feature extraction module: Extracts numerical and informative features from biological sequence databases utilizing the MathFeature package.
- Feature selection module: Automates the selection of suitable features.
2. Metalearning - Algorithm recommendation module: Automates the recommendation of ML algorithms.
- Hyperparameter tuning module: Automates tuning the selected algorithms' hyperparameters.
BioAutoML was experimentally evaluated in two scenarios: predicting the three principal classes of noncoding RNAs (ncRNAs) and the eight categories of ncRNAs in bacteria, including housekeeping and regulatory types. The package's predictive performance was compared to two other AutoML tools, RECIPE and TPOT. The results showed that BioAutoML can accelerate new studies, reduce the cost of feature engineering processing, and maintain or improve predictive performance.
Topic
Functional, regulatory and non-coding RNA;Machine learning;Personalised medicine;Gene transcripts;Transcription factors and regulatory sites
Detail
Operation: Feature extraction;Quantification;Editing
Software interface: Library
Language: Python
License: Not stated
Cost: Free of charge
Version name: -
Credit: The Coordenacâo de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), Google (LARA - 2021), Universidade de São Paulo (USP) and São Paulo Research Foundation (FAPESP).
Input: -
Output: -
Contact: Robson P Bonidia rpbonidia@gmail.com ,Ulisses N da Rocha ulisses.rocha@ufz.de
Collection: -
Maturity: -
Publications
- BioAutoML: automated feature engineering and metalearning to predict noncoding RNAs in bacteria.
- Bonidia RP, et al. BioAutoML: automated feature engineering and metalearning to predict noncoding RNAs in bacteria. BioAutoML: automated feature engineering and metalearning to predict noncoding RNAs in bacteria. 2022; 23:(unknown pages). doi: 10.1093/bib/bbac218
- https://doi.org/10.1093/BIB/BBAC218
- PMID: 35753697
- PMC: PMC9294424
Download and documentation
Documentation: https://bonidia.github.io/BioAutoML/
Home page: https://github.com/Bonidia/BioAutoML
< Back to DB search