MegaR

MegaR applies machine learning to taxonomic profiles derived from whole metagenome and 16S rRNA sequencing to classify metagenomic samples and predict phenotypes.


Key Features:

  • Sequencing support: Uses taxonomic profiles derived from whole metagenome sequencing and 16S rRNA sequencing data.
  • Taxonomic-profile-based modeling: Builds predictive models from taxonomic profiles to classify samples into multiple categories.
  • Machine learning techniques: Supports various machine learning techniques for model training and phenotype prediction.
  • Model development and validation: Provides data processing, model fine-tuning, selection of machine learning techniques, and model validation options.
  • Unknown sample prediction: Applies trained models to predict properties of unknown samples.

Scientific Applications:

  • Human health and disease characterization: Characterizes microbiome composition effects on human health and disease through sample classification and phenotype prediction.
  • Diagnostic application: Enables identification of microbe-related human diseases by classifying metagenomic samples and predicting phenotypes.
  • Ecosystem and evolutionary studies: Analyzes microbiome–environment relationships to inform biogeochemical processes and evolutionary dynamics in ecosystems.

Methodology:

Constructs machine learning models from taxonomic profiles obtained from whole metagenome and 16S rRNA sequencing data, supporting various machine learning techniques along with data processing, model fine-tuning, selection of methods, and model validation, and uses trained models to predict unknown sample properties.

Topics

Details

License:
GPL-3.0
Tool Type:
library
Programming Languages:
R
Added:
3/19/2021
Last Updated:
5/4/2021

Operations

Publications

Dhungel E, Mreyoud Y, Gwak H, Rajeh A, Rho M, Ahn T. MegaR: an interactive R package for rapid sample classification and phenotype prediction using metagenome profiles and machine learning. BMC Bioinformatics. 2021;22(1). doi:10.1186/s12859-020-03933-4. PMID:33461494. PMCID:PMC7814621.

PMID: 33461494
PMCID: PMC7814621
Funding: - National Science Foundation: 1564894