vRhyme
vRhyme bins viral genomes from metagenomic datasets to generate high-quality viral metagenome-assembled genomes (vMAGs) for analysis of viral diversity and function.
Key Features:
- Coverage effect size comparisons: Performs single- and multi-sample coverage effect size comparisons between scaffolds to distinguish patterns indicative of distinct viral genomes.
- Supervised machine learning and weighted networks: Uses supervised machine learning to identify nucleotide feature similarities and to construct weighted networks that are iteratively refined to form genome bins.
- Protein redundancy scoring: Incorporates a protein redundancy scoring mechanism based on the expectation that viral genomes typically do not encode redundant genes to improve bin accuracy.
Scientific Applications:
- Benchmarking with simulated viromes: Demonstrated reconstruction of more complete and less contaminated vMAGs compared to existing binning tools on simulated virome datasets.
- Human skin virome analysis: Applied to 10,601 viral scaffolds from human skin metagenomes to bin complex viral sequence collections.
- vMAG discovery examples: Recovered a Herelleviridae vMAG composed of 22 scaffolds and a vMAG encoding a nitrate reductase metabolic gene, representing near-complete genomes after binning.
Methodology:
Constructs weighted networks from nucleotide features identified via supervised machine learning that are iteratively refined to produce genome bins, and uses single- and multi-sample coverage effect size comparisons plus protein redundancy scoring to guide bin refinement.
Topics
Details
- License:
- GPL-3.0
- Cost:
- Free of charge
- Tool Type:
- command-line tool
- Operating Systems:
- Mac, Linux, Windows
- Programming Languages:
- Python
- Added:
- 8/13/2022
- Last Updated:
- 11/24/2024
Operations
Publications
Kieft K, Adams A, Salamzade R, Kalan L, Anantharaman K. vRhyme enables binning of viral genomes from metagenomes. Nucleic Acids Research. 2022;50(14):e83-e83. doi:10.1093/nar/gkac341. PMID:35544285. PMCID:PMC9371927.
DOI: 10.1093/nar/gkac341
PMID: 35544285
PMCID: PMC9371927
Funding: - National Institutes of Health: R35GM137828, R35GM143024, U19AI142720
- National Library of Medicine: T15LM007359