MOB-suite
MOB-suite performs plasmid typing, clustering, reconstruction, and host-range prediction from whole-genome sequencing (WGS) assemblies to support plasmid epidemiology and antimicrobial resistance (AMR) surveillance.
Key Features:
- Plasmid typing and clustering: Implements a whole-sequence-based classification system that clusters complete plasmid sequences using Mash distances.
- Plasmid reconstruction: Reconstructs plasmid sequences from WGS assemblies.
- Database expansion and sequence analysis: Maintains a database of 23,671 complete sequences representing 17,779 unique plasmids for sequence comparison and analysis.
- Algorithmic comparisons for clustering: Evaluates clustering using Mash distances and average nucleotide identity (ANI) across three clustering algorithms, identifying Mash distance 0.06 with complete-linkage as producing highly homogeneous clusters.
- Host range prediction: Predicts plasmid host range from observed distributions of sequence features, including replication biomarkers and relaxase types, reporting hosts at the highest taxonomic rank that encompasses related plasmids.
- Epidemiological surveillance integration: Integrates plasmid nomenclature with whole-sequence-based clusters to enable examination of plasmid distribution and potential AMR dissemination.
Scientific Applications:
- Plasmid epidemiology and surveillance: Enables detailed examination of plasmid distribution and tracking of AMR dissemination across bacterial populations.
- Host-range inference: Provides predictions of plasmid host ranges at taxonomic ranks using sequence biomarkers and relaxase types.
- Comparative plasmid genomics: Supports grouping and comparison of highly similar plasmids using Mash and ANI-based clustering.
- AMR trait tracking: Facilitates analysis of antimicrobial resistance trait distribution linked to plasmids in genomic surveillance studies.
Methodology:
Performs whole-sequence-based classification and clustering of complete plasmid sequences using Mash distances and ANI, evaluates clustering across three clustering algorithms (identifying Mash distance 0.06 with complete-linkage as effective), predicts host range from distributions of replication biomarkers and relaxase types using known plasmid hosts, and references a database of 23,671 complete sequences representing 17,779 unique plasmids.
Topics
Details
- Tool Type:
- command-line tool
- Programming Languages:
- Python
- Added:
- 1/18/2021
- Last Updated:
- 2/26/2021
Operations
Publications
Robertson J, Bessonov K, Schonfeld J, Nash JHE. Universal whole-sequence-based plasmid typing and its utility to prediction of host range and epidemiological surveillance. Microbial Genomics. 2020;6(10). doi:10.1099/mgen.0.000435. PMID:32969786. PMCID:PMC7660255.