MegaPath

MegaPath detects pathogens from metagenomic next-generation sequencing (NGS) data by applying polishing to remove non-informative human reads and spurious alignments, global optimization and read reassignment, an enhanced maximum-exact-match prefix seeding strategy, and a SIMD-accelerated Smith-Waterman algorithm to identify novel or highly divergent bacteria and viruses.


Key Features:

  • High Sensitivity in Pathogen Detection: Detects pathogens with high sensitivity even when sequence similarity to reference databases is low, enabling identification of novel or highly divergent bacteria and viruses.
  • Polishing Techniques: Removes non-informative human reads and spurious alignments to improve signal-to-noise for downstream detection.
  • Global Optimization and Read Reassignment: Uses a global optimization strategy to reassign reads aligned to multiple species to a unique species, increasing correct alignments to distant pathogens and reducing incorrect assignments.
  • Enhanced Seeding and SIMD-accelerated Smith-Waterman: Implements an enhanced maximum-exact-match prefix seeding strategy combined with a SIMD-accelerated Smith-Waterman algorithm to accelerate alignment computations.
  • Benchmark Performance: Demonstrated detection of eight times more reads from low-similarity pathogens compared to existing tools and processed a ~1 Gb dataset in approximately 20 minutes with speed comparable to profile-based methods.

Scientific Applications:

  • Infectious Disease Diagnostics: Identification of bacterial and viral pathogens in clinical metagenomic NGS samples.
  • Epidemiology and Public Health Response: Rapid detection of emerging pathogens to inform public health interventions.
  • Pathogen Evolution and Transmission Studies: Detection of high-mutation-load and novel sequences to support analyses of pathogen evolution and transmission dynamics.
  • Surveillance of Novel or Divergent Pathogens: Monitoring and detection of pathogens with substantial genetic divergence from known reference sequences.

Methodology:

Polishing to remove non-informative human reads and spurious alignments; global optimization with read reassignment to unique species; enhanced maximum-exact-match prefix seeding; SIMD-accelerated Smith-Waterman alignment.

Topics

Details

Tool Type:
command-line tool
Added:
1/18/2021
Last Updated:
2/20/2021

Operations

Publications

Leung C, Li D, Xin Y, Law W, Zhang Y, Ting H, Luo R, Lam T. MegaPath: sensitive and rapid pathogen detection using metagenomic NGS data. BMC Genomics. 2020;21(S6). doi:10.1186/s12864-020-06875-6. PMID:33349238. PMCID:PMC7751095.

PMID: 33349238
PMCID: PMC7751095
Funding: - Innovative and Technology Fund: ITS/331/17FP - General Research Fund: 17208019, 27204518