HierCC
HierCC assigns genome assemblies to hierarchical population clusters using core genome Multi-Locus Sequence Typing (core genome MLST) to support population assignment and genomic surveillance of bacterial pathogens.
Key Features:
- Hierarchical clustering of core genome MLST: Groups genomes into multi-level clusters based on core genome Multi-Locus Sequence Types to represent population structure.
- Scalable multi-level assignments: Employs a scalable clustering scheme that supports both incremental and static multi-level cluster assignments across large datasets.
- Core genome MLST-based operation: Operates directly on core genome MLST data derived from genome assemblies.
- HCCeval integration: Integrates with HCCeval to determine optimal thresholds for assigning genomes to cohesive clusters within the hierarchy.
- Real-time and large-scale analysis: Enables real-time genomic analysis and is applicable to large-scale whole-genome sequencing databases.
- Empirical application to bacterial pathogens: Has been applied to the analysis of over 400,000 genomes from Salmonella, Escherichia, Yersinia, and Clostridioides.
Scientific Applications:
- Infectious disease surveillance: Supports real-time genomic surveillance of bacterial pathogens using hierarchical cgMLST clusters.
- Pathogen genotyping: Provides cgMLST-based genotype assignments for public health and epidemiological analyses.
- Population assignment and structure delineation: Delineates population structure and assigns genomes to hierarchical population groups in large WGS databases.
- Comparative analyses of specific taxa: Facilitates comparative genomic analyses of taxa such as Salmonella, Escherichia, Yersinia, and Clostridioides.
Methodology:
Assigns genomes to hierarchical clusters using core genome MLST profiles via a scalable clustering scheme that supports incremental and static multi-level assignments and uses HCCeval to determine optimal clustering thresholds.
Topics
Details
- License:
- GPL-3.0
- Tool Type:
- command-line tool
- Programming Languages:
- Python
- Added:
- 1/18/2021
- Last Updated:
- 3/18/2021
Operations
Publications
Zhou Z, Charlesworth J, Achtman M. HierCC: A multi-level clustering scheme for population assignments based on core genome MLST. Unknown Journal. 2020. doi:10.1101/2020.11.25.397539.
Documentation
Links
Repository
https://github.com/zheminzhou/HierCC