VirBin
VirBin reconstructs viral haplotypes from next-generation sequencing contigs to characterize intra-species genetic diversity in RNA viruses such as Influenza and HIV.
Key Features:
- Contig binning for haplotype reconstruction: Clusters sequencing contigs into groups representing distinct viral haplotypes to enable genome-scale haplotype reconstruction.
- Prototype-based clustering: Uses prototype-based clustering that identifies regions likely to contain haplotype-specific mutations rather than relying solely on sequence composition and contig coverage.
- Expectation-Maximization variant: Implements a variant of the Expectation-Maximization (EM) algorithm adapted for prototype-based clustering.
- Benchmarking and performance: Demonstrated high sensitivity and precision on multiple simulated datasets with varying haplotype abundance distributions and contig sizes, and on mock quasispecies sequencing data relative to other contig binning tools.
Scientific Applications:
- Vaccine and drug design: Enables detailed characterization of viral strain diversity to inform targeted therapeutic and vaccine development.
- Viral population characterization: Reconstructs full genome-scale haplotypes from NGS data to assess intra-host viral diversity.
- Evolution and pathogenesis studies: Supports analysis of viral evolution and pathogenesis by resolving closely related haplotypes in RNA virus populations.
Methodology:
Applies prototype-based clustering and an adapted Expectation-Maximization (EM) algorithm that identifies regions enriched for haplotype-specific mutations to handle high sequence similarity and heterogeneous sequencing coverage.
Topics
Details
- Programming Languages:
- Python
- Added:
- 1/14/2020
- Last Updated:
- 1/16/2021
Operations
Publications
Chen J, Shang J, Wang J, Sun Y. A binning tool to reconstruct viral haplotypes from assembled contigs. BMC Bioinformatics. 2019;20(1). doi:10.1186/s12859-019-3138-1. PMID:31684876. PMCID:PMC6829986.