VGA

VGA reconstructs heterogeneous viral populations from ultra-deep next-generation sequencing data by using barcode-aware assembly to distinguish rare variants from sequencing errors.


Key Features:

  • Barcode-based error elimination: Uses individual barcodes attached to sequencing fragments to eliminate sequencing errors and ensure only accurate data are used in assembly.
  • Expectation-maximization abundance estimation: Implements a robust expectation-maximization algorithm to estimate abundances of assembled viral variants.
  • Advanced assembly for heterogeneous populations: Employs an advanced assembly method tailored to reconstruct diverse viral variants from mixed populations.
  • Scalability: Scales to analyses of millions of sequencing reads for large datasets.
  • Rare variant sensitivity: Detects rare variants that are otherwise obscured by sequencing errors.
  • Empirical validation and HIV performance: Demonstrated superior assembly performance on synthetic and real datasets, including HIV populations, compared to state-of-the-art methods.

Scientific Applications:

  • HIV population reconstruction: Reconstruction and quantitative analysis of HIV viral populations from next-generation sequencing data.
  • Rare variant detection: Sensitive identification of rare viral variants within heterogeneous viral communities.
  • Large-scale viral diversity studies: Scalable analysis of viral diversity across large sequencing datasets and different species.
  • Evolution and epidemiology: Quantitative assessment of variant abundances to inform studies of viral evolution and epidemiology.

Methodology:

Uses individual sequencing fragment barcodes to eliminate sequencing errors, an advanced assembly method to reconstruct viral variants, and a robust expectation-maximization algorithm to estimate variant abundances; the approach is scalable to millions of sequencing reads.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux, Windows, Mac
Programming Languages:
Python, C
Added:
8/3/2017
Last Updated:
11/25/2024

Operations

Publications

Mangul S, Wu NC, Mancuso N, Zelikovsky A, Sun R, Eskin E. Accurate viral population assembly from ultra-deep sequencing data. Bioinformatics. 2014;30(12):i329-i337. doi:10.1093/bioinformatics/btu295. PMID:24932001. PMCID:PMC4058922.

Documentation

Links