QQ-SNV

QQ-SNV applies a logistic regression classifier using quality score quantiles to detect low-frequency single nucleotide variants (SNVs) and distinguish true variants from sequencing errors in Illumina deep sequencing data.


Key Features:

  • Quality Quantile-Based Classification: Uses quantiles of base quality scores from Illumina reads in a logistic regression model to estimate the probability that an observed variant is a true SNV rather than a sequencing error.
  • Flexible Sensitivity and Specificity Settings: Provides three calling modes—QQ-SNV(D) with an SNV probability cutoff of 0.5, QQ-SNV(HS) with a cutoff of 0.0001 to increase sensitivity, and QQ-SNV(HS-P80) that applies an 80th-percentile error-frequency override to improve sensitivity and specificity.
  • Comparative Performance: In studies on HIV-1, HCV plasmid mixtures, and an influenza H1N1 clinical dataset, QQ-SNV(HS-P80) balanced sensitivity and specificity better than LoFreq, ShoRAH, and V-Phaser 2, and in a paired-end HCV experiment with a 0.5% spiked-in true frequency achieved 100% sensitivity and specificity versus 40–60% sensitivity and 98.0–99.7% specificity for other methods.
  • Computational Efficiency: Required significantly less computation time compared to other methods.
  • Detection of Very Low-Frequency Variants: Consistently identified four putative true variants below 0.5% frequency across clinical samples and different generations of Illumina sequencers.

Scientific Applications:

  • Viral deep sequencing analysis: Detection of low-frequency SNVs in viral populations to inform studies of viral evolution, drug resistance, and intra-host population dynamics using Illumina deep sequencing data.

Methodology:

Implements a logistic regression classifier trained on quality score quantiles from Illumina sequencing reads and applies configurable SNV probability cutoffs (e.g., 0.5 and 0.0001) with an optional 80th-percentile error-frequency override (HS-P80).

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux, Mac
Programming Languages:
Perl, SAS
Added:
5/19/2018
Last Updated:
12/10/2018

Operations

Publications

Van der Borght K, Thys K, Wetzels Y, Clement L, Verbist B, Reumers J, van Vlijmen H, Aerssens J. QQ-SNV: single nucleotide variant detection at low frequency by comparing the quality quantiles. BMC Bioinformatics. 2015;16(1). doi:10.1186/s12859-015-0812-9. PMID:26554718. PMCID:PMC4641353.

Documentation