CNVkit
CNVkit infers and visualizes copy number variants (CNVs) and somatic copy number alterations (SCNAs) from targeted DNA sequencing data to provide genome-wide and exon-level copy number profiles for genomic analysis.
Key Features:
- Data sources: Uses both targeted reads and nonspecifically captured off-target reads from targeted DNA sequencing and massively parallel sequencing.
- Resolution within targets: Provides exon-level copy number resolution inside targeted regions.
- Genome-wide resolution: Achieves approximately 100-kilobase resolution genome-wide from platforms targeting as few as 293 genes, with signal in intronic and intergenic areas via off-target reads.
- Normalization: Normalizes read counts against a pooled reference to reduce sample-to-sample variability.
- Bias correction: Corrects read-depth biases associated with GC content, target footprint size and spacing, and repetitive sequences.
- Bias sources addressed: Accounts for variability introduced by target capture efficiency and library preparation that affect read depth.
- Read-depth analysis: Operates on sequencing read counts/read depth as the primary signal for copy number inference.
- Visualization and reporting: Produces visualizations and reports of identified copy number changes and significant features.
- Benchmarking: Has been evaluated against array comparative genomic hybridization (aCGH) for performance assessment.
Scientific Applications:
- Germline CNV detection: Identification of germline copy number variants relevant to syndromic conditions and genetic studies.
- Somatic CNV/SCNA analysis in cancer: Detection and characterization of somatic copy number alterations in cancer genomes from targeted sequencing data.
- Targeted sequencing studies: Generation of high-resolution copy number data from targeted re-sequencing efforts to support genomic research into genetic contributions to disease.
- Cross-platform comparison: Use in studies comparing sequencing-based CNV calls to array-based methods such as aCGH.
Methodology:
Leverages targeted and off-target reads, normalizes read counts against a pooled reference, corrects for GC content, target footprint size and spacing, and repetitive sequences, and infers copy number at exon-level within targets and at ~100-kilobase resolution genome-wide.
Topics
Details
- License:
- BSD-3-Clause
- Maturity:
- Mature
- Tool Type:
- library
- Operating Systems:
- Mac
- Programming Languages:
- Python
- Added:
- 1/13/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Talevich E, Shain AH, Botton T, Bastian BC. CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing. PLOS Computational Biology. 2016;12(4):e1004873. doi:10.1371/journal.pcbi.1004873. PMID:27100738. PMCID:PMC4839673.