VCFtools
VCFtools processes and analyzes genetic variation data in Variant Call Format (VCF) files to manipulate, validate, merge, compare, and retrieve single nucleotide polymorphisms (SNPs), insertions, deletions, structural variants, and their annotations.
Key Features:
- Variant manipulation and analysis: Provides utilities to manipulate and analyze records within VCF files including variant-level operations.
- Validation: Ensures VCF files conform to expected standards to maintain data integrity across studies and platforms.
- Merging: Combines multiple VCF files into a single dataset to integrate variant calls from different samples or sources.
- Comparison: Compares genetic variants between datasets to identify differences or patterns across populations or conditions.
- Perl API: Exposes a general Perl programming interface for custom processing and extension of VCF-related analyses.
- Compressed and indexed VCF support: Handles compressed VCF files and uses indexing to enable rapid retrieval of variant information at specified genome positions.
Scientific Applications:
- Population and comparative genomics: Comparing variants between populations or conditions to identify significant differences or patterns.
- Large-scale variant dataset integration: Merging and processing VCFs for large projects and databases such as 1000 Genomes Project, UK10K, dbSNP, and the NHLBI Exome Project.
- Data validation and standardization: Validating VCF files to ensure consistent formats and integrity across studies and platforms.
- Targeted variant retrieval: Rapidly retrieving variant information at specified genome positions using compressed/indexed VCFs.
Methodology:
Performs VCF validation, merging of multiple VCF files, variant comparison, provides a Perl API for custom processing, and supports compressed/indexed VCF retrieval at specified genome positions.
Topics
Collections
Details
- License:
- GPL-3.0
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Mac
- Programming Languages:
- C++, Perl
- Added:
- 1/13/2017
- Last Updated:
- 2/12/2019
Operations
Publications
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R. The variant call format and VCFtools. Bioinformatics. 2011;27(15):2156-2158. doi:10.1093/bioinformatics/btr330. PMID:21653522. PMCID:PMC3137218.