DeepVariant
DeepVariant calls genetic variants from aligned sequencing reads (BAM/CRAM) by converting read alignments into pileup image tensors and classifying them with a convolutional neural network to improve variant detection accuracy for individual and population-scale cohorts.
Key Features:
- Input formats: Accepts aligned sequencing reads in BAM and CRAM formats.
- Pileup image tensor conversion: Converts aligned reads into pileup image tensors representing local read context.
- Convolutional neural network: Uses a CNN to classify pileup tensors and call genetic variants.
- Cohort merging integration: Integrates with GLnexus for scalable merging of per-sample calls into cohort callsets.
- Callset outputs: Produces individual-level variant calls and cohort-level callsets.
- Quality evaluation metrics: Utilizes variant recall and precision and Mendelian consistency checks to assess callset quality.
- Benchmarking results: Demonstrated improved callset quality and imputation reference panel performance relative to GATK Best Practices on deeply sequenced 1000 Genomes Project (1KGP) samples.
Scientific Applications:
- Cohort-level variant generation: Producing high-quality cohort callsets for large-scale genetic studies.
- Individual sample variant calling: Generating accurate per-sample variant calls from aligned reads.
- Imputation reference panel improvement: Enhancing imputation reference panel performance through higher-quality callsets.
- Benchmarking and validation: Evaluating variant caller performance using variant recall, precision, and Mendelian consistency in trios.
- Population-scale analyses: Applied to deeply sequenced 1000 Genomes Project (1KGP) samples for population genetics studies.
Methodology:
Processes aligned reads (BAM/CRAM), converts them into pileup image tensors, classifies tensors with a convolutional neural network to call variants, integrates per-sample calls with GLnexus for cohort merging, and evaluates callset quality using variant recall/precision and Mendelian consistency checks.
Topics
Details
- License:
- BSD-3-Clause
- Programming Languages:
- Python, C++, Shell
- Added:
- 3/19/2021
- Last Updated:
- 11/24/2024
Operations
Publications
Yun T, Li H, Chang P, Lin MF, Carroll A, McLean CY. Accurate, scalable cohort variant calls using DeepVariant and GLnexus. Bioinformatics. 2020;36(24):5582-5589. doi:10.1093/bioinformatics/btaa1081. PMID:33399819. PMCID:PMC8023681.
Downloads
- Container filehttps://hub.docker.com/r/google/deepvariant