tagIMPUTE
tagIMPUTE imputes untyped single nucleotide polymorphisms (SNPs) by leveraging linkage disequilibrium from external reference panels and flanking SNPs to enable downstream genetic association analyses.
Key Features:
- Use of linkage disequilibrium and reference panels: Leverages LD structure in an external reference panel and flanking SNPs to predict untyped SNP genotypes.
- Two-stage imputation strategy: Predicts untyped genotypes either as the most likely genotype or as expected genotype counts and then uses these imputed values in downstream association analyses.
- Maximum-likelihood statistical framework: Integrates genotype prediction and estimation of association parameters within a unified maximum-likelihood approach, providing estimators for genetic effects and gene-environment interactions with variance estimation.
- Type I error control: Provides control of type I error in single-SNP tests, including with covariate adjustment and under reference panel misspecification, while noting limitations for multiple-SNP tests and some gene-environment interaction analyses.
- Bias and variance considerations: Accounts for bias and generally underestimated variances resulting from imputation and applies methods intended to minimize these biases.
- Simulation-based evaluation: Has been evaluated using extensive simulation studies comparing bias, type I error, power, and confidence interval coverage across single-SNP, multiple-SNP, and gene-environment interaction scenarios.
Scientific Applications:
- Localization of disease-causing variants: Enables analysis of untyped SNPs to facilitate localization of causal variants in genetic studies.
- Meta-analysis across genotyping platforms: Supports combining data from different genotyping arrays by imputing untyped variants to enable cross-platform meta-analysis.
- Integration into association studies and large-scale projects: Integrates imputed genotypes into association analyses for large-scale genomic research, including genome-wide data such as from the Wellcome Trust Case-Control Consortium.
Methodology:
Uses LD structure from an external reference panel and flanking SNPs; applies a two-stage imputation predicting most-likely genotypes or expected genotype counts and subsequently uses these in association analyses; employs a maximum-likelihood framework that jointly estimates genotype predictions and association parameters with variance estimation; evaluated by simulation studies comparing bias, type I error, power, and confidence interval coverage under single-SNP, multiple-SNP, and gene-environment interaction scenarios in cross-sectional and case-control designs.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux
- Added:
- 8/3/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Hu Y, Lin D. Analysis of untyped SNPs: maximum likelihood and imputation methods. Genetic Epidemiology. 2010;34(8):803-815. doi:10.1002/gepi.20527. PMID:21104886. PMCID:PMC3030127.