tagIMPUTE

tagIMPUTE imputes untyped single nucleotide polymorphisms (SNPs) by leveraging linkage disequilibrium from external reference panels and flanking SNPs to enable downstream genetic association analyses.


Key Features:

  • Use of linkage disequilibrium and reference panels: Leverages LD structure in an external reference panel and flanking SNPs to predict untyped SNP genotypes.
  • Two-stage imputation strategy: Predicts untyped genotypes either as the most likely genotype or as expected genotype counts and then uses these imputed values in downstream association analyses.
  • Maximum-likelihood statistical framework: Integrates genotype prediction and estimation of association parameters within a unified maximum-likelihood approach, providing estimators for genetic effects and gene-environment interactions with variance estimation.
  • Type I error control: Provides control of type I error in single-SNP tests, including with covariate adjustment and under reference panel misspecification, while noting limitations for multiple-SNP tests and some gene-environment interaction analyses.
  • Bias and variance considerations: Accounts for bias and generally underestimated variances resulting from imputation and applies methods intended to minimize these biases.
  • Simulation-based evaluation: Has been evaluated using extensive simulation studies comparing bias, type I error, power, and confidence interval coverage across single-SNP, multiple-SNP, and gene-environment interaction scenarios.

Scientific Applications:

  • Localization of disease-causing variants: Enables analysis of untyped SNPs to facilitate localization of causal variants in genetic studies.
  • Meta-analysis across genotyping platforms: Supports combining data from different genotyping arrays by imputing untyped variants to enable cross-platform meta-analysis.
  • Integration into association studies and large-scale projects: Integrates imputed genotypes into association analyses for large-scale genomic research, including genome-wide data such as from the Wellcome Trust Case-Control Consortium.

Methodology:

Uses LD structure from an external reference panel and flanking SNPs; applies a two-stage imputation predicting most-likely genotypes or expected genotype counts and subsequently uses these in association analyses; employs a maximum-likelihood framework that jointly estimates genotype predictions and association parameters with variance estimation; evaluated by simulation studies comparing bias, type I error, power, and confidence interval coverage under single-SNP, multiple-SNP, and gene-environment interaction scenarios in cross-sectional and case-control designs.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux
Added:
8/3/2017
Last Updated:
11/25/2024

Operations

Publications

Hu Y, Lin D. Analysis of untyped SNPs: maximum likelihood and imputation methods. Genetic Epidemiology. 2010;34(8):803-815. doi:10.1002/gepi.20527. PMID:21104886. PMCID:PMC3030127.

Documentation

Links