ImpG-Summary

ImpG-Summary performs imputation of untyped single nucleotide polymorphisms (SNPs) from GWAS summary association statistics using external reference haplotypes such as the 1000 Genomes Project to expand variant coverage for association studies and meta-analyses.


Key Features:

  • Input Requirements: Requires 1000 Genomes Project reference haplotypes and summary association statistics from typed SNPs derived from GWAS or meta-analyses.
  • Output Generation: Produces summary association statistics for variants present in the 1000 Genomes dataset.
  • Gaussian imputation approach: Implements a Gaussian imputation method that imputes effect sizes from summary statistics without individual-level genotype data, in contrast to hidden Markov model (HMM)-based imputation.
  • Performance and efficiency: Recovers ~84% of effective sample size for common variants (>5%) and ~54% for low-frequency variants (1–5%), improving to ~87% and ~60% respectively when summary linkage disequilibrium (LD) from target samples is available, and incorporates reference panel sample-size corrections to reduce false positives.
  • Computational speed: Provides computationally fast imputation suitable for large-scale genetic analyses.
  • Empirical validation: Validated using Wellcome Trust Case Control Consortium (WTCCC) data and the British 1958 birth cohort (height), recovering ~95% of effective sample size compared to HMM-based imputation.
  • Functional enrichment: Enables imputation at broader SNP sets to support detection of genic versus non-genic enrichment, as demonstrated for lipid traits.

Scientific Applications:

  • Genome-wide association studies (GWAS): Enhances power and resolution of GWAS by imputing additional variants from summary statistics.
  • Meta-analyses: Expands meta-analytic datasets by providing imputed summary statistics across 1000 Genomes variants.
  • Functional enrichment analyses: Facilitates analyses of functional enrichment and locus annotation, including comparisons of genic versus non-genic loci.

Methodology:

Implements Gaussian imputation from GWAS summary association statistics using 1000 Genomes reference haplotypes; can incorporate summary linkage disequilibrium (LD) from target samples; does not require individual-level genotype data and is contrasted with hidden Markov model (HMM)-based imputation.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux
Added:
8/3/2017
Last Updated:
11/25/2024

Operations

Publications

Pasaniuc B, Zaitlen N, Shi H, Bhatia G, Gusev A, Pickrell J, Hirschhorn J, Strachan DP, Patterson N, Price AL. Fast and accurate imputation of summary statistics enhances evidence of functional enrichment. Bioinformatics. 2014;30(20):2906-2914. doi:10.1093/bioinformatics/btu416. PMID:24990607. PMCID:PMC4184260.

Documentation

Links