80 Free Genotype Imputation Tools - Software and Resources

What are the top 10 most popular genotype imputation tools?

The top ten most popular imputation tools (Date: 2024-04-30 by citation count):

Rank	Tool Name	Citations	Year
1	MaCH	1400	2010
2	BEAGLE	1400	2007
3	fastPHASE	1200	2006
4	SHAPEIT	854	2011
5	Sanger Imputation Service	641	2016
6	IMPUTE2	592	2011
7	Minimac3	439	2016
8	Minimac4	439	2016
9	SNP2HLA	322	2013
10	IMPUTE4	267	2018

Database of Bioinformatics Software Tools and Resources.

ABHgenotypeR

Description : ABHgenotypeR is an R tool for genotype imputation, error correction, and plotting genotype data. The purpose of ABHgenotypeR is to work as an in-between tool for TASSEL GBS pipeline and qtl tool. However, ABHgenotypeR is an independent tool and can visualize genotypes using ggplot2.

Adapt-Mix

Description : A tool for genotype imputation in arbitrary population data. The Adapt-Mix algorithm computes estimates of local correlation structure by using a combination of information in all available reference panels and summary statistics-based methods.

ADDIT

Description : ADDIT (Accurate Data-Driven Imputation Technique) is a tool for genotype imputation. The ADDIT algorithm consists of two data-driven methods that can manage data from both model and non-model organisms. The model version of the algorithm uses statistical inference, and the non-model version employs supervised learning.

alleHap

Description : alleHap is a set of tools for simulation of alleles, genotype imputation, and reconstruction of non-recombinant haplotypes.

AlphaImpute

Description : A tool for genotype imputation and phasing. The AlphaImpute algorithm requires pedigree data and couples the long-range phasing with a segregation analysis and haplotype library imputation (SAHLI). The tool is available via email from the Authors.

AlphaPlantImpute

Description : A tool for phasing and genotype imputation in plant data. The AlphaPlantImpute algorithm works in and across bi-parental populations. Available from the Author through email. See 'contact'.

ALRA

Description : ALRA (Adaptively-thresholded Low Rank Approximation) is a tool for genotype imputation. The ALRA algorithm uses a low-rank approximation method to capture expressed gene dropouts. ALRA is also integrated into Seurat v3.0.

AutoImpute

Description : AutoImpute is an R tool for genotype imputation. The AutoImpute algorithm avoids dropouts by autoencoder-based sparse gene expression matrix imputation method and learning a distribution of sing-cell expression data distributions. Requires: Python 2.7, numpy, scikit-learn, TensorFlow, and matplotlib.

BEAGLE

Description : Beagle is a tool for genotype calling, phasing, identity-by-descent segment detection, and genotype imputation. The Beagle algorithm uses a modified version of the Li and Stephens haplotype frequency model that reduces the space requirements and a pre-processing step that re-computes an original reference panel into a composite reference haplotypes. These steps reduce both space and computing time.

BIMBAM

Description : BIMBAM (Bayesian Imputation Based Association Mapping) is a tool to impute genotypes and perform statistical tests for disease association, such as single-SNP tests and regional multi-SNP tests. The BIMBAM algorithm uses the Bayesian framework.

BLIMP

Description : BLIMP (Software for Best Linear IMPutation) is a tool for genotype imputation. The BLIMP algorithm works on pooled or summary data and uses chained equations to handle incomplete categorical variables.

BLUP

Description : A tool for genotype imputation. The BLUP algorithm aims to solve the problem of case-control association with missing data. It assumes the samples to contain related individuals.

CGDSNPdb

Description : CGDSNPdb is a web-based tool for imputed mouse single nucleotide polymorphism (SNP).

chipimputation

Description : chipimputation is a pipeline tool for genotype imputation. The chipimputation pipeline consists of Perl and Python scripts and incorporates the following software tools: PLINK, Shapeit, Impute2, and gtool.

DeepImpute

Description : DeepImpute is a tool to impute single-cell RNA-seq (scRNA-seq) data. The DeepImpute algorithm uses a deep neural network learning method.

DISSCO

Description : DISSCO (Direct Imputation of Summary Statistics allowing COvariates) is a tool for genotype imputation. The DISSCO algorithm uses association summary statistics, and thus do not require individual-level genotype data.

DIST

Description : DIST (DirectImputation of summarySTatistics) is a tool for genotype imputation. The DIST algorithm imputes the summary statistics of untyped variants, using conditional expectation for multivariate normal variates and correlation in a reference population.

DISTMIX

Description : DISTMIX is a tool for genotype imputation. The DISTMIX algorithm extends the capability of DIST (see 'links') by analysis of mixed ethnicity cohorts. It uses a reference panel to impute missing SNPs estimated or in specified ethnic proportions.

DrImpute

Description : DrImpute is a tool for the imputation of dropout events in single-cell RNA-seq (sc-RNA-seq) data.

EM-LRT

Description : A tool for genotype imputation. The EM-LRT algorithm produces imputation uncertainty.

EMINIM

Description : A tool for genotype imputation. The EMINIM algorithm is based on a hidden Markov model (HMM), estimates population parameters from data, and works on diverse model organisms.

Ezimputer

Description : EZImputer is a workflow for genotype imputation based on impute2 (see 'links'). It automates steps routinely needed in an imputation scheme.

FAPI

Description : A tool for genotype imputation. The FAPI algorithm comprises functions for p-value imputation, meta-analysis, and quality assessment. It does not require phasing or to sample raw genotypes.

fastPHASE

Description : A tool for genotype imputation and estimating missing haplotypes. The fastPHASE algorithm obtains a random sample from a population data and models the genealogy of chromosomes and summarizes the haplotype variation. fastPHASE estimates and corrects genotyping errors based on linkage disequilibrium (LD) patterns, associates haplotypes with binary phenotypes, and works on low-coverage sequencing data.

FImpute

Description : FImpute is a tool for haplotype estimation or phasing and genotype imputation. The FImpute algorithm uses pedigree information and an iterative procedure and imputes missing genotypes using a sliding window method with the assumption that all subjects have some degree of relationship.

findhap.f90

Description : findhap.f90 is a tool for genotype imputation and haplotype detection. The findhap.f90 algorithm uses allele read counts to improve imputation accuracy. The Authors claim the findhap.f90 to be more accurate than Beagle (v4) and up 400 times faster.

FISH

Description : A tool for genotype imputation. The FISH algorithm uses a hidden Markov model to characterize single reference haplotypes.

GeneImp

Description : A tool for genotype imputation. The GeneImp algorithm uses a sliding window approach and does not require pre-phasing. Furthermore, the algorithm imputes genotypes to a dense reference panel by obtaining the likelihoods from ultralow sequencing coverage. Requirements: VCFtools, bcftools, and HTSlib.

genipe

Description : A pipeline tool for genotype imputation. The genipe algorithm includes imputed data indexing, data management, Sequence Kernel Association Test, Cox proportional hazards for survival analysis, linear mixed models for repeated measurements in longitudinal studies. The imputation pipeline works with PLINK, SHAPEIT, and IMPUTE2.

GIGI

Description : A tool for rare variant genotype imputation. The GIGI algorithm can handle large pedigrees. Only a subset of individuals in a pedigree needs to be completely sequenced, GIGI will infer the missing genotypes at untyped markers, if the remaining individuals are sequenced solely at appropriate marker locations. GIGI-Quick can speed up the computation by running GIGI in parallel. See 'links'.

GIGI-Quick

Description : A tool for genotype imputation that can handle large pedigrees. The GIGI-Quick algorithm runs GIGI (see 'links') in parallel to reduce the overall run time.

GIGSEA

Description : GIGSEA (Genotype Imputed Gene Set Enrichment Analysis) is a tool to analyze imputed genotypes. The GIGSEA algorithm uses a combination of genome-wide association study (GWAS) summary statistics and eQTL to deduce differential gene expression and to examine enrichment for gene sets.

Gimpute

Description : A pipeline tool for genotype imputation The Gimpute algorithm comprises of genetic variant updating, matching, liftover, quality control, alignment of variants to references, pre-phasing, imputation, and post-imputation quality control.

GRIMM

Description : A tool to impute human leukocyte antigen (HLA) genotypes and matching. The GRIMM algorithm uses a graph-based method to store haplotype frequencies.

GTOOL

Description : A tool to transform genotype datasets for use with SNPTEST and IMPUTE.

Hap-seqX

Description : A tool to haplotype phasing and genotype imputation. The Hap-seqX Algorithm uses a combination of Dynamic Programming and a hidden Markov Model.

HIBAG

Description : A tool for imputation of human leukocyte antigen (HLA) types using single nucleotide polymorphisms (SNPs). The HIBAG algorithm consolidates attribute bagging and an ensemble classifier methods, with haplotype inference for SNPs and HLA types.

HLA*IMP

Description : A tool for genotype imputation of human leukocyte antigen (HLA) alleles. The HLA*IMP algorithm uses linked SPN data, prepares local data, performs probabilistic imputation through a remote server, and QC.

HLA-IMPUTER

Description : A web-based tool for HLA allele imputation using HIBAG algorithm (see 'links'). The HLA-IMPUTER currently has the following reference panels: Han Chinese, Pan Asian, European, and multiethnic.

hsphase

Description : hsphase is a tool to detect recombination events, phasing, genotype imputation, and to reconstruct pedigrees. The hsphase algorithm uses a genetic data structure within half-sib livestock to classify recombination events. It can also run directly on sequence data.

Human Protein Variant Effect Map Imputation Toolkit

Description : A web-based pipeline tool for genotype imputation and visualization of missense variant effect maps. The algorithm imputes lacking data in empirically observed effect maps.

ImpG-Summary

Description : A tool for genotype imputation using. The ImpG-Summary algorithm uses Gaussian imputation with summary association statistics.

Imputability Database

Description : A web-based tool provides information on single nucleotide (SNP) and insertion and deletion (indel) imputability. It produces the information given IDs of variants or by specifying a genomics region.

IMPUTE2

Description : IMPUTE2 is a tool for genotype imputation and haplotype phasing. The IMPUTE2 algorithm uses 'pre-phasing' wherein it makes initial statistical estimates of the haplotypes. In the next step, it imputes missing genotypes given the estimated haplotypes. This approach yields reduced computing time.

IMPUTE4

Description : IMPUTE4 is a tool for genotype imputation. The IMPUTE4 is an improved version of IMPUTE2 (see 'links') and Jonathan Marchini to impute genotype for the UK Biobank data.

IMPUTOR

Description : IMPUTOR is a tool to identify miscalled bases caused by sequencing errors and to impute genotypes. The IMPUTOR algorithm imputes erroneously called bases and missing data using a parsimony approach.

Kinpute

Description : A tool to compute reference panels and genotype probabilities for specific studies. The Kinpute algorithm uses initial estimates of average identity by descent in a sample to select an optimal set of individuals to sequence for a sample-specific reference panel. The probabilities are useful as an input for genotype imputation software.

LinkImpute

Description : A tool for genotype imputation for non-model organisms. The LinkImpute algorithm works on unphased data from heterozygous species. It uses the k-nearest neighbor genotype imputation technique (LD-kNNi) and does not require physical or genetic maps. LinkImpute is available in two versions, Java and as an R package. Other names: LinkImputeR.

LinkImputeR

Description : LinkImputeR is a Java tool to call and impute genotypes. The LinkImputeR algorithm uses the read count information and all other available sequence information. LinkImputeR works particularly in non-model organisms because it does not need genotype reference panels or ordered markers.

MaCH

Description : MaCH is a tool for genotype imputation and haplotyping using WGS sequence data. The MaCH algorithm uses a Markov chain approach and represents sampled chromosomes as mosaics of each other.

MaCH-Admix

Description : MaCH-admix is a tool for genotype imputation, having added features compared to the MaCH 1.0 tool. The MaCH-admix algorithm does piecewise selection of a reference to tailor-make it fit for a target person. Also, the algorithm allows the use of standard reference panels and independent, calibrated parameters by separating imputation itself from parameter estimation.

Mendel Impute

Description : A tool for genotype imputation. The MENDEL-IMPUTE algorithm uses matrix completion and a sliding window approach over a single nucleotide polymorphism (SNP). The download package contains documentation.

mendel-gpu

Description : A tool for genotype imputation. The mendel-gpu algorithm uses linkage disequilibrium patterns in unrelated subjects and runs in AMD and Nvidia GPUs.

Michigan Imputation Server

Description : A web-based tool for genotype imputation. The Michigan Imputation Server supports the following reference panels: 1. HapMap Release 2, 2. 1000 Genomes Phase 1, 3. 1000 Genomes Phase 3, 4. CAAPA African American, 5. Haplotype Reference Consortium, 6. Hosting your own reference panels. The Michigan Imputation Server is open source and the source code is available for download.

Minimac

Description : Minimac is a tool for genotype imputation. The Minimac Algorithm is a computationally efficient implementation of MaCH algorithm, works on phased genotypes and can handle large reference panels up to hundreds of thousands of haplotypes.

Minimac2

Description : Minimac2 is a tool for genotype imputation. The Minimac2 Algorithm is a computationally efficient implementation of MaCH algorithm, works on phased genotypes and can handle large reference panels up to hundreds of thousands of haplotypes. A multiprocessor version, minimac2-omp is available from the download page.

Minimac3

Description : Minimac3 is a tool for genotype imputation, an improved version of Minimac2. The Minimac Algorithm is a computationally efficient implementation of MaCH algorithm, works on phased genotypes and can handle large reference panels up to hundreds of thousands of haplotypes. It first identifies repeated haplotype patterns and uses these to speed up the computational process. Minimac3 uses less memory than its predecessors.

Minimac4

Description : Minimac4 is a tool for genotype imputation, an improved version of Minimac2. The Minimac Algorithm is a computationally efficient implementation of MaCH algorithm, works on phased genotypes and can handle large reference panels up to hundreds of thousands of haplotypes. It first identifies repeated haplotype patterns and uses these to speed up the computational process. Main improvements from previous versions: 1. About six times faster for the reference panels than Minimac3. 2. Decreased memory usage. 3. It can use varying ploidy in the same VCF file for imputation of sex chromosomes.

Molgenis-impute

Description : A pipeline tool for genotype imputation for the grid and local cluster environments. The automation includes the following steps: genome build liftover, genotype phasing (SHAPEIT2), quality control, sample, and chromosomal chunking, and genotype imputation (IMPUTE). Molgenis-impute utilizes MOLGENIS-compute (see 'links') for submission and monitoring of tasks. Requires: wget or curl, tar, unzip, bunzip2, g++, java 1.6 or higher, python 2.7, and numpy.

NPUTE

Description : NPUTE is a tool for genotype imputation. The NPUTE algorithm uses the K-nearest-neighbor (KNN) over any size of sliding haplotype window method, a mismatch accumulator array (MAA). NPUTE also estimates the imputation accuracy by inference of known SNP values when left out.

ParaHaplo

Description : ParaHaplo is a tool for genotype imputation and reconstruction of haplotypes. The ParaHaplo algorithm uses parallel computing.

PedImpute

Description : A tool for haplotype reconstruction and genotype imputation using whole-genome single-nucleotide reference panels.

PlantImpute

Description : PlantImpute is a tool for genotype imputation. The PlantImpute algorithm uses a hidden Markov model (HMM) to track inheritance specifically in plants. The tool is available from Carl Nettelblad by email. See 'contact'.

polyHap

Description : polyHap is a tool to phase and estimate missing genotypes in copy number variable (CNV) regions. The polyHap algorithm uses a hidden Markov model (HMM).

PRIMAL

Description : PRIMAL (PedigRee IMputation ALgorithm) is a tool for genotype imputation for founder populations having pedigree data. The PRIMAL algorithm is based on an indexing procedure of Identity-By-Descent segments using clique graphs.

PWT

Description : A tool for genotype imputation and phasing. The BWT algorithm uses a compressed representation of haplotypes and is based on Positional Burrows-Wheeler Transform (PBWT) compression.

r2_hat

Description : r2_hat is a tool to estimate the quality of imputation based on dosage data.

RegionalHapMapExtractor

Description : Software to extract a region from hapMapII for MaCH imputation.

Sanger Imputation Service

Description : Sanger genotype imputation and phasing service is a web-based tool at Wellcome Sanger Institute. The service pipeline uses EAGLE2 or SHAPEIT2 for pre-phasing, EAGLE2 for phasing, and PBWT (Positional Burrows-Wheeler Transform) for genotype imputation. The service currently offers the following reference panels: 1. Haplotype Reference Consortium, 2. African Genome Resources, 3. 1000 Genomes Phase 3, 4. UK10K, 5. UK10K + 1000 Genomes Phase 3.

SHAPEIT

Description : SHAPEIT (Segmented HAPlotype Estimation and Imputation Tool) is a tool to estimate haplotypes. The SHAPEIT algorithm uses data from unrelated individuals or small families and scales linearly with he number of single-nucleotide polymorphisms (SNPs) and haplotypes.

Simpute

Description : Is a tool for genotype imputation. The Simpute algorithm does not require reference panels and works by evaluating two neighboring SNP loci around a missing target. It combines the estimated haplotype probabilities with LD data to predict the missing SNP genotype.

simuRare

Description : A tool for genotype imputation. The simuRare algorithm uses an aggregate of a logistic regression-based imputation and resampling to simulate rare and common single nucleotide polymorphism (SNP).

SNP2HLA

Description : Impute amino acid polymorphisms and single nucleotide polymorphisms in human luekocyte antigenes (HLA) within the major histocompatibility complex (MHC) region in chromosome 6.

SparRec

Description : A tool for genotype imputation. The SparRec (SPARse RECovery) algorithm uses low-rank matrix completion with a novel co-clustering factorization.

STITCH

Description : A tool for genotype imputation. The STITCH algorithm works without reference panels and models chromosomes as a mosaic of unknown founders or ancestral haplotypes.

tagIMPUTE

Description : tagIMPUTE is a tool to impute untyped single-nucleotide polymorphism (SNPs). The tagIMPUTE algorithm uses flanking SNPs that can predict the SNP for imputation.

TagIt

Description : TagIt is a tool to select single nucleotide polymorphism (SNP) from 26 population reference panels to increase the accuray of genotype imputation.

TIGAR

Description : TIGAR (Transcriptome-Intergrated Genetic Association Resource) is a tool for genotype imputation of transcriptome data. The TIGAR algorithm uses a nonparametric Bayesian method.

trio

Description : trio package contains the following functions: 1. The identification of linkage disequilibrium (LD) blocks, 2. Computation of pair-wise LD values. 3. Genotype imputation, 4. Simulation of case-parent trios with disease risk vs. SNP interaction, 5. Computation of trio logic regression on matched case pseudo-control genotype data for case-parent trios, 6. Calculation of power and sample size.

YHap

Description : YHap is a command-line tool for prediction of Y-chromosome genotypes and assignment of haplogroups. The YHap algorithm uses an imputation framework and works with low coverage sequence data, less than 2X coverage.

If you find errors, please report here: comments and suggestions.

SECTIONS

Tutorials
In-house software
Blog
News
Find Jobs
History
Definition of Bioinformatics
ENCYCLOPEDIA
COVID-19
⚬ Timeline of outbreak
⚬ Blog
⚬ Scientific Facts
⚬ Daily statistics
⚬ News

TOOLS

Find thousands of Bioinformatics and Life Science software tools and databases in the newly launched

Database of Bioinformatics Software Tools and Resources.

Ads

80 Free Genotype Imputation Tools - Software and Resources

80 Free Genotype Imputation Tools - Software and Resources

What are the top 10 most popular genotype imputation tools?

TOOLS

1. SNP Tools

2. GWAS Tools

3. Genotype Imputation Tools

4. MSA tools

5. CNV tools

6. WGA tools

7. RNA-seq tools

8. Proteomics Tools

9. Nucleic Acid Structure Analysis Tools

10. Nucleic Acid Sequence Analysis Tools

11. Sequence Logo Tools

12. In-house Software Tools