fastPHASE

fastPHASE estimates missing genotypes and haplotypes in genetic datasets by clustering haplotypes to model local linkage disequilibrium for genotype imputation and haplotype phasing.


Key Features:

  • Clustered haplotype model: Clusters haplotypes into groups of similar sequences over short chromosomal regions to reflect the local nature of recombination.
  • Hidden Markov model: Uses a hidden Markov model that allows cluster memberships to change continuously along the chromosome.
  • Genotype imputation: Performs genotype imputation with accuracy reported as comparable to or surpassing other methodologies.
  • Haplotype estimation (phasing): Produces haplotype phase estimates with a reported switch error rate of 0.055 versus 0.051 for PHASE on unrelated HapMap individuals.
  • Uncertainty calibration: Computes well-calibrated probabilities that reflect uncertainty in genotype and haplotype estimates.
  • Scalability and efficiency: Designed for large-scale datasets, e.g., thousands of individuals genotyped at hundreds of thousands of markers, with substantially reduced computational demands relative to some methods.

Scientific Applications:

  • Genotype imputation: Imputes missing SNP genotypes in large-scale datasets to support association studies and downstream analyses.
  • Haplotype phasing: Estimates haplotypic phase in unrelated individuals for linkage disequilibrium and haplotype-based analyses.
  • Population genetics: Characterizes local LD patterns and recombination structure in population-genetic studies.

Methodology:

Clusters of haplotypes are modeled with a hidden Markov model that permits cluster membership changes along the chromosome, accommodating block-like LD patterns and gradual declines in LD, and computes calibrated probabilities for estimates.

Topics

Collections

Details

License:
Not licensed
Tool Type:
command-line tool
Operating Systems:
Linux, Mac
Added:
8/20/2017
Last Updated:
11/25/2024

Operations

Publications

Scheet P, Stephens M. A Fast and Flexible Statistical Model for Large-Scale Population Genotype Data: Applications to Inferring Missing Genotypes and Haplotypic Phase. The American Journal of Human Genetics. 2006;78(4):629-644. doi:10.1086/502802. PMID:16532393. PMCID:PMC1424677.

Documentation