fastPHASE
fastPHASE estimates missing genotypes and haplotypes in genetic datasets by clustering haplotypes to model local linkage disequilibrium for genotype imputation and haplotype phasing.
Key Features:
- Clustered haplotype model: Clusters haplotypes into groups of similar sequences over short chromosomal regions to reflect the local nature of recombination.
- Hidden Markov model: Uses a hidden Markov model that allows cluster memberships to change continuously along the chromosome.
- Genotype imputation: Performs genotype imputation with accuracy reported as comparable to or surpassing other methodologies.
- Haplotype estimation (phasing): Produces haplotype phase estimates with a reported switch error rate of 0.055 versus 0.051 for PHASE on unrelated HapMap individuals.
- Uncertainty calibration: Computes well-calibrated probabilities that reflect uncertainty in genotype and haplotype estimates.
- Scalability and efficiency: Designed for large-scale datasets, e.g., thousands of individuals genotyped at hundreds of thousands of markers, with substantially reduced computational demands relative to some methods.
Scientific Applications:
- Genotype imputation: Imputes missing SNP genotypes in large-scale datasets to support association studies and downstream analyses.
- Haplotype phasing: Estimates haplotypic phase in unrelated individuals for linkage disequilibrium and haplotype-based analyses.
- Population genetics: Characterizes local LD patterns and recombination structure in population-genetic studies.
Methodology:
Clusters of haplotypes are modeled with a hidden Markov model that permits cluster membership changes along the chromosome, accommodating block-like LD patterns and gradual declines in LD, and computes calibrated probabilities for estimates.
Topics
Collections
Details
- License:
- Not licensed
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Mac
- Added:
- 8/20/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Scheet P, Stephens M. A Fast and Flexible Statistical Model for Large-Scale Population Genotype Data: Applications to Inferring Missing Genotypes and Haplotypic Phase. The American Journal of Human Genetics. 2006;78(4):629-644. doi:10.1086/502802. PMID:16532393. PMCID:PMC1424677.