GPHMM
GPHMM models chromosomal aberrations and genotyping signal biases in whole genome SNP array data from tumor samples to detect copy number alterations and loss of heterozygosity (LOH).
Key Features:
- Global parameter integration: Incorporates global parameters into a Hidden Markov Model to account for sample-wide effects on SNP array signals.
- Hidden Markov Model framework: Uses an HMM to segment genomic states relevant to copy number and LOH.
- Baseline shift modeling: Quantitatively models signal baseline shifts arising from aneuploidy.
- Normal cell contamination modeling: Explicitly models contamination from normal cells (tumor purity effects) in genotyping signals.
- GC content bias correction: Accounts for GC content bias that distorts SNP array signal intensities.
- Expectation-Maximization estimation: Employs an Expectation-Maximization (EM) algorithm for parameter estimation.
- Low-purity sensitivity: Capable of identifying chromosomal rearrangements in samples with tumor cell content as low as 10%.
- SNP array compatibility: Operates on whole genome SNP array genotyping data for genome-wide analysis.
- Quality-control outputs: Produces global parameter estimates useful for data quality control and outlier detection in cohorts.
Scientific Applications:
- Chromosomal aberration detection: Detection and characterization of chromosomal rearrangements and copy number alterations from SNP arrays.
- LOH analysis: Identification and mapping of loss of heterozygosity regions in tumor genomes.
- Tumor purity and aneuploidy assessment: Estimation of tumor cell content and aneuploidy-related baseline shifts from genotyping signals.
- Cohort data quality control: Quality-control assessment and outlier detection in SNP array cohort studies using global parameter estimates.
Methodology:
Integration of global parameters into a Hidden Markov Model and parameter estimation via an Expectation-Maximization (EM) algorithm to quantitatively model baseline shifts (aneuploidy), normal-cell contamination, and GC content bias and to identify chromosomal rearrangements from whole genome SNP array genotyping signals.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Windows, Mac
- Programming Languages:
- Java, MATLAB
- Added:
- 12/18/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Li A, Liu Z, Lezon-Geyda K, Sarkar S, Lannin D, Schulz V, Krop I, Winer E, Harris L, Tuck D. GPHMM: an integrated hidden Markov model for identification of copy number alteration and loss of heterozygosity in complex tumor samples using whole genome SNP arrays. Nucleic Acids Research. 2011;39(12):4928-4941. doi:10.1093/nar/gkr014. PMID:21398628. PMCID:PMC3130254.