GLM-PCA
GLM-PCA performs dimension reduction and deviance-based feature selection on single-cell RNA-Seq (scRNA-Seq) UMI count data by modeling counts with a multinomial generalized linear model to accommodate non-normal count and binary matrices.
Key Features:
- Multinomial model for UMI counts: Models UMI counts using a multinomial sampling framework that assumes no zero inflation to capture variability inherent in single-cell data.
- Generalized PCA for non-normal data: Implements a generalized version of principal component analysis tailored for non-normally distributed count or binary matrices.
- Deviance-based feature selection: Uses deviance to identify highly variable genes, providing an alternative to log counts per million or variance-based selection methods.
- Mitigation of normalization artifacts: Addresses limitations of current normalization and feature selection techniques that can introduce false variability.
Scientific Applications:
- Single-cell RNA-Seq dimensionality reduction: Provides an approach for reducing dimensionality of scRNA-Seq UMI count data while preserving biologically relevant variability.
- Clustering and cell population identification: Improves downstream clustering assessments and the identification of cell populations in single-cell studies.
- Analysis of large-scale single-cell studies: Supports analyses of large-scale scRNA-Seq datasets that require methods accounting for count-based distributions.
Methodology:
Applies a generalized linear model framework with a multinomial model for UMI counts to perform principal component analysis and employs deviance-based feature selection.
Topics
Details
- License:
- Artistic-2.0
- Tool Type:
- library
- Added:
- 1/14/2020
- Last Updated:
- 12/3/2020
Operations
Publications
Townes FW, Hicks SC, Aryee MJ, Irizarry RA. Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model. Genome Biology. 2019;20(1). doi:10.1186/s13059-019-1861-6. PMID:31870412. PMCID:PMC6927135.