GLM-PCA

GLM-PCA performs dimension reduction and deviance-based feature selection on single-cell RNA-Seq (scRNA-Seq) UMI count data by modeling counts with a multinomial generalized linear model to accommodate non-normal count and binary matrices.


Key Features:

  • Multinomial model for UMI counts: Models UMI counts using a multinomial sampling framework that assumes no zero inflation to capture variability inherent in single-cell data.
  • Generalized PCA for non-normal data: Implements a generalized version of principal component analysis tailored for non-normally distributed count or binary matrices.
  • Deviance-based feature selection: Uses deviance to identify highly variable genes, providing an alternative to log counts per million or variance-based selection methods.
  • Mitigation of normalization artifacts: Addresses limitations of current normalization and feature selection techniques that can introduce false variability.

Scientific Applications:

  • Single-cell RNA-Seq dimensionality reduction: Provides an approach for reducing dimensionality of scRNA-Seq UMI count data while preserving biologically relevant variability.
  • Clustering and cell population identification: Improves downstream clustering assessments and the identification of cell populations in single-cell studies.
  • Analysis of large-scale single-cell studies: Supports analyses of large-scale scRNA-Seq datasets that require methods accounting for count-based distributions.

Methodology:

Applies a generalized linear model framework with a multinomial model for UMI counts to perform principal component analysis and employs deviance-based feature selection.

Topics

Details

License:
Artistic-2.0
Tool Type:
library
Added:
1/14/2020
Last Updated:
12/3/2020

Operations

Publications

Townes FW, Hicks SC, Aryee MJ, Irizarry RA. Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model. Genome Biology. 2019;20(1). doi:10.1186/s13059-019-1861-6. PMID:31870412. PMCID:PMC6927135.