scPCA

scPCA integrates sparse principal component analysis (sPCA) with contrastive principal component analysis (cPCA) to extract sparse, biologically relevant signals from high-dimensional biological datasets, including those generated by high-throughput sequencing, by leveraging control data to separate technical noise from biological variation.


Key Features:

  • Sparse Principal Component Analysis (sPCA): Implements sPCA to produce sparse components that enhance interpretability of high-dimensional biological data.
  • Contrastive PCA (cPCA): Utilizes control data in cPCA to contrast target and control samples and separate technical noise from genuine biological signals.
  • Stability and Interpretability: Emphasizes stability of extracted components to improve reproducibility and biological interpretability.
  • Relevance and Signal Recovery: Recovers features that capture biologically relevant variation while suppressing unwanted technical variation.

Scientific Applications:

  • Protein expression datasets: Applied to protein expression datasets to identify salient expression patterns amid high dimensionality.
  • Microarray gene expression profiles: Applied to microarray gene expression profiles for dimensionality reduction and extraction of relevant signals.
  • Single-cell transcriptome sequencing data: Applied to single-cell transcriptome sequencing data to extract sparse signals from noisy single-cell measurements.

Methodology:

Combines sPCA and cPCA and integrates control data to remove unwanted variation and extract sparse components.

Topics

Details

License:
MIT
Programming Languages:
R
Added:
1/14/2020
Last Updated:
12/18/2020

Operations

Publications

Boileau P, Hejazi NS, Dudoit S. Exploring High-Dimensional Biological Data with Sparse Contrastive Principal Component Analysis. Unknown Journal. 2019. doi:10.1101/836650.