projectR

projectR implements transfer learning in R/Bioconductor to project and interpret features learned by dimension-reduction methods (PCA, NMF), correlation analysis, and clustering across high-dimensional genomics datasets.


Key Features:

  • Transfer learning framework: Relates features learned in a source dataset to a target dataset, enabling projection when ground truth is limited or missing (e.g., large single-cell datasets).
  • Dimension reduction techniques: Supports PCA, NMF, and related approaches to extract interpretable factors that can capture technical variation as well as biological signal.
  • Integrated data analysis: Enables cross-dataset interpretation and biologically driven validation by assessing projected features against independent sample annotations, including spatial single-cell analysis.
  • Correlation, clustering, and factorization: Applies correlation analysis, clustering, and factorization-based analyses to integrate projected features with target data and contextual annotations.

Scientific Applications:

  • Single-cell genomic analysis: Supports analysis of large-scale single-cell genomic datasets where conventional validation strategies are limited.
  • Spatial and integrative analyses: Facilitates spatial and integrative analyses to detect biologically meaningful patterns not readily recovered by standard workflows.
  • Feature transfer and discovery: Enables transfer of features across datasets to support discovery and interpretation of emergent biological phenomena.

Methodology:

Extract gene-weight or feature-loading representations from high-dimensional data using dimension-reduction methods; construct projection matrices to map learned features from a source dataset onto a target dataset; and apply correlation, clustering, and factorization-based analyses to integrate projected features with target data and contextual annotations.

Topics

Details

License:
GPL-2.0
Programming Languages:
R
Added:
11/14/2019
Last Updated:
12/6/2020

Operations

Publications

Sharma G, Colantuoni C, Goff LA, Fertig EJ, Stein-O’Brien G. projectR: An R/Bioconductor package for transfer learning via PCA, NMF, correlation, and clustering. Unknown Journal. 2019. doi:10.1101/726547.

Documentation

Training material
https://github.com/fertigLab/projectRSpatialExample
Tutorial material