GSEApy

GSEApy performs gene set enrichment analysis in Python, providing enrichment and over-representation analyses for bulk and single-cell RNA sequencing gene expression datasets.


Key Features:

  • Efficient Large Dataset Analysis: Optimized for large-scale datasets including single-cell RNA sequencing to address computational demands of high-dimensional expression data.
  • Rust Implementation: Core computations are implemented in Rust, yielding up to threefold faster enrichment-statistic calculation compared with the Numpy version v0.10.8 and reducing memory usage by more than fourfold.
  • Versatile Environment Compatibility: Can be executed from the command line or within a Python environment to support different execution contexts.
  • Integration with Enrichr and BioMart: Provides an API-based integration with Enrichr for over-representation analysis and can query BioMart for gene annotation retrieval.
  • Comprehensive Toolset for Enrichment Analysis: Includes a suite of functions tailored to different types of enrichment analyses, including standard GSEA and over-representation analyses.

Scientific Applications:

  • Genomics and transcriptomics: Identification of enriched pathways and gene sets from bulk and single-cell expression experiments.
  • Single-cell RNA-seq analysis: Analysis of cellular heterogeneity and cell-type–specific expression programs in large single-cell datasets.
  • Disease mechanism and developmental studies: Discovery of pathway-level changes relevant to disease mechanisms or developmental processes.

Methodology:

GSEApy calculates enrichment statistics consistent with traditional GSEA approaches, performs over-representation analysis via the Enrichr API, and implements core computations in Rust for performance and memory efficiency while supporting queries to BioMart.

Topics

Details

License:
MIT
Cost:
Free of charge
Tool Type:
library
Operating Systems:
Mac, Windows
Programming Languages:
Python
Added:
1/28/2023
Last Updated:
1/28/2023

Operations

Publications

Fang Z, Liu X, Peltz G. GSEApy: a comprehensive package for performing gene set enrichment analysis in Python. Bioinformatics. 2022;39(1). doi:10.1093/bioinformatics/btac757. PMID:36426870. PMCID:PMC9805564.

PMID: 36426870
PMCID: PMC9805564
Funding: - National Institute for Drug Addiction: 5U01DA04439902

Links