EPIG-Seq

EPIG-Seq analyzes RNA sequencing (RNA-Seq) count-based data to extract pattern profiles and cluster co-expressed genes using count-based correlation and quasi-Poisson modeling to assess dispersion and differential expression, including datasets with inflated zeros and small sample sizes.


Key Features:

  • Count-Based Correlation: Measures similarity between genes using raw count-level data and accommodates inflated zeros in RNA-Seq datasets.
  • Quasi-Poisson Modeling: Estimates dispersion in replicate data and incorporates a location parameter to indicate the magnitude of differential expression.
  • Pattern Profile Extraction: Identifies pattern profiles from RNA-Seq counts to serve as seeds for downstream clustering of co-expressed genes.
  • Clustering and Statistical Significance: Clusters genes based on extracted patterns and computes statistical significance for co-expression.
  • Handling Small Sample Sizes: Provides analysis methods intended to remain reliable with limited sample numbers.
  • Bootstrapped p-values: Produces bootstrapped p-values for genes to validate identified patterns statistically.
  • Profile Plots: Generates profile plots to visualize co-expressed gene patterns across conditions.
  • Heat Maps and PCA: Produces heat maps and principal component analysis (PCA) to assist interpretation of gene clustering across experimental conditions.

Scientific Applications:

  • Co-expressed gene cluster identification: Identifies biologically relevant clusters of co-expressed genes to investigate coordinated gene regulation.
  • Toxicogenomics: Detects expression patterns and co-expression clusters relevant to toxicological responses.
  • Cancer research: Identifies expression patterns and co-expressed gene clusters that can provide insights into cancer-related biological processes.

Methodology:

Operates on raw RNA-Seq counts using count-based correlation (supporting inflated zeros); applies quasi-Poisson modeling to estimate dispersion and a location parameter for differential expression; extracts pattern profiles as clustering seeds, clusters genes across experimental conditions, computes statistical significance including bootstrapped p-values, and outputs profile plots, heat maps, and PCA.

Topics

Details

License:
GPL-3.0
Tool Type:
desktop application
Operating Systems:
Linux, Windows
Programming Languages:
R, Java, C
Added:
4/29/2018
Last Updated:
12/10/2018

Operations

Publications

Li J, Bushel PR. EPIG-Seq: extracting patterns and identifying co-expressed genes from RNA-Seq data. BMC Genomics. 2016;17(1). doi:10.1186/s12864-016-2584-7. PMID:27004791. PMCID:PMC4804494.

Documentation