EPIG-Seq
EPIG-Seq analyzes RNA sequencing (RNA-Seq) count-based data to extract pattern profiles and cluster co-expressed genes using count-based correlation and quasi-Poisson modeling to assess dispersion and differential expression, including datasets with inflated zeros and small sample sizes.
Key Features:
- Count-Based Correlation: Measures similarity between genes using raw count-level data and accommodates inflated zeros in RNA-Seq datasets.
- Quasi-Poisson Modeling: Estimates dispersion in replicate data and incorporates a location parameter to indicate the magnitude of differential expression.
- Pattern Profile Extraction: Identifies pattern profiles from RNA-Seq counts to serve as seeds for downstream clustering of co-expressed genes.
- Clustering and Statistical Significance: Clusters genes based on extracted patterns and computes statistical significance for co-expression.
- Handling Small Sample Sizes: Provides analysis methods intended to remain reliable with limited sample numbers.
- Bootstrapped p-values: Produces bootstrapped p-values for genes to validate identified patterns statistically.
- Profile Plots: Generates profile plots to visualize co-expressed gene patterns across conditions.
- Heat Maps and PCA: Produces heat maps and principal component analysis (PCA) to assist interpretation of gene clustering across experimental conditions.
Scientific Applications:
- Co-expressed gene cluster identification: Identifies biologically relevant clusters of co-expressed genes to investigate coordinated gene regulation.
- Toxicogenomics: Detects expression patterns and co-expression clusters relevant to toxicological responses.
- Cancer research: Identifies expression patterns and co-expressed gene clusters that can provide insights into cancer-related biological processes.
Methodology:
Operates on raw RNA-Seq counts using count-based correlation (supporting inflated zeros); applies quasi-Poisson modeling to estimate dispersion and a location parameter for differential expression; extracts pattern profiles as clustering seeds, clusters genes across experimental conditions, computes statistical significance including bootstrapped p-values, and outputs profile plots, heat maps, and PCA.
Topics
Details
- License:
- GPL-3.0
- Tool Type:
- desktop application
- Operating Systems:
- Linux, Windows
- Programming Languages:
- R, Java, C
- Added:
- 4/29/2018
- Last Updated:
- 12/10/2018
Operations
Publications
Li J, Bushel PR. EPIG-Seq: extracting patterns and identifying co-expressed genes from RNA-Seq data. BMC Genomics. 2016;17(1). doi:10.1186/s12864-016-2584-7. PMID:27004791. PMCID:PMC4804494.