tweeDEseq
tweeDEseq models RNA-seq count data using the Poisson-Tweedie family to perform differential expression analysis that accommodates heavy tails and zero-inflation not captured by Poisson or negative binomial models.
Key Features:
- Poisson-Tweedie family: Utilizes the Poisson-Tweedie family of count distributions to capture heavy-tails and zero-inflation without altering configuration parameters.
- Flexibility and accuracy: Handles large-scale RNA-seq datasets with extensive biological replication by accommodating diverse count distributions to better fit underlying biological variability.
- Improved performance: Simulation studies on synthetic and real RNA-seq datasets show it yields P-values that are equal to or more accurate than those produced by competing methods across configurations.
- Reproducibility: Detects differentially expressed genes with improved performance and reproducibility, evidenced by studies on sex-specific gene expression changes in human lymphoblastoid cell lines.
- Comparison with microarrays: Has been validated against microarray results to confirm reproducibility of differential expression findings.
Scientific Applications:
- Differential expression with many replicates: Differential expression analysis of RNA-seq datasets containing numerous biological replicates.
- Modeling complex count distributions: Modeling RNA-seq count data exhibiting heavy tails and zero-inflation where Poisson or negative binomial assumptions fail.
- Sex-specific expression studies: Investigating subtle sex-specific gene expression changes in human lymphoblastoid cell lines.
- Cross-platform validation: Benchmarking and validating RNA-seq differential expression results against microarray data.
Methodology:
Fits RNA-seq count data to the Poisson-Tweedie distribution family to estimate count distributions and enable comparison between groups of samples.
Topics
Collections
Details
- License:
- GPL-2.0
- Tool Type:
- command-line tool, library
- Operating Systems:
- Linux, Windows, Mac
- Programming Languages:
- R
- Added:
- 1/17/2017
- Last Updated:
- 1/10/2019
Operations
Publications
Esnaola M, Puig P, Gonzalez D, Castelo R, Gonzalez JR. A flexible count data model to fit the wide diversity of expression profiles arising from extensively replicated RNA-seq experiments. BMC Bioinformatics. 2013;14(1). doi:10.1186/1471-2105-14-254. PMID:23965047. PMCID:PMC3849762.