tweeDEseq

tweeDEseq models RNA-seq count data using the Poisson-Tweedie family to perform differential expression analysis that accommodates heavy tails and zero-inflation not captured by Poisson or negative binomial models.


Key Features:

  • Poisson-Tweedie family: Utilizes the Poisson-Tweedie family of count distributions to capture heavy-tails and zero-inflation without altering configuration parameters.
  • Flexibility and accuracy: Handles large-scale RNA-seq datasets with extensive biological replication by accommodating diverse count distributions to better fit underlying biological variability.
  • Improved performance: Simulation studies on synthetic and real RNA-seq datasets show it yields P-values that are equal to or more accurate than those produced by competing methods across configurations.
  • Reproducibility: Detects differentially expressed genes with improved performance and reproducibility, evidenced by studies on sex-specific gene expression changes in human lymphoblastoid cell lines.
  • Comparison with microarrays: Has been validated against microarray results to confirm reproducibility of differential expression findings.

Scientific Applications:

  • Differential expression with many replicates: Differential expression analysis of RNA-seq datasets containing numerous biological replicates.
  • Modeling complex count distributions: Modeling RNA-seq count data exhibiting heavy tails and zero-inflation where Poisson or negative binomial assumptions fail.
  • Sex-specific expression studies: Investigating subtle sex-specific gene expression changes in human lymphoblastoid cell lines.
  • Cross-platform validation: Benchmarking and validating RNA-seq differential expression results against microarray data.

Methodology:

Fits RNA-seq count data to the Poisson-Tweedie distribution family to estimate count distributions and enable comparison between groups of samples.

Topics

Collections

Details

License:
GPL-2.0
Tool Type:
command-line tool, library
Operating Systems:
Linux, Windows, Mac
Programming Languages:
R
Added:
1/17/2017
Last Updated:
1/10/2019

Operations

Publications

Esnaola M, Puig P, Gonzalez D, Castelo R, Gonzalez JR. A flexible count data model to fit the wide diversity of expression profiles arising from extensively replicated RNA-seq experiments. BMC Bioinformatics. 2013;14(1). doi:10.1186/1471-2105-14-254. PMID:23965047. PMCID:PMC3849762.

Documentation

Downloads

Links