RnaSeqSampleSize
RnaSeqSampleSize: Negative Binomial-Based Sample Size Estimation for RNA-Seq
RnaSeqSampleSize estimates sample size and statistical power for differential gene expression analysis in RNA-Seq experiments using negative binomial models parameterized by gene-specific read counts and dispersion estimates.
Key Features:
- Negative Binomial Model-Based Estimation: Applies a negative binomial distribution to model RNA-Seq count data, accounting for variability and overdispersion in gene expression measurements.
- Multiple Testing Control: Incorporates false discovery rate (FDR) control to support power calculations across thousands of simultaneously tested genes.
- Distribution-Based Estimation: Derives gene-specific read count and dispersion parameters from empirical RNA-Seq datasets, including The Cancer Genome Atlas (TCGA).
- Gene and Pathway-Specific Analysis: Enables targeted sample size and power estimation for selected genes or biological pathways.
- Parameter Optimization: Optimizes model parameters to improve precision and reliability of power and sample size estimates.
Scientific Applications:
- Differential Expression Study Design: Determines sample sizes required to achieve specified statistical power for detecting differential gene expression in RNA-Seq experiments while controlling FDR.
Methodology:
Gene-specific read counts and dispersion parameters are estimated from reference RNA-Seq datasets such as TCGA. These parameters are modeled using a negative binomial distribution to compute statistical power and required sample sizes under multiple hypothesis testing with FDR control.
Topics
Collections
Details
- License:
- GPL-2.0
- Tool Type:
- command-line tool, library
- Operating Systems:
- Linux, Windows, Mac
- Programming Languages:
- R
- Added:
- 1/17/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Zhao S, Li C, Guo Y, Sheng Q, Shyr Y. RnaSeqSampleSize: real data based sample size estimation for RNA sequencing. BMC Bioinformatics. 2018;19(1). doi:10.1186/s12859-018-2191-5. PMID:29843589. PMCID:PMC5975570.
Huber W, Carey VJ, Gentleman R, Anders S, Carlson M, Carvalho BS, Bravo HC, Davis S, Gatto L, Girke T, Gottardo R, Hahne F, Hansen KD, Irizarry RA, Lawrence M, Love MI, MacDonald J, Obenchain V, Oleś AK, Pagès H, Reyes A, Shannon P, Smyth GK, Tenenbaum D, Waldron L, Morgan M. Orchestrating high-throughput genomic analysis with Bioconductor. Nature Methods. 2015;12(2):115-121. doi:10.1038/nmeth.3252. PMID:25633503. PMCID:PMC4509590.