PennSeq
PennSeq estimates isoform-specific gene expression from RNA sequencing (RNA-Seq) data using a non-parametric statistical model that accounts for isoform-specific, non-uniform read distributions and sequencing biases.
Key Features:
- Non-parametric modeling: Employs a non-parametric statistical approach that avoids parametric assumptions about read distributions.
- Isoform-specific read distributions: Allows each isoform to have its own non-uniform read distribution.
- Bias accommodation: Accounts for hexamer priming bias, local sequence bias, positional bias, RNA degradation, mapping bias, and other unknown factors.
- Empirical fragment sampling probabilities: Estimates the probability that a fragment is sampled from particular regions based on aligned RNA-Seq data.
- Validation datasets: Evaluated on simulated datasets with known ground truth and on real Illumina RNA-Seq datasets including qRT-PCR measurements.
- Performance advantage: Provides improved isoform-expression estimation relative to existing methods, particularly under severe non-uniformity.
Scientific Applications:
- Isoform-level quantification: Estimating isoform-specific gene expression from RNA-Seq data.
- Bias-sensitive analyses: Improving accuracy of expression estimates in studies affected by sequencing and sample biases.
- Disease-associated gene studies: Enabling more accurate identification of genes associated with disease susceptibility through better isoform quantification.
- Method benchmarking: Benchmarking expression-estimation methods using simulated ground truth and qRT-PCR validation.
Methodology:
PennSeq models isoform-specific, non-uniform read distributions using a non-parametric statistical approach and estimates empirical fragment sampling probabilities from aligned RNA-Seq data; performance was evaluated on simulated datasets and Illumina RNA-Seq datasets with qRT-PCR measurements.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Windows
- Programming Languages:
- Perl
- Added:
- 12/18/2017
- Last Updated:
- 11/24/2024
Operations
Publications
Hu Y, Liu Y, Mao X, Jia C, Ferguson JF, Xue C, Reilly MP, Li H, Li M. PennSeq: accurate isoform-specific gene expression quantification in RNA-Seq by modeling non-uniform read distribution. Nucleic Acids Research. 2013;42(3):e20-e20. doi:10.1093/nar/gkt1304. PMID:24362841. PMCID:PMC3919567.