RSeQC
RSeQC performs comprehensive quality control and evaluation of RNA sequencing (RNA-seq) data to assess sequence quality, mapping distribution, coverage uniformity, strand specificity, nucleotide composition bias, PCR and GC bias, and transcript integrity.
Key Features:
- Quality Control Modules: Evaluates sequence quality, nucleotide composition bias, PCR bias, and GC content bias.
- Sequencing Saturation: Assesses whether sequencing depth has reached saturation for transcript discovery and quantification.
- Mapped Reads Distribution: Analyzes how reads are distributed across genome structure.
- Coverage Uniformity: Measures evenness of coverage across transcripts to inform expression quantification.
- Strand Specificity: Validates strand-specific nature of RNA-seq data to distinguish sense and antisense transcripts.
- Input Compatibility: Processes SAM/BAM formats from RNA-seq mappers and BED files for gene models.
- Efficiency and Visualization: Uses R scripts to generate visualizations and to handle large datasets with hundreds of millions of alignments.
- Programming Languages: Implemented in Python and C.
- Transcript Integrity Number (TIN): Computes TIN to measure RNA degradation at transcript and sample levels and supports correction of gene expression counts to mitigate degradation effects.
Scientific Applications:
- Transcriptome profiling: Provides QC metrics required for reliable transcriptome profiling.
- Gene expression analysis: Supplies coverage, saturation, and bias assessments used in expression quantification and differential expression.
- RNA degradation studies: Quantifies degradation effects using TIN and enables correction of expression counts for degraded samples.
- Translational and clinical research: Calibrates gene expression data with TIN to reduce degradation-related artifacts in archived clinical tissues.
- Cancer research, genomics, and personalized medicine: Enhances reliability of RNA-seq–based findings in studies across cancer research, genomics, and personalized medicine.
Methodology:
Parses SAM/BAM and BED inputs; evaluates sequence quality, nucleotide composition, PCR bias, and GC bias; computes sequencing saturation, mapped reads distribution, coverage uniformity, and strand specificity; computes Transcript Integrity Number (TIN) at transcript and sample levels; uses R scripts for visualization; implemented in Python and C.
Topics
Collections
Details
- Tool Type:
- workflow
- Operating Systems:
- Linux, Mac
- Added:
- 1/17/2017
- Last Updated:
- 11/24/2024
Operations
Publications
Wang L, Wang S, Li W. RSeQC: quality control of RNA-seq experiments. Bioinformatics. 2012;28(16):2184-2185. doi:10.1093/bioinformatics/bts356. PMID:22743226.
Wang L, Nie J, Sicotte H, Li Y, Eckel-Passow JE, Dasari S, Vedell PT, Barman P, Wang L, Weinshiboum R, Jen J, Huang H, Kohli M, Kocher JA. Measure transcript integrity using RNA-seq data. BMC Bioinformatics. 2016;17(1). doi:10.1186/s12859-016-0922-z. PMID:26842848. PMCID:PMC4739097.