RSeQC

RSeQC performs comprehensive quality control and evaluation of RNA sequencing (RNA-seq) data to assess sequence quality, mapping distribution, coverage uniformity, strand specificity, nucleotide composition bias, PCR and GC bias, and transcript integrity.


Key Features:

  • Quality Control Modules: Evaluates sequence quality, nucleotide composition bias, PCR bias, and GC content bias.
  • Sequencing Saturation: Assesses whether sequencing depth has reached saturation for transcript discovery and quantification.
  • Mapped Reads Distribution: Analyzes how reads are distributed across genome structure.
  • Coverage Uniformity: Measures evenness of coverage across transcripts to inform expression quantification.
  • Strand Specificity: Validates strand-specific nature of RNA-seq data to distinguish sense and antisense transcripts.
  • Input Compatibility: Processes SAM/BAM formats from RNA-seq mappers and BED files for gene models.
  • Efficiency and Visualization: Uses R scripts to generate visualizations and to handle large datasets with hundreds of millions of alignments.
  • Programming Languages: Implemented in Python and C.
  • Transcript Integrity Number (TIN): Computes TIN to measure RNA degradation at transcript and sample levels and supports correction of gene expression counts to mitigate degradation effects.

Scientific Applications:

  • Transcriptome profiling: Provides QC metrics required for reliable transcriptome profiling.
  • Gene expression analysis: Supplies coverage, saturation, and bias assessments used in expression quantification and differential expression.
  • RNA degradation studies: Quantifies degradation effects using TIN and enables correction of expression counts for degraded samples.
  • Translational and clinical research: Calibrates gene expression data with TIN to reduce degradation-related artifacts in archived clinical tissues.
  • Cancer research, genomics, and personalized medicine: Enhances reliability of RNA-seq–based findings in studies across cancer research, genomics, and personalized medicine.

Methodology:

Parses SAM/BAM and BED inputs; evaluates sequence quality, nucleotide composition, PCR bias, and GC bias; computes sequencing saturation, mapped reads distribution, coverage uniformity, and strand specificity; computes Transcript Integrity Number (TIN) at transcript and sample levels; uses R scripts for visualization; implemented in Python and C.

Topics

Collections

Details

Tool Type:
workflow
Operating Systems:
Linux, Mac
Added:
1/17/2017
Last Updated:
11/24/2024

Operations

Publications

Wang L, Wang S, Li W. RSeQC: quality control of RNA-seq experiments. Bioinformatics. 2012;28(16):2184-2185. doi:10.1093/bioinformatics/bts356. PMID:22743226.

Wang L, Nie J, Sicotte H, Li Y, Eckel-Passow JE, Dasari S, Vedell PT, Barman P, Wang L, Weinshiboum R, Jen J, Huang H, Kohli M, Kocher JA. Measure transcript integrity using RNA-seq data. BMC Bioinformatics. 2016;17(1). doi:10.1186/s12859-016-0922-z. PMID:26842848. PMCID:PMC4739097.

Documentation