Rcorrector

Rcorrector: K-mer–based error correction for RNA-seq reads

Rcorrector corrects random sequencing errors in RNA-seq reads using a k-mer–based approach optimized for non-uniform transcript coverage.


Key Features:

  • K-mer Based Methodology: Constructs a De Bruijn graph to represent trusted k-mers identified by frequency and distribution within the dataset.
  • Local Thresholding: Computes position-specific thresholds for k-mer trust evaluation to accommodate variable gene expression levels and alternative splicing in RNA-seq data.
  • Computational Efficiency: Processes 100 million reads with approximately 5 GB memory usage.
  • Technology Adaptability: Supports Illumina RNA-seq data and is applicable to sequencing datasets with non-uniform coverage, including single-cell sequencing.

Scientific Applications:

  • Transcriptomic Analysis: Improves read quality for downstream alignment and assembly, enabling more accurate gene expression profiling, variant detection, and alternative splicing analysis.

Methodology:

Rcorrector builds a De Bruijn graph from sequencing reads, identifies trusted k-mers using local frequency-based thresholds, and corrects errors by replacing low-frequency k-mers with supported alternatives, accounting for transcript-specific coverage variability.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux, Mac
Programming Languages:
C++, Perl, C
Added:
8/3/2017
Last Updated:
11/25/2024

Operations

Publications

Song L, Florea L. Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads. Gigascience. 2015;4(1). doi:10.1186/s13742-015-0089-y. PMID:26500767. PMCID:PMC4615873.

Documentation

Links