SGSeq

SGSeq identifies and quantifies splice events from RNA-seq data, including unannotated and complex splicing patterns, to construct genome-wide splice graphs and measure splice variant usage.


Key Features:

  • Genome-Guided Prediction: Predicts splice junctions and exons by mapping RNA-seq reads to a reference genome for downstream assembly.
  • Splice Graph Construction: Generates genome-wide splice graphs from existing annotations or predicts them de novo directly from RNA-seq data.
  • Recursive Event Identification: Identifies splice events recursively within the constructed splice graph to capture complex splicing patterns.
  • Local Quantification: Quantifies splice event usage locally by analyzing reads that extend across the start or end points of each splice variant.
  • Aligner Impact Assessment: Evaluates the influence of read aligners GSNAP, HISAT2, STAR, and TopHat2 on prediction accuracy.
  • Comparative Quantification Validation: Validates quantification using simulated data and compares results with methods such as MISO and Cufflinks.

Scientific Applications:

  • Analysis of Complex Splicing Events: Detects complex and unannotated splice events for detailed characterization of alternative splicing.
  • Validation Across Datasets and Tissues: Validated using simulated data and RNA-seq from 16 normal human tissues to assess prediction and quantification performance.
  • Discovery of Novel Internal Exons: Identified 249 internal exons within known genes in the Illumina Body Map 2.0 dataset, with validation by RT-PCR and paired-end RNA-seq in independent samples.

Methodology:

Reads are mapped to a reference genome to predict junctions and exons and assemble splice graphs (from annotations or de novo), splice events are identified recursively and quantified using reads spanning variant boundaries, and prediction/quantification performance is assessed with different aligners (GSNAP, HISAT2, STAR, TopHat2) and compared to MISO and Cufflinks using simulated data.

Topics

Collections

Details

License:
Artistic-2.0
Tool Type:
command-line tool, library
Operating Systems:
Linux, Windows, Mac
Programming Languages:
R
Added:
1/17/2017
Last Updated:
1/13/2019

Operations

Publications

Goldstein LD, Cao Y, Pau G, Lawrence M, Wu TD, Seshagiri S, Gentleman R. Prediction and Quantification of Splice Events from RNA-Seq Data. PLOS ONE. 2016;11(5):e0156132. doi:10.1371/journal.pone.0156132. PMID:27218464. PMCID:PMC4878813.

Documentation

Downloads