Supersplat

Supersplat identifies splice junctions from RNA sequencing (RNA-seq) reads by comparing reads to genomic references to support genome annotation and transcript structure discovery.


Key Features:

  • Empirical splice-junction discovery: Identifies splice junctions directly from RNA-seq data without relying on existing splice annotations and reports intron/exon boundaries.
  • Processing throughput: Processes RNA-seq datasets at a reported rate of approximately 11.4 million reads per hour.
  • High-throughput scaling: Scales to large RNA-seq datasets to support large-scale genomic studies.
  • Benchmarking and validation: Benchmarked using Illumina RNA-seq reads mapped against the Arabidopsis thaliana genome.
  • De novo annotation utility: Applied for de novo annotation use cases, including analysis of Brachypodium distachyon.
  • Implementation: Implemented in C++ and optimized for performance.

Scientific Applications:

  • Genome annotation: Delineates intron/exon boundaries and splice junctions to improve structural genome annotations.
  • Transcriptome characterization: Detects splice junctions from RNA-seq data for both model organisms and novel species, including Arabidopsis thaliana and Brachypodium distachyon.
  • De novo gene model annotation: Supports generation of gene models in species lacking comprehensive reference annotations.

Methodology:

Maps RNA-seq reads to genomic reference sequences and empirically identifies splice junctions from the mapped reads; implemented in C++ with a reported throughput of ~11.4 million reads per hour and validated using Illumina RNA-seq mapped to the Arabidopsis thaliana genome.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux, Mac
Added:
1/13/2017
Last Updated:
11/24/2024

Operations

Publications

Bryant DW, Shen R, Priest HD, Wong W, Mockler TC. Supersplat—spliced RNA-seq alignment. Bioinformatics. 2010;26(12):1500-1505. doi:10.1093/bioinformatics/btq206. PMID:20410051. PMCID:PMC2881391.

Documentation