STAR

STAR aligns high-throughput RNA sequencing (RNA-seq) reads to reference genomes to identify exon–exon junctions, non-canonical splice events, and chimeric (fusion) transcripts.


Key Features:

  • Algorithmic Innovation: Employs sequential maximum mappable seed search within uncompressed suffix arrays followed by seed clustering and stitching to enable fast and accurate spliced alignment.
  • Performance Superiority: Reports mapping speeds exceeding other RNA-seq aligners by over 50-fold and can process over 550 million paired-end reads per hour on a 12-core server while maintaining high sensitivity and precision.
  • Comprehensive Mapping Capabilities: Detects canonical and non-canonical splice junctions, identifies chimeric (fusion) transcripts, and supports mapping of full-length RNA sequences.
  • Validation and Precision: Validation using Roche 454 sequencing reported an 80–90% success rate in detecting novel intergenic splice junctions.
  • Implementation: Implemented in C++.

Scientific Applications:

  • Alignment Accuracy: Serves as the primary alignment step in RNA-seq pipelines, providing reliable base-, read-, and exon-junction-level mapping across genomes of varying complexity.
  • Benchmarking Insights: Maintains high accuracy when optimized parameters are applied, according to benchmarking studies.
  • Integration with Other Tools: Can complement tools such as segemehl and lack to assist in rescuing unmapped RNA-seq reads.

Methodology:

Performs a sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching to produce spliced alignments and detect splice junctions and chimeric transcripts.

Topics

Collections

Details

License:
GPL-3.0
Maturity:
Mature
Tool Type:
command-line tool
Operating Systems:
Linux, Mac
Programming Languages:
C++
Added:
1/13/2017
Last Updated:
11/24/2024

Operations

Publications

Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2012;29(1):15-21. doi:10.1093/bioinformatics/bts635. PMID:23104886. PMCID:PMC3530905.

Baruzzo G, Hayer KE, Kim EJ, Di Camillo B, FitzGerald GA, Grant GR. Simulation-based comprehensive benchmarking of RNA-seq aligners. Nature Methods. 2016;14(2):135-139. doi:10.1038/nmeth.4106. PMID:27941783. PMCID:PMC5792058.

Otto C, Stadler PF, Hoffmann S. Lacking alignments? The next-generation sequencing mapper segemehl revisited. Bioinformatics. 2014;30(13):1837-1843. doi:10.1093/bioinformatics/btu146. PMID:24626854.

Documentation