TopHat

TopHat aligns RNA-Seq reads from mRNA sequencing to a reference genome and discovers exon-exon splice junctions to identify known and novel splice variants.


Key Features:

  • Alignment without known splice sites: Performs alignment without depending on pre-existing splice junction annotations, enabling ab initio discovery of novel splice variants.
  • Efficient read mapping (Bowtie): Built upon the ultrafast short-read mapper Bowtie and capable of mapping nearly 2.2 million reads per CPU hour for large RNA-Seq datasets.
  • Discovery of novel splice junctions: Recovers over 72% of junctions identified by annotation-based approaches in mammalian RNA-Seq experiments and reported nearly 20,000 previously unreported splice junctions.
  • Short fragment handling: Maps short sequence fragments generated from mRNA sequencing to support transcript-level analyses.
  • Ab initio splice site discovery challenges: Implements ab initio splice junction discovery while highlighting algorithmic challenges in accurately predicting novel junctions in complex genomic contexts.

Scientific Applications:

  • Gene expression and transcriptomics: Characterizes gene expression and transcript structure in genomics and transcriptomics studies.
  • Alternative splicing analysis: Identifies alternative splicing events and novel splice variants for studies of gene regulation in health and disease.
  • Annotation-independent junction discovery: Enables annotation-independent discovery of splice junctions in mammalian RNA-Seq experiments.

Methodology:

Uses Bowtie for ultrafast short-read mapping and aligns RNA-Seq reads to a reference genome to identify exon-exon splice junctions without relying on known splice site annotations.

Topics

Collections

Details

Maturity:
Mature
Cost:
Free of charge
Tool Type:
command-line tool
Operating Systems:
Linux, Mac
Programming Languages:
C++
Added:
1/13/2017
Last Updated:
11/24/2024

Operations

Publications

Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25(9):1105-1111. doi:10.1093/bioinformatics/btp120. PMID:19289445. PMCID:PMC2672628.

Mareuil F, Doppelt-Azeroual O, Ménager H. A public Galaxy platform at Pasteur used as an execution engine for web services. Unknown Journal. 2017. doi:10.7490/f1000research.1114334.1.

Documentation

Links