HISAT2

HISAT2 aligns next-generation RNA sequencing (RNA-seq) reads to single and multiple reference genomes for spliced alignment and downstream gene-expression analysis.


Key Features:

  • Efficient Indexing Scheme: Uses a hierarchical indexing approach based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index with a whole-genome FM index plus numerous local FM indexes.
  • Human-genome locality: The human genome hierarchical index comprises approximately 48,000 local FM indexes, each covering about 64,000 base pairs.
  • Performance: Reported as a fastest RNA-seq aligner with equal or superior accuracy versus other methods and a reported memory footprint of 4.3 gigabytes.
  • Scalability: Supports genomes of any size, including genomes exceeding 4 billion bases.
  • Integration with analysis tools: Integrates with StringTie and Ballgown to enable read alignment, transcript assembly (including novel splice variants), computation of transcript abundance per sample, and comparison across experiments for differential expression.
  • Benchmarking and optimization: Benchmarked against other splice-aware aligners using simulated data, showing superior base-, read-, and exon junction-level accuracy and demonstrating the importance of parameter optimization.

Scientific Applications:

  • Transcript assembly and quantification: Alignments produced by HISAT2 feed transcript assemblers (e.g., StringTie) to assemble transcripts and quantify transcript abundance, including novel splice variants.
  • Differential gene and transcript expression: Provides aligned reads for identification of differentially expressed genes and transcripts across samples and conditions.
  • Comparative gene expression across species and conditions: Enables analysis of gene expression patterns by aligning RNA-seq reads to single or multiple reference genomes for cross-sample and cross-species studies.

Methodology:

Employs hierarchical indexing based on the Burrows-Wheeler transform and FM index with a whole-genome FM index plus numerous local FM indexes (~48,000 local indexes of ~64,000 bp in the human genome), performs splice-aware alignment, and has been benchmarked against other splice-aware aligners using simulated data to assess base-, read-, and exon junction-level accuracy with parameter optimization highlighted as important.

Topics

Collections

Details

License:
GPL-3.0
Tool Type:
command-line tool
Operating Systems:
Linux, Windows, Mac
Programming Languages:
Python
Added:
8/20/2017
Last Updated:
9/4/2019

Operations

Publications

Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nature Methods. 2015;12(4):357-360. doi:10.1038/nmeth.3317. PMID:25751142. PMCID:PMC4655817.

Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nature Protocols. 2016;11(9):1650-1667. doi:10.1038/nprot.2016.095. PMID:27560171. PMCID:PMC5032908.

Baruzzo G, Hayer KE, Kim EJ, Di Camillo B, FitzGerald GA, Grant GR. Simulation-based comprehensive benchmarking of RNA-seq aligners. Nature Methods. 2016;14(2):135-139. doi:10.1038/nmeth.4106. PMID:27941783. PMCID:PMC5792058.

Documentation

Links