HISAT2
HISAT2 aligns next-generation RNA sequencing (RNA-seq) reads to single and multiple reference genomes for spliced alignment and downstream gene-expression analysis.
Key Features:
- Efficient Indexing Scheme: Uses a hierarchical indexing approach based on the Burrows-Wheeler transform and the Ferragina-Manzini (FM) index with a whole-genome FM index plus numerous local FM indexes.
- Human-genome locality: The human genome hierarchical index comprises approximately 48,000 local FM indexes, each covering about 64,000 base pairs.
- Performance: Reported as a fastest RNA-seq aligner with equal or superior accuracy versus other methods and a reported memory footprint of 4.3 gigabytes.
- Scalability: Supports genomes of any size, including genomes exceeding 4 billion bases.
- Integration with analysis tools: Integrates with StringTie and Ballgown to enable read alignment, transcript assembly (including novel splice variants), computation of transcript abundance per sample, and comparison across experiments for differential expression.
- Benchmarking and optimization: Benchmarked against other splice-aware aligners using simulated data, showing superior base-, read-, and exon junction-level accuracy and demonstrating the importance of parameter optimization.
Scientific Applications:
- Transcript assembly and quantification: Alignments produced by HISAT2 feed transcript assemblers (e.g., StringTie) to assemble transcripts and quantify transcript abundance, including novel splice variants.
- Differential gene and transcript expression: Provides aligned reads for identification of differentially expressed genes and transcripts across samples and conditions.
- Comparative gene expression across species and conditions: Enables analysis of gene expression patterns by aligning RNA-seq reads to single or multiple reference genomes for cross-sample and cross-species studies.
Methodology:
Employs hierarchical indexing based on the Burrows-Wheeler transform and FM index with a whole-genome FM index plus numerous local FM indexes (~48,000 local indexes of ~64,000 bp in the human genome), performs splice-aware alignment, and has been benchmarked against other splice-aware aligners using simulated data to assess base-, read-, and exon junction-level accuracy with parameter optimization highlighted as important.
Topics
Collections
Details
- License:
- GPL-3.0
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Windows, Mac
- Programming Languages:
- Python
- Added:
- 8/20/2017
- Last Updated:
- 9/4/2019
Operations
Publications
Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nature Methods. 2015;12(4):357-360. doi:10.1038/nmeth.3317. PMID:25751142. PMCID:PMC4655817.
Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nature Protocols. 2016;11(9):1650-1667. doi:10.1038/nprot.2016.095. PMID:27560171. PMCID:PMC5032908.
Baruzzo G, Hayer KE, Kim EJ, Di Camillo B, FitzGerald GA, Grant GR. Simulation-based comprehensive benchmarking of RNA-seq aligners. Nature Methods. 2016;14(2):135-139. doi:10.1038/nmeth.4106. PMID:27941783. PMCID:PMC5792058.