GSNAP

GSNAP aligns short reads from next-generation sequencing to reference genomes or transcriptomes to detect splicing events, complex variants, SNP-tolerant mappings, and methylation-related signals.


Key Features:

  • Read support: Supports alignment of single-end and paired-end reads, handling sequences as short as 14 nucleotides and of arbitrarily long lengths.
  • Splice detection: Detects complex splicing events, including interchromosomal splicing, using probabilistic models or a database of known splice sites.
  • SNP-tolerant alignment: Accommodates all possible combinations of major and minor alleles to reveal alternate genomic mappings.
  • Bisulfite alignment: Aligns bisulfite-treated DNA reads to enable analysis of methylation states.
  • Search strategy: Employs a successively constrained search that merges and filters position lists from a genomic index to enhance detection of complex variants and splicing.
  • Complex variant detection: Efficiently identifies variants with multiple mismatches and indels, including cases with four or more mismatches, insertions of 1–9 nucleotides, and deletions up to 30 nucleotides.
  • Performance: Maintains competitive speeds relative to other aligners, particularly for reads of 70 nucleotides or longer.

Scientific Applications:

  • RNA-seq splicing analysis: Detection and characterization of complex and interchromosomal splicing in transcriptomic data.
  • Variant discovery: Identification of complex variants with multiple mismatches and indels from short-read data.
  • Allele-aware mapping: Analysis of alternate genomic mappings arising from SNPs in transcriptomic studies.
  • Methylation analysis: Alignment of bisulfite-treated reads to infer changes in DNA methylation states.

Methodology:

GSNAP performs a successively constrained search that merges and filters position lists from a genomic index, detects splicing using probabilistic models or a database of known splice sites, implements SNP-tolerant alignment by accommodating all combinations of major and minor alleles, and supports alignment of bisulfite-treated DNA reads.

Topics

Collections

Details

Maturity:
Mature
Tool Type:
command-line tool
Operating Systems:
Linux, Mac
Programming Languages:
Perl, C
Added:
1/13/2017
Last Updated:
6/16/2020

Operations

Publications

Wu TD, Nacu S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics. 2010;26(7):873-881. doi:10.1093/bioinformatics/btq057. PMID:20147302. PMCID:PMC2844994.

Hatem A, Bozdağ D, Toland AE, Çatalyürek ÜV. Benchmarking short sequence mapping tools. BMC Bioinformatics. 2013;14(1). doi:10.1186/1471-2105-14-184. PMID:23758764. PMCID:PMC3694458.

Baruzzo G, Hayer KE, Kim EJ, Di Camillo B, FitzGerald GA, Grant GR. Simulation-based comprehensive benchmarking of RNA-seq aligners. Nature Methods. 2016;14(2):135-139. doi:10.1038/nmeth.4106. PMID:27941783. PMCID:PMC5792058.

Caboche S, Audebert C, Lemoine Y, Hot D. Comparison of mapping algorithms used in high-throughput sequencing: application to Ion Torrent data. BMC Genomics. 2014;15(1):264. doi:10.1186/1471-2164-15-264. PMID:24708189. PMCID:PMC4051166.

Mareuil F, Doppelt-Azeroual O, Ménager H. A public Galaxy platform at Pasteur used as an execution engine for web services. Unknown Journal. 2017. doi:10.7490/f1000research.1114334.1.

Documentation