OLego

OLego maps RNA-seq reads to genomes to enable sensitive detection of splice junctions and micro-exons in mammalian transcriptomes.


Key Features:

  • Multiple-Seed-and-Extend Scheme: Uses small seeds (~14 nt) tailored for mammalian genomes to increase sensitivity for splice junction detection.
  • Built-in Statistical Model: Scores exon junctions based on splice-site strength and intron size to improve mapping accuracy and resolve junction ambiguities.
  • Efficient Mapping with BWT: Employs the Burrows-Wheeler Transform (BWT) in multiple algorithmic steps to map seeds, locate junctions, and identify small exons.
  • Multithreaded Execution: Implemented in C++ with full multithreading to enable rapid processing of large RNA-seq datasets.

Scientific Applications:

  • Benchmarking: Demonstrated improved sensitivity, comparable or higher accuracy, and significantly increased speed on simulated and real RNA-seq datasets.
  • Micro-exon discovery: Identified hundreds of novel micro-exons (less than 30 nt) in the mouse transcriptome, many of which are phylogenetically conserved and experimentally validated in vivo.

Methodology:

Uses small (~14 nt) seeds, a statistical model scoring junctions by splice-site strength and intron size, applies the Burrows-Wheeler Transform across mapping and junction-identification steps, and is implemented as a multithreaded C++ program.

Topics

Details

License:
GPL-3.0
Maturity:
Mature
Tool Type:
command-line tool
Operating Systems:
Linux, Mac
Programming Languages:
C++
Added:
1/13/2017
Last Updated:
4/26/2021

Operations

Publications

Wu J, Anczuków O, Krainer AR, Zhang MQ, Zhang C. OLego: fast and sensitive mapping of spliced mRNA-Seq reads using small seeds. Nucleic Acids Research. 2013;41(10):5149-5163. doi:10.1093/nar/gkt216. PMID:23571760. PMCID:PMC3664805.

Baruzzo G, Hayer KE, Kim EJ, Di Camillo B, FitzGerald GA, Grant GR. Simulation-based comprehensive benchmarking of RNA-seq aligners. Nature Methods. 2016;14(2):135-139. doi:10.1038/nmeth.4106. PMID:27941783. PMCID:PMC5792058.

Documentation