OLego
OLego maps RNA-seq reads to genomes to enable sensitive detection of splice junctions and micro-exons in mammalian transcriptomes.
Key Features:
- Multiple-Seed-and-Extend Scheme: Uses small seeds (~14 nt) tailored for mammalian genomes to increase sensitivity for splice junction detection.
- Built-in Statistical Model: Scores exon junctions based on splice-site strength and intron size to improve mapping accuracy and resolve junction ambiguities.
- Efficient Mapping with BWT: Employs the Burrows-Wheeler Transform (BWT) in multiple algorithmic steps to map seeds, locate junctions, and identify small exons.
- Multithreaded Execution: Implemented in C++ with full multithreading to enable rapid processing of large RNA-seq datasets.
Scientific Applications:
- Benchmarking: Demonstrated improved sensitivity, comparable or higher accuracy, and significantly increased speed on simulated and real RNA-seq datasets.
- Micro-exon discovery: Identified hundreds of novel micro-exons (less than 30 nt) in the mouse transcriptome, many of which are phylogenetically conserved and experimentally validated in vivo.
Methodology:
Uses small (~14 nt) seeds, a statistical model scoring junctions by splice-site strength and intron size, applies the Burrows-Wheeler Transform across mapping and junction-identification steps, and is implemented as a multithreaded C++ program.
Topics
Details
- License:
- GPL-3.0
- Maturity:
- Mature
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Mac
- Programming Languages:
- C++
- Added:
- 1/13/2017
- Last Updated:
- 4/26/2021
Operations
Publications
Wu J, Anczuków O, Krainer AR, Zhang MQ, Zhang C. OLego: fast and sensitive mapping of spliced mRNA-Seq reads using small seeds. Nucleic Acids Research. 2013;41(10):5149-5163. doi:10.1093/nar/gkt216. PMID:23571760. PMCID:PMC3664805.
Baruzzo G, Hayer KE, Kim EJ, Di Camillo B, FitzGerald GA, Grant GR. Simulation-based comprehensive benchmarking of RNA-seq aligners. Nature Methods. 2016;14(2):135-139. doi:10.1038/nmeth.4106. PMID:27941783. PMCID:PMC5792058.