SPARSE
SPARSE implements a Sankoff-style algorithm for RNA structure-based alignment and joint folding of non-coding RNAs to enable accurate alignment at low sequence identity while reducing computational complexity.
Key Features:
- Sankoff-style algorithm: Implements simultaneous alignment and folding using a Sankoff-style formulation.
- Quadratic-time complexity: Executes the Sankoff-style algorithm in quadratic time rather than quartic time.
- No sequence-based heuristics: Does not rely on sequence-based heuristics for pruning or sparsification.
- Strong sparsification: Applies strong sparsification based on structural properties of RNA ensembles.
- Lightweight energy integration: Fully integrates Sankoff's original model with a lightweight energy computation framework.
- Loop deletions and insertions: Incorporates loop deletions and insertions within the lightweight Sankoff-style model.
- Maintains accuracy at low identity: Preserves alignment and folding accuracy for cases with sequence identity below 60%.
- Performance vs LocARNA: Achieves up to a 3.7× speedup over LocARNA while maintaining similar or better folding quality.
- Performance vs RAF: Produces substantially greater alignment accuracy than RAF on low sequence identity instances at comparable run-times.
- Targeted to ncRNAs from RNA-Seq: Applicable to alignment and folding of novel non-coding RNAs identified by RNA-Seq.
Scientific Applications:
- Structure-based alignment and joint folding: Simultaneous secondary-structure-aware alignment and folding of RNA sequences.
- ncRNA analysis from RNA-Seq: Analysis and alignment of novel non-coding RNAs discovered in RNA-Seq experiments.
- Low sequence identity alignment: Alignment of RNA sequences with sequence identity below 60% where sequence-based methods fail.
- Benchmarking and comparison: Comparative evaluation of folding quality, runtime, and alignment accuracy against tools such as LocARNA and RAF.
Methodology:
Implements a quadratic-time Sankoff-style algorithm with strong sparsification based on structural properties of RNA ensembles, integrates Sankoff's model with a lightweight energy computation framework, and explicitly includes loop deletions and insertions without using sequence-based heuristics.
Topics
Collections
Details
- License:
- GPL-3.0
- Maturity:
- Mature
- Cost:
- Free of charge
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Mac
- Programming Languages:
- Python
- Added:
- 4/29/2016
- Last Updated:
- 1/11/2019
Operations
Data Inputs & Outputs
Structure-based sequence alignment
Publications
Will S, Otto C, Miladi M, Möhl M, Backofen R. SPARSE: quadratic time simultaneous alignment and folding of RNAs without sequence-based heuristics. Bioinformatics. 2015;31(15):2489-2496. doi:10.1093/bioinformatics/btv185. PMID:25838465. PMCID:PMC4514930.