SPARSE

SPARSE implements a Sankoff-style algorithm for RNA structure-based alignment and joint folding of non-coding RNAs to enable accurate alignment at low sequence identity while reducing computational complexity.


Key Features:

  • Sankoff-style algorithm: Implements simultaneous alignment and folding using a Sankoff-style formulation.
  • Quadratic-time complexity: Executes the Sankoff-style algorithm in quadratic time rather than quartic time.
  • No sequence-based heuristics: Does not rely on sequence-based heuristics for pruning or sparsification.
  • Strong sparsification: Applies strong sparsification based on structural properties of RNA ensembles.
  • Lightweight energy integration: Fully integrates Sankoff's original model with a lightweight energy computation framework.
  • Loop deletions and insertions: Incorporates loop deletions and insertions within the lightweight Sankoff-style model.
  • Maintains accuracy at low identity: Preserves alignment and folding accuracy for cases with sequence identity below 60%.
  • Performance vs LocARNA: Achieves up to a 3.7× speedup over LocARNA while maintaining similar or better folding quality.
  • Performance vs RAF: Produces substantially greater alignment accuracy than RAF on low sequence identity instances at comparable run-times.
  • Targeted to ncRNAs from RNA-Seq: Applicable to alignment and folding of novel non-coding RNAs identified by RNA-Seq.

Scientific Applications:

  • Structure-based alignment and joint folding: Simultaneous secondary-structure-aware alignment and folding of RNA sequences.
  • ncRNA analysis from RNA-Seq: Analysis and alignment of novel non-coding RNAs discovered in RNA-Seq experiments.
  • Low sequence identity alignment: Alignment of RNA sequences with sequence identity below 60% where sequence-based methods fail.
  • Benchmarking and comparison: Comparative evaluation of folding quality, runtime, and alignment accuracy against tools such as LocARNA and RAF.

Methodology:

Implements a quadratic-time Sankoff-style algorithm with strong sparsification based on structural properties of RNA ensembles, integrates Sankoff's model with a lightweight energy computation framework, and explicitly includes loop deletions and insertions without using sequence-based heuristics.

Topics

Collections

Details

License:
GPL-3.0
Maturity:
Mature
Cost:
Free of charge
Tool Type:
command-line tool
Operating Systems:
Linux, Mac
Programming Languages:
Python
Added:
4/29/2016
Last Updated:
1/11/2019

Operations

Data Inputs & Outputs

Structure-based sequence alignment

Publications

Will S, Otto C, Miladi M, Möhl M, Backofen R. SPARSE: quadratic time simultaneous alignment and folding of RNAs without sequence-based heuristics. Bioinformatics. 2015;31(15):2489-2496. doi:10.1093/bioinformatics/btv185. PMID:25838465. PMCID:PMC4514930.

Documentation