SEWAL

SEWAL analyzes Illumina next-generation sequencing data using a locality-sensitive hashing algorithm to enumerate unique sequences, quantify mutation frequencies and information content, compute sequence distance metrics, perform multivariate statistics including principal component analysis, and generate three-dimensional visualizations of adaptive/fitness landscapes for in vitro selection studies of functional nucleic acids.


Key Features:

  • Locality-Sensitive Hashing Algorithm: Enumerates all unique sequences from an entire Illumina sequencing run (≈10^8 sequences) and enables quasilinear-time processing of datasets on the order of ≈10^7 sequences in minutes on a standard desktop.
  • Visualization Tools: Generates three-dimensional scatter plots that represent adaptive/fitness landscape concepts to aid interpretation of sequence-function relationships.
  • Specialized Functions for Doped Selections: Provides mutation frequency analysis, information content calculation, and multivariate statistical functions including principal component analysis.
  • Sequence Analysis Tools: Computes distance metrics and performs sequence searches and comparisons across multiple Illumina datasets for comparative analyses.

Scientific Applications:

  • In vitro selection analysis: Analysis of large-scale sequencing data from in vitro selection experiments of functional nucleic acids.
  • Mutation and diversity profiling: Quantification of mutation frequencies and sequence diversity within sequencing datasets.
  • Evolutionary dynamics and fitness landscape inference: Visualization and multivariate analysis to investigate evolutionary trajectories and adaptive/fitness landscape structure.

Methodology:

Locality-sensitive hashing for enumeration of unique sequences with quasilinear-time scaling; computation of mutation frequencies, information content, and distance metrics; principal component analysis and generation of three-dimensional scatter-plot visualizations.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux
Programming Languages:
C++
Added:
1/13/2017
Last Updated:
11/25/2024

Operations

Publications

Pitt JN, Rajapakse I, Ferre-D'Amare AR. SEWAL: an open-source platform for next-generation sequence analysis and visualization. Nucleic Acids Research. 2010;38(22):7908-7915. doi:10.1093/nar/gkq661. PMID:20693400. PMCID:PMC3001052.