MOODS

MOODS matches position weight matrices (PWMs) to DNA sequences to detect motif occurrences and assess how sequence variants affect motif sites.


Key Features:

  • Advanced Matrix Matching Algorithms: MOODS implements online matching algorithms in C++ that outperform brute-force approaches and can handle scanning hundreds of PWMs against chromosome-sized sequences (e.g., 123 JASPAR matrices across the human genome in ~18 minutes on a standard PC).
  • High-Order PWM Support: The software supports first-order Markov and other high-order motif models that capture dependencies between adjacent positions, including dinucleotide and general q-mer dependencies.
  • Sequence Variant Integration: MOODS integrates single nucleotide polymorphisms (SNPs), insertions, and deletions into PWM matching workflows to evaluate variant effects on motif sites.
  • Multiple-Matrix Filtration: MOODS performs simultaneous searches for multiple matrices using a multiple-matrix filtration algorithm that demonstrates superior performance compared to basic sliding-window algorithms.
  • Implementation and Bindings: The package is implemented in C++ and provides bindings for BioPerl and Biopython, and can be used as a standalone C++ library for integration into analysis pipelines.
  • Algorithmic Optimizations: The software generalizes classical multipattern matching and applies filtering and superalphabet methods adapted for high-order PWMs to reduce computational complexity.

Scientific Applications:

  • Genome-scale motif scanning: Rapid identification of motif occurrences across entire genomes and chromosome-sized sequences.
  • Motif discovery and analysis: Prediction of putative sites for learned motifs and analysis of motif models.
  • Variant impact analysis: Assessment of how SNPs, insertions, and deletions affect PWM matches and potential regulatory sites.
  • Modeling position dependencies: Exploration of dinucleotide, q-mer, and other high-order dependencies within biological sequence motifs.

Methodology:

MOODS generalizes classical multipattern matching to weight matrix matching using online matching algorithms, a multiple-matrix filtration approach, and filtering and superalphabet methods adapted for high-order PWMs; the implementation is in C++.

Topics

Collections

Details

License:
GPL-3.0
Tool Type:
command-line tool
Operating Systems:
Linux
Programming Languages:
C++, Python
Added:
8/20/2017
Last Updated:
11/25/2024

Operations

Publications

Pizzi C, Rastas P, Ukkonen E. Finding Significant Matches of Position Weight Matrices in Linear Time. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2011;8(1):69-79. doi:10.1109/tcbb.2009.35. PMID:21071798.

Korhonen JH, Palin K, Taipale J, Ukkonen E. Fast motif matching revisited: high-order PWMs, SNPs and indels. Bioinformatics. 2016;33(4):514-521. doi:10.1093/bioinformatics/btw683. PMID:28011774.

PMID: 28011774
Funding: - SYSCOL: 258236 - Academy of Finland CoE in Cancer Genetics Research: 250345 - NIASC: 62721 - Icelandic Research Fund: 152679-051, VP12014044

Korhonen J, Martinmäki P, Pizzi C, Rastas P, Ukkonen E. MOODS: fast search for position weight matrix matches in DNA sequences. Bioinformatics. 2009;25(23):3181-3182. doi:10.1093/bioinformatics/btp554. PMID:19773334. PMCID:PMC2778336.

Documentation

Links