MSAprobs

MSAprobs performs protein multiple sequence alignment using hidden Markov models to produce high-accuracy MSAs for phylogenetic analysis, structural prediction, and functional annotation of proteins.


Key Features:

  • Hidden Markov models: Uses hidden Markov models (HMMs) to compute protein multiple sequence alignments.
  • High alignment accuracy: Produces high alignment accuracy, which is important for downstream phylogenetic, structural, and functional analyses, with increased runtimes on large datasets.
  • Parallelized implementation (MSAProbs-MPI): Provides a distributed-memory parallel version that leverages MPI on multicore CPU clusters to reduce execution times.
  • Cluster performance: Demonstrated runtime reductions of over one order of magnitude on a 32-node cluster with two Intel Haswell processors per node.
  • Comparison to GPU methods: Outperforms the GPU-accelerated QuickProbs running on a Tesla K20 in speed when using eight nodes.
  • Scalability: Can handle large datasets that exceed the time and memory constraints of both serial MSAProbs and QuickProbs.
  • Implementation: Implemented in C++ with MPI and compatible with Linux systems.

Scientific Applications:

  • Phylogenetic analysis: Produces MSAs suitable for phylogenetic tree inference and evolutionary studies.
  • Structural prediction: Provides alignments to support protein secondary and tertiary structure prediction and comparative modeling.
  • Functional annotation: Generates alignments useful for transferring functional annotations and identifying conserved residues in proteins.

Methodology:

Performs alignment computation using hidden Markov models and implements distributed-memory MPI-based parallelization (MSAProbs-MPI) in C++ for multicore CPU clusters.

Topics

Details

Tool Type:
api
Operating Systems:
Linux, Windows, Mac
Added:
8/3/2015
Last Updated:
11/25/2024

Operations

Publications

González-Domínguez J, Liu Y, Touriño J, Schmidt B. MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems. Bioinformatics. 2016;32(24):3826-3828. doi:10.1093/bioinformatics/btw558. PMID:27638400.

PMID: 27638400
Funding: - Ministry of Economy and Competitiveness of Spain and FEDER funds of the EU: TIN2013-42148-P

Documentation

Links