MSAprobs
MSAprobs performs protein multiple sequence alignment using hidden Markov models to produce high-accuracy MSAs for phylogenetic analysis, structural prediction, and functional annotation of proteins.
Key Features:
- Hidden Markov models: Uses hidden Markov models (HMMs) to compute protein multiple sequence alignments.
- High alignment accuracy: Produces high alignment accuracy, which is important for downstream phylogenetic, structural, and functional analyses, with increased runtimes on large datasets.
- Parallelized implementation (MSAProbs-MPI): Provides a distributed-memory parallel version that leverages MPI on multicore CPU clusters to reduce execution times.
- Cluster performance: Demonstrated runtime reductions of over one order of magnitude on a 32-node cluster with two Intel Haswell processors per node.
- Comparison to GPU methods: Outperforms the GPU-accelerated QuickProbs running on a Tesla K20 in speed when using eight nodes.
- Scalability: Can handle large datasets that exceed the time and memory constraints of both serial MSAProbs and QuickProbs.
- Implementation: Implemented in C++ with MPI and compatible with Linux systems.
Scientific Applications:
- Phylogenetic analysis: Produces MSAs suitable for phylogenetic tree inference and evolutionary studies.
- Structural prediction: Provides alignments to support protein secondary and tertiary structure prediction and comparative modeling.
- Functional annotation: Generates alignments useful for transferring functional annotations and identifying conserved residues in proteins.
Methodology:
Performs alignment computation using hidden Markov models and implements distributed-memory MPI-based parallelization (MSAProbs-MPI) in C++ for multicore CPU clusters.
Topics
Details
- Tool Type:
- api
- Operating Systems:
- Linux, Windows, Mac
- Added:
- 8/3/2015
- Last Updated:
- 11/25/2024
Operations
Publications
González-Domínguez J, Liu Y, Touriño J, Schmidt B. MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems. Bioinformatics. 2016;32(24):3826-3828. doi:10.1093/bioinformatics/btw558. PMID:27638400.
PMID: 27638400
Funding: - Ministry of Economy and Competitiveness of Spain and FEDER funds of the EU: TIN2013-42148-P
Documentation
Links
Software catalogue
https://www.biocatalogue.org/services/3709