MSAProbs-MPI

MSAProbs-MPI performs parallel multiple sequence alignment of protein sequences on distributed-memory multicore clusters using Message Passing Interface (MPI) to provide accurate and scalable alignments.


Key Features:

  • Parallelization with MPI: Distributes computational tasks across multiple processors in a cluster using Message Passing Interface (MPI) for distributed-memory parallelism.
  • High accuracy: Retains MSAProbs' probabilistic/statistical models to capture biological nuances in protein sequence data and produce reliable alignments.
  • Scalability and efficiency: Optimized for high-performance computing environments to handle large datasets with reduced runtime on multicore clusters.
  • Based on MSAProbs v0.9.7: Implements the algorithms and statistical frameworks of MSAProbs version 0.9.7 in a parallelized architecture.
  • Configurable performance parameters: Exposes parameters to adjust computation and resource usage according to hardware configurations and dataset characteristics.

Scientific Applications:

  • Phylogenetic analysis: Produces multiple sequence alignments used to infer evolutionary relationships and build phylogenetic trees.
  • Protein structure prediction: Provides accurate alignments of homologous protein sequences that inform comparative modelling and structure prediction.
  • Functional annotation of genes: Generates alignments that support transfer of functional annotations between conserved sequences.
  • Evolutionary studies: Enables analysis of sequence conservation, divergence, and evolutionary patterns across large protein datasets.

Methodology:

Applies MSAProbs' probabilistic/statistical alignment models and distributes computations across cluster nodes using Message Passing Interface (MPI), with configurable parameters for performance tuning.

Topics

Details

Tool Type:
command-line tool
Added:
1/18/2021
Last Updated:
3/1/2021

Operations

Publications

González-Domínguez J. Fast and Accurate Multiple Sequence Alignment with MSAProbs-MPI. Methods in Molecular Biology. 2020. doi:10.1007/978-1-0716-1036-7_3. PMID:33289885.