103 Free Multiple Sequence Alignment (MSA) Tools - Software and Resources

Graph: The word 'MSA' occurences in scientific articles stored in PubMed from 1990 to June 2019. — The word "MSA" occurences in scientific articles stored in PubMed from 1990 to June 2019.

103 Free Multiple Sequence Alignment (MSA) Tools - Software and Resources

Database of Bioinformatics Software Tools and Resources.

MSA Tools General Summary

In bioinformatics, multiple sequence alignment means an alignment of more than two DNA, RNA, or protein sequences and is one of the oldest problems in computational biology.

One often used strategy is to minimize the number of mismatches, insertions, and deletions in the alignment, and we can use the Dynamic Programming (DP) algorithm to compute an optimal alignment.

Unfortunately, the Dynamic Programming algorithm is computationally feasible only for a small number of sequences; Therefore, DP is only used to compute pairwise alignments. See our online tool that computes the number of possible alignments between two sequences. However, the computational complexity of pairwise sequence alignments is O(n²), and therefore, it is still possible to compute optimally although computationally expensive.

To construct multiple sequence alignments, we need to use varied heuristic methods. The computational complexity is O(2^kn^k), where k is the number of sequences, and n is the length. In other words, to align eight DNA sequences 100 bases long each takes about 2⁸×100⁸ = 3×10¹⁸ seconds, slightly longer than the estimated age of the universe.

The purpose of multiple sequence alignments can be sequence comparison, assessment of data quality, prediction of protein and RNA structures, database searching, and phylogenetic analysis. For this reason, varied methods are used depending on the purpose. We will have a more in-depth treatment of this topic in our upcoming tutorial.

3DCoffee@igs

Description : 3DCoffee@igs web server for computation of multiple sequence alignments (MSAs) that can mix protein sequences and 3D structures to increase accuracy of the alignments. It first aligns 3D structures and sequences with structures and uses T-Coffee to construct multiple sequence alignments.

ALICO

Description : A tool aimed for aiding a development of multiple sequence alignment methods. It generates randomized versions of input sequences preserving its essential features.

Anchored DIALIGN

Description : A web server for multiple protein and DNA sequence alignment. The tool allows a user to specify sequence segments as anchor points. The algorithm then aligns sequences using the anchor points as constraints. Alternatively, multiple sequence alignments can be done autonomously. This version of DIALIGN can only align sequences up a few thounsand residues.

ANTICALIgN

Description : A tool specifically designed for combinatorial protein engineering. ANTICALIgN can construct multiple sequence alignments (MSAs) based on a template reference sequence and global sequence alignments. Available from the Authors upon request.

BAliBASE

Description : BAliBASE (Benchmark Alignment dataBASE) is a multiple sequence alignment (MSA) benchmarking reference set. It contains reference alignments based on three-dimensional structures and particular reference sets that contain various linear motifs. The AUthors also provide a program that can compare a test alignment with the BAliBASE reference alignment.

BAliBASE 4

Description : BAliBASE (Benchmark Alignment dataBASE) is a multiple sequence alignment (MSA) benchmarking reference set. It contains reference alignments based on three-dimensional structures and particular reference sets that contain various linear motifs. The AUthors also provide a program that can compare a test alignment with the BAliBASE reference alignment. See also "links" for BAliBASE.

BARCOD

Description : BARCOD makes a character matrix using Véronique Barriel’s method, coding each insertion/deletion event regardless of the length into a single event and retains common indels.

biojs-io-clustal

Description : A tool for parsing Clustal files in your web browser.

CHAOS and DIALIGN web server

Description : A web-based application, which utilizes CHAOS database search tool to find a list of local sequence similarities. DIALIGN uses these similarities as anchor points to construct multiple sequence alignments.

Clustal Omega

Description : The original Clustal Omega tool for multiple protein sequence alignment. Clustal Omega is capable of aligning thousand of sequences and is an improvement of the previous version of Clustal, ClustalW and ClustalX, using HMMs, based on HHalign from Johannes Soeding. Clustal Omega also makes use of precomputed aligment information found in public databases.

Clustal Omega (EBI)

Description : EBI has several interfaces for Clustal Omega: Web interface, REST API, SOAP API, Open API Interface, and Common workflow Language.

Clustal WS (jabaws)

Description : Finds the best global alignment for a set of input sequences (nucleic acid or protein).

ClustalO (EBI)

Description : A Uniprot web server for multiple alignment of protein sequences using Clustal Omega. Capable of aligning up to 4,000 protein sequences. Also available as REST API, SOAP API, Open API Interface, and Common Workflow Language.

ClustalO (Galaxy Pasteur)

Description : A Galaxy public web interface at Institut Pasteur includes Clustal Omega wrapper. Institut Pasteur gives free access for external users for over 280 tools.

ClustalO WS (jabaws)

Description : Clustal Omega multiple sequence alignment program at JABAWS web-services. You can access JABAWS by Jalview, a command-line user interface, or install JABAWS and run it in your own computer.

ClustalW

Description : There are two versions of Clustal 2 multiple sequence alignment software: 1) Clustal W command-line tool and 2) Clustal X with graphical user interface.

ClustalW (PRABI)

Description : A web-based implementation at BCP - CNRS Université Lyon of Clustal W multiple sequence alignment software for protein and DNA sequences.

ClustalW (SIB)

Description : A web interface at Expasy for ClustalW multiple sequence alignment (MSA) tool. Works with both nucleic acid and protein sequences.

ClustalW2

Description : Tools for multiple protein sequence alignment. The algorithm uses a guide tree in alignment creation. There are two separate flavors of Clustal 2: Clustal W, the command-line version and Clustal X, the graphical version.

ClustalX

Description : A version of the Clustal 2 multiple sequence alignment program with a graphical interface.

ClustaW2 (Galaxy Pasteur)

Description : A Galaxy public web interface at Institut Pasteur includes ClustalW2 wrapper. Institut Pasteur gives free access for external users for over 280 tools.

CMSA

Description : A command-line tool for construction of multiple sequence alignments. It can utilize both CPUs and GPUs.

CoMSA

Description : CoMSA is a compression and decompression tool for FASTA and Stockholm format multiple sequence alignment (MSA) files. The algorithm in CoMSA relies on a generalization of the positional Burrows-Wheeler transform of non-binary characters. The Authors claim it to be significantly faster than gzip and it can, for example, compress a Stockholm file of size 41.6 Gb into 1.74 Gb, compared to gzip file size of 5.6 Gb. Apart from source code, CoMSA is also available with binaries for Windows and Linux.

CRASP

Description : The tool analyses multiple protein sequence alignments to find correlated residues. The algorithm assumes that functionally related residues are due to dependent evolution. The calculations are based on physicochemical properties.

CUDA ClustalW

Description : The multiple sequence alignment (MSA) tool, CUDA ClustalW v1.0 is a GPU version of ClustalW v2.0.1, using synchronous diagonal multiple threads and internal tasks' parallelization. The Authors report it to be able to speed up about 22 times compared to running on a single CPU.

DCA

Description : Divide-and-Conquer Multiple Sequence Alignmen (DCA). This tool uses a divide and conquers method to construct multiple sequence alignments (MSA) heuristically. It can align amino acid, DNA, and RNA sequences. The web site also provides REST services.

DIALIGN

Description : DIALIGN is a multiple sequence alignment (MSA) tool. An improved version is called DIALIGN-TX which in turn is an improvement over DIALIGN-T, that combines greedy, progressive methods. See also 'LINKS'.

DIALIGN 2

Description : DIALIGN 2 is an improved version of the original multiple sequence alignment tool DIALIGN from the year 1997. This tool uses sequence segments that don't contain indels for alignment construction. The most recent version of DIAGLIGN is called DIALIGN-TX. See the 'LINKS'.

DIALIGN-TX

Description : DIALIGN-TX is the latest version of the DIALIGN multiple sequence alignment (MSA) tool. The main algorithmic addition is the usage of a guide tree.

edialign

Description : edialign is an EMBOSS version of DIALIGN 2 multiple sequence alignment (MSA) tool.

EMBL-Align

Description : EMBL-Align is a publicly available database of multiple sequence alignments (MSAs). An associated tool, Webin-Align is a tool for submission of alignments. the EBI SRS (Sequence Retrieval System) server is used to query all the multiple sequence alignments.

FAMSA

Description : FAMSA is designed to produce rapid multiple sequence alignment of large protein families. It first determines the longest common subsequences and has a unique way to compute gap costs. It proceeds progressively to add sequences into the alignments using a novel iterative approach. The Authors claim FAMSA to be superior to Clustal Omega and MAFFT. A GPU version is available.

GISMO

Description : Bayesian Markov chain Monte Carlo (MCMC) sampler for protein multiple sequence alignment (MSA).

HAlign-II

Description : HAlign-II is a tool for multiple sequence alignment of amino acid and nucleotide sequences and phylogenetic tree construction aimed for sequence files bigger than one Gb. The software can be used in standalone or in Hadoop cluster mode. HAlign-II contains three types of sequence alignment methods and a large-scale phylogenetic tree construction method based on Apache Spark platform. You can also run HAlign-II on the web server on the clusters in Tianjin University (Spark & Hadoop cluster and NVIDIA K80 GPU cluster). The webserver is accessible from the HAlign-II web pagepage.

HandAlign

Description : HandAlign is a part of the DART package and a tool for reconstruction of multiple sequence alignment (MSA) and phylogenetic history. HandAlign includes several Metropolis-Hastings Markov chain Monte Carlo (MCMC) for sampling of any target distribution.

HmmCleaner

Description : HmmCleaner is a tool to remove alignment and sequencing error containing segments from multiple sequence alignments (MSA) using profile hidden Markov models (pHMM). This tool is based on Bio::MUST modules and integrates into MUST environment.

ISPAlign

Description : ISPAlign (Intermediate Sequence Profile Alignment) is a multiple sequence alignment program that incorporates an improvement to ProbCons’ HMM algorithm by extending it to use intermediate sequence profiles and structure predictions.

Kalign

Description : Kalign is a fast and accurate multiple sequence alignment algorithm of prtotein RNA, and DNA sequences.

Kalign (EBI)

Description : An implementation of Kalign multiple sequence alignment (MSA) tool at EBI. Web form and web services are available.

Kalignvu

Description : Kalignvu is a viewer for multiple sequence alignments and phylogenetic trees.

KMAD

Description : KMAD is a software package specifically designed to construct multiple sequence alignments (MSAs) of so-called intrinsically disordered proteins (IDPs). IDPs differ from globular proteins by lacking tertiary structure and by having lower sequence conservation. The Authors provide both stand-alone and web server versions.

M-Coffee

Description : M-Coffee is a particular mode of T-Coffee software.

MAFCO

Description : Lossless compression tool specifically designed to compress MAF (Multiple Alignment Format) files.

MAFFT (CBRC)

Description : MAFFT (Multiple Alignment using Fast Fourier Transform) is a multiple sequence alignment program for nucleotide and protein sequences. It allows users to interactively select sequences and visualization.

MAFFT (EBI)

Description : A web interface for MAFFT (Multiple Alignment using Fast Fourier Transform) multiple sequence alignment (MSA) tool at EBI. You can align up 500 sequences and have a file size up to one MB.

MAFFT (REST)

Description : A REST interface for MAFFT (Multiple Alignment using Fast Fourier Transform) multiple sequence alignment (MSA) tool at EBI.

MAFFT parallel

Description : A parallelized version of MAFFT multiple sequence alignment (MSA) tool. The parallelization is based on the POSIX Threads library with two approaches: best-first and simple hill-climbing in the alignment refinement stage.

Malakite

Description : A web-based tool, Malakite (Multiple Alignment Automatic Kinship Tiling Engine), is for analysis of aligned blocks in multiple protein sequence alignments.

MARS

Description : MARS is a multiple sequence alignment (MSA) tool specifically designed for the alignment of sequences from circular genomes, such as mitochondria and viral genome sequences.

mbed

Description : A command-line-based tool for computation of guide trees for multiple sequence alignments (MSAs).

MSA at BYU

Description : Enhanced multiple sequence alignment (MSA) software at Brigham Young University, Computer Science Department. This software uses hardware acceleration: GPU, FPGA, and Cell BE.

msa-edna

Description : EDNA (Energy Based Multiple Sequence Alignment) is a multiple sequence alignment (MSA) program for aligning transcription factor binding site sequences (TFBSs). The novelty of this software is the scoring using a thermodynamically generated null hypothesis. The method is well suited for aligning sequences that are often not related. Alternative names: Energy Based Multiple Sequence Alignment, EDNA

MSA-PAD 2.0

Description : A web-based tool for multiple sequence alignment (MSA) of DNA. The algorithm uses PFAM or profiles provided by a user. The web interface requires registration and login.

MSACompro

Description : MSACompro is a tool to integrate tool predicted secondary structure, residue contact information, and relative solvent accessibility into a posterior probability for multiple sequence alignment (MSA) software, such as MSAProbs, ProbCons, Probalign, T-coffee, MAFFT, and MUSCLE.

MSAprobs

Description : A tool for multiple sequence alignment (MSA) for protein sequences. Features: uses a combination of hidden Markov models and partition functions, weighted probabilistic consistency transformation, weighted profile to profile alignments. The Authors claim MSAprobs to have better accuracy than ClustalW, MAFFT, MUSCLE, ProbCons, and Probalign. A multicore version is available. See "LINKS."

MSAProbs-MPI

Description : A parallel version of MSAProbs multiple sequence (MSA) alignment tool. The method is based on hidden Markov models. See LINKS on this page for more information about MSAProbs.

MSARC

Description : MSARC a multiple sequence alignment (MSA) tool that constructs alignments without guide trees. The Authors claim their method to outperform BAliBASE on "sequence sets whose evolutionary distances are difficult to represent by a phylogenetic tree."

Multi-LAGAN

Description : Multi-LAGAN is a tool for multiple global alignments of genomic sequences, a part of the Lagan Tool Toolkit and based on CHAOS local alignment tool.
Alternative name: MLAGAN

Mumsa

Description : Mumsa is a tool for the assessment of the quality of multiple sequence alignments (MSAs).

MUSCLE

Description : A program to create multiple sequence alignments of a large number of sequences. Prominent features are rapid sequence distance computation using k-mer counting, a profile function computing a log-expectation scores, and tree-dependent partitioning of the sequences.

MUSCLE (BioConductor)

Description : An R package of multiple sequence alignment with MUSCLE.

MUSCLE (EBI)

Description : A web-based multiple sequence alignment with MUSCLE. RESTful and SOAP services are also available.

Muscle WS (jabaws)

Description : MUSCLE multiple sequence alignment program at JABAWS web-services. You can access JABAWS by Jalview, a command-line user interface, or install JABAWS and run it in your own computer.

Mustguseal

Description : A web application for multiple sequence alignments of protein families. The application constructs the alignments based on structural and other information in public databases.

MView (EBI)

Description : MView is a tool for reformatting multiple alignments or the results of BLAST, FASTA, database search results by adding optional HTML markup for coloring and web page layout.

OD-seq

Description : OD-seq is a software program for detecting outliers in multiple sequence alignments (MSA). It works by finding sequences with an inconsistent average distance to sequences present in the multiple alignment.

OD-seq (bioconductor)

Description : OD-seq is a software program for detecting outliers in multiple sequence alignments (MSA). It works by finding sequences with an inconsistent average distance to sequences present in the multiple alignment. Requirement: R >= 3.2.3.

OPAL

Description : A tool for multiple sequence alignment (MSA) using "form-and-polish strategy." The Authors claim OPAL to be more accurate than Muscle and similar to Muscle on protein sequence alignment and have similar accuracy as MAFFT and Muscle on DNA sequence alignments.

OXBench

Description : OXBench consists of a set of programs to permorm accuracy assessment of multiple sequence alignment methods, aimed for software developers.

PASTA

Description : PASTA (Practical Alignment using Sate and TrAnsitivity) is a multiple sequence alignment tool that uses a guide tree.

PnpProbs

Description : PnpProbs is a multiple sequence alignment (MSA) tool. It operates by assigning sequences into two, distantly and "normally" related, groups and uses a guide tree solely for "normally" related sequences. For distantly related sequences, it applies a non-progressive approach to generate a multiple sequence alignment (MSA).

PRALINE

Description : Praline is a multiple sequence alignment program that provodes several different alignment strategies, e.g, integration of structural information in the alignment process. It also provides a comprehensive visualization of the multiple sequence alignments. SOAP service is available.

PRANK

Description : A creation of multiple alignments representing structural homology and evolutionary homology, require separate approaches. PRANK is designed for construction of multiple alignments reflecting the evolutionary homology and phylogenetic information to handle insertions and deletions.

PRANK API

Description : A web API at EBI for PRANK, a multiple sequence alignment (MSA) tool for nucleic acid and amino acid sequences. The core algorithm differs from 'traditional' ones by avoiding to overestimate insertion/deletion events and accounts for the evolutionary distance between the sequences. Available upon request from Ari Löytynoja.

Pro-Coffee

Description : Pro-Coffee is a part of the T_Coffee package and implemented for multiple sequence alignment of promoter regions.

Probalign

Description : Probalign is a multiple sequence alignment (MSA) software that uses a partition function to estimate posterior alignment probabilities. The Authors claim Probalign to be more accurate than Probcons, MAFFT, and MUSCLE.

ProbCons

Description : Probabilistic Consistency-based Multiple Alignment of Amino Acid Sequences. It uses probabilistic modeling and consistenct-based technigues in the alignment construction. The Authors claim this tool to have improved alignments compared to T-Coffee, Clustal W, and Dialign.

ProDA

Description : ProDA is a tool that constructs local multiple sequence alignments (MSAs) by first identifying repeated homologous regions in a collection of protein sequences.

PROMALS3D

Description : PROMALS (Profile Multiple Alignment with Local Structure) is a web-based tool for the construction of multiple sequence alignments (MSA). It searches both sequence and structure databases and uses that information together with user-defined constraints.

PSAlign

Description : PSAlign is a multiple sequence alignment tool. The algorithm constructs pairwise sequence alignments that are represented as a graph and finds the shortest path to create a multiple sequence alignment without heuristics.

PSAR-align

Description : A tool for improving multiple sequence alignments using probabilistic sampling.

PSI-Coffee

Description : PSI-Coffee is a part of the T-Coffe distribution and specifically designed for making multiple sequence alignments (MSAs) of alpha-helical transmembrane protein sequences. The Authors claim PSI-Coffee to be more accurate than MSAProbs, Kalign, PROMALS, MAFFT, ProbCons, and PRALINE.

PVS

Description : Protein Variability Server (PVS) web server uses several variability metrics to calculate a sequence variability within a multiple protein sequence alignment. The tool can map the sequence variability to supplied 3D structure, plot the variability, mask the variability in a sequence, predict T-cell epitopes, locate conserved sequences in 3D structures, and return conserved sequence fragments.

QOMA

Description : Multiple sequence alignment of proteins sequences, using k-partite graph. The algorithm is independent of thr order of sequences.

QuickAlign

Description : A tool for the editing of sequence alignments, and the making of multiple sequence alignments (MSAs) with ClustalX.

QuickProbs

Description : A multiple protein sequence alignment tool that is based on probabilistic models, employing a column-oriented and selective consistency in aligment refinement instead of commonly used strategies of increased alignment quality. The Authors claim Quick-Probs 2 to be "noticeably" better than ClustalΩ and MAFFT.

R-Coffee

Description : R-Coffee is a package for multiple RNA sequence alignments, derived from T-Coffee package. It uses structural information in the construction of the sequence alignments and a special version of T-Coffee constructs the multiple sequence alignments incorporating the structural information. Requirements: RNAlpfold from the Vienna package, Mafft, Muscle, ProbCons, and ConSan.

R3D-2-MSA

Description : The R3D-2-MSA (RNA 3D Structure-to-Multiple Sequence Alignment) is a web-based tool for linking 3D structures to multiple RNA sequence alignments.

SARA-Coffee

Description : A web server for multiple sequence alignments (MSAs) of RNA sequences based on 3D structures. SARA combines pair-wise structural alignments with R-Coffee multiple RNA alignments. It also allows alignment without 3D structures.

showalign

Description : Display a multiple sequence alignment in pretty format.

SINA

Description : SINA (SILVA Incremental Aligner) is a web tool for multiple sequence alignment (MSA) specifically designed for the multiple alignment of ribosomal RNA genes (rRNA). SINA is also able to taxonomically classify the sequences.

STACCATO

Description : Stacatto is a multiple sequence alignment (MSA) tool that combines the use of three-dimensional structure alignment probabilities and standard amino acid substitution probabilities. Available from the Authors.

SuiteMSA

Description : A java-based application that provides unique MSA viewers. Users can directly compare multiple MSAs and evaluate where the MSAs agree (are consistent) or disagree (are inconsistent).

T-Coffee

Description : T-Coffee is a multiple sequence alignment (MSA) program. It preprocesses the data by making pair-wise alignments between all sequences and this information is incorporated in the progressive alignment procedure. The structural sequence information may be obtained from various different sources. It can align amino acid and nucleotide sequences.

T-Coffee (CGR)

Description : A web server for T-Coffee tools for evaluating and A web server at The Centre for Genomic Regulation (CRG) provides T-Coffee tools for assessing and handling multiple sequence alignments (MSAs) of nucleotide and amino acid sequences and related structures. The web server provides the following methods: M-Coffee, R-Coffee, Expresso, PSI-Coffee, and iRMSD-APDB.manipulating multiple alignments of DNA, RNA, protein sequences and structures. Includes M-Coffee, R-Coffee, Expresso, PSI-Coffee, iRMSD-APDB.

T-Coffee (EBI)

Description : T-Coffee is a multiple sequence alignment (MSA) program. Web user interface and web services interfaces at EBI provides Simple Object Access Protocol (SOAP), Representational State Transfer (REST), Open API Interface, and Common Workflow Language (CWL) services. T-Coffee package pre-processes the data by making pair-wise alignments between all sequences and this information is incorporated in the progressive alignment procedure. The structural sequence information may be obtained from various different sources. It can align amino acid and nucleotide sequences. The package combines several alignment methods.

TM-Aligner

Description : A web-based tool to align transmembrane proteins using Wu-Manber string matching algorithm. The tool can visualize multiple sequence alignments in varied color schemes.

TM-Coffee

Description : A web-based tool for PSI/TM-Coffee at The Centre for Genomic Regulation (CRG). The tool is specifically constructed to construct multiple sequence alignments (MSAs) of transmembrane proteins. It can use transmembrane databases for fast extension of homology.

trimAl

Description : Tool for the removal of poorly aligned sequences from multiple sequence alignments. It can automatically detect and select various parameters to optimize the signal-to-noise ratio.

UniProt Align

Description : A web interface at Uniprot for multiple sequence alignment using Clustal Omega.

VerAlign

Description : VerAlign is a web-based tool to compare two multiple sequence alignments (MSAs). It uses SPdist scoring scheme which measures a distance between mismatched amino acid pairs. Available from the Authors upon request.

webPRANK

Description : webPRANK is a multiple sequence alignment (MSA) tool for DNA, protein, cDNA, and codon sequences at Goldman Group (EBI). It has structure models built-in and includes a web-based visualization of multiple alignments.

Please, send comments and suggestions. Suggest for example a topic you would like to see or any improvements in the content. We will not spam you with emails!

SECTIONS

Tutorials
In-house software
Blog
News
Find Jobs
History
Definition of Bioinformatics
ENCYCLOPEDIA
COVID-19
⚬ Timeline of outbreak
⚬ Blog
⚬ Scientific Facts
⚬ Daily statistics
⚬ News

TOOLS

Find thousands of Bioinformatics and Life Science software tools and databases in the newly launched

Database of Bioinformatics Software Tools and Resources.

Ads

103 Free Multiple Sequence Alignment (MSA) Tools - Software and Resources

103 Free Multiple Sequence Alignment (MSA) Tools - Software and Resources

MSA Tools General Summary

TOOLS

1. SNP Tools

2. GWAS Tools

3. Genotype Imputation Tools

4. MSA tools

5. CNV tools

6. WGA tools

7. RNA-seq tools

8. Proteomics Tools

9. Nucleic Acid Structure Analysis Tools

10. Nucleic Acid Sequence Analysis Tools

11. Sequence Logo Tools

12. In-house Software Tools