For better experience, turn on JavaScript!

103 Free Multiple Sequence Alignment (MSA) Tools - Software and Resources

# 103 Free Multiple Sequence Alignment (MSA) Tools - Software and Resources

## MSA Tools General Summary

In bioinformatics, multiple sequence alignment means an alignment of more than two DNA, RNA, or protein sequences and is one of the oldest problems in computational biology.

One often used strategy is to minimize the number of mismatches, insertions, and deletions in the alignment, and we can use the Dynamic Programming (DP) algorithm to compute an optimal alignment.

Unfortunately, the Dynamic Programming algorithm is computationally feasible only for a small number of sequences; Therefore, DP is only used to compute pairwise alignments. See our online tool that computes the number of possible alignments between two sequences. However, the computational complexity of pairwise sequence alignments is O(n2), and therefore, it is still possible to compute optimally although computationally expensive.

To construct multiple sequence alignments, we need to use varied heuristic methods. The computational complexity is O(2knk), where k is the number of sequences, and n is the length. In other words, to align eight DNA sequences 100 bases long each takes about 28×1008 = 3×1018 seconds, slightly longer than the estimated age of the universe.

The purpose of multiple sequence alignments can be sequence comparison, assessment of data quality, prediction of protein and RNA structures, database searching, and phylogenetic analysis. For this reason, varied methods are used depending on the purpose. We will have a more in-depth treatment of this topic in our upcoming tutorial.

1. ##### 3DCoffee@igs
• Description : 3DCoffee@igs web server for computation of multiple sequence alignments (MSAs) that can mix protein sequences and 3D structures to increase accuracy of the alignments. It first aligns 3D structures and sequences with structures and uses T-Coffee to construct multiple sequence alignments.
2. ##### ALICO
• Description : A tool aimed for aiding a development of multiple sequence alignment methods. It generates randomized versions of input sequences preserving its essential features.
3. ##### Anchored DIALIGN
• Description : A web server for multiple protein and DNA sequence alignment. The tool allows a user to specify sequence segments as anchor points. The algorithm then aligns sequences using the anchor points as constraints. Alternatively, multiple sequence alignments can be done autonomously. This version of DIALIGN can only align sequences up a few thounsand residues.
4. ##### ANTICALIgN
• Description : A tool specifically designed for combinatorial protein engineering. ANTICALIgN can construct multiple sequence alignments (MSAs) based on a template reference sequence and global sequence alignments. Available from the Authors upon request.
5. ##### BAliBASE
• Description : BAliBASE (Benchmark Alignment dataBASE) is a multiple sequence alignment (MSA) benchmarking reference set. It contains reference alignments based on three-dimensional structures and particular reference sets that contain various linear motifs. The AUthors also provide a program that can compare a test alignment with the BAliBASE reference alignment.
6. ##### BAliBASE 4
• Description : BAliBASE (Benchmark Alignment dataBASE) is a multiple sequence alignment (MSA) benchmarking reference set. It contains reference alignments based on three-dimensional structures and particular reference sets that contain various linear motifs. The AUthors also provide a program that can compare a test alignment with the BAliBASE reference alignment. See also "links" for BAliBASE.
7. ##### BARCOD
• Description : BARCOD makes a character matrix using Véronique Barriel’s method, coding each insertion/deletion event regardless of the length into a single event and retains common indels.
9. ##### CHAOS and DIALIGN web server
• Description : A web-based application, which utilizes CHAOS database search tool to find a list of local sequence similarities. DIALIGN uses these similarities as anchor points to construct multiple sequence alignments.
10. ##### Clustal Omega
• Description : The original Clustal Omega tool for multiple protein sequence alignment. Clustal Omega is capable of aligning thousand of sequences and is an improvement of the previous version of Clustal, ClustalW and ClustalX, using HMMs, based on HHalign from Johannes Soeding. Clustal Omega also makes use of precomputed aligment information found in public databases.
11. ##### Clustal Omega (EBI)
• Description : EBI has several interfaces for Clustal Omega: Web interface, REST API, SOAP API, Open API Interface, and Common workflow Language.
12. ##### Clustal WS (jabaws)
• Description : Finds the best global alignment for a set of input sequences (nucleic acid or protein).
13. ##### ClustalO (EBI)
• Description : A Uniprot web server for multiple alignment of protein sequences using Clustal Omega. Capable of aligning up to 4,000 protein sequences. Also available as REST API, SOAP API, Open API Interface, and Common Workflow Language.
14. ##### ClustalO (Galaxy Pasteur)
• Description : A Galaxy public web interface at Institut Pasteur includes Clustal Omega wrapper. Institut Pasteur gives free access for external users for over 280 tools.
15. ##### ClustalO WS (jabaws)
• Description : Clustal Omega multiple sequence alignment program at JABAWS web-services. You can access JABAWS by Jalview, a command-line user interface, or install JABAWS and run it in your own computer.
16. ##### ClustalW
• Description : There are two versions of Clustal 2 multiple sequence alignment software: 1) Clustal W command-line tool and 2) Clustal X with graphical user interface.
17. ##### ClustalW (PRABI)
• Description : A web-based implementation at BCP - CNRS Université Lyon of Clustal W multiple sequence alignment software for protein and DNA sequences.
18. ##### ClustalW (SIB)
• Description : A web interface at Expasy for ClustalW multiple sequence alignment (MSA) tool. Works with both nucleic acid and protein sequences.
19. ##### ClustalW2
• Description : Tools for multiple protein sequence alignment. The algorithm uses a guide tree in alignment creation. There are two separate flavors of Clustal 2: Clustal W, the command-line version and Clustal X, the graphical version.
20. ##### ClustalX
• Description : A version of the Clustal 2 multiple sequence alignment program with a graphical interface.
21. ##### ClustaW2 (Galaxy Pasteur)
• Description : A Galaxy public web interface at Institut Pasteur includes ClustalW2 wrapper. Institut Pasteur gives free access for external users for over 280 tools.
22. ##### CMSA
• Description : A command-line tool for construction of multiple sequence alignments. It can utilize both CPUs and GPUs.
23. ##### CoMSA
• Description : CoMSA is a compression and decompression tool for FASTA and Stockholm format multiple sequence alignment (MSA) files. The algorithm in CoMSA relies on a generalization of the positional Burrows-Wheeler transform of non-binary characters. The Authors claim it to be significantly faster than gzip and it can, for example, compress a Stockholm file of size 41.6 Gb into 1.74 Gb, compared to gzip file size of 5.6 Gb. Apart from source code, CoMSA is also available with binaries for Windows and Linux.
24. ##### CRASP
• Description : The tool analyses multiple protein sequence alignments to find correlated residues. The algorithm assumes that functionally related residues are due to dependent evolution. The calculations are based on physicochemical properties.
25. ##### CUDA ClustalW
• Description : The multiple sequence alignment (MSA) tool, CUDA ClustalW v1.0 is a GPU version of ClustalW v2.0.1, using synchronous diagonal multiple threads and internal tasks' parallelization. The Authors report it to be able to speed up about 22 times compared to running on a single CPU.
26. ##### DCA
• Description : Divide-and-Conquer Multiple Sequence Alignmen (DCA). This tool uses a divide and conquers method to construct multiple sequence alignments (MSA) heuristically. It can align amino acid, DNA, and RNA sequences. The web site also provides REST services.
27. ##### DIALIGN
• Description : DIALIGN is a multiple sequence alignment (MSA) tool. An improved version is called DIALIGN-TX which in turn is an improvement over DIALIGN-T, that combines greedy, progressive methods. See also 'LINKS'.
28. ##### DIALIGN 2
• Description : DIALIGN 2 is an improved version of the original multiple sequence alignment tool DIALIGN from the year 1997. This tool uses sequence segments that don't contain indels for alignment construction. The most recent version of DIAGLIGN is called DIALIGN-TX. See the 'LINKS'.
29. ##### DIALIGN-TX
• Description : DIALIGN-TX is the latest version of the DIALIGN multiple sequence alignment (MSA) tool. The main algorithmic addition is the usage of a guide tree.
30. ##### edialign
• Description : edialign is an EMBOSS version of DIALIGN 2 multiple sequence alignment (MSA) tool.
31. ##### EMBL-Align
• Description : EMBL-Align is a publicly available database of multiple sequence alignments (MSAs). An associated tool, Webin-Align is a tool for submission of alignments. the EBI SRS (Sequence Retrieval System) server is used to query all the multiple sequence alignments.
32. ##### FAMSA
• Description : FAMSA is designed to produce rapid multiple sequence alignment of large protein families. It first determines the longest common subsequences and has a unique way to compute gap costs. It proceeds progressively to add sequences into the alignments using a novel iterative approach. The Authors claim FAMSA to be superior to Clustal Omega and MAFFT. A GPU version is available.
33. ##### GISMO
• Description : Bayesian Markov chain Monte Carlo (MCMC) sampler for protein multiple sequence alignment (MSA).
34. ##### HAlign-II
• Description : HAlign-II is a tool for multiple sequence alignment of amino acid and nucleotide sequences and phylogenetic tree construction aimed for sequence files bigger than one Gb. The software can be used in standalone or in Hadoop cluster mode. HAlign-II contains three types of sequence alignment methods and a large-scale phylogenetic tree construction method based on Apache Spark platform. You can also run HAlign-II on the web server on the clusters in Tianjin University (Spark & Hadoop cluster and NVIDIA K80 GPU cluster). The webserver is accessible from the HAlign-II web pagepage.
35. ##### HandAlign
• Description : HandAlign is a part of the DART package and a tool for reconstruction of multiple sequence alignment (MSA) and phylogenetic history. HandAlign includes several Metropolis-Hastings Markov chain Monte Carlo (MCMC) for sampling of any target distribution.
36. ##### HmmCleaner
• Description : HmmCleaner is a tool to remove alignment and sequencing error containing segments from multiple sequence alignments (MSA) using profile hidden Markov models (pHMM). This tool is based on Bio::MUST modules and integrates into MUST environment.
37. ##### ISPAlign
• Description : ISPAlign (Intermediate Sequence Profile Alignment) is a multiple sequence alignment program that incorporates an improvement to ProbCons’ HMM algorithm by extending it to use intermediate sequence profiles and structure predictions.
38. ##### Kalign
• Description : Kalign is a fast and accurate multiple sequence alignment algorithm of prtotein RNA, and DNA sequences.
39. ##### Kalign (EBI)
• Description : An implementation of Kalign multiple sequence alignment (MSA) tool at EBI. Web form and web services are available.
40. ##### Kalignvu
• Description : Kalignvu is a viewer for multiple sequence alignments and phylogenetic trees.
• Description : KMAD is a software package specifically designed to construct multiple sequence alignments (MSAs) of so-called intrinsically disordered proteins (IDPs). IDPs differ from globular proteins by lacking tertiary structure and by having lower sequence conservation. The Authors provide both stand-alone and web server versions.
42. ##### M-Coffee
• Description : M-Coffee is a particular mode of T-Coffee software.
43. ##### MAFCO
• Description : Lossless compression tool specifically designed to compress MAF (Multiple Alignment Format) files.
44. ##### MAFFT (CBRC)
• Description : MAFFT (Multiple Alignment using Fast Fourier Transform) is a multiple sequence alignment program for nucleotide and protein sequences. It allows users to interactively select sequences and visualization.
45. ##### MAFFT (EBI)
• Description : A web interface for MAFFT (Multiple Alignment using Fast Fourier Transform) multiple sequence alignment (MSA) tool at EBI. You can align up 500 sequences and have a file size up to one MB.
46. ##### MAFFT (REST)
• Description : A REST interface for MAFFT (Multiple Alignment using Fast Fourier Transform) multiple sequence alignment (MSA) tool at EBI.
47. ##### MAFFT parallel
• Description : A parallelized version of MAFFT multiple sequence alignment (MSA) tool. The parallelization is based on the POSIX Threads library with two approaches: best-first and simple hill-climbing in the alignment refinement stage.
48. ##### Malakite
• Description : A web-based tool, Malakite (Multiple Alignment Automatic Kinship Tiling Engine), is for analysis of aligned blocks in multiple protein sequence alignments.
49. ##### MARS
• Description : MARS is a multiple sequence alignment (MSA) tool specifically designed for the alignment of sequences from circular genomes, such as mitochondria and viral genome sequences.
50. ##### mbed
• Description : A command-line-based tool for computation of guide trees for multiple sequence alignments (MSAs).
51. ##### MSA at BYU
• Description : Enhanced multiple sequence alignment (MSA) software at Brigham Young University, Computer Science Department. This software uses hardware acceleration: GPU, FPGA, and Cell BE.
52. ##### msa-edna
• Description : EDNA (Energy Based Multiple Sequence Alignment) is a multiple sequence alignment (MSA) program for aligning transcription factor binding site sequences (TFBSs). The novelty of this software is the scoring using a thermodynamically generated null hypothesis. The method is well suited for aligning sequences that are often not related. Alternative names: Energy Based Multiple Sequence Alignment, EDNA
• Description : A web-based tool for multiple sequence alignment (MSA) of DNA. The algorithm uses PFAM or profiles provided by a user. The web interface requires registration and login.
54. ##### MSACompro
• Description : MSACompro is a tool to integrate tool predicted secondary structure, residue contact information, and relative solvent accessibility into a posterior probability for multiple sequence alignment (MSA) software, such as MSAProbs, ProbCons, Probalign, T-coffee, MAFFT, and MUSCLE.
55. ##### MSAprobs
• Description : A tool for multiple sequence alignment (MSA) for protein sequences. Features: uses a combination of hidden Markov models and partition functions, weighted probabilistic consistency transformation, weighted profile to profile alignments. The Authors claim MSAprobs to have better accuracy than ClustalW, MAFFT, MUSCLE, ProbCons, and Probalign. A multicore version is available. See "LINKS."
57. ##### MSARC
• Description : MSARC a multiple sequence alignment (MSA) tool that constructs alignments without guide trees. The Authors claim their method to outperform BAliBASE on "sequence sets whose evolutionary distances are difficult to represent by a phylogenetic tree."
58. ##### Multi-LAGAN
• Description : Multi-LAGAN is a tool for multiple global alignments of genomic sequences, a part of the Lagan Tool Toolkit and based on CHAOS local alignment tool.
Alternative name: MLAGAN
59. ##### Mumsa
• Description : Mumsa is a tool for the assessment of the quality of multiple sequence alignments (MSAs).
60. ##### MUSCLE
• Description : A program to create multiple sequence alignments of a large number of sequences. Prominent features are rapid sequence distance computation using k-mer counting, a profile function computing a log-expectation scores, and tree-dependent partitioning of the sequences.
61. ##### MUSCLE (BioConductor)
• Description : An R package of multiple sequence alignment with MUSCLE.
62. ##### MUSCLE (EBI)
• Description : A web-based multiple sequence alignment with MUSCLE. RESTful and SOAP services are also available.
63. ##### Muscle WS (jabaws)
• Description : MUSCLE multiple sequence alignment program at JABAWS web-services. You can access JABAWS by Jalview, a command-line user interface, or install JABAWS and run it in your own computer.
64. ##### Mustguseal
• Description : A web application for multiple sequence alignments of protein families. The application constructs the alignments based on structural and other information in public databases.
65. ##### MView (EBI)
• Description : MView is a tool for reformatting multiple alignments or the results of BLAST, FASTA, database search results by adding optional HTML markup for coloring and web page layout.
66. ##### OD-seq
• Description : OD-seq is a software program for detecting outliers in multiple sequence alignments (MSA). It works by finding sequences with an inconsistent average distance to sequences present in the multiple alignment.
67. ##### OD-seq (bioconductor)
• Description : OD-seq is a software program for detecting outliers in multiple sequence alignments (MSA). It works by finding sequences with an inconsistent average distance to sequences present in the multiple alignment. Requirement: R >= 3.2.3.
68. ##### OPAL
• Description : A tool for multiple sequence alignment (MSA) using "form-and-polish strategy." The Authors claim OPAL to be more accurate than Muscle and similar to Muscle on protein sequence alignment and have similar accuracy as MAFFT and Muscle on DNA sequence alignments.
69. ##### OXBench
• Description : OXBench consists of a set of programs to permorm accuracy assessment of multiple sequence alignment methods, aimed for software developers.
70. ##### PASTA
• Description : PASTA (Practical Alignment using Sate and TrAnsitivity) is a multiple sequence alignment tool that uses a guide tree.
71. ##### PnpProbs
• Description : PnpProbs is a multiple sequence alignment (MSA) tool. It operates by assigning sequences into two, distantly and "normally" related, groups and uses a guide tree solely for "normally" related sequences. For distantly related sequences, it applies a non-progressive approach to generate a multiple sequence alignment (MSA).
72. ##### PRALINE
• Description : Praline is a multiple sequence alignment program that provodes several different alignment strategies, e.g, integration of structural information in the alignment process. It also provides a comprehensive visualization of the multiple sequence alignments. SOAP service is available.
73. ##### PRANK
• Description : A creation of multiple alignments representing structural homology and evolutionary homology, require separate approaches. PRANK is designed for construction of multiple alignments reflecting the evolutionary homology and phylogenetic information to handle insertions and deletions.
74. ##### PRANK API
• Description : A web API at EBI for PRANK, a multiple sequence alignment (MSA) tool for nucleic acid and amino acid sequences. The core algorithm differs from 'traditional' ones by avoiding to overestimate insertion/deletion events and accounts for the evolutionary distance between the sequences. Available upon request from Ari Löytynoja.
75. ##### Pro-Coffee
• Description : Pro-Coffee is a part of the T_Coffee package and implemented for multiple sequence alignment of promoter regions.
76. ##### Probalign
• Description : Probalign is a multiple sequence alignment (MSA) software that uses a partition function to estimate posterior alignment probabilities. The Authors claim Probalign to be more accurate than Probcons, MAFFT, and MUSCLE.
77. ##### ProbCons
• Description : Probabilistic Consistency-based Multiple Alignment of Amino Acid Sequences. It uses probabilistic modeling and consistenct-based technigues in the alignment construction. The Authors claim this tool to have improved alignments compared to T-Coffee, Clustal W, and Dialign.
78. ##### ProDA
• Description : ProDA is a tool that constructs local multiple sequence alignments (MSAs) by first identifying repeated homologous regions in a collection of protein sequences.
79. ##### PROMALS3D
• Description : PROMALS (Profile Multiple Alignment with Local Structure) is a web-based tool for the construction of multiple sequence alignments (MSA). It searches both sequence and structure databases and uses that information together with user-defined constraints.
80. ##### PSAlign
• Description : PSAlign is a multiple sequence alignment tool. The algorithm constructs pairwise sequence alignments that are represented as a graph and finds the shortest path to create a multiple sequence alignment without heuristics.
81. ##### PSAR-align
• Description : A tool for improving multiple sequence alignments using probabilistic sampling.
82. ##### PSI-Coffee
• Description : PSI-Coffee is a part of the T-Coffe distribution and specifically designed for making multiple sequence alignments (MSAs) of alpha-helical transmembrane protein sequences. The Authors claim PSI-Coffee to be more accurate than MSAProbs, Kalign, PROMALS, MAFFT, ProbCons, and PRALINE.
83. ##### PVS
• Description : Protein Variability Server (PVS) web server uses several variability metrics to calculate a sequence variability within a multiple protein sequence alignment. The tool can map the sequence variability to supplied 3D structure, plot the variability, mask the variability in a sequence, predict T-cell epitopes, locate conserved sequences in 3D structures, and return conserved sequence fragments.
84. ##### QOMA
• Description : Multiple sequence alignment of proteins sequences, using k-partite graph. The algorithm is independent of thr order of sequences.
85. ##### QuickAlign
• Description : A tool for the editing of sequence alignments, and the making of multiple sequence alignments (MSAs) with ClustalX.
86. ##### QuickProbs
• Description : A multiple protein sequence alignment tool that is based on probabilistic models, employing a column-oriented and selective consistency in aligment refinement instead of commonly used strategies of increased alignment quality. The Authors claim Quick-Probs 2 to be "noticeably" better than ClustalΩ and MAFFT.
87. ##### R-Coffee
• Description : R-Coffee is a package for multiple RNA sequence alignments, derived from T-Coffee package. It uses structural information in the construction of the sequence alignments and a special version of T-Coffee constructs the multiple sequence alignments incorporating the structural information. Requirements: RNAlpfold from the Vienna package, Mafft, Muscle, ProbCons, and ConSan.
88. ##### R3D-2-MSA
• Description : The R3D-2-MSA (RNA 3D Structure-to-Multiple Sequence Alignment) is a web-based tool for linking 3D structures to multiple RNA sequence alignments.
89. ##### SARA-Coffee
• Description : A web server for multiple sequence alignments (MSAs) of RNA sequences based on 3D structures. SARA combines pair-wise structural alignments with R-Coffee multiple RNA alignments. It also allows alignment without 3D structures.
90. ##### showalign
• Description : Display a multiple sequence alignment in pretty format.
91. ##### SINA
• Description : SINA (SILVA Incremental Aligner) is a web tool for multiple sequence alignment (MSA) specifically designed for the multiple alignment of ribosomal RNA genes (rRNA). SINA is also able to taxonomically classify the sequences.
92. ##### STACCATO
• Description : Stacatto is a multiple sequence alignment (MSA) tool that combines the use of three-dimensional structure alignment probabilities and standard amino acid substitution probabilities. Available from the Authors.
93. ##### SuiteMSA
• Description : A java-based application that provides unique MSA viewers. Users can directly compare multiple MSAs and evaluate where the MSAs agree (are consistent) or disagree (are inconsistent).
94. ##### T-Coffee
• Description : T-Coffee is a multiple sequence alignment (MSA) program. It preprocesses the data by making pair-wise alignments between all sequences and this information is incorporated in the progressive alignment procedure. The structural sequence information may be obtained from various different sources. It can align amino acid and nucleotide sequences.
95. ##### T-Coffee (CGR)
• Description : A web server for T-Coffee tools for evaluating and A web server at The Centre for Genomic Regulation (CRG) provides T-Coffee tools for assessing and handling multiple sequence alignments (MSAs) of nucleotide and amino acid sequences and related structures. The web server provides the following methods: M-Coffee, R-Coffee, Expresso, PSI-Coffee, and iRMSD-APDB.manipulating multiple alignments of DNA, RNA, protein sequences and structures. Includes M-Coffee, R-Coffee, Expresso, PSI-Coffee, iRMSD-APDB.
96. ##### T-Coffee (EBI)
• Description : T-Coffee is a multiple sequence alignment (MSA) program. Web user interface and web services interfaces at EBI provides Simple Object Access Protocol (SOAP), Representational State Transfer (REST), Open API Interface, and Common Workflow Language (CWL) services. T-Coffee package pre-processes the data by making pair-wise alignments between all sequences and this information is incorporated in the progressive alignment procedure. The structural sequence information may be obtained from various different sources. It can align amino acid and nucleotide sequences. The package combines several alignment methods.
97. ##### TM-Aligner
• Description : A web-based tool to align transmembrane proteins using Wu-Manber string matching algorithm. The tool can visualize multiple sequence alignments in varied color schemes.
98. ##### TM-Coffee
• Description : A web-based tool for PSI/TM-Coffee at The Centre for Genomic Regulation (CRG). The tool is specifically constructed to construct multiple sequence alignments (MSAs) of transmembrane proteins. It can use transmembrane databases for fast extension of homology.
99. ##### trimAl
• Description : Tool for the removal of poorly aligned sequences from multiple sequence alignments. It can automatically detect and select various parameters to optimize the signal-to-noise ratio.
100. ##### UniProt Align
• Description : A web interface at Uniprot for multiple sequence alignment using Clustal Omega.
101. ##### VerAlign
• Description : VerAlign is a web-based tool to compare two multiple sequence alignments (MSAs). It uses SPdist scoring scheme which measures a distance between mismatched amino acid pairs. Available from the Authors upon request.
102. ##### webPRANK
• Description : webPRANK is a multiple sequence alignment (MSA) tool for DNA, protein, cDNA, and codon sequences at Goldman Group (EBI). It has structure models built-in and includes a web-based visualization of multiple alignments.

Please, send comments and suggestions. Suggest for example a topic you would like to see or any improvements in the content. We will not spam you with emails!