In bioinformatics, multiple sequence alignment means an alignment of more than two DNA, RNA, or protein sequences and is one of the oldest problems in computational biology.
One often used strategy is to minimize the number of mismatches, insertions, and deletions in the alignment, and we can use the Dynamic Programming (DP) algorithm to compute an optimal alignment.
Unfortunately, the Dynamic Programming algorithm is computationally feasible only for a small number of sequences; Therefore, DP is only used to compute pairwise alignments. See our online tool that computes the number of possible alignments between two sequences. However, the computational complexity of pairwise sequence alignments is O(n2), and therefore, it is still possible to compute optimally although computationally expensive.
To construct multiple sequence alignments, we need to use varied heuristic methods. The computational complexity is O(2knk), where k is the number of sequences, and n is the length. In other words, to align eight DNA sequences 100 bases long each takes about 28×1008 = 3×1018 seconds, slightly longer than the estimated age of the universe.
The purpose of multiple sequence alignments can be sequence comparison, assessment of data quality, prediction of protein and RNA structures, database searching, and phylogenetic analysis. For this reason, varied methods are used depending on the purpose. We will have a more in-depth treatment of this topic in our upcoming tutorial.
Please, send comments and suggestions. Suggest for example a topic you would like to see or any improvements in the content. We will not spam you with emails!
SECTIONS
TutorialsFind thousands of Bioinformatics and Life Science software tools and databases in the newly launched
Ads
Ads
Ads