MUSCLE
MUSCLE produces multiple sequence alignments of protein sequences for comparative sequence analysis, evolutionary studies, and protein structure prediction.
Key Features:
- Fast distance estimation: Utilizes k-mer counting to rapidly estimate distances between sequences.
- Progressive alignment scoring: Uses a log-expectation profile score for progressive alignment to improve accuracy.
- Refinement: Incorporates tree-dependent restricted partitioning during refinement stages to improve alignment quality.
- Benchmark performance: Tested against T-Coffee, MAFFT, CLUSTALW, Progressive POA, and the MAFFT FFTNS1 script, achieving the highest or joint-highest accuracy on BAliBASE, SABmark, SMART, and PREFAB.
- Speed: Demonstrates high throughput, e.g., aligning 5,000 sequences of average length 350 in ~7 minutes and MUSCLE-fast aligning 1,000 sequences of average length 282 in ~21 seconds.
- Variants: Offers MUSCLE (default) for highest accuracy, MUSCLE-fast for high-throughput use, and MUSCLE-prog as a compromise between speed and accuracy.
- Objective-function evaluation protocol: Implements a protocol for evaluating objective functions when aligning two profiles.
- Unpublished algorithmic techniques: Includes additional unpublished techniques aimed at improving biological accuracy and computational efficiency.
Scientific Applications:
- Multiple sequence alignment tasks: Produces precise alignments for comparative sequence analysis and downstream analyses.
- Genomic studies: Handles large sequence datasets for genomic-scale analyses.
- Evolutionary biology: Supports evolutionary and phylogenetic analyses through accurate alignments.
- Protein structure prediction: Provides alignments that aid protein structure prediction and comparative modeling.
Methodology:
Uses k-mer counting for rapid distance estimation, progressive alignment with a log-expectation profile score, tree-dependent restricted partitioning for refinement, and a protocol for evaluating objective functions when aligning two profiles.
Topics
Collections
Details
- License:
- Other
- Maturity:
- Mature
- Cost:
- Free of charge
- Tool Type:
- api, command-line tool
- Operating Systems:
- Linux, Windows, Mac
- Added:
- 1/17/2017
- Last Updated:
- 11/24/2024
Operations
Data Inputs & Outputs
Multiple sequence alignment
Publications
Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research. 2004;32(5):1792-1797. doi:10.1093/nar/gkh340. PMID:15034147. PMCID:PMC390337.
Mareuil F, Doppelt-Azeroual O, Ménager H. A public Galaxy platform at Pasteur used as an execution engine for web services. Unknown Journal. 2017. doi:10.7490/f1000research.1114334.1.
Edgar RC. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 2004;5(1). doi:10.1186/1471-2105-5-113. PMID:15318951. PMCID:PMC517706.
Edgar RC. High-accuracy alignment ensembles enable unbiased assessments of sequence homology and phylogeny. Unknown Journal. 2021. doi:10.1101/2021.06.20.449169.
Documentation
Downloads
- API specificationhttps://www.ebi.ac.uk/seqdb/confluence/display/WEBSERVICES/muscle_restEBI MUSCLE Web Service
- Downloads pagehttps://www.drive5.com/muscle/downloads.htm
- Source codehttp://bioconductor/packages/release/bioc/src/contrib/muscle_3.16.0.tar.gzBioConductor package