CMSA

CMSA performs multiple sequence alignment of large-scale RNA and DNA datasets using a heterogeneous CPU/GPU co-run computing model to improve alignment efficiency and scalability.


Key Features:

  • Heterogeneous Computing: Leverages CPUs and GPUs via a co-run computation model to enable simultaneous workload processing and maximize resource utilization.
  • Automatic Optimization: Automatically performs and optimizes MSA without requiring prior knowledge of dataset size or sequence length.
  • Improved Center Star Strategy: Implements an optimized center star strategy that reduces center sequence selection time complexity from O(mn^2) to O(mn) for aligning similar sequences.
  • Bitmap-Based Algorithm: Employs a bitmap-based algorithm tailored for multiple similar RNA/DNA sequence alignment to enhance efficiency of the center star strategy.
  • Performance: Demonstrates experimental speedups of up to 11× compared to existing MSA software.

Scientific Applications:

  • Large-scale MSA research: Enables processing of extensive RNA and DNA sequence datasets for high-throughput alignment tasks.
  • Evolutionary biology studies: Facilitates large-scale sequence comparisons to investigate evolutionary relationships.
  • Phylogenetic analysis: Supports construction and refinement of phylogenies by providing efficient alignment of many homologous sequences.
  • Functional genomics: Provides rapid alignments to aid downstream functional inference and comparative analyses.

Methodology:

Uses a co-run computation model to balance workloads between CPU and GPU, applies an optimized center star strategy that reduces selection complexity from O(mn^2) to O(mn), and incorporates a bitmap-based algorithm to accelerate alignment of similar RNA/DNA sequences.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux
Programming Languages:
C++
Added:
5/17/2018
Last Updated:
12/10/2018

Operations

Publications

Chen X, Wang C, Tang S, Yu C, Zou Q. CMSA: a heterogeneous CPU/GPU computing system for multiple similar RNA/DNA sequence alignment. BMC Bioinformatics. 2017;18(1). doi:10.1186/s12859-017-1725-6. PMID:28646874. PMCID:PMC5483318.

PMID: 28646874
PMCID: PMC5483318
Funding: - National Natural Science Foundation of China: 11573019, 61602336 - Joint Research fund in Astronomy: U1531111

Documentation