CUDA-EC
CUDA-EC corrects sequencing errors in high-throughput short-read DNA data to produce error-free reads for de novo genome assembly and downstream genomic analyses.
Key Features:
- Scalable parallel algorithm: Implements a scalable parallel algorithm using the Compute Unified Device Architecture (CUDA) programming model to perform error correction across large short-read data sets.
- Spectral alignment: Uses spectral alignment to identify and correct errors within sequencing reads by analyzing their spectral properties.
- CUDA texture memory utilization: Leverages CUDA texture memory to enhance computational efficiency during error correction processes.
- Space-efficient Bloom filter: Incorporates a space-efficient Bloom filter data structure for spectrum membership queries.
Scientific Applications:
- Graph-based short-read assembly support: Provides corrected reads to graph-based short-read assembly tools to improve the accuracy of de novo genome assembly.
- Illumina sequencing data processing: Applicable to real and simulated Illumina sequencing data across varying read lengths, error rates, and input sizes.
Methodology:
Implements a scalable parallel algorithm using the CUDA programming model and spectral alignment; uses CUDA texture memory and a space-efficient Bloom filter for spectrum membership queries; tested on real and simulated Illumina data sets and reported speedups of 12-84× for parallelized error correction and 3-63× versus the Euler-SR program for sequential preprocessing and parallelized error correction.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Windows, Mac
- Programming Languages:
- C
- Added:
- 1/13/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Shi H, Schmidt B, Liu W, Müller-Wittig W. A Parallel Algorithm for Error Correction in High-Throughput Short-Read Data on CUDA-Enabled Graphics Hardware. Journal of Computational Biology. 2010;17(4):603-615. doi:10.1089/cmb.2009.0062. PMID:20426693.