ParLECH

ParLECH (Parallel Long-read Error Correction using Hybrid methodology) is a software tool designed to correct errors in long-read sequencing data using short-read sequencing data. It addresses the challenges of higher error rates and costs associated with long-read sequencing technologies like PacBio.

Key features and functionality of ParLECH:

1. Hybrid approach: ParLECH utilizes high-throughput Illumina short-read sequences to correct errors in PacBio long-read sequences.

2. Distributed error correction: The error correction algorithm is distributed, allowing for efficient processing of large-scale datasets.

3. De Bruijn graph construction: ParLECH builds a de Bruijn graph from the short reads, which is then used to correct indel errors in the long reads by replacing error regions with the graph's widest or maximum min-coverage path.

4. K-mer coverage-based correction: The tool uses k-mer coverage information from the short reads to split long reads into low and high-coverage regions. It then performs majority voting to correct substitution errors in each base.

5. Scalability: ParLECH can handle terabytes of sequencing data using hundreds of computing nodes, making it suitable for large-scale datasets.

Topic

Sequence assembly;Sequencing;Mapping

Detail

  • Operation: De-novo assembly;Sequencing error detection;k-mer counting

  • Software interface: Command-line interface

  • Language: Java

  • License: Not stated

  • Cost: Free of charge

  • Version name: -

  • Credit: NSF, NIH, LA Board of Regents, IBM.

  • Input: -

  • Output: -

  • Contact: Arghya Kusum Das dasa@uwplatt.edu

  • Collection: -

  • Maturity: -

Publications

  • A hybrid and scalable error correction algorithm for indel and substitution errors of long reads.
  • Das AK, et al. A hybrid and scalable error correction algorithm for indel and substitution errors of long reads. A hybrid and scalable error correction algorithm for indel and substitution errors of long reads. 2019; 20:948. doi: 10.1186/s12864-019-6286-9
  • https://doi.org/10.1186/S12864-019-6286-9
  • PMID: 31856721
  • PMC: PMC6923905

Download and documentation


< Back to DB search