AlignGraph2

AlignGraph2 enhances genome assemblies by refining and extending contigs using PacBio long reads aligned to closely related reference genomes to improve assembly contiguity and accuracy.


Key Features:

  • Genome-Assisted Reassembly: Employs a genome-assisted reassembly approach that aligns contigs and reads to a similar genome to extend and refine assemblies.
  • Novel Algorithms: Includes the Similarity-Aware Alignment Algorithm, Alignment Filtration Algorithm, Reassembly Algorithm, and Weight-Adjusted Consensus Algorithm to improve alignment, filtration, reassembly, and consensus accuracy.
  • Long-Read Compatibility: Supports PacBio error-prone and HiFi (High-Fidelity) long reads for assembly refinement.
  • Performance Improvements: Aligns 5.7% to 27.2% more long reads and 7.3% to 56.0% more bases, extends 8.7% to 94.7% of aligned contigs, increases N50 by 7.0% to 249.6%, and reduces indels by 5.2% to 87.7% per 100 kbp in comparative tests.
  • Stability Across Genomic Similarities: Maintains stable performance when working with reference genomes of decreased similarity.

Scientific Applications:

  • De novo genome assembly: Refines and extends contigs from PacBio long reads to improve contiguity and accuracy in de novo assembly projects.
  • Variant calling: Produces assemblies with higher N50 and lower indel rates to support more accurate variant calling.
  • Comparative genomics and functional annotation: Generates higher-quality assemblies suitable for comparative genomics and functional annotation analyses.

Methodology:

Integrates four algorithms—the Similarity-Aware Alignment Algorithm, Alignment Filtration Algorithm, Reassembly Algorithm, and Weight-Adjusted Consensus Algorithm—and aligns long reads and preassembled contigs to a similar reference genome to extend and refine assemblies.

Topics

Details

Tool Type:
command-line tool
Programming Languages:
C++, Python
Added:
3/19/2021
Last Updated:
4/11/2021

Operations

Publications

Huang S, He X, Wang G, Bao E. AlignGraph2: similar genome-assisted reassembly pipeline for PacBio long reads. Briefings in Bioinformatics. 2021;22(5). doi:10.1093/bib/bbab022. PMID:33621981.

PMID: 33621981
Funding: - Beijing Natural Science Foundation: 4192044 - Fundamental Research Funds for the Central Universities: 2019JBM073

Related Tools

aligngraph
Relation: isNewVersionOf