AlignGraph
AlignGraph improves de novo genome assemblies by using paired-end (PE) reads, preassembled contigs or scaffolds and alignments to closely related reference genomes to extend and join fragmented assemblies.
Key Features:
- Guided Assembly Enhancement: AlignGraph aligns paired-end (PE) reads and preassembled contigs or scaffolds to closely related reference genomes to identify potential extensions and connections within assemblies.
- PE Multipositional de Bruijn Graph: It constructs a PE multipositional de Bruijn graph that integrates positional information from alignments and PE reads to guide accurate extension of assemblies and reduce incorrect extensions or premature terminations.
- Performance Metrics Improvement: In evaluations, AlignGraph extended 28.7-62.3% of contigs for Arabidopsis thaliana and human, increasing N50 by 89.9-94.5% for A. thaliana and 80.3-165.8% for human assemblies.
- Application to Published Genomes: It increased the N50 of extendable scaffolds by 86.6% for the Arabidopsis strain Landsberg.
Scientific Applications:
- Genome assembly refinement: Improves completeness and continuity of de novo genome assemblies across organisms such as Arabidopsis thaliana and human.
- Comparative genomics and evolutionary analysis: Uses related reference genomes to enable more accurate reconstruction of genomic sequences for studying genetic variation and evolutionary relationships among species.
Methodology:
AlignGraph aligns paired-end reads and preassembled contigs or scaffolds to a reference genome and constructs a PE multipositional de Bruijn graph that incorporates positional information from alignments and PE reads to guide extension and joining of contigs or scaffolds.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux
- Programming Languages:
- C++
- Added:
- 8/3/2017
- Last Updated:
- 11/24/2024
Operations
Publications
Bao E, Jiang T, Girke T. AlignGraph: algorithm for secondary <i>de novo</i> genome assembly guided by closely related references. Bioinformatics. 2014;30(12):i319-i328. doi:10.1093/bioinformatics/btu291. PMID:24932000. PMCID:PMC4058956.