EDENA

EDENA assembles de novo bacterial genomes from short Illumina reads (approximately 35 bases) to generate accurate contigs for genome characterization and detection of clonal polymorphisms.


Key Features:

  • Overlap Graph Representation: EDENA employs a classical overlap graph approach that identifies overlaps between short reads to construct longer contigs.
  • Detection of Spurious Reads: The software detects and excludes potentially erroneous or spurious reads to improve assembly accuracy.
  • Generation of Accurate Contigs: EDENA produces accurate contigs spanning several kilobases that collectively cover a substantial portion of bacterial genomes.
  • Validation Against Published Genomes: Assemblies have been validated using experimental datasets for Staphylococcus aureus strain MW2 and Helicobacter acinonychis strain Sheeba and compared to published genomes obtained from 1.5 to 3.0 kilobase fragments.

Scientific Applications:

  • Genome Characterization: Enables characterization of bacterial genomes from Illumina short-read experiments leveraging high-throughput sequencing.
  • Detection of Clonal Polymorphisms: Facilitates detection of clonal polymorphisms within the sequenced DNA population due to broad coverage from high-throughput sequencing.
  • Comparative Genomics and Evolutionary Studies: Provides assembled contigs suitable for downstream comparative genomics and evolutionary analyses.

Methodology:

Processes millions of short Illumina reads using an overlap graph representation to assemble reads into contigs while detecting and filtering spurious or erroneous reads.

Topics

Details

License:
GPL-3.0
Cost:
Free of charge
Tool Type:
command-line tool
Operating Systems:
Linux
Added:
1/13/2017
Last Updated:
11/25/2024

Operations

Data Inputs & Outputs

Publications

Hernandez D, François P, Farinelli L, Østerås M, Schrenzel J. De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer. Genome Research. 2008;18(5):802-809. doi:10.1101/gr.072033.107. PMID:18332092. PMCID:PMC2336802.