EndHiC

EndHiC scaffolds large contigs into chromosomal-level assemblies by using Hi-C links from contig end regions to improve signal-to-noise discrimination between adjacent and non-adjacent contigs.


Key Features:

  • Contig end region focus: Uses Hi-C links specifically from contig end regions to enhance differentiation between adjacent (signal) and non-adjacent (noise) linkages.
  • Reciprocal best requirement: Applies a reciprocal-best criterion to candidate scaffold joins.
  • Robustness evaluation: Performs robustness evaluation of proposed joins to reduce mis-assembly.
  • Accuracy for large contigs: Improves scaffolding accuracy for large contigs by concentrating on end-region Hi-C signals.
  • Efficiency: Operates in a time-efficient and memory-friendly manner to limit computational resource usage.
  • Broad applicability: Demonstrated across species including human, rice, Arabidopsis, great burdock, water spinach, chicory, endive, yacon, and Ipomoea cairica.

Scientific Applications:

  • Chromosomal-level scaffolding of long-contig assemblies: Suited for assemblies with N50 near or exceeding 10 Mb and N90 near or over 1 Mb to achieve chromosomal-level scaffolds using Hi-C contacts.
  • Cross-species genome assembly: Applicable to both plant and animal genomes as demonstrated in human, rice, Arabidopsis, great burdock, water spinach, chicory, endive, yacon, and Ipomoea cairica.

Methodology:

Extracts Hi-C links from contig end regions, applies a reciprocal-best requirement to candidate joins, and conducts robustness evaluation of proposed joins.

Topics

Details

License:
GPL-1.0
Tool Type:
command-line tool, workflow
Operating Systems:
Linux, Mac
Programming Languages:
Perl, Python
Added:
2/20/2023
Last Updated:
11/24/2024

Operations

Publications

Wang S, Wang H, Jiang F, Wang A, Liu H, Zhao H, Yang B, Xu D, Zhang Y, Fan W. EndHiC: assemble large contigs into chromosome-level scaffolds using the Hi-C links from contig ends. BMC Bioinformatics. 2022;23(1). doi:10.1186/s12859-022-05087-x. PMID:36482318. PMCID:PMC9730666.

PMID: 36482318
PMCID: PMC9730666
Funding: - National Natural Science Foundation of China: Grant No. 32000408

Downloads