EndHiC
EndHiC scaffolds large contigs into chromosomal-level assemblies by using Hi-C links from contig end regions to improve signal-to-noise discrimination between adjacent and non-adjacent contigs.
Key Features:
- Contig end region focus: Uses Hi-C links specifically from contig end regions to enhance differentiation between adjacent (signal) and non-adjacent (noise) linkages.
- Reciprocal best requirement: Applies a reciprocal-best criterion to candidate scaffold joins.
- Robustness evaluation: Performs robustness evaluation of proposed joins to reduce mis-assembly.
- Accuracy for large contigs: Improves scaffolding accuracy for large contigs by concentrating on end-region Hi-C signals.
- Efficiency: Operates in a time-efficient and memory-friendly manner to limit computational resource usage.
- Broad applicability: Demonstrated across species including human, rice, Arabidopsis, great burdock, water spinach, chicory, endive, yacon, and Ipomoea cairica.
Scientific Applications:
- Chromosomal-level scaffolding of long-contig assemblies: Suited for assemblies with N50 near or exceeding 10 Mb and N90 near or over 1 Mb to achieve chromosomal-level scaffolds using Hi-C contacts.
- Cross-species genome assembly: Applicable to both plant and animal genomes as demonstrated in human, rice, Arabidopsis, great burdock, water spinach, chicory, endive, yacon, and Ipomoea cairica.
Methodology:
Extracts Hi-C links from contig end regions, applies a reciprocal-best requirement to candidate joins, and conducts robustness evaluation of proposed joins.
Topics
Details
- License:
- GPL-1.0
- Tool Type:
- command-line tool, workflow
- Operating Systems:
- Linux, Mac
- Programming Languages:
- Perl, Python
- Added:
- 2/20/2023
- Last Updated:
- 11/24/2024
Operations
Publications
Wang S, Wang H, Jiang F, Wang A, Liu H, Zhao H, Yang B, Xu D, Zhang Y, Fan W. EndHiC: assemble large contigs into chromosome-level scaffolds using the Hi-C links from contig ends. BMC Bioinformatics. 2022;23(1). doi:10.1186/s12859-022-05087-x. PMID:36482318. PMCID:PMC9730666.
PMID: 36482318
PMCID: PMC9730666
Funding: - National Natural Science Foundation of China: Grant No. 32000408