Hifiasm
Hifiasm assembles haplotype-resolved de novo genomes from long high-fidelity (HiFi) reads to produce phased assemblies for analysis of allelic variation.
Key Features:
- HiFi read support: Uses long high-fidelity (HiFi) sequence reads to capture haplotype information in assemblies.
- Phased assembly graph: Constructs a phased assembly graph that encodes genomic variations between haplotypes.
- Haplotype contiguity preservation: Preserves contiguity across all haplotypes rather than collapsing heterozygous alleles into a single consensus.
- Graph trio binning algorithm: Implements an advanced graph trio binning algorithm for improved haplotype separation compared to standard trio binning.
- Complex genome handling: Demonstrates ability to assemble large and polyploid genomes, including the ~30-Gb hexaploid California redwood.
- Empirical performance: Produces superior haplotype-resolved assemblies relative to existing tools on tested datasets including three human and five nonhuman genomes.
Scientific Applications:
- Haplotype-resolved genome assembly: Reconstruction of phased haplotypes for studies of allelic variation.
- Human genetics: Assembly and phasing of human genomes for variant analysis.
- Plant and large-genome genomics: Assembly of large or polyploid plant genomes such as California redwood.
- Comparative genomics and diversity studies: Investigation of genomic diversity and sequence variation across individuals and species.
Methodology:
Builds a phased assembly graph from long HiFi reads and applies a graph trio binning algorithm to separate haplotypes while preserving haplotype contiguity.
Topics
Details
- License:
- MIT
- Tool Type:
- command-line tool
- Programming Languages:
- C, C++
- Added:
- 7/4/2022
- Last Updated:
- 11/24/2024
Operations
Publications
Cheng H, Concepcion GT, Feng X, Zhang H, Li H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nature Methods. 2021;18(2):170-175. doi:10.1038/s41592-020-01056-5. PMID:33526886. PMCID:PMC7961889.
PMID: 33526886
PMCID: PMC7961889
Funding: - U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute: R01HG010040, U01HG010971, U41HG010972