Hifiasm

Hifiasm assembles haplotype-resolved de novo genomes from long high-fidelity (HiFi) reads to produce phased assemblies for analysis of allelic variation.


Key Features:

  • HiFi read support: Uses long high-fidelity (HiFi) sequence reads to capture haplotype information in assemblies.
  • Phased assembly graph: Constructs a phased assembly graph that encodes genomic variations between haplotypes.
  • Haplotype contiguity preservation: Preserves contiguity across all haplotypes rather than collapsing heterozygous alleles into a single consensus.
  • Graph trio binning algorithm: Implements an advanced graph trio binning algorithm for improved haplotype separation compared to standard trio binning.
  • Complex genome handling: Demonstrates ability to assemble large and polyploid genomes, including the ~30-Gb hexaploid California redwood.
  • Empirical performance: Produces superior haplotype-resolved assemblies relative to existing tools on tested datasets including three human and five nonhuman genomes.

Scientific Applications:

  • Haplotype-resolved genome assembly: Reconstruction of phased haplotypes for studies of allelic variation.
  • Human genetics: Assembly and phasing of human genomes for variant analysis.
  • Plant and large-genome genomics: Assembly of large or polyploid plant genomes such as California redwood.
  • Comparative genomics and diversity studies: Investigation of genomic diversity and sequence variation across individuals and species.

Methodology:

Builds a phased assembly graph from long HiFi reads and applies a graph trio binning algorithm to separate haplotypes while preserving haplotype contiguity.

Topics

Details

License:
MIT
Tool Type:
command-line tool
Programming Languages:
C, C++
Added:
7/4/2022
Last Updated:
11/24/2024

Operations

Publications

Cheng H, Concepcion GT, Feng X, Zhang H, Li H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nature Methods. 2021;18(2):170-175. doi:10.1038/s41592-020-01056-5. PMID:33526886. PMCID:PMC7961889.

PMID: 33526886
PMCID: PMC7961889
Funding: - U.S. Department of Health & Human Services | NIH | National Human Genome Research Institute: R01HG010040, U01HG010971, U41HG010972

Documentation