HapCol
HapCol reconstructs haplotypes from sequencing reads to assemble diploid genomes and evaluate the impact of single-nucleotide polymorphisms (SNPs) on phenotypic traits.
Key Features:
- Long Read Compatibility: Optimized for long, gapless reads from sequencing technologies including PacBio RS II (SMRT sequencing) and Oxford Nanopore MinION.
- Error Correction Strategy: Leverages the uniform distribution of sequencing errors and employs an exact algorithm that is exponential in the maximum number of corrections per SNP position to minimize an overall error-correction score.
- Computational Efficiency: Requires less memory and computing resources compared to existing combinatorial methods, enabling processing of higher-coverage datasets without relying on restrictive assumptions such as the all-heterozygous model.
- Performance and Accuracy: Demonstrates competitive accuracy and increased numbers of phased positions, with improved metrics observed on real datasets.
- Scalability: Overcomes limitations related to read length and sequencing coverage to handle larger datasets effectively.
Scientific Applications:
- Genetic analysis of phenotypic traits: Enables reconstruction of haplotypes to study the effects of SNPs on phenotype.
- Personalized medicine: Supports haplotype-resolved variant interpretation relevant to individual genotype-informed treatment strategies.
- Evolutionary biology: Facilitates haplotype-based analyses for studying evolutionary relationships and allele histories.
- Population genetics: Allows phasing of variants at population scale to investigate genetic diversity and structure.
Methodology:
Operates on long, gapless reads and uses an exact, exponential-time algorithm parameterized by the maximum corrections per SNP that minimizes an overall error-correction score while leveraging a uniform sequencing error model and not assuming an all-heterozygous genotype.
Topics
Details
- License:
- GPL-2.0
- Maturity:
- Emerging
- Cost:
- Free of charge
- Tool Type:
- command-line tool
- Operating Systems:
- Linux
- Programming Languages:
- C++
- Added:
- 3/13/2016
- Last Updated:
- 11/25/2024
Operations
Publications
Pirola Y, Zaccaria S, Dondi R, Klau GW, Pisanti N, Bonizzoni P. H<scp>ap</scp>C<scp>ol</scp>: accurate and memory-efficient haplotype assembly from long reads. Bioinformatics. 2015;32(11):1610-1617. doi:10.1093/bioinformatics/btv495. PMID:26315913.
PMID: 26315913
Documentation
General
http://hapcol.algolab.eu/