fastbaps
Fastbaps is a rapid and efficient solution to the genetic clustering challenge, particularly tailored for multilocus genotype data. This software implements an approximate fit to a Dirichlet process mixture model (DPM), enabling it to handle substantially larger datasets—ranging from 10 to 100 times—than those manageable by existing model-based clustering methods. A key demonstration of its capability is the analysis of an alignment comprising over 110,000 sequences of HIV-1 pol genes, showcasing its robustness and scalability.
Beyond mere clustering, Fastbaps offers a novel approach for partitioning existing hierarchies to maximize the DPM model's marginal likelihood. This feature allows for the splitting of phylogenetic trees into clades and subclades based on a population genomic model, offering a nuanced view of genetic relationships and variations.
Topic
Genotype and phenotype;Phylogenetics;Mapping
Detail
Operation: Essential dynamics;Clustering;Genotyping
Software interface: Library
Language: R,C++,C
License: The MIT License
Cost: Free with restrictions
Version name: -
Credit: Wellcome Trust, The Alan Turing Institute via an Engineering and Physical Sciences Research Councilm U.S. National Institutes of Health.
Input: -
Output: -
Contact: Gerry Tonkin-Hill gqt20@cam.ac.uk
Collection: -
Maturity: Mature
Publications
- Fast hierarchical Bayesian analysis of population structure.
- Tonkin-Hill G, et al. Fast hierarchical Bayesian analysis of population structure. Fast hierarchical Bayesian analysis of population structure. 2019; 47:5539-5549. doi: 10.1093/nar/gkz361
- https://doi.org/10.1093/nar/gkz361
- PMID: 31076776
- PMC: PMC6582336
Download and documentation
Documentation: https://github.com/gtonkinhill/fastbaps/blob/master/README.md
Home page: https://github.com/gtonkinhill/fastbaps
< Back to DB search