fastbaps

Fastbaps is a rapid and efficient solution to the genetic clustering challenge, particularly tailored for multilocus genotype data. This software implements an approximate fit to a Dirichlet process mixture model (DPM), enabling it to handle substantially larger datasets—ranging from 10 to 100 times—than those manageable by existing model-based clustering methods. A key demonstration of its capability is the analysis of an alignment comprising over 110,000 sequences of HIV-1 pol genes, showcasing its robustness and scalability.

Beyond mere clustering, Fastbaps offers a novel approach for partitioning existing hierarchies to maximize the DPM model's marginal likelihood. This feature allows for the splitting of phylogenetic trees into clades and subclades based on a population genomic model, offering a nuanced view of genetic relationships and variations.

Topic

Genotype and phenotype;Phylogenetics;Mapping

Detail

  • Operation: Essential dynamics;Clustering;Genotyping

  • Software interface: Library

  • Language: R,C++,C

  • License: The MIT License

  • Cost: Free with restrictions

  • Version name: -

  • Credit: Wellcome Trust, The Alan Turing Institute via an Engineering and Physical Sciences Research Councilm U.S. National Institutes of Health.

  • Input: -

  • Output: -

  • Contact: Gerry Tonkin-Hill gqt20@cam.ac.uk

  • Collection: -

  • Maturity: Mature

Publications

  • Fast hierarchical Bayesian analysis of population structure.
  • Tonkin-Hill G, et al. Fast hierarchical Bayesian analysis of population structure. Fast hierarchical Bayesian analysis of population structure. 2019; 47:5539-5549. doi: 10.1093/nar/gkz361
  • https://doi.org/10.1093/nar/gkz361
  • PMID: 31076776
  • PMC: PMC6582336

Download and documentation


< Back to DB search