TaxonKit
TaxonKit provides utilities for manipulation and querying of NCBI Taxonomy data to retrieve TaxIds, reconstruct and reformat complete taxonomic lineages, list descendants and compute lowest common ancestors for taxonomic analyses.
Key Features:
- TaxId querying: Query TaxIds by scientific names and identifiers.
- Listing and filtering TaxIds: List and filter TaxIds and enumerate descendants of specified TaxIds.
- Lineage retrieval and reformatting: Retrieve complete taxonomic lineages using TaxIds and reformat lineage outputs.
- Lowest common ancestor (LCA): Compute the lowest common ancestor (LCA) between taxa.
- Tracking TaxId changes: Track and report changes in TaxIds.
- Core subcommands: Provide seven core subcommands covering querying, listing/filtering, lineage retrieval/reformatting, LCA computation, and change tracking.
- Performance and scalability: Execute taxonomy data operations with performance and scalability suitable for datasets of varying sizes.
Scientific Applications:
- Genomics: Support genomics studies that require precise NCBI Taxonomy identifiers and lineages.
- Biodiversity assessment: Enable biodiversity and taxonomic inventory analyses through TaxId retrieval and lineage reconstruction.
- Ecological modeling: Provide taxonomic inputs such as lineages and descendant lists for ecological and community-level modeling.
Methodology:
Performs querying of TaxIds by names, retrieval and reformatting of complete taxonomic lineages from TaxIds, listing and filtering of TaxIds and descendants, computation of lowest common ancestors (LCA), and tracking of TaxId changes on NCBI Taxonomy data.
Topics
Details
- License:
- MIT
- Tool Type:
- workflow
- Programming Languages:
- R, Perl
- Added:
- 12/13/2021
- Last Updated:
- 12/13/2021
Operations
Publications
Shen W, Ren H. TaxonKit: A practical and efficient NCBI taxonomy toolkit. Journal of Genetics and Genomics. 2021;48(9):844-850. doi:10.1016/j.jgg.2021.03.006. PMID:34001434.
PMID: 34001434
Funding: - National Natural Science Foundation of China: 32000474
- National Major Science and Technology Projects of China: 2017ZX10202203-007-001
Documentation
User manual
https://bioinf.shenwei.me/taxonkit/Links
Issue tracker
https://github.com/shenwei356/taxonkit/issues