phylotaR
phylotaR retrieves orthologous DNA sequences from GenBank for use in phylogenetic inference.
Key Features:
- Ortholog Identification: Uses alignment searches to identify orthologous sequences and mitigate sequence mislabeling and incorrect species identification.
- Modular Pipeline: Implements a modular pipeline building on PhyLoTa for flexible processing of large datasets.
- Handling Data Complexity: Processes large and complex GenBank datasets to select relevant orthologous sequences.
- Improved Accuracy: Reduces noise, error, and bias in phylogenetic inference by focusing on orthologs and using up-to-date GenBank data.
- Versatility Across Taxa: Applied to large taxonomic clades such as Palms and Primates, demonstrating cross-taxa utility.
Scientific Applications:
- Phylogenetic reconstruction: Assembles orthology-aware sequence datasets from GenBank for phylogenetic tree inference.
- Error reduction in evolutionary analyses: Minimizes impacts of sequence mislabeling and paralogy on downstream evolutionary and comparative studies.
- Large-scale taxonomic analyses: Enables analyses across diverse and large taxonomic clades using comprehensive GenBank sequence data.
Methodology:
Accesses GenBank to retrieve DNA sequences, employs alignment searches to identify orthologous sequences, and identifies overlapping sequence clusters.
Topics
Details
- License:
- MIT
- Tool Type:
- library
- Operating Systems:
- Linux, Windows, Mac
- Programming Languages:
- R
- Added:
- 7/29/2018
- Last Updated:
- 12/10/2018
Operations
Publications
Bennett DJ, Hettling H, Silvestro D, Zizka A, Bacon CD, Faurby S, Vos RA, Antonelli A. phylotaR: An Automated Pipeline for Retrieving Orthologous DNA Sequences from GenBank in R. Life. 2018;8(2):20. doi:10.3390/life8020020. PMID:29874797. PMCID:PMC6027284.