MLSTar

MLSTar: R-based automated MLST and cgMLST allele assignment via PubMLST


MLSTar automates allele and sequence type assignment from genome assemblies using multilocus sequence typing (MLST) and core genome MLST (cgMLST) schemes retrieved from the PubMLST database. It screens genomes against selected schemes and returns standardized typing results within the R environment.


Key Features:

  • PubMLST Integration: Connects to the PubMLST database via RESTful API to retrieve species-specific MLST and cgMLST schemes.
  • Automated Allele and Sequence Type Assignment: Screens genome datasets against selected schemes to assign alleles and sequence types programmatically.
  • R Environment Compatibility: Outputs results as R objects for downstream statistical analysis and visualization using R packages.
  • Validated Performance: Demonstrated high concordance with established command-line tools in benchmarking using 400 Campylobacter coli genomes.

Scientific Applications:

  • Bacterial Epidemiology and Population Genomics: Supports MLST and cgMLST-based analysis of genetic diversity, population structure, outbreak investigation, and evolutionary relationships.

Methodology:

MLSTar queries the PubMLST database to obtain MLST or cgMLST schemes, extracts locus sequences, and compares input genome assemblies to reference alleles to assign allele numbers and sequence types. cgMLST and whole-genome MLST (wgMLST) support depends on RESTful API compatibility.

Topics

Details

License:
MIT
Tool Type:
command-line tool
Programming Languages:
R
Added:
1/9/2020
Last Updated:
12/29/2020

Operations

Publications

Ferrés I, Iraola G. MLSTar: automatic multilocus and core genome sequence typing in R. Unknown Journal. 2018. doi:10.7287/peerj.preprints.26630v2.