LobSTR

LobSTR profiles short tandem repeats (STRs) from next-generation sequencing (NGS) data to enable accurate STR genotyping for applications including medical genetics, forensics, and genetic genealogy.


Key Features:

  • No gapped alignment: Implements an approach that circumvents gapped alignment for STR mapping to reduce alignment-induced bias in allele sampling.
  • Signal processing and statistical learning: Employs techniques from signal processing and statistical learning to model STR-specific sequence patterns and call alleles.
  • STR-specific noise handling: Explicitly addresses unique noise patterns associated with STR calling to improve genotyping accuracy.
  • Performance: Demonstrates rapid processing and high accuracy relative to existing STR profiling algorithms.
  • Validation: Validated by consistency between whole-genome sequencing biological replicates, Mendelian inheritance tracing in a HapMap trio, and comparisons with traditional molecular techniques.
  • Large-scale survey capability: Applied to a deeply sequenced personal genome to characterize mutation dynamics at nearly 100,000 STR loci and identify over 50,000 STR variations.
  • Computational implementation and formats: Accepts raw sequencing reads, outputs genotyping results, is implemented in C/C++, supports multi-threading, and accepts BAM format.

Scientific Applications:

  • Medical genetics: STR genotyping for studies of repeat-associated genetic variation and disease.
  • Forensics: STR profiling for identity and kinship analysis.
  • Genetic genealogy: STR-based lineage and relatedness inference.
  • Population and mutation studies: Large-scale surveys of STR variation and mutation dynamics across genomes.

Methodology:

Computational methods include avoidance of gapped alignment and use of signal processing and statistical learning on raw sequencing reads; implemented in C/C++ with multi-threading and BAM support to produce STR genotypes.

Topics

Details

Maturity:
Mature
Tool Type:
workflow
Operating Systems:
Linux
Programming Languages:
R, C++, Python
Added:
1/13/2017
Last Updated:
11/25/2024

Operations

Publications

Gymrek M, Golan D, Rosset S, Erlich Y. lobSTR: A short tandem repeat profiler for personal genomes. Genome Research. 2012;22(6):1154-1162. doi:10.1101/gr.135780.111. PMID:22522390. PMCID:PMC3371701.

Documentation