Tally

Tally identifies structural tandem repeats (TRs) in protein sequences using machine learning to distinguish sequences with genuine 3D repeats from those without.


Key Features:

  • Machine Learning Approach: Tally employs machine learning algorithms to detect structural tandem repeats in protein sequences.
  • Detection of Imperfect and Evolutionarily Obscured TRs: The method targets imperfect and evolutionarily modified repetitive patterns that are difficult to detect by sequence analysis alone.
  • Benchmark Performance: Tally achieved sensitivity of 81%, specificity of 74%, and an Area Under the Receiver Operating Characteristic Curve (AUC) of 86%.
  • Structural Benchmarking: Evaluation uses protein 3D structures composed of repetitive 3D blocks as a benchmark to assess detection of structural TRs.
  • Improved Separation from Traditional Scoring: The approach provides superior separation between sequences with and without 3D TRs compared to traditional scoring methods.

Scientific Applications:

  • Protein Structure Analysis: Identification of structurally significant TRs to inform protein architecture and function.
  • Functional Annotation: Selection of functionally meaningful TRs from proteomes to support annotation efforts.
  • Benchmarking Resource: Generation of a dataset for benchmarking detection of structural TRs in proteins.

Methodology:

Analyzes protein sequences and integrates machine learning models trained on known 3D protein structures to distinguish sequences with genuine structural TRs from those without.

Topics

Details

License:
GPL-3.0
Maturity:
Mature
Tool Type:
web application
Operating Systems:
Linux, Mac
Programming Languages:
Python, C
Added:
1/13/2017
Last Updated:
11/25/2024

Operations

Publications

Richard FD, Alves R, Kajava AV. Tally: a scoring tool for boundary determination between repetitive and non-repetitive protein sequences. Bioinformatics. 2016;32(13):1952-1958. doi:10.1093/bioinformatics/btw118. PMID:27153701.