Tally
Tally identifies structural tandem repeats (TRs) in protein sequences using machine learning to distinguish sequences with genuine 3D repeats from those without.
Key Features:
- Machine Learning Approach: Tally employs machine learning algorithms to detect structural tandem repeats in protein sequences.
- Detection of Imperfect and Evolutionarily Obscured TRs: The method targets imperfect and evolutionarily modified repetitive patterns that are difficult to detect by sequence analysis alone.
- Benchmark Performance: Tally achieved sensitivity of 81%, specificity of 74%, and an Area Under the Receiver Operating Characteristic Curve (AUC) of 86%.
- Structural Benchmarking: Evaluation uses protein 3D structures composed of repetitive 3D blocks as a benchmark to assess detection of structural TRs.
- Improved Separation from Traditional Scoring: The approach provides superior separation between sequences with and without 3D TRs compared to traditional scoring methods.
Scientific Applications:
- Protein Structure Analysis: Identification of structurally significant TRs to inform protein architecture and function.
- Functional Annotation: Selection of functionally meaningful TRs from proteomes to support annotation efforts.
- Benchmarking Resource: Generation of a dataset for benchmarking detection of structural TRs in proteins.
Methodology:
Analyzes protein sequences and integrates machine learning models trained on known 3D protein structures to distinguish sequences with genuine structural TRs from those without.
Topics
Details
- License:
- GPL-3.0
- Maturity:
- Mature
- Tool Type:
- web application
- Operating Systems:
- Linux, Mac
- Programming Languages:
- Python, C
- Added:
- 1/13/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Richard FD, Alves R, Kajava AV. Tally: a scoring tool for boundary determination between repetitive and non-repetitive protein sequences. Bioinformatics. 2016;32(13):1952-1958. doi:10.1093/bioinformatics/btw118. PMID:27153701.
PMID: 27153701