REPRO

REPRO detects evolutionarily related sequence motifs within individual protein sequences, including highly diverged repeat units, to delineate homologous repeats for evolutionary, structural, and alignment analyses.

Key Features:

Sensitive repeat detection: Detects homologous repeat regions whose similarity ranges from identical to nearly undetectable by conventional alignment heuristics.
Smith–Waterman-based local alignment enumeration: Uses variations of the Smith–Waterman algorithm to enumerate high-scoring nonoverlapping fragment alignments.
Graph-based clustering: Groups alignments with shared N-terminal boundaries and iteratively subdivides alignments to define and extend repeat clusters.
Iterative multiple alignment and profile sliding: Performs iterative multiple alignment and profile "sliding" across the query to detect weakly conserved fragments missed initially.
Bootstrap-style iterative refinement: Applies a bootstrap-style iterative refinement that mimics expert manual repeat identification while scaling computationally.
Performance optimizations: Incorporates optimizations yielding ≥25× performance improvements without compromising sensitivity.
Proteome-scale scalability: Scales to full proteomes for automated detection of diverse repeat architectures.

Scientific Applications:

Protein age estimation: Accurate repeat delineation facilitates estimation of protein or repeat unit ages.
Repeat-based structural modeling: Enables repeat-based fold inference and structural modeling of repeat proteins.
Improved multiple sequence alignments: Improves reliability of multiple sequence alignments by addressing misalignment caused by repeats.
Evolutionary and genomic analyses: Supports analysis of duplication, recombination, and fusion events in genomes.

Methodology:

The pipeline proceeds in three phases: (i) a comprehensive local alignment search using variations of the Smith–Waterman algorithm to enumerate high-scoring nonoverlapping fragment alignments; (ii) a graph-based clustering procedure that groups alignments with shared N-terminal boundaries and iteratively subdivides alignments to define initial repeat sets and extend clusters; and (iii) iterative multiple alignment and profile "sliding" across the query to detect additional weakly conserved fragments.

Visit Official Homepage →

Topics

Sequence composition, complexity and repeats Nucleic acid structure analysis

Details

Tool Type:: web application
Operating Systems:: Linux, Windows, Mac
Added:: 4/21/2017
Last Updated:: 11/25/2024

Operations

Publications

George RA, Heringa J. The REPRO server: finding protein internal sequence repeats through the Web. Trends in Biochemical Sciences. 2000;25(10):515-517. doi:10.1016/s0968-0004(00)01643-1. PMID:11203383.

DOI: 10.1016/s0968-0004(00)01643-1

PMID: 11203383

Documentation

General

http://www.ibi.vu.nl/programs/reprowww/info.php

Links

Software catalogue

http://www.mybiosoftware.com/repro-protein-repeats-analysis.html

← Back to search