RepeatAnalyzer
RepeatAnalyzer analyzes short-sequence repeats (SSRs) and genotypes to track, catalog, and quantify SSR distribution and genetic diversity across prokaryotic and eukaryotic genomes, with validation on Anaplasma marginale.
Key Features:
- Tracking and Management: Catalogs short-sequence repeats (SSRs) and genotypes for systematic documentation of repeat sequences.
- Analysis Capabilities: Computes metrics assessing regional genetic diversity, SSR variety, and SSR regularity within loci.
- Visualization Tools: Generates geographic maps illustrating the distribution of genotypes and SSRs across regions of interest.
- Validation and Accuracy: Validated repeat identification and genotyping using 380 Anaplasma marginale isolates, confirming precision of repeat calls.
- Error Detection: Detects discrepancies in published data, including misreported SSRs, duplicate names for different SSRs, and multiple names assigned to a single SSR.
Scientific Applications:
- Genotype Identification: Uses heterogeneous SSR patterns within loci to assign strain genotypes for epidemiological and pathogen-evolution studies.
- Genetic Diversity Analysis: Quantifies regional genetic diversity to inform population structure and evolutionary dynamics.
- Data Correction and Validation: Identifies and helps correct errors in published SSR data to improve reliability of genomic datasets.
Methodology:
Employs novel metrics to evaluate SSR distribution, fits genotype-length distributions (reported as approximately normal) and SSR-frequency distributions (power-law-like), computes edit-distance distributions (identifying a common edit distance of five or six), and validates analyses using 380 Anaplasma marginale isolates; analyses report that over 90% of repeats are between 28 and 29 amino acids long.
Topics
Details
- License:
- GPL-3.0
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Windows, Mac
- Programming Languages:
- Python
- Added:
- 10/1/2018
- Last Updated:
- 12/10/2018
Operations
Publications
Catanese HN, Brayton KA, Gebremedhin AH. RepeatAnalyzer: a tool for analysing and managing short-sequence repeat data. BMC Genomics. 2016;17(1). doi:10.1186/s12864-016-2686-2. PMID:27260942. PMCID:PMC4891823.