RepeatExplorer

RepeatExplorer identifies and characterizes repetitive DNA elements in next-generation sequencing (NGS) datasets to determine repeat composition and evolutionary dynamics in plant and animal genomes.

Key Features:

Graph-Based Clustering: Employs graph-based similarity clustering of short NGS reads to partition data into clusters representing individual repeat families.
De Novo Identification: Performs de novo identification of repetitive elements without relying on reference databases of known elements.
Repeat Annotation and Quantification: Provides programs for annotation and quantification of identified repeats, including classification of repeat types.
Phylogenetic and Comparative Analyses: Supports investigation of phylogenetic relationships among retroelements and comparative analyses across multiple species.
Visualization (SeqGrapheR): Includes visual inspection using SeqGrapheR to explore cluster structure and sequence variability.
Scalability and Low-Pass Data Support: Scales to analyze low-pass genome sequencing and several million sequence reads to detect high- and medium-copy repeats.

Scientific Applications:

Genome Structure and Evolution Studies: Enables characterization of repetitive sequence content to study genome structure and evolutionary dynamics in plants and animals.
Repeat Family Analysis: Allows assessment of cluster sizes with statistical analyses and visual inspection to distinguish repeat types and intra-family sequence variability.
Novel Element Discovery: Facilitates discovery and characterization of novel repeat elements and assembly of consensus sequences to investigate repeat family divergence.

Methodology:

Performs similarity-based graph partitioning of genome sequence reads into clusters representing repeat families; assembles consensus sequences; applies classification and quantification programs; uses statistical analysis of cluster sizes and visual inspection with SeqGrapheR; operates on low-pass NGS data and without reliance on reference repeat databases.

Visit Official Homepage →

Topics

DNA polymorphism Mobile genetic elements

Collections

Czech Republic ELIXIR-CZ

Details

License:: GPL-3.0
Maturity:: Mature
Cost:: Free of charge
Tool Type:: command-line tool, web application
Operating Systems:: Linux
Programming Languages:: R, Perl, Python
Added:: 11/9/2015
Last Updated:: 11/25/2024

Operations

Data Inputs & Outputs

Genome annotation

Inputs

Outputs

Map
- Map format

Publications

Novák P, Neumann P, Macas J. Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data. BMC Bioinformatics. 2010;11(1). doi:10.1186/1471-2105-11-378. PMID:20633259. PMCID:PMC2912890.

DOI: 10.1186/1471-2105-11-378

PMID: 20633259

PMCID: PMC2912890

Novák P, Neumann P, Pech J, Steinhaisl J, Macas J. RepeatExplorer: a Galaxy-based web server for genome-wide characterization of eukaryotic repetitive elements from next-generation sequence reads. Bioinformatics. 2013;29(6):792-793. doi:10.1093/bioinformatics/btt054. PMID:23376349.

DOI: 10.1093/bioinformatics/btt054

PMID: 23376349

Documentation

http://repeatexplorer.org

Downloads

Binaries
https://bitbucket.org/petrnovak/repex_tarean
Software package
https://toolshed.g2.bx.psu.edu/view/petr-novak/repeatexplorer2
Galaxy toolshed package
Source code
https://bitbucket.org/petrnovak/repex_tarean

Links

Repository

https://bitbucket.org/petrnovak/repex_tarean

Galaxy service

https://repeatexplorer-elixir.cerit-sc.cz/

← Back to search