RSAT oligo-analysis

RSAT oligo-analysis computes k-mer frequencies and detects statistically over-represented and under-represented oligonucleotides and oligopeptides to identify regulatory motifs and sequence signals such as transcription factor binding sites, restriction sites, RNA termination and polyadenylation signals, and replication origins.


Key Features:

  • K-mer frequency computation: Computes and analyzes frequencies of oligonucleotides (k-mers) and oligopeptides across nucleotide and protein sequences.
  • Motif discovery: Identifies DNA binding sites for transcription factors by extracting and analyzing upstream regulatory sequences from families of coregulated genes.
  • Statistical rigor: Employs an exhaustive statistical approach defining motif significance based on observed frequencies in non-coding sequences from the yeast genome.
  • Over-/under-representation detection: Detects motifs that are significantly over-represented or under-represented in input sequence sets.
  • Polyadenylation signal analysis: Analyzes oligonucleotide composition downstream of stop codons to identify over-represented words associated with efficiency and positioning, including distributions around ~35 bp after the stop codon.
  • Oligopeptide frequency analysis: Computes and analyzes oligopeptide frequencies in protein sequences to reveal sequence patterns.
  • Restriction site identification: Detects motifs indicative of restriction sites within bacterial genomes by analyzing motif frequency patterns.
  • Comparative genomics and genome-scale pattern matching: Supports comparative genomics and regulatory variation analyses and performs genome-scale pattern matching.
  • Data integration: Integrates data from fully sequenced genomes with updates from GenBank for background and comparative analyses.

Scientific Applications:

  • Transcription factor binding motif discovery: Identification of putative transcription factor binding sites from coregulated gene upstream regions.
  • Restriction site identification: Mapping and detection of restriction motifs in bacterial genomic sequences.
  • RNA processing signal detection: Detection of RNA termination and polyadenylation signals by analyzing oligonucleotide composition downstream of stop codons.
  • Replication origin analysis: Analysis of oligonucleotide frequency patterns to investigate replication origin sequences.
  • Protein sequence analysis: Examination of oligopeptide frequency distributions to study protein sequence features and evolution.

Methodology:

Computes and analyzes k-mer frequencies; performs exhaustive statistical analysis defining motif significance from observed frequencies in non-coding sequences (yeast genome); identifies over- and under-represented motifs; analyzes oligonucleotide composition downstream of stop codons (noting enrichment around 35 bp); extracts and analyzes upstream regulatory sequences from coregulated genes; analyzes oligopeptide frequencies in protein sequences; and performs genome-scale pattern matching and comparative genomics analyses.

Topics

Collections

Details

License:
AFL-3.0
Maturity:
Mature
Cost:
Free of charge
Tool Type:
api, command-line tool, web application
Operating Systems:
Linux, Mac
Programming Languages:
Perl
Added:
3/24/2016
Last Updated:
11/24/2024

Operations

Data Inputs & Outputs

Sequence motif discovery

Publications

van Helden J, André B, Collado-Vides J. Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies 1 1Edited by G. von Heijne. Journal of Molecular Biology. 1998;281(5):827-842. doi:10.1006/jmbi.1998.1947. PMID:9719638.

Helden Jv. Statistical analysis of yeast genomic downstream sequences reveals putative polyadenylation signals. Nucleic Acids Research. 2000;28(4):1000-1010. doi:10.1093/nar/28.4.1000. PMID:10648794. PMCID:PMC102588.

Medina-Rivera A, Defrance M, Sand O, Herrmann C, Castro-Mondragon JA, Delerce J, Jaeger S, Blanchet C, Vincens P, Caron C, Staines DM, Contreras-Moreira B, Artufel M, Charbonnier-Khamvongsa L, Hernandez C, Thieffry D, Thomas-Chollier M, van Helden J. RSAT 2015: Regulatory Sequence Analysis Tools. Nucleic Acids Research. 2015;43(W1):W50-W56. doi:10.1093/nar/gkv362. PMID:25904632. PMCID:PMC4489296.

van Helden J. Regulatory Sequence Analysis Tools. Nucleic Acids Research. 2003;31(13):3593-3596. doi:10.1093/nar/gkg567. PMID:12824373. PMCID:PMC168973.

Documentation

Downloads

Links

Service
http://metazoa.rsat.eu/oligo-analysis_form.cgi
(Metazoa-specific RSAT instance)
Service
http://prokaryotes.rsat.eu/oligo-analysis_form.cgi
(Prokaryote-specific RSAT instance (Bacteria and Archaea))
Service
http://fungi.rsat.eu/oligo-analysis_form.cgi
(Fungi-specific RSAT instance)
Service
http://protists.rsat.eu/oligo-analysis_form.cgi
(Protist-specific RSAT instance)
Service
http://plants.rsat.eu/oligo-analysis_form.cgi
(Plant-specific RSAT instance)