CorGen

CorGen measures long-range correlations in DNA GC-content and generates random DNA sequences that preserve those correlation parameters for use as null models in computational genomics.


Key Features:

  • Measurement and Analysis: Quantifies long-range correlations in base composition, with explicit analysis of GC-content autocorrelation and its power-law decay behavior.
  • Random Sequence Generation: Produces random DNA sequences that maintain the same correlation parameters as the input sequence or user-specified parameters for use as correlated null models.
  • Expansion-Randomization Dynamics: Implements an expansion–randomization dynamics algorithm to generate sequences with specified long-range correlation structure.

Scientific Applications:

  • Sequence Alignment: Accounts for long-range correlations to improve alignment score statistics and comparative genomic analyses.
  • Motif Finding Algorithms: Incorporates long-range correlations into null models to improve the performance and precision of motif discovery.

Methodology:

Computes GC-content autocorrelation functions (noting power-law decay) and employs expansion–randomization dynamics to generate random sequences that preserve specified correlation parameters.

Topics

Details

Tool Type:
web application
Operating Systems:
Linux, Windows, Mac
Added:
2/10/2017
Last Updated:
11/25/2024

Operations

Publications

Messer PW, Arndt PF. CorGen--measuring and generating long-range correlations for DNA sequence analysis. Nucleic Acids Research. 2006;34(Web Server):W692-W695. doi:10.1093/nar/gkl234. PMID:16845099. PMCID:PMC1538783.

Documentation