Pro-Coffee

Pro-Coffee aligns homologous promoter regions using a dinucleotide substitution matrix derived from TRANSFAC functional binding-site alignments to improve identification of orthologous regulatory sequences and transcription factor binding sites.


Key Features:

  • Dinucleotide Substitution Matrix: Pro-Coffee employs a dinucleotide substitution matrix estimated from alignments of functional binding sites sourced from the TRANSFAC database.
  • Tailored for Regulatory Sequences: The method is specialized for promoter and regulatory region alignment, distinguishing it from general-purpose aligners.
  • Validation Framework: The method was evaluated on a dataset comprising several thousand families of orthologous promoters to assess ortholog versus paralog identification.
  • Enhanced Ortholog Identification Accuracy: Pro-Coffee achieved 80.4% accuracy in identifying true orthologs compared with averages of 73.5% and 77.6% for other methods.
  • Multi-species ChIP-seq Validation: Validation included a procedure using multi-species ChIP-seq data to test alignment of experimentally detected binding sites across species.
  • Binding Site Alignment Performance: Pro-Coffee correctly aligned 331 transcription factor binding sites, approximately 16.5% more than the default-method average (284) and outperforming trained methods (316).
  • Correlation with Ortholog Classification: A strong correlation was observed between a method's ortholog classification accuracy and its proficiency in aligning proven binding sites, indicating that training on the ortholog dataset increases functional informativeness.

Scientific Applications:

  • Gene regulation studies: Generate more accurate promoter-region alignments to support analysis of transcription factor binding and regulatory mechanisms.
  • Comparative genomics: Identify orthologous promoters across species to investigate evolutionary conservation and divergence in regulatory networks.

Methodology:

Pro-Coffee uses a dinucleotide substitution matrix estimated from alignments of functional binding sites from TRANSFAC and was evaluated on several thousand families of orthologous promoters, with additional validation using multi-species ChIP-seq comparing trained and untrained methods and metrics based on ortholog classification accuracy and counts of correctly aligned binding sites.

Topics

Details

Tool Type:
web application
Operating Systems:
Linux, Windows, Mac
Added:
8/3/2017
Last Updated:
11/25/2024

Operations

Publications

Erb I, González-Vallinas JR, Bussotti G, Blanco E, Eyras E, Notredame C. Use of ChIP-Seq data for the design of a multiple promoter-alignment method. Nucleic Acids Research. 2012;40(7):e52-e52. doi:10.1093/nar/gkr1292. PMID:22230796. PMCID:PMC3326335.

Documentation

Links