Pro-Coffee
Pro-Coffee aligns homologous promoter regions using a dinucleotide substitution matrix derived from TRANSFAC functional binding-site alignments to improve identification of orthologous regulatory sequences and transcription factor binding sites.
Key Features:
- Dinucleotide Substitution Matrix: Pro-Coffee employs a dinucleotide substitution matrix estimated from alignments of functional binding sites sourced from the TRANSFAC database.
- Tailored for Regulatory Sequences: The method is specialized for promoter and regulatory region alignment, distinguishing it from general-purpose aligners.
- Validation Framework: The method was evaluated on a dataset comprising several thousand families of orthologous promoters to assess ortholog versus paralog identification.
- Enhanced Ortholog Identification Accuracy: Pro-Coffee achieved 80.4% accuracy in identifying true orthologs compared with averages of 73.5% and 77.6% for other methods.
- Multi-species ChIP-seq Validation: Validation included a procedure using multi-species ChIP-seq data to test alignment of experimentally detected binding sites across species.
- Binding Site Alignment Performance: Pro-Coffee correctly aligned 331 transcription factor binding sites, approximately 16.5% more than the default-method average (284) and outperforming trained methods (316).
- Correlation with Ortholog Classification: A strong correlation was observed between a method's ortholog classification accuracy and its proficiency in aligning proven binding sites, indicating that training on the ortholog dataset increases functional informativeness.
Scientific Applications:
- Gene regulation studies: Generate more accurate promoter-region alignments to support analysis of transcription factor binding and regulatory mechanisms.
- Comparative genomics: Identify orthologous promoters across species to investigate evolutionary conservation and divergence in regulatory networks.
Methodology:
Pro-Coffee uses a dinucleotide substitution matrix estimated from alignments of functional binding sites from TRANSFAC and was evaluated on several thousand families of orthologous promoters, with additional validation using multi-species ChIP-seq comparing trained and untrained methods and metrics based on ortholog classification accuracy and counts of correctly aligned binding sites.
Topics
Details
- Tool Type:
- web application
- Operating Systems:
- Linux, Windows, Mac
- Added:
- 8/3/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Erb I, González-Vallinas JR, Bussotti G, Blanco E, Eyras E, Notredame C. Use of ChIP-Seq data for the design of a multiple promoter-alignment method. Nucleic Acids Research. 2012;40(7):e52-e52. doi:10.1093/nar/gkr1292. PMID:22230796. PMCID:PMC3326335.