cobind

cobind quantifies collocation strength between two sets of genomic intervals using six statistical measures to provide a nuanced and reproducible assessment of overlaps for genomic and epigenomic analyses.


Key Features:

  • Statistical measures: Implements the Jaccard coefficient, Sørensen–Dice coefficient, Szymkiewicz–Simpson coefficient, collocation coefficient, pointwise mutual information (PMI), and normalized PMI for quantitative overlap assessment.
  • Quantitative collocation assessment: Provides a continuous evaluation of overlap strength instead of relying on binary overlap thresholds.
  • Reproducibility and comparability: Uses standardized statistical metrics to enable reproducible and comparable results across datasets and studies.
  • Application to genomic datasets: Applied to analyses of CTCF binding sites from ChIP-seq, identification of cancer-specific open-chromatin regions (OCRs) using ATAC-seq across 17 cancer types, and examination of oligodendrocyte-specific OCRs from single-cell ATAC-seq (scATAC-seq).
  • Discovery of regulatory elements: Facilitates re-discovery of CTCF cofactors and master regulators specific to cancers and oligodendrocytes.

Scientific Applications:

  • Quantitative genomic overlap analysis: Enables detailed measurement of interval collocation strength in genomic and epigenomic studies.
  • CTCF binding analysis: Supports analysis of CTCF binding sites derived from ChIP-seq data.
  • Cancer OCR identification: Identifies cancer-specific open-chromatin regions using ATAC-seq across 17 cancer types.
  • Oligodendrocyte chromatin analysis: Examines oligodendrocyte-specific OCRs identified from single-cell ATAC-seq (scATAC-seq).
  • Regulatory factor discovery: Aids in re-discovering CTCF cofactors and master regulators relevant to cancer and oligodendrocyte biology.

Methodology:

Computes six statistical measures—Jaccard, Sørensen–Dice, Szymkiewicz–Simpson, collocation coefficient, pointwise mutual information (PMI), and normalized PMI—to quantify collocation strength between two genomic interval sets and avoid binary overlap thresholds.

Topics

Details

License:
CC-BY-4.0
Cost:
Free of charge
Tool Type:
library
Operating Systems:
Mac, Linux, Windows
Programming Languages:
Python
Added:
3/6/2024
Last Updated:
11/24/2024

Operations

Publications

Ma T, Guo L, Yan H, Wang L. Cobind: quantitative analysis of the genomic overlaps. Bioinformatics Advances. 2023;3(1). doi:10.1093/bioadv/vbad104. PMID:37600846. PMCID:PMC10438957.

PMID: 37600846
Funding: - US National Institute of Health: AA27179, CA130908-12, CA180882-07, CA230712-4