cobind
cobind quantifies collocation strength between two sets of genomic intervals using six statistical measures to provide a nuanced and reproducible assessment of overlaps for genomic and epigenomic analyses.
Key Features:
- Statistical measures: Implements the Jaccard coefficient, Sørensen–Dice coefficient, Szymkiewicz–Simpson coefficient, collocation coefficient, pointwise mutual information (PMI), and normalized PMI for quantitative overlap assessment.
- Quantitative collocation assessment: Provides a continuous evaluation of overlap strength instead of relying on binary overlap thresholds.
- Reproducibility and comparability: Uses standardized statistical metrics to enable reproducible and comparable results across datasets and studies.
- Application to genomic datasets: Applied to analyses of CTCF binding sites from ChIP-seq, identification of cancer-specific open-chromatin regions (OCRs) using ATAC-seq across 17 cancer types, and examination of oligodendrocyte-specific OCRs from single-cell ATAC-seq (scATAC-seq).
- Discovery of regulatory elements: Facilitates re-discovery of CTCF cofactors and master regulators specific to cancers and oligodendrocytes.
Scientific Applications:
- Quantitative genomic overlap analysis: Enables detailed measurement of interval collocation strength in genomic and epigenomic studies.
- CTCF binding analysis: Supports analysis of CTCF binding sites derived from ChIP-seq data.
- Cancer OCR identification: Identifies cancer-specific open-chromatin regions using ATAC-seq across 17 cancer types.
- Oligodendrocyte chromatin analysis: Examines oligodendrocyte-specific OCRs identified from single-cell ATAC-seq (scATAC-seq).
- Regulatory factor discovery: Aids in re-discovering CTCF cofactors and master regulators relevant to cancer and oligodendrocyte biology.
Methodology:
Computes six statistical measures—Jaccard, Sørensen–Dice, Szymkiewicz–Simpson, collocation coefficient, pointwise mutual information (PMI), and normalized PMI—to quantify collocation strength between two genomic interval sets and avoid binary overlap thresholds.
Topics
Details
- License:
- CC-BY-4.0
- Cost:
- Free of charge
- Tool Type:
- library
- Operating Systems:
- Mac, Linux, Windows
- Programming Languages:
- Python
- Added:
- 3/6/2024
- Last Updated:
- 11/24/2024
Operations
Publications
Ma T, Guo L, Yan H, Wang L. Cobind: quantitative analysis of the genomic overlaps. Bioinformatics Advances. 2023;3(1). doi:10.1093/bioadv/vbad104. PMID:37600846. PMCID:PMC10438957.
PMID: 37600846
PMCID: PMC10438957
Funding: - US National Institute of Health: AA27179, CA130908-12, CA180882-07, CA230712-4