ClusterMine
ClusterMine integrates annotated gene sets into clustering analysis to identify sample groups and prioritize gene sets that drive sample similarity in gene expression data.
Key Features:
- Knowledge-Integrated Clustering: Partitions gene expression data into subdata based on predefined annotated gene sets to assess functional contributions to sample similarity.
- Functional Interpretation: Highlights and prioritizes the gene sets that most significantly contribute to observed clusters, linking clusters to specific biological functions or pathways.
- Comparison to Conventional Methods: Addresses limitations of hierarchical clustering (HC) and consensus clustering (CC) by incorporating gene-set-level signals rather than relying solely on holistic expression profiles.
- Input Requirements: Operates on a list of gene sets and a gene expression data matrix with genes in rows and samples in columns.
- Clustering Integration: Performs clustering on each gene-set-specific subdata and integrates per-gene-set clustering results into a final comprehensive clustering output.
- Validation: Demonstrated improved performance and biologically relevant prioritized gene sets across nine real experimental datasets.
Scientific Applications:
- Cell subpopulation identification: Detects cell subpopulations by grouping samples according to functional gene-set signals in expression data.
- Disease subtype discovery: Identifies disease subtypes and links them to pathway- or function-level gene-set differences.
- Biological interpretation of clusters: Facilitates assigning biological functions or pathways to clusters via prioritized gene sets.
Methodology:
Requires a list of gene sets and a gene expression matrix; partitions the expression matrix into subdata by gene set, performs clustering on each subdata subset, and integrates the per-gene-set clustering results into a final clustering output.
Topics
Details
- Programming Languages:
- R
- Added:
- 1/18/2021
- Last Updated:
- 11/24/2024
Operations
Publications
Li H, Xu Y, Zhu X, Liu Q, Omenn GS, Wang J. ClusterMine: A knowledge-integrated clustering approach based on expression profiles of gene sets. Journal of Bioinformatics and Computational Biology. 2020;18(03):2040009. doi:10.1142/s0219720020400090. PMID:32698720. PMCID:PMC8864677.