ECFS-DEA

ECFS-DEA applies an ensemble classifier-based feature selection strategy to improve differential expression analysis of gene expression profiles by identifying explanatory variables across heterogeneous sample distributions.


Key Features:

  • Ensemble Classifier Approach: Integrates multiple base classifiers into an ensemble framework to improve robustness of feature selection.
  • Variable Importance Calculation: Implements a variable importance measure inspired by the random forest algorithm that can be adapted to any chosen base classifier.
  • Interactive Feature Selection: Supports iterative feature selection via a sorted list of individual variables to refine selected explanatory features.
  • Visualization Tools: Generates projection heatmaps using k-means clustering and assesses feature performance with Receiver Operating Characteristic (ROC) curves.
  • Multivariate Hypothesis Testing: Applies a multivariate hypothesis testing approach to address limitations of multiple hypothesis testing in detecting collectively explanatory features.

Scientific Applications:

  • Differential Expression Analysis: Identifying important variables in differential expression analysis of gene expression profiles.
  • Heterogeneous Sample Distributions: Detecting explanatory features across datasets with differing sample distributions.
  • Validation and Translational Studies: Applicable from basic research to clinical studies and validated on both simulated and realistic datasets.

Methodology:

Combines an ensemble classifier strategy with a random-forest-inspired variable importance measure and multivariate hypothesis testing, and uses k-means clustering for projection heatmaps and ROC curves for performance assessment.

Topics

Details

License:
GPL-3.0
Tool Type:
desktop application
Operating Systems:
Mac, Linux, Windows
Programming Languages:
Python
Added:
1/18/2021
Last Updated:
3/5/2021

Operations

Publications

Zhao X, Jiao Q, Li H, Wu Y, Wang H, Huang S, Wang G. ECFS-DEA: an ensemble classifier-based feature selection for differential expression analysis on expression profiles. BMC Bioinformatics. 2020;21(1). doi:10.1186/s12859-020-3388-y. PMID:32024464. PMCID:PMC7003361.

PMID: 32024464
PMCID: PMC7003361
Funding: - Natural Science Foundation of China: 61771165 - China Postdoctoral Science Foundation Funded Project: 2014M551246, 2018T110302 - Innovation Project of State Key Laboratory of Tree Genetics and Breeding: 2019A04 - Fundamental Research Funds for the Central Universities: 2572018BH01 - National Undergraduate Innovation Project: 201910225184 - Specialized Personnel Start-up Grant: 41113237