ECFS-DEA
ECFS-DEA applies an ensemble classifier-based feature selection strategy to improve differential expression analysis of gene expression profiles by identifying explanatory variables across heterogeneous sample distributions.
Key Features:
- Ensemble Classifier Approach: Integrates multiple base classifiers into an ensemble framework to improve robustness of feature selection.
- Variable Importance Calculation: Implements a variable importance measure inspired by the random forest algorithm that can be adapted to any chosen base classifier.
- Interactive Feature Selection: Supports iterative feature selection via a sorted list of individual variables to refine selected explanatory features.
- Visualization Tools: Generates projection heatmaps using k-means clustering and assesses feature performance with Receiver Operating Characteristic (ROC) curves.
- Multivariate Hypothesis Testing: Applies a multivariate hypothesis testing approach to address limitations of multiple hypothesis testing in detecting collectively explanatory features.
Scientific Applications:
- Differential Expression Analysis: Identifying important variables in differential expression analysis of gene expression profiles.
- Heterogeneous Sample Distributions: Detecting explanatory features across datasets with differing sample distributions.
- Validation and Translational Studies: Applicable from basic research to clinical studies and validated on both simulated and realistic datasets.
Methodology:
Combines an ensemble classifier strategy with a random-forest-inspired variable importance measure and multivariate hypothesis testing, and uses k-means clustering for projection heatmaps and ROC curves for performance assessment.
Topics
Details
- License:
- GPL-3.0
- Tool Type:
- desktop application
- Operating Systems:
- Mac, Linux, Windows
- Programming Languages:
- Python
- Added:
- 1/18/2021
- Last Updated:
- 3/5/2021
Operations
Publications
Zhao X, Jiao Q, Li H, Wu Y, Wang H, Huang S, Wang G. ECFS-DEA: an ensemble classifier-based feature selection for differential expression analysis on expression profiles. BMC Bioinformatics. 2020;21(1). doi:10.1186/s12859-020-3388-y. PMID:32024464. PMCID:PMC7003361.