MACLEAPS
MACLEAPS applies support vector machines (SVMs) to genome-wide association study (GWAS) single-nucleotide polymorphism (SNP) datasets to predict disease risk and evaluate contributions of common, uncommon (1–5% frequency) and rare (<1% frequency) variants.
Key Features:
- Machine Learning Integration: Uses support vector machines (SVMs) to model genetic predictors from GWAS SNP datasets.
- Predictive Modeling: Constructs and evaluates predictive models for diseases including Parkinson's disease (PD) and type 1 diabetes (T1D) using area under the receiver operating characteristic curve (AUC) as a performance metric.
- Cross-Validation Techniques: Implements traditional training-validation splits and nested k-fold cross-validation for model evaluation and generalization assessment.
- Inclusion of Rare Variants: Incorporates uncommon (1–5% frequency) and rare (<1% frequency) variants in analyses to assess their impact on prediction accuracy.
- Simulation-based Evaluation: Performs simulations to investigate how effect size magnitude and heritability influence predictive performance.
Scientific Applications:
- Type 1 diabetes (T1D): Applied to T1D datasets, achieving an AUC of approximately 0.88 and estimating heritability near 90%.
- Parkinson's disease (PD): Applied to PD datasets, yielding an AUC of approximately 0.56 and estimating heritability near 38%.
Methodology:
Computational methods explicitly include SVM-based modeling of GWAS SNP data with explicit handling of common, uncommon (1–5%) and rare (<1%) variants, simulation studies to probe effect size and heritability impacts, and model evaluation via training-validation splits and nested k-fold cross-validation using AUC as the performance metric.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Windows, Mac
- Programming Languages:
- Java
- Added:
- 8/3/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Mittag F, Büchel F, Saad M, Jahn A, Schulte C, Bochdanovits Z, Simón-Sánchez J, Nalls MA, Keller M, Hernandez DG, Gibbs JR, Lesage S, Brice A, Heutink P, Martinez M, Wood NW, Hardy J, Singleton AB, Zell A, Gasser T, Sharma M. Use of support vector machines for disease risk prediction in genome-wide association studies: Concerns and opportunities. Human Mutation. 2012;33(12):1708-1718. doi:10.1002/humu.22161. PMID:22777693. PMCID:PMC5968822.