BALD
BALD leverages linkage disequilibrium among single nucleotide polymorphisms (SNPs) to define LD blocks and perform group-wise variable selection with Group Lasso for improved genome-wide association studies (GWAS).
Key Features:
- High-dimensional regression handling: Addresses scenarios where the number of SNP predictors exceeds the number of observations and predictors are dependent due to LD.
- Hierarchical clustering with LD adjacency constraint: Performs hierarchical clustering of SNPs using LD as a similarity measure while enforcing an adjacency constraint to capture LD-induced grouping structure.
- Model selection for LD block definition: Applies model selection on the SNP hierarchy to define biologically meaningful and statistically robust LD blocks.
- Group Lasso regression on LD blocks: Executes Group Lasso regression on inferred LD blocks to enable simultaneous variable selection and regularization within groups.
- Comparative performance: Shown to outperform haplotype association tests, single marker analyses (SMA), Lasso, and Elastic-Net regressions particularly when more than two causal SNPs reside within an LD block.
Scientific Applications:
- Genetic marker discovery: Improves detection of SNPs associated with phenotypes by exploiting LD structure among SNPs in GWAS.
- Inference of block structure and SNP significance: Infers underlying LD block architecture and identifies individual SNP associations within blocks.
- Validation and empirical application: Validated on simulations and semi-simulated data and applied to a published HIV dataset.
Methodology:
Three-step computational approach: hierarchical clustering of SNPs using LD measures with an adjacency constraint; model selection on the SNP hierarchy to define LD blocks; and Group Lasso regression on the inferred LD blocks to identify significant associations.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Windows
- Programming Languages:
- R
- Added:
- 12/18/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Dehman A, Ambroise C, Neuvial P. Performance of a blockwise approach in variable selection using linkage disequilibrium information. BMC Bioinformatics. 2015;16(1). doi:10.1186/s12859-015-0556-6. PMID:25951947. PMCID:PMC4430909.