BALD

BALD leverages linkage disequilibrium among single nucleotide polymorphisms (SNPs) to define LD blocks and perform group-wise variable selection with Group Lasso for improved genome-wide association studies (GWAS).


Key Features:

  • High-dimensional regression handling: Addresses scenarios where the number of SNP predictors exceeds the number of observations and predictors are dependent due to LD.
  • Hierarchical clustering with LD adjacency constraint: Performs hierarchical clustering of SNPs using LD as a similarity measure while enforcing an adjacency constraint to capture LD-induced grouping structure.
  • Model selection for LD block definition: Applies model selection on the SNP hierarchy to define biologically meaningful and statistically robust LD blocks.
  • Group Lasso regression on LD blocks: Executes Group Lasso regression on inferred LD blocks to enable simultaneous variable selection and regularization within groups.
  • Comparative performance: Shown to outperform haplotype association tests, single marker analyses (SMA), Lasso, and Elastic-Net regressions particularly when more than two causal SNPs reside within an LD block.

Scientific Applications:

  • Genetic marker discovery: Improves detection of SNPs associated with phenotypes by exploiting LD structure among SNPs in GWAS.
  • Inference of block structure and SNP significance: Infers underlying LD block architecture and identifies individual SNP associations within blocks.
  • Validation and empirical application: Validated on simulations and semi-simulated data and applied to a published HIV dataset.

Methodology:

Three-step computational approach: hierarchical clustering of SNPs using LD measures with an adjacency constraint; model selection on the SNP hierarchy to define LD blocks; and Group Lasso regression on the inferred LD blocks to identify significant associations.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux, Windows
Programming Languages:
R
Added:
12/18/2017
Last Updated:
11/25/2024

Operations

Publications

Dehman A, Ambroise C, Neuvial P. Performance of a blockwise approach in variable selection using linkage disequilibrium information. BMC Bioinformatics. 2015;16(1). doi:10.1186/s12859-015-0556-6. PMID:25951947. PMCID:PMC4430909.

Links