ALDEx2

ALDEx2 performs differential abundance analysis of high-throughput sequencing count data by modeling compositionality with a Dirichlet-multinomial framework.

Key Features:

Dirichlet-multinomial model: Transforms raw counts into relative abundances to handle compositional data and account for technical and statistical variation.
Statistical tests: Implements Wilcox rank test, Welch's t-test, generalized linear models (GLM), and Kruskal-Wallis test for inference of differential abundance.
False Discovery Rate (FDR) control: Calculates expected false discovery rate and reports P-values and FDR values adjusted using the Benjamini-Hochberg correction, considering biological and sampling variation.
Bayesian inference: Uses Bayesian methods to distinguish technical noise from true biological signal.
Replicate optimization: Model optimized for experiments with three or more replicates.

Scientific Applications:

RNA Sequencing (RNA-seq): Identifies differentially expressed genes from count-based RNA-seq data.
16S rRNA Gene Sequencing: Analyzes microbial community taxon abundances, including distinguishing taxa between tongue dorsum and buccal mucosa in human microbiome studies.
Chromatin Immunoprecipitation Sequencing (ChIP-seq): Applicable to epigenetic studies of DNA–protein interactions using count data.
Metagenomic Analysis: Facilitates differential abundance analysis of microbial communities from environmental or clinical metagenomic samples.
Selective Growth Experiments: Assesses differential growth patterns in vitro using count-based measurements.
Human Microbiome Project 16S data: Has been applied to Human Microbiome Project 16S rRNA gene abundance datasets.

Methodology:

Applies a Dirichlet-multinomial framework to transform raw counts into relative abundances, uses Bayesian inference to partition technical noise and biological signal, performs statistical tests (Wilcox rank test, Welch's t-test, GLM, Kruskal-Wallis), and computes P-values and FDR values with Benjamini-Hochberg adjustment while estimating expected false discovery rate accounting for biological and sampling variation; optimized for experiments with three or more replicates.

Visit Official Homepage →

Topics

Gene expression Statistics and probability

Collections

BioConductor

Details

Tool Type:: command-line tool, library
Operating Systems:: Linux, Windows, Mac
Programming Languages:: R
Added:: 1/17/2017
Last Updated:: 11/24/2024

Operations

Statistical inference

Data Inputs & Outputs

Statistical inference

Inputs

Genotype/phenotype report
- FASTA

Outputs

P-value
- BCF
- VCF

Publications

Fernandes AD, Reid JN, Macklaim JM, McMurrough TA, Edgell DR, Gloor GB. Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis. Microbiome. 2014;2(1). doi:10.1186/2049-2618-2-15. PMID:24910773. PMCID:PMC4030730.

DOI: 10.1186/2049-2618-2-15

PMID: 24910773

PMCID: PMC4030730

Documentation

User manual

http://bioconductor.org/packages/release/bioc/html/ALDEx2.html

Downloads

Source code
http://bioconductor/packages/release/bioc/src/contrib/ALDEx2_1.6.0.tar.gz

← Back to search