XAEM
XAEM estimates isoform-level gene expression from RNA-seq data by jointly modeling transcript abundance and read-assignment biases using a bilinear statistical framework.
Key Features:
- Bilinear Model Framework: Treats both the design matrix X and coefficients β as unknowns to enable simultaneous estimation of X and β.
- Empirical Bias Correction: Automatically performs empirical bias correction to account for potentially unknown biases in RNA-seq data.
- Alternating Expectation-Maximization (AEM) Algorithm: Uses an alternating expectation-maximization algorithm to iteratively estimate X and β.
- Quasi-mapping for Speed: Incorporates quasi-mapping for read alignment to reduce computational time while preserving accuracy.
Scientific Applications:
- Multi-sample Analysis: Jointly estimates the design matrix and coefficients across multi-sample RNA-seq data to improve reliability of expression estimates.
- Improved Accuracy in Simulations: Demonstrates higher accuracy in simulations involving multiple-isoform genes compared to existing methods.
- Differential Expression Analysis in scRNA-seq: Applied to single-cell RNA-seq datasets to achieve substantially better rediscovery rates in independent validation sets.
Methodology:
XAEM requires a transcript FASTA file and a GTF annotation file, supports various reference genomes and annotations with UCSC hg19 given as an example, and computationally uses a bilinear model treating X and β as unknowns, an alternating expectation-maximization algorithm to iteratively estimate them, empirical bias correction, and quasi-mapping for read alignment.
Topics
Details
- Added:
- 11/14/2019
- Last Updated:
- 11/24/2024
Operations
Publications
Deng W, Mou T, Kalari KR, Niu N, Wang L, Pawitan Y, Vu TN. Alternating EM algorithm for a bilinear model in isoform quantification from RNA-seq data. Bioinformatics. 2019;36(3):805-812. doi:10.1093/bioinformatics/btz640. PMID:31400221. PMCID:PMC9883676.