XAEM

XAEM estimates isoform-level gene expression from RNA-seq data by jointly modeling transcript abundance and read-assignment biases using a bilinear statistical framework.


Key Features:

  • Bilinear Model Framework: Treats both the design matrix X and coefficients β as unknowns to enable simultaneous estimation of X and β.
  • Empirical Bias Correction: Automatically performs empirical bias correction to account for potentially unknown biases in RNA-seq data.
  • Alternating Expectation-Maximization (AEM) Algorithm: Uses an alternating expectation-maximization algorithm to iteratively estimate X and β.
  • Quasi-mapping for Speed: Incorporates quasi-mapping for read alignment to reduce computational time while preserving accuracy.

Scientific Applications:

  • Multi-sample Analysis: Jointly estimates the design matrix and coefficients across multi-sample RNA-seq data to improve reliability of expression estimates.
  • Improved Accuracy in Simulations: Demonstrates higher accuracy in simulations involving multiple-isoform genes compared to existing methods.
  • Differential Expression Analysis in scRNA-seq: Applied to single-cell RNA-seq datasets to achieve substantially better rediscovery rates in independent validation sets.

Methodology:

XAEM requires a transcript FASTA file and a GTF annotation file, supports various reference genomes and annotations with UCSC hg19 given as an example, and computationally uses a bilinear model treating X and β as unknowns, an alternating expectation-maximization algorithm to iteratively estimate them, empirical bias correction, and quasi-mapping for read alignment.

Topics

Details

Added:
11/14/2019
Last Updated:
11/24/2024

Operations

Publications

Deng W, Mou T, Kalari KR, Niu N, Wang L, Pawitan Y, Vu TN. Alternating EM algorithm for a bilinear model in isoform quantification from RNA-seq data. Bioinformatics. 2019;36(3):805-812. doi:10.1093/bioinformatics/btz640. PMID:31400221. PMCID:PMC9883676.