SEQ-EM
SEQ-EM estimates expression levels of homologous genes from short-read RNA-seq data by using a probabilistic model that integrates multimapping reads and accounts for genetic variability.
Key Features:
- Handling Multimapping Reads: Integrates reads that map to multiple genomic locations into its probabilistic model rather than discarding them.
- Probabilistic Modeling: Uses a maximum likelihood-based probabilistic framework to estimate expression parameters from RNA-seq data.
- Accounting for Genetic Variability: Considers genetic discrepancies between the sequenced DNA and the reference genome, including single nucleotide polymorphisms (SNPs), when estimating RNA concentrations.
- Expectation-Maximization and Variant Calling: Implements an Expectation-Maximization (EM) approach for expression estimation, with the current version not incorporating SNP variant calling within the EM algorithm.
Scientific Applications:
- Precise Expression Quantification: Quantification of gene expression levels in RNA-seq experiments requiring accurate measurement of homologous or closely related genes.
- Complex Genomes and Paralogous Genes: Analysis of expression in complex genomes or gene families where multimapping reads and sequence similarity confound read assignment.
- Improved RNA-seq Sensitivity: Enhancing accuracy and statistical power of RNA-seq studies to investigate gene expression patterns and regulatory mechanisms.
Methodology:
Maps short RNA-seq reads to a reference genome, uses read counts per gene as indicators of RNA concentration, and applies a maximum likelihood probabilistic model (implemented via Expectation-Maximization) that integrates multimapping reads and accounts for genetic discrepancies between the sequenced DNA and the reference.
Topics
Details
- Maturity:
- Legacy
- Tool Type:
- command-line tool
- Operating Systems:
- Linux
- Added:
- 8/3/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Paşaniuc B, Zaitlen N, Halperin E. Accurate Estimation of Expression Levels of Homologous Genes in RNA-seq Experiments. Journal of Computational Biology. 2011;18(3):459-468. doi:10.1089/cmb.2010.0259. PMID:21385047.