MOIRAI

MOIRAI processes and analyzes Cap Analysis of Gene Expression (CAGE) sequencing data to identify 5' RNA ends, pinpoint transcription start sites (TSS), and quantify their expression from mapped read counts.


Key Features:

  • Mapping Workflow: Aligns CAGE reads to a reference genome to determine genomic coordinates of 5' ends.
  • Annotation Workflow: Annotates mapped peaks with genomic features to aid interpretation of TSS locations.
  • Expression Analysis Workflow: Compares expression levels across multiple samples by quantifying read counts at TSS and peaks.
  • Quality Control Indicators: Generates graphical quality control metrics embedded in workflows to assess data quality.

Scientific Applications:

  • TSS identification and quantification: Detects transcription start sites and measures their expression using CAGE-derived 5' read counts.
  • Comparative expression analysis: Compares TSS and peak expression across samples to investigate gene regulation.
  • Genomic feature annotation: Maps CAGE peaks to genomic annotations to support interpretation of promoter activity.
  • Large-scale project analysis: Applicable to integrative analyses in projects such as FANTOM and ENCODE.
  • Protocol development support: Supports development and evaluation of new sequencing-based protocols via integrated QC and analysis workflows.

Methodology:

Computational steps explicitly include alignment/mapping of CAGE reads to a reference genome, annotation of mapped peaks with genomic features, counting reads to quantify TSS/peak expression, comparative expression analysis across samples, and generation of graphical quality control indicators.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux
Programming Languages:
PHP, Java, C++, Perl
Added:
12/18/2017
Last Updated:
12/10/2018

Operations

Publications

Hasegawa A, Daub C, Carninci P, Hayashizaki Y, Lassmann T. MOIRAI: a compact workflow system for CAGE analysis. BMC Bioinformatics. 2014;15(1). doi:10.1186/1471-2105-15-144. PMID:24884663. PMCID:PMC4033680.

Documentation

Links