SeqTU

SeqTU predicts transcription units from strand-specific RNA sequencing (ssRNA-seq) data in bacterial genomes to map TU boundaries and characterize transcriptional organization.


Key Features:

  • Strand-Specific RNA-Seq Data Analysis: Utilizes strand-specific RNA sequencing (ssRNA-seq) to distinguish sense and antisense transcripts for accurate mapping of transcription units (TUs).
  • Machine Learning Approach: Employs a machine-learning algorithm using two primary parameters—expression-level continuity and variance across the genome—to determine TU start and end points.
  • High Prediction Accuracy: Demonstrated by application to Clostridium thermocellum datasets, predicting 2,590 distinct TUs with 44% containing multiple genes.
  • Validation and Generalization: Predictions were validated using an independent RNA-seq dataset with longer reads and the method was applied to Escherichia coli data to demonstrate cross-species applicability.
  • Functional Insights: Functional enrichment analyses on predicted TUs revealed biological insights into transcriptional and post-transcriptional regulatory mechanisms.

Scientific Applications:

  • Transcriptional regulation studies: Provides detailed TU maps to investigate transcriptional organization and regulatory mechanisms in prokaryotes.
  • Gene expression and regulatory network analysis: Supports analysis of co-transcribed genes and operon structures for studies of gene regulation.
  • Microbial genomics: Enables basic and applied research in bacterial genomics by delivering high-resolution TU predictions across species.

Methodology:

Integrates machine learning with strand-specific RNA-seq (ssRNA-seq) data and uses expression-level continuity and variance across the genome to predict transcription unit boundaries.

Topics

Details

Tool Type:
command-line tool
Operating Systems:
Linux
Programming Languages:
R
Added:
12/18/2017
Last Updated:
11/25/2024

Operations

Data Inputs & Outputs

Publications

Chou W, Ma Q, Yang S, Cao S, Klingeman DM, Brown SD, Xu Y. Analysis of strand-specific RNA-seq data using machine learning reveals the structures of transcription units in Clostridium thermocellum. Nucleic Acids Research. 2015;43(10):e67-e67. doi:10.1093/nar/gkv177. PMID:25765651. PMCID:PMC4446414.

Documentation

Links