SeqTU
SeqTU predicts transcription units from strand-specific RNA sequencing (ssRNA-seq) data in bacterial genomes to map TU boundaries and characterize transcriptional organization.
Key Features:
- Strand-Specific RNA-Seq Data Analysis: Utilizes strand-specific RNA sequencing (ssRNA-seq) to distinguish sense and antisense transcripts for accurate mapping of transcription units (TUs).
- Machine Learning Approach: Employs a machine-learning algorithm using two primary parameters—expression-level continuity and variance across the genome—to determine TU start and end points.
- High Prediction Accuracy: Demonstrated by application to Clostridium thermocellum datasets, predicting 2,590 distinct TUs with 44% containing multiple genes.
- Validation and Generalization: Predictions were validated using an independent RNA-seq dataset with longer reads and the method was applied to Escherichia coli data to demonstrate cross-species applicability.
- Functional Insights: Functional enrichment analyses on predicted TUs revealed biological insights into transcriptional and post-transcriptional regulatory mechanisms.
Scientific Applications:
- Transcriptional regulation studies: Provides detailed TU maps to investigate transcriptional organization and regulatory mechanisms in prokaryotes.
- Gene expression and regulatory network analysis: Supports analysis of co-transcribed genes and operon structures for studies of gene regulation.
- Microbial genomics: Enables basic and applied research in bacterial genomics by delivering high-resolution TU predictions across species.
Methodology:
Integrates machine learning with strand-specific RNA-seq (ssRNA-seq) data and uses expression-level continuity and variance across the genome to predict transcription unit boundaries.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux
- Programming Languages:
- R
- Added:
- 12/18/2017
- Last Updated:
- 11/25/2024
Operations
Data Inputs & Outputs
RNA-Seq analysis
Publications
Chou W, Ma Q, Yang S, Cao S, Klingeman DM, Brown SD, Xu Y. Analysis of strand-specific RNA-seq data using machine learning reveals the structures of transcription units in Clostridium thermocellum. Nucleic Acids Research. 2015;43(10):e67-e67. doi:10.1093/nar/gkv177. PMID:25765651. PMCID:PMC4446414.