DeepBound
DeepBound identifies transcript boundaries from RNA sequencing (RNA-seq) read alignments using deep convolutional neural fields to improve transcript assembly and the discovery of novel genes and transcripts.
Key Features:
- Deep convolutional neural fields: Uses deep convolutional neural fields to learn hidden distributions and patterns of transcript boundaries.
- AUC-based optimization: Integrates the AUC (area under the curve) score into the optimization objective to enhance boundary prediction accuracy.
- Label-imbalance handling: Addresses label-imbalance issues inherent in boundary training data through the AUC-based objective.
- Simulated training data: Utilizes simulated RNA-seq datasets for training to provide large labeled sample sets for the deep probabilistic graphical model.
- Noisy/weak signal targeting: Specifically tackles noisy and weak signals at transcript boundaries that complicate accurate determination from RNA-seq reads.
- Complementary to spliced-read junction detection: Focuses on boundary identification distinct from splicing junction detection, which is addressed by spliced reads.
- Empirical performance: Demonstrated superior performance compared to existing methods on simulation datasets from two species and on biological datasets.
Scientific Applications:
- Transcript assembly: Improves full-length transcript assembly by providing more accurate boundary locations from RNA-seq data.
- Novel gene and transcript discovery: Enables discovery of novel genes and transcripts through enhanced boundary detection.
- Gene expression and functional studies: Facilitates more complete transcript models for analysis of gene expression patterns and gene function.
- Method benchmarking: Serves as a benchmark for evaluating transcript boundary identification methods on simulated and biological datasets.
Methodology:
Implements deep convolutional neural fields trained on simulated RNA-seq datasets with an AUC (area under the curve)-based optimization objective to learn hidden distributions of transcript boundaries and address label-imbalance.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Mac
- Programming Languages:
- Shell
- Added:
- 6/15/2018
- Last Updated:
- 11/25/2024
Operations
Publications
Shao M, Ma J, Wang S. DeepBound: accurate identification of transcript boundaries via deep convolutional neural fields. Bioinformatics. 2017;33(14):i267-i273. doi:10.1093/bioinformatics/btx267. PMID:28881999. PMCID:PMC5870651.
PMID: 28881999
PMCID: PMC5870651
Funding: - National Institutes of Health: R01HG007104
- National Science Foundation: DBI-1564955