SAMSA
SAMSA is a bioinformatics pipeline designed specifically for analyzing metatranscriptome datasets because there are limited options currently available for biologists to analyze this type of data. Metatranscriptomics, the study of diverse microbial population activity based on RNA-seq data, has grown in popularity, but there are few approaches to processing this type of data. Current approaches for processing metatranscriptomes rely on restricted databases and a dedicated computing cluster or metagenome-based approaches that have not been fully evaluated for processing metatranscriptomic datasets.
SAMSA shows metatranscriptome transcription activity levels by organism or transcript function. SAMSA is fully open source, making the new pipeline publicly available.
The SAMSA software package is constructed to run with Metagenome-RAST (MG-RAST) servers. It is designed for use by researchers with relatively little bioinformatics experience. SAMSA can summarize and evaluate raw annotation results, identifying abundant species and significant functional differences between metatranscriptomes.
The researchers used this new tool to evaluate best practices for sequencing stool metatranscriptomes and were able to determine experimental requirements for fecal gut metatranscriptomes. Sequences must be either long reads (longer than 100 bp) or joined paired-end reads. Each sample needs 40-50 million raw sequences, which can be expected to yield the 5-10 million annotated reads necessary for accurate abundance measures.
The researchers also demonstrated that ribosomal RNA depletion does not equally deplete ribosomes from all species within a sample, and remaining rRNA sequences should be discarded. Using publicly available metatranscriptome data in which rRNA was not depleted, they demonstrated that overall organism transcriptional activity could be measured using mRNA counts. They also detected significant differences between control and experimental groups in both organism transcriptional activity and specific cellular functions.
Topic
Transcriptomics;Metagenomics
Detail
Operation: Sequence analysis
Software interface: Workflow
Language: R;Python
License: -
Cost: Free
Version name: 1.0
Credit: University of California, Davis, the Peter J. Shields Endowed Chair in Dairy Food Science.
Input: -
Output: -
Contact: Samuel T. Westreich stwestreich@ucdavis.edu,Danielle G. Lemay dglemay@ucdavis.edu
Collection: -
Maturity: -
Publications
- SAMSA: A Comprehensive Metatranscriptome Analysis Pipeline
- Westreich ST, et al. SAMSA: a comprehensive metatranscriptome analysis pipeline. SAMSA: a comprehensive metatranscriptome analysis pipeline. 2016; 17:399. doi: 10.1186/s12859-016-1270-8
- https://doi.org/10.1186/s12859-016-1270-8
- PMID: 27687690
- PMC: PMC5041328
Download and documentation
Source: https://github.com/transcript/SAMSA/releases/tag/v1.0
Documentation: https://github.com/transcript/SAMSA#readme
Home page: https://github.com/transcript/SAMSA
< Back to DB search