249 Free RNA-seq Core Analysis Tools - Software and Resources

1. Transcriptome Profiling

1.1 Read mapping or assembly

1.1.1 De novo (reference free) transcriptome assembly

1.1.1.1 Unstranded

Trans-ABySS

Description : Trans-ABySS is a tool for de novo transcriptome assembly using short reads. The Trans-ABySS algorithm specifically addresses issues caused by local coverage variations by first computing assemblies of substrings using various stringencies. It then merges the separate assemblies into contigs. It can handle paired-end reads, multiple insert sizes, but not strandedness. This tool requires ABySS and BLAT.

SOAPdenovo-Trans

Description : SOAPdenovo-Trans is de novo RNA-seq full-length transcriptome assembler. The SOAPdenovo-Trans algorithm adapts the SOAPdenovo framework, uses the Trinity error removal technique, the graph traversal model from Oases, and uses a transitive reduction to simplify scaffolding graphs. It can handle paired-end reads and multiple insert sizes. The physical memory requirement is large.

IDBA-tran

Description : IDBA-Tran is a de novo RNA-seq transcriptome assembler. The IDBA-Tran algorithm uses De Bruijn Graphs, can handle paired-end reads, isoforms, and uses a probabilistic heuristic method to remove incorrect vertices.

RNAbrowse

Description : RNAbrowse is a browser for de novo RNA-seq data assembly results.

1.1.1.2 Stranded

StringTie

Description : A tool to assemble RNA-seq sequence alignments into transcripts. The StringTie algorithm can optionally make de novo assemblies and uses a novel network flow algorithm.

Cidane

Description : A tool to assemble ab initio and quantify transcripts in RNA-seq data. The Cidane algorithm also annotates known splice sites, transcription starts and ends.

Rnnotator

Description : Rnnotator is a pipeline tool for the generation of full-length transcript models by computing de novo assemblies of RNA-seq data sets. The Rnnotator algorithm specifically addresses issues arising from poor read quality and, read length, and can make deep coverage assemblies. It can use paired-end, stranded reads, multiple insert sizes, and works on multiple CPUs. Obtaining Rnnotator requires a license unless you are collaborating with the developer. Contact Lawrence Berkeley National Laboratory David Gilbert at DEGilbert_at_lbl.gov for more information.

ABySS

Description : ABySS is a tool for de novo genome assembly using short read data. It implements a distributed representation of de Bruijn graphs, which enable parallel computation of the assembly algorithm. ABySS stands for Assembly By Short Sequencing.

Oases

Description : Oases is a tool for assembling de novo transcriptomes using short RNA-seq reads. The Oases algorithm uses dynamic error removal in the prediction of full-length transcripts, and it can handle a wide range of expression values and the absence of alternative iso-forms. Requires Velvet 1.2.08 or higher (see links).

Trinity

Description : Trinity is a tool for de novo transcriptome assembly of RNA-seq data and consists of three modules: Inchworm, Chrysalis, and Butterfly. The algorithm uses de Bruijn graphs, dynamic programming method, it can detect isoforms, handle paired-end reads, multiple insert sizes, and strandedness. The running time is exponential related to the number of graph branches.

Scripture

Description : Scripture is a tool for de novo assembly of RNA-seq full-length gene transcriptome data. The Scripture algorithm needs both reads and a genome sequence, and can handle strandedness.

Bridger

Description : Bridger is a tool for de novo assembly of RNA-seq full-length transcriptome data. The Bridger algorithm adapts schemes used in Cufflink and Trinity. It can handle paired-end reads and multiple insert sizes. The sensitivity and specificity are similar to Cufflink. Bridger runs faster and requires less physical memory than several other assembly tools.

BinPacker

Description : BinPacker is a tool for de novo RNA-seq full-length transcriptome assembly. It can handle paired-end reads.

rnaSPAdes

Description : rnaSPAdes is a de novo RNA-seq full-length transcriptome assembly tool. The rnaSPAdes extends the SPAdes genome assembler and can handle paired-end reads, isoforms, and multiple insert sizes.

Bayesembler

Description : Bayesembler is a tool

1.1.1.3 Quality Control

DETONATE

Description : DETONATE (DE novo TranscriptOme rNa-seq Assembly with or without the Truth Evaluation) is a tool to evaluate de novo RNA-seq transcriptome assemblies. The DETONATE package consists of two modules, RSEM-EVAL and REF-EVAL.

TransRate

Description : Transrate is a tool to assess and analyze de novo RNA-seq transcriptome assemblies. Transrate does not need a reference genome, and the report comprises analyses of structural errors, chimeras, incorrect bases, and deficient assembly. The algorithm uses unique statistics, the TransRate contig score, and the TransRate assembly score.

1.1.2 Mapping to a reference genome or transcriptome

1.1.2.1 Splice Aware

HISAT2

Description : A tool to map DNA and RNA sequences to one or more genomes. The HISAT2 algorithm uses an extension of the Burrows-Wheeler transform (BWT) to generate graphs, a new graph FM index (GFM), and a Hierarchical Graph FM index (HGFM) to index a whole-genome and population of genomes.

EventPointer

Description : It identifies alternative splicing events that involve either simple or complex experimental designs such as time course experiments and studies including paired-samples. The algorithm can be used to analyze data from either junction arrays or sequencing data. The algorithm can generate a series of files to visualize the detected alternative splicing events in IGV. This eases the interpretation of results and the design of primers for standard PCR validation.

OLego

Description : A tool for mapping of spliced mRNA-seq reads. The OLego algorithm uses the Burrows-Wheeler transform for the mapping of seeds, splice junctions, and detection of exons. The algorithm also allows multiple threads.

Subjunc

Description : Subjunc is a tool to align RNA-seq reads and for the detection of exon-exon junctions and gene fusions. The Subjunc is part of the Subread package (see links). An R version is also available as Rsubread .

Subread

Description : Subread is a software tool package for the alignment of both DNA-seq and RNA-seq read data, quantification, and mutation detection. The Subread package consists of five separate tools: 1. Subread, a read aligner for both RNA-seq and DNA-seq data, 2. Subjunc, read aligner for RNA-seq data, detection of exon-exon junctions and gene fusion events, 3. featureCounts, read counting, 4. Sublong, for aligning long reads using the seed-and-vote technique, and 5. exactSNP, a single-nucleotide polymorphism discovery. An R version of the Subread package is also available, Rsubread .

GSNAP

Description : GSNAP (Genomic Short-read Nucleotide Alignment Program) is a tool to align single- and paired-end reads to a reference genome. The GSNAP algorithm is based on the seed-and-extend method and works on reads down to 14 nucleotides of length, and computes SNP-tolerant alignments of various combinations of major and minor alleles. The algorithm can discover long-distance and interchromosomal splicing events by utilizing known splice sites data or by probabilistic models. In addition, the GSNAP algorithm can construct alignments using reads originating from bisulfite-treated DNA samples.

CRAC

Description : CRAC is a tool to map RNA-seq reads. The CRAC algorithm uses a k-mer profiling approach to identify substitutions, insertions/deletions (indels), and chimeric junctions.

STAR

Description : A tool to align RNA-seq data. The STAR algorithm uses suffix arrays, seed clustering, and stitching. It can detect non-canonical splice sites, chimeric sequences, and can also map full-length RNA sequences.

TruHmm

Description : TruHmm is a tool for assembling prokaryote RNA-seq transcriptomes based on a reference.

MaLTA

Description : A tool to assemble and quantify transcripts in Ion Torrent RNA-seq data sets. The MaLTA uses the IsoEM algorithm for the estimation of expression levels. It also uses a maximum likelihood method in both assembly and quantification steps.

Necklace

Description : A tool to assemble RNA-seq data. The Necklace algorithm can assemble genomes both de novo, and guided by a template. It combines an assembled transcriptome with annotations from a reference.

Rail-RNA

Description : Rail-RNA is a tool to align spliced sequences from RNA-seq data. Rail-RNA is cloud-enabled and can analyze multiple samples at a time.

TopHat

Description : TopHat is a tool for splice-aware mapping of RNA-seq reads. The TopHat uses the Bowtie short read aligner tool (BWT-based algorithm) for the mapping whereafter it identifies intron-exon (splice) junctions. TopHat can use paired-end sequencing reads and parallel computation. (*BWT=Burrows–Wheeler transform)

MapSplice

Description : MapSplice is a tool to align RNA-seq read to a reference sequence. The MapSplice algorithm uses the Burrows-Wheeler Transform (BWT) technique and can discover both canonical and non-canonical splice sites.

Rbowtie2

Description : Rbowtie2 is an R tool that wraps the Bowtie 2 tool and includes adapter removal, read merging and identification.

Rsubread

Description : Rsubread is an R tool for RNA-/DNA-seq data mapping, read counting, single-nucleotide polymorphism (SNP), structural variant, and gene fusion detection. The tool is also available in C language, see Subread.

Rbowtie

Description : This package provides an R wrapper around the popular bowtie short read aligner and around SpliceMap, a de novo splice junction discovery and alignment tool. The package is used by the QuasR bioconductor package. We recommend to use QuasR instead of using this package directly.

DeepBound

Description : DeepBound is a tool to identify splicing junctions and boundaries of expressed transcript read alignments in RNA-seq data. The DeepBound algorithm uses deep convolutional neural fields.

Supersplat

Description : Supersplat is a tool to identify splice junctions in RNA-seq data.

Qpalma

Description : Qpalma is a tool to align spliced reads. The Qpalma algorithm uses quality values and information of predicted splice sites for the assessment of alignment accuracy.

tophat-IP

Description : TopHat-IP is the TopHat tool at Galalaxy Pasteur. See links for TopHat. TopHat is a tool for splice-aware mapping of RNA-seq reads. The TopHat uses the Bowtie short read aligner tool (BWT-based algorithm) for the mapping whereafter it identifies intron-exon (splice) junctions. TopHat can use paired-end sequencing reads and parallel computation. (*BWT=Burrowsâ€“Wheeler transform)

SpliceJumper

Description : SpliceJumper is a tool to identify splice junctions in RNA-seq data. The SpliceJumper algorithm uses a classification-based approach

PALMapper

Description : PALMapper is a tool to align read from RNA-seq data. The PALMapper can compute spliced and unspliced alignments. The package GenomeMapper together with the spliced aligner QPALMA (see links). The PALMapper tool is available as a command-line tool or via the web service (https://galaxy.inf.ethz.ch/ ).

SGSeq

Description : SGSeq is a tool to predict and quantify splice events in RNA-seq datasets. The SGSeq algorithm predicts splice junctions and exons by mapping reads to a reference genome.

MapPER

Description : MapPER is a tool to align paired-end reads in RNA-seq data sets. The MapPER algorithm uses an expectation-maximization method to assign likelihood values.

FusionSeq

Description : FusionSeq is a tool for the identification of fusion transcript in RNA-seq data sets using paired-end reads. The FusionSeq includes functions to filter out spurious fusions caused by misalignment artifacts or random pairing. It ranks the candidate fusions using varied statistical methods.

1.1.2.2 Splice unaware

mmquant

Description : A tool to quantiy gene expression. The mmquant algorithm handles multiply mapping reads, i.e., duplicated genes by constructing merged genes.

RNA-MATE

Description : A recursive mapping strategy for high-throughput RNA-sequencing data. This pipeline described here is written in Perl, and makes use of a PBS queue manager, however it can be configured to use LSF or SGE

NanoPARE

Description : NanoPARE is a set of tools for the analysis of 5' RNA data from nanoPARE sequencing libraries. The NanoPARE package contains (1) EndMap for Aligning 5P and BODY FASTQ files to a reference genome, (2) EndGraph for Identifying 5P features, (3) EndClass for Classifying 5P features as capped or noncapped and label features according to a reference transcriptome, (3) EndMask for Masking genomic regions with capped features and converting genome coordinates to transcriptome coordinates, (4) EndCut for searching evidence of small RNA mediated cleavage in transcript-mapping noncapped 5P reads. Requirements: STAR aligner 2.5+, Python 3.6+, Samtools 1.3+, Bedtools 2.26, and Cutadapt 1.9.

GEM Mapper

Description : GEM Mapper is a tool for aligning paired-end reads to a reference genome. The GEM algorithm uses seed and extend technique and enables exhaustive searches given specific criteria.

GEM-Tools

Description : GEM-Tools is an API and a Python API that simplify the usage of the GEM Mapper tool (See links). Besides, GEM-Tools includes a command-line interface, gemtools, for initiating the RNAseq pipeline, indexer module, statistics module, and various other tools.

Bowtie

Description : Bowtie is a tool for aligning short DNA sequence reads to a reference genome. The Bowtie algorithm uses the Burrows-Wheeler transform (BWT) technique and permits the use of multiple CPUs.

Bowtie 2

Description : Bowtie 2 is a tool for aligning short DNA sequence reads to a reference genome. The Bowtie algorithm uses a compressed full-text substring index based on the Burrows-Wheeler transform (BWT) technique and permits the use of multiple CPUs. Bowtie 2 can align reads up to thousands of nucleotides of length, has a gapped local alignment, and paired-end alignment modes.

GRIT

Description : GRIT (Generalized RNA Integration Tool) is a tool to assemble transcripts using RNA-seq data. The GRIT pipeline combines RNA-seq and gene-boundary data, CAGE, RAMPAGE, and poly(A)-seq data.

1.1.2.3 Quality Control

ORMAN

Description : ORMAN (Optimal Resolution of Multimapping Ambiguity of RNA-Seq Reads) is a tool to resolve transcript mappings in RNA-seq data. Th ORMAN algorithm uses combinatorial optimization, integer linear programming, heuristics, and well-know approximation methods.

FastQ Screen

Description : FastQ Screen is a tool for quality control of DNA samples by screening against a reference genomes to validate the origins.

Picard

Description : Picard is a collection of command-line tools for handling high-throughput sequencing data.

RSeQC

Description : RSeQC tool consists of a set of tools for quality assessment of RNA-seq data. The RSeQC package includes functions for sequence quality, nucleotide composition, GC, and PCR bias, sequencing depth, strand specificity, mapped reads distribution, coverage uniformity, RNA integrity, and genomic read distribution.

AlignerBoost

Description : AlignerBoost is a tool to analyze the mapping of high-throughput sequencing reads and increase the overall mapping precision. It works with all sequence aligners that produce SAM or BAM output and further accepts known SNPs as input to improve the quality of alignments. The authors have specifically optimized AlignerBoost for Bowtie, Bowtie2, NovoAlign, BWA-ALN/BWA-SW/BWA-MEM, SeqAlTo (DNA aligners), and Tophat, Tophat2, STAR (RNA aligners).

QoRTs

Description : QoRTs is a tool for quality control of RNA-seq data. The QoRTs algorithm has functions for the analysis, quality control, and data management, primarily the detection identification mistakes, biases, and artifacts in paired-end sequencing. It can also compute count data and group-summary genome track files for visualization in the UCSC genome browser.

RNA-SeQC

Description : A tool for quality control of RNA-seq data. The RNA-SeQC package has functions for computing various quality metrics, such as alignment quality, duplication rates, GC bias, rRNA content, coverage continuity, covered alignment regions, transcript count, and 3'/5' bias. It produces Read counts, coverage, correlation quality control metrics, and is also suitable for use with scRNA-seq data sets.

QuaCRS

Description : QuaCRS is a tool for integrated quality control of RNA-seq data. The QuaCRS package consolidates the FastQC, RNA-SeQC tools, and a collection of functions in the RSeQC.

MultiQC

Description : MultiQC is a tool that aggregates results from multiple sequence aligners, post- and pre-processing, and quality control tools. Version 1.8 supports 78 separate tool packages. The MultiQC obtains the information by scanning the log files and produces an HTML report. This tool is also useful for Single-cell sequencing data and population studies.

QualiMap

Description : QualiMap and a later version, Qualimap 2, is a tool for quality control of sequence alignments and genomic features. The QualiMap can use whole-genome genome and exome sequencing, RNA-seq, and ChIP-seq data. It also has functions for comparison of multiple samples and clustering of epigenomic profiles.

NOISeq

Description : NOISeq is a tool for quality control of RNA-seq count data. The NOISeq can evaluate, among others, count distribution, per chromosome expression, and detected features. The NOISeqBIO module in the NOISeq package assesses false positives non-parametrically.

EDASeq

Description : EDASeq is an R tool for visualization of RNA-seq data. The EDASeq includes functions for within lane and between lane normalizations.

rnaQUAST

Description : rnaQUAST is a tool to assess the quality of RNA-seq assemblies. The rnaQUAST uses a reference and a gene database to compute several quality metrics for the assembly correctness and completeness.

GeneScissors

Description : GeneScissors is a tool for quality control of mapped RNA-seq data. The GeneScissors algorithm combines machine learning (ML) with biological knowledge for the detections and adjustment of spurious inferences.

CADBURE

Description : CADBURE is a tool

1.2 Expression Quantification

1.2.1 Union-exon Based

Subread

Description : Subread is a software tool package for the alignment of both DNA-seq and RNA-seq read data, quantification, and mutation detection. The Subread package consists of five separate tools: 1. Subread, a read aligner for both RNA-seq and DNA-seq data, 2. Subjunc, read aligner for RNA-seq data, detection of exon-exon junctions and gene fusion events, 3. featureCounts, read counting, 4. Sublong, for aligning long reads using the seed-and-vote technique, and 5. exactSNP, a single-nucleotide polymorphism discovery. An R version of the Subread package is also available, Rsubread .

IsoEM2

Description : A tool to estimate differential expression and confidence intervals in RNA-seq data. The IsoEM2 package integrates both IsoEM2 and IsoDE2. The IsoEM2 algorithm uses bootstrapping to evaluate expression levels and confidence intervals. It reports fragments per kilobase million and transcript per million (FPKM, TPM) for genes and isoforms. IsoDE2 uses the data generated by IsoEM2 to analyze differential expression.

FeatureCounts

Description : featureCounts is a tool to quantify RNA-seq and gDNA-seq data as counts. It is also suitable for single-cell RNA-seq (scRNA-seq) data. It supports multi-threading. The featureCounts is part of the Subread package (see links). An R version is also available as Rsubread .

easyRNASeq

Description : easyRNASeq is a tool to quantify RNA-seq expression data. The package also has functions for retrieving annotations, read count summarization by feature. It reports Reads per kilobase per million mapped reads (RPKM).

HTSeq

Description : HTSeq is a tool for the analysis of high-throughput sequencing data. It processes reads aligned with HISTAT or STAR and assign expression value counts. The HTSeq is also suitable for the quantification of single-cell RNA-seq data (scRNA-seq). The package also includes a htseq-count tool for pre-processing RNA-seq reads before differential expression analysis and a htseq-qa tool that assesses the read quality.

Rsubread

Description : Rsubread is an R tool for RNA-/DNA-seq data mapping, read counting, single-nucleotide polymorphism (SNP), structural variant, and gene fusion detection. The tool is also available in C language, see Subread.

PennDiff

Description : PennDiff is a tool to quantify RNA-seq data. The PennDiff algorithm uses both transcript-based and union-exon methods.

1.2.2 Transcript Based

StringTie

Description : A tool to assemble RNA-seq sequence alignments into transcripts. The StringTie algorithm can optionally make de novo assemblies and uses a novel network flow algorithm.

ISVASE

Description : A tool for the identification of splice variants in RNA-seq data. The ISVASE algorithm uses rule-based filters, identifies splicing junctions, sequence variants, and exon-exon junction shifts.

Matataki

Description : A tool to estimate gene expression levels in RNA-seq data sets. The Matalaki algorithm uses unique k-mers for each gene to quickly map all the fragments to genes. According to the Authors, Matalaki is faster than conventional methods.

EMASE

Description : EMASE (Expectation-Maximization for Allele-Specific Expression) is a tool for estimating total gene expression, isoform usage, and allele-specific expression in RNA-seq data. The EMASE algorithm approaches the problem hierarchically by first resolving uncertainties between genes secondly between isoforms, and finally between alleles. EMASE is a prototype implementation in Python language, and EMASE-Zero is a C++ version.

SparseIso

Description : A tool for identification of alternatively spliced transcripts in RNA-seq data. The SparseIso algorithm Gibbs sampling method for simultaneous identification and quantification of transcripts. It also estimates the joint distribution of all transcript candidates to improve the detection of transcripts that are expressed in small quantities and expressed isoforms.

AltHapAlignR

Description : AltHapAlignR is a tool estimate transcript abundace on gene and haplotype levels on genomic regions.

RefBool

Description : RefBool is a tool to classify RNA-Seq and microarray gene expression data in three categories: active, intermediate, and inactive. The RefBool algorithm is based on reference and provides p- and q- values for each classification.

Salmon

Description : A tool to quantify transcript expression in RNA-seq data. The Salmon algorithm can correct for GC-bias, and it uses 'selective-alignment' and massively-parallel stochastic collapsed variational inference to achieve high accuracy and speed. It reports transcripts per million mapped reads (TPM).

NURD

Description : NURD is a tool to estimate expression levels of isoforms in RNA-seq data. The NURD algorithm uses a binary interval search method and can correct for experimental sequencing biases both globally and locally. The home page is currently not available. You may email the Authors and request the source code.

kallisto

Description : A tool to quantify RNA-seq data. The kallisto algorithm uses a pseudo alignment approach to speed up the alignment procedure. The "pseudo alignment" approach can quantify reads without making actual alignments. Kallisto can handle paired-end and single-end reads. It reports transcripts per million mapped reads (TPM).

BRIE

Description : BRIE (Bayesian regression for isoform estimation) is a tool to quantify splicing from RNA-seq data. The BRIE algorithm learns prior distribution isoform proportions from the sequences in samples using a Bayesian hierarchical model.

Outrigger

Description : Outrigger is a tool for the creation of de novo alternative splicing annotation for RNA-seq data. The Outrigger uses junction reads, a graph database, and quantifies spliced-in (Psi) events.

bonvoyage

Description : bonvoyage is a tool for the detection of alternative splicing in RNA-seq data. The bonvoyage algorithm uses the outrigger de novo splice graph method and a Bayesian approach for modality assignment. It can also show changes in modalities using non-negative matrix factorization.

EMSAR

Description : EMSAR is a tool for transcript quantification of RNA-seq data. The EMSAR algorithm can use both single- and paired-end reads, it can operate in multi-thread mode, and reports Fragments per kilobase million Reads (FPKM).

RSEM

Description : RSEM (RNA-Seq by Expectation-Maximization) is a tool for the quantification of RNA-seq data. The RSEM algorithm uses the expectation-maximization technique, it can operate with and without a reference, and reports transcripts per million mapped reads (TPM). RSEM scales linearly with the amount of alignment quantity and uses The Bowtie tool for the read alignments.

Cufflinks

Description : Cufflinks consist of a suite of tools for differential gene expression analysis of RNA-seq data. It assembles aligned reads in a set of transcripts and estimates the relative abundances. The Cufflinks suite consists of the following tools: cufflinks, cuffcompare, cuffmerge, cuffquant, cuffdiff, and cuffnorm.

eXpress

Description : eXpress is a tool to quantify RNA-seq data, but it is also applicaple to ChIP-seq, metagenomics, and large-scale sequencing data in general. The eXpress streaming algorithm computes sequenced DNA or RNA in real-time. Unfortunately, the Authors no longer maintain the software and recommend to use the kallisto tool.

Sailfish

Description : Sailfish is a tool to estimate the abundance of gene isoforms using reference sequences and RNA-seq data sets. The Sailfish uses an alignment-free algorithm and k-mers.

RNA-Skim

Description : RNA-Skim is a tool to quantify transcripts in RNA-seq data. The RNA-Skim algorithm uses a concept of sig-mers, a type of k-mers, for quantification based on distinct clusters of transcripts.

SpliceTrap

Description : SpliceTrap is a tool to quantify exon inclusion ratios in paired-end RNA-seq data. The SpliceTrap algorithm quantifies the extent to which each exon is included, skipped has size variations.

PennDiff

Description : PennDiff is a tool to quantify RNA-seq data. The PennDiff algorithm uses both transcript-based and union-exon methods.

TIGAR2

Description : TIGAR2 is a tool to quantify transcript isoforms in RNA-seq data sets. The TIGAR2 algorithm applies a variational Bayesian inference, and it can also model sequencing errors. It can use Bowtie2 and BWA-MEM tools in the computational pipeline.

LocExpress

Description : LocExpress is web-based to quantify the expression of novel gene transcripts in RNA-seq data. The LocExpress algorithm work with human and mouse data. For the abundance estimation, LocExpress uses a minimum spanning bundle (MSB) region to allow quantification without the need to analyze a whole genome.

EPIG-Seq

Description : EPIG-Seq is a tool to cluster co-expressed genes in RNA-seq data sets. The EPIG-Seq algorithm uses count correlation to estimate the gene similarity and to estimate differential expression level, it uses quasi-Poisson modeling and a location parameter.

PDEGEM

Description : PDEGEM (Positional Dependent Energy Guided Expression Model ) is a tool to estimate transcript abundance and isoform expression in RNA-seq data. The PDEGEM algorithm uses the Positional Dependent Nearest Neighborhood (PDNN) based technique to model the distribution of reads.

SeqSaw

Description : SeqSaw is a tool for the de novo identification of splice junctions in RNA-seq data. The SeqSaw algorithm detects splice junctions also without GT-AG splicing signals.

IsoLasso

Description : IsoLasso is a tool

R-SAP

Description : R-SAP is an RNA-seq analysis pipeline tool. R-SAP can quantitate and uses a hierarchical decision-making scheme to characterize various classes of transcripts. R-SAP reports expression levels as RPKM (reads per kilobase of exon model per million mapped reads).

Solas

Description : Solas is a tool to predict and quantify expressed isoforms within observed coding regions in RNA-seq data. The Solas algorithm has three separate functions: 1. detection of alternative splicing events differentiating two conditions, 2. detection of genes and exons being part of an alternative splicing event, 3. quantification of the relative proportion of isoforms.

SplitSeek

Description : SplitSeek is a tool to detect splice junctions and chimeric reads in RNA-seq data.

cufflinks-IP

Description : cufflinks-IP is on Institute Pasteur - Cufflinks consist of a suite of tools for differential gene expression analysis of RNA-seq data. It assembles aligned reads in a set of transcripts and estimates the relative abundances. The Cufflinks suite consists of the following tools: cufflinks, cuffcompare, cuffmerge, cuffquant, cuffdiff, and cuffnorm.

Rcount

Description : Rcount is a tool to quantify the number of reads mapped to a specific gene (feature counts) in RNA-seq datasets. The Rcount algorithm specifically addresses the issue arising from reads mapping to multiple locations.

SEQ-EM

Description : SEQ-EM is a tool to estimate the expression levels of homologous genes in RNA-seq datasets. The SEQ-EM algorithm uses a maximum likelihood-based method to estimate the model parameters.

Net-RSTQ

Description : Net-RSTQ is a tool to quantify isoforms in RNA-seq data aimed for cancer transcriptome. The Net-RSTQ algorithm uses protein domain-domain interaction network information as prior knowledge in the abundance estimation.

SplicingTypesAnno

Description : SplicingTypesAnno is a tool for the annotation and quantification of alternative splicing in RNA-seq datasets. The SplicingTypesAnno annotates major alternative splicing at exon/intron level, genome-scale annotation or gene-scale annotation, and outputs report in HTM plus additional BED files for IGV visualization.

SplicingCompass

Description : SplicingCompass is a tool to predict differentially splices genes between two separate conditions in RNA-seq datasets. The SplicingCompass uses a technique of computing geometric angles between the high dimensional vectors of exon read counts.

flipflop

Description : flipflop is a tool to identify and quantify isoforms in RNA-seq data. The flipflop algorithm uses a network flow optimization technique to solve the sparse estimation problem.

QuasR

Description : QuasR is a tool to quantify and annotate reads from RNA-seq, ChIP-seq, and Bis-seq. The QuasR package has tools for all analysis steps from sequence read preprocessing, alignment, and quality control to quantification.

GPSeq

Description : GPSeq is a tool to quantify transcriptomes using RNA-seq data. The GPSeq algorithm uses a two-parameter generalized Poisson model to estimate the position-specific read counts.

MMSEQ

Description : MMSEQ is a tool to estimate isoform in RNA-seq data. The MMSEQ algorithm uses a new statistical method that deconvolves the mapping of reads to haplotype-specific isoforms and works with paired-end reads.

1.2.3 Bacterial genome

EDGE-pro

Description : EDGE-pro (Estimated Degree of Gene Expression in PROkaryotes) is a tool to quantify gene expression in prokaryotes and archaea. The EDGE-pro algorithm can align overlapping gene regions.

SeqTU

Description : SeqTU is a tool for the analysis of strand-specific RNA-seq data. The SeqTU algorithm us machine learning approach.

Parseq

Description : Parseq is a tool to estimate transcription levels of microbial genomes in RNA-seq data. The Parseq algorithm uses a particle Gibbs algorithm.

TSSer

Description : TSSer is a tool to detect transcription start sites in bacterial RNA-seq data.

2. Differential Expression Analysis

2.1 Pre-processing DEA

PoissonSeq

Description : PoissonSeq is an R library to normalize, estimate false discovery rate (FDR), and testing of RNA-seq data sets. The PoissonSeq algorithm uses a Poisson log-linear model.

GENAVi

Description : GENAVi (Gene Expression Normalization Analysis and Visualization) is a tool to normalize, analyze, and visualize gene expression in human or mouse RNA-seq data. GENAVi provides a user-friendly GUI and does not require bioinformatics expertise to operate. GENAVi is available as a web-based tool and also installable on a local computer using docker.

TCC

Description : TCC (Tag Count Comparison) is a tool for the differential analysis of tags counts in RNA-seq datasets. The TCC algorithm uses a multi-step normalization method based on differentially expressed genes (DEG) elimination strategy (DEGES).

alpine

Description : alpine is a tool to reduce systematic biases in the estimation of transcript abundances in RNA-seq datasets. The alpine algorithm uses sequence features in the analysis of the abundance.

2.2 Parametric

ideal

Description : ideal is a tool for differential expression analysis of RNA-seq data. The ideal is a Shiny app.

CORNAS

Description : CORNAS is a tool for differential expression analysis of RNA-seq data. The CORNAS algorithm uses a Bayesian approach to compute the sequence coverage from concentrations of RNA-seq samples. It uses a posterior distribution to estimate the true gene count.

TCseq

Description : TCseq is an R tool for analysis of quantitative and differential expression of RNA-seq data. The TCseq algorithm can also do cluster analysis and has functions for visualization of time-course data. It uses the generalized linear model (GLM).

XBSeq

Description : XBSeq is an R tool for genome-wide expression analysis of RNA-seq data. The XBSeq algorithm uses a statistical approach in which observed signals are a convolution of real expression signals and sequencing noises. It assumes the reads that map on intergenic regions are distributed according to Poisson and distinguishes signals using the negative binomial distribution.

BNBR

Description : BNBR is an R tool for the analysis of differential expression in RNA-seq data. The BNBR algorithm uses a new Bayesian negative binomial regression technique (BNB-R).

IsoEM2

Description : A tool to estimate differential expression and confidence intervals in RNA-seq data. The IsoEM2 package integrates both IsoEM2 and IsoDE2. The IsoEM2 algorithm uses bootstrapping to evaluate expression levels and confidence intervals. It reports fragments per kilobase million and transcript per million (FPKM, TPM) for genes and isoforms. IsoDE2 uses the data generated by IsoEM2 to analyze differential expression.

ABSSeq

Description : ABSSeq is a tool for differential gene expression analysis in RNA-seq data. The ABSSeq algorithm uses a negative binomial distribution approach to infer expression differences.

NSMAP

Description : NSMAP (Nonnegativity and Sparsity constrained Maximum A Posteriori ) is a tool for quantification of expression levels and identification of isoforms in RNA-seq data. The NSMAP algorithm uses A Nonnegativity and Sparsity constrained Maximum APosteriori model, to simultaneous identification of isoform structures and estimation of expression levels.

WemIQ

Description : WemIQ is a tool for quantification of isoform expression and exon splicing ratios in RNA-seq data. The WemIQ algorithm uses the expectation-maximization (EM) approach and a Poisson model.

DSGseq

Description : This program aims to identify differentially spliced genes from two groups of RNA-seq samples.

DREAMSeq

Description : DREAMSeq is an R tool for the detection of differentially expressed genes in RNA-seq data. The DREAMSeq algorithm uses a double Poisson model to capture all data properties, such as underdispersion, overdispersion, and equidispersion.

VCNet

Description : A tool to construct co-expressed gene networks from RNA-seq data. The VCNet algorithm uses a new statistical test on the correlation of a gene pair using the Frobenius norm (Euclidean norm ).

DESeq

Description : DESeq is a tool for hypothesis testing and differential gene expression analysis of RNA-seq data. The DESeq algorithm applies the negative binomial distribution and a Likelihood Ratio Test (LRT), it normalizes data by trimmed mean of M-values and circumvents a small sample size by incorporating information from all genes in a set of samples.

DESeq2

Description : DESeq2 is a tool for differential gene expression analysis of RNA-seq data. DESeq2 is a new version of DESeq and can detect more differentially expressed genes (DEGs) than DESeq2. However, it also seems to allow more false positives. The DESeq2 algorithm uses the negative binomial distribution, the Wald, and the Likelihood Ratio Tests.

edgeR

Description : edgeR is a tool for differential expression (DE) analysis of RNA-seq, ChIP-seq, CAGE, and SAGE data with biological replicates. The edgeR algorithm uses information from all the genes, computes the dispersion using a weighted likelihood and F-test techniques. For the normalization, it can use the trimmed mean of M-values, upper-quartile (UQ) procedure, Relative Log Expression (RLE), and DESeq. It can compare two groups, paired and unpaired, or use a Generalized Linear Model (GLM). The upper-quartile (UQ) procedure is also applicable to single-cell RNA-seq (scRNA-seq).

ImpulseDE

Description : ImpulseDE is an R tool for differentially expressed genes (DEGs) in RNA-seq and scRNA-seq time-course data. The ImpulseDE can report DEGs across time points over time in datasets with single or multiple conditions. It includes quality values for DEGs, impulse model parameters, fitted values for genes, and can use multi-threading.

ImpulseDE2

Description : ImpulseDE2 is an R tool for differentially expressed genes (DEGs) time course in RNA-seq, ChIP-seq, ATAC-seq and DNaseI-seq data sets. The ImpulseDE2 algorithm uses negative binomial noise and impulse models. It can also correct for batch and library construction effects.

SARTools

Description : SARTools is an R tool package for differential expression analysis of RNA-seq data. SARTools uses DESeq2 and edgeR. The input consists of raw count data, experimental description files. It will then normalize, estimate dispersion, and analyze differential gene expression. The output is a tab-delimited file and optionally a report in HTML format.

DEApp

Description : DEApp is a web-based tool for differential analysis of RNBA-seq data. It uses edgeR, Limma-Voom, and DESeq2 for cross-validation.

Cufflinks

Description : Cufflinks consist of a suite of tools for differential gene expression analysis of RNA-seq data. It assembles aligned reads in a set of transcripts and estimates the relative abundances. The Cufflinks suite consists of the following tools: cufflinks, cuffcompare, cuffmerge, cuffquant, cuffdiff, and cuffnorm.

Cuffdiff 2

Description : Cuffdiff 2 is a tool to estimate differential expression at gene and transcript levels. It uses a negative binomial model, normalizes using the relative log expression method implemented in DESeq, Inter-sample normalization method Q, and reports Fragments per kilobase million Reads per million mapped reads (FPKM). Cuffdiff 2 is a part of the Cufflinks suite of tools.

MISO

Description : MISO (Mixture-of-Isoforms) is a tool for estimating expression levels of alternatively spliced genes and isoforms. The Authors have implemented MISO as an alternative to the Cufflinks tool. MISO is no longer maintained, but it is available for download.

rMATS

Description : rMATS is a tool to detect major differential alternative splicing types in RNA-seq data with replicates. The rMATS algorithm can use both paired and unpaired reads and computes p-values and false discovery rates based on a user-defined threshold.
Alternative name: MATS.

tweeDEseq

Description : tweeDEseq is a tool to analyze differential gene expression in RNA-seq data sets. The tweeDEseq algorithm uses the Poisson-Tweedie family of distributions.

deGPS

Description : deGPS is a tool to analyze differential gene expression in RNA-seq data sets. The deGPS algorithm uses a normalization technique based on generalized Poisson distribution and tests using permutations.

SpatialDE

Description : SpatialDE is a tool to identify spatially variable genes in data from multiplexed imaging or RNA-seq data. The SpatialDE algorithm can cluster genes for expression-based tissue histology.

PLNseq

Description : PLNseq is a tool for the differential gene expression analysis (DGE) in RNA-seq data. The PLNseq algorithm uses a multivariate Poisson lognormal distribution for modeling the read count data.

RDiff

Description : RDiff is a tool for the detection of differential RNA processing in RNA-seq data. The RDiff algorithm can identify and quantify novel and known isoforms. The RDiff provides a parametric test for annotated genomes and a non-parametric version for genomes where the annotation is incomplete.

BM-DE

Description : BM-DE (Bayesian method of calling differential expression) is a tool for differential gene expression (DGE) analysis of RNA_seq data. The BM-DE algorithm models read counts at each position using Bayesian statistics and can analyze data without biological replicates.

sSeq

Description : sSeq is a tool for differential gene expression (DGE) analysis using RNA-seq data. The sSeq algorithm uses the Negative Binomial (NB) distribution, and a shrinkage technique, and outputs expression as counts.

EBSeq

Description : EBSeq is a tool to identify differential expression isoforms in RNA-seq data. The EBSeq is based on empirical Bayesian methods.

JunctionSeq

Description : JunctionSeq is a tool for the detection of differential splice junction usage. The JunctionSeq algorithm does not need an extra isoform assembly step. The JunctionSeq tool includes visualization functions.

maSigPro

Description : maSigPro is an R tool to discover genes with sufficient differences in gene expression among experimental groups in time-course microarray and RNA-Seq experiments.

FunSys

Description : FunSys is a tool for the analysis of differential gene expression (DGE) in RNA-seq data. The FunSys can associate RNA-seq data with proteomics data.

DEXUS

Description : DEXUS is a tool for the analysis of differential gene expression (DGE) in RNA-seq data where the conditions are unknown. The DEXUS algorithm uses a finite mixture of the negative binomial distribution to model read counts.

diffcoexp

Description : diffcoexp is a tool to detect differentially co-expressed genes and gene pairs (links) in microarray data.

svapls

Description : svapls is a tool for the identification of various sample-specific sources of heterogeneity of gene expression in RNA-seq datasets, producing an increasingly accurate expression pattern. The svapls algorithm uses Partial Least Squares regression statistics to obtain the hidden signals of sample-specific heterogeneity to identify phenotypes.

variancePartition

Description : variancePartition is a tool to partition and visualize gene variation diverging from a general trend in RNA-seq datasets. The variancePartition algorithm uses a linear mixed model and partitions traits with differences in, for example, disease status, sex, cell or tissue type, genetic background, experimental conditions, and technical variation.

BitSeq

Description : BitSeq is an R tool for differential gene expression (DGE) analysis in RNA-seq datasets. The BitSeq algorithm uses Bayesian inference and Markov chain Monte Carlo sampling to model the data.

gCMAP

Description : gCMAP is a tool for the analysis of differential gene expression (DGE) analysis of RNA-seq data sets.

anota2seq

Description : anota2seq is a tool for the analysis of translational efficiency and differential expression analysis for polysome-profiling and ribosome-profiling studies that are quantified by RNA-seq or DNA-microarray.

ECFS-DEA

Description : ECFS-DEA (feature selection tool for differential expression analysis) is a tool to select features for differential expression analysis in RNA-seq data.

2.3 Non-parametric

PennSeq

Description : PennSeq is a tool for isoform-specific gene expression quantification in RNA-seq data. The PennSeq algorithm uses a statistical method permitting each isoform to have a separate specific distribution and uses a non-parametric procedure.

DSS

Description : DSS is an R library for differential expression analysis in RNA-seq count-based data and differential methylation (DML/DMRs) in bisulfite sequencing (BS-seq) data. For the estimation of the dispersion of the count data, the DSS algorithm uses an empirical Bayes shrinkage estimate (Gamma-Poisson or Beta-Binomial distributions). To handle varied sequencing coverages, it uses comparisons between two groups to obtain dispersion shrinkage for multiple factors. For differential expression, it uses the Wald test. DSS fits the Generalized Linear Model (GLM) using edgeR (see links). The DSS algorithm tends to under-estimate the false discovery rate (FDR).

limma

Description : The limma (limma-voom) tool is for the analysis of gene expression of microarray and RNA-seq data. The limma algorithm uses a generalized linear model (GLM), log-normal distribution, trimmed mean of M-values, t- and F-tests.

DEApp

Description : DEApp is a web-based tool for differential analysis of RNBA-seq data. It uses edgeR, Limma-Voom, and DESeq2 for cross-validation.

NPEBseq

Description : NPEBseq is an R tool for differential gene expression analysis of RNA-seq data. The NPEBseq algorithm uses an empirical Bayesian model with the prior distribution estimation from the data itself. NPEBseq can estimate differential expression on both gene and exon levels.

FDM

Description : FDM is a tool for the analysis of differential gene expression in RNA-seq data sets.

DegPack

Description : DegPack is a web-based tool for the identification of differentially expressed genes in RNA-seq data. The DegPack algorithm uses PoissonSeq and SAMseq methods.

DiffSplice

Description : DiffSplice is a tool for analysis of differential splicing in RNA-seq data. The DiffSplice uses a non-parametric permutation test to determine differences in transcription levels. The DiffSplice algorithm doesn't use annotations, and the results are viewable in the UCSC genome browser.

Guide

Description : Guide (Genome Informatics Data Explorer) is a tool for the differential expression analysis of RNA-seq and microarray data. The Guide package is aimed for wet-lab biologists and uses the limma tool for the analyses but does not require specific programming knowledge.

RDiff

Description : RDiff is a tool for the detection of differential RNA processing in RNA-seq data. The RDiff algorithm can identify and quantify novel and known isoforms. The RDiff provides a parametric test for annotated genomes and a non-parametric version for genomes where the annotation is incomplete.

GSAR

Description : GSAR (Gene Set Analysis in R) is a tool for gene set analysis (GSA). The GSAR algorithm uses multivariate non-parametrical testing methods to test null hypothesis against alternative hypotheses. For example, mean, variance, and net correlations structure. GSAR also includes visualisation function.

rSeqNP

Description : rSeqNP is a tool for the non-parametric detection of differential gene expression in RNA-seq datasets. The rSeqNP algorithm uses permutation tests to access statistical significance.

fishpond

Description : fishpond is a tool for the expression analysis of RNA-seq data. The fishpond algorithm uses a nonparametric method.

2.4 Power analysis

powsimR

Description : powsimR is a tool to simulate differential expression RNA-seq data. The powsimR package can simulate read counts, model the mean, dispersion, dropout distributions, and compute the power of sample sizes.

RNASeqPower

Description : RNASeqPower is an R tool for estimating power and computing RNA-seq sample sizes.

RNASeqPowerCalculator

Description : RNASeqPowerCalculator is a tool for computing the sample size and estimating the power of RNA-seq data sets.

Scotty

Description : Scotty is a web-based tool for estimating the statistical power in RNA_seq data samples. It helps to optimize the sequencing depth and the sample sizes.

RnaSeqSampleSize

Description : RnaSeqSampleSize is a tool for estimating the power and sample sizes in RNA-seq data sets.

PowerExplorer

Description : PowerExplorer is a tool for the power estimation of multiple sizes of samples using simulated data. The PowerExplorer algorithm estimates the distribution parameters from the input data.

PROPER

Description : PROPER is a tool for the assessment of power in RNA-seq data. The PROPER algorithm uses a semi-parametric simulation by a computation based on the experimental data and provides stratified power and false discovery cost.

SSPA

Description : SSPA is a tool to calculate sample size and power for RNA-seq and microarray data. The SSPA algorithm uses pilot-data for the evaluation.

samExploreR

Description : samExploreR is an R package for the analysis and exploration of sequencing experiments by simulation using subsampling. The algorithm works for data produced by Illumina GA, HiSeq, MiSeq, ABI SOLiD, Roche GS-FLX, and LifeTech Ion PGM Proton sequencing machines.

subSeq

Description : subSeq is a tool to determine a suitable coverage for RNA-seq experiments. The subSeq algorithm uses subsampling to determine a point where both the accuracy and power coincide to help in the design of the experiments.

erccdashboard

Description : erccdashboard is a tool for the assessment of differential gene expression (DGE) in RNA-seq data. The erccdashboard algorithm uses external spike-in RNA control ratio mixtures.

3. Functional Profiling

3.1 Enrichment Analysis (GSEA), annotation, other

ACTION

Description : A tool to detect a functional identity of cells from RNA-seq expression data profiles. The ACTION algorithm uses cells' expression profile to classify cells by dominant functions and reconstructs gene regulatory networks involved in mediating identities. NOTE! The source may be available from the Authors. The given GitLab repository does not exist.

ccfindR

Description : ccfindR (Cancer Clone findeR) is a tool for the analysis of cancer cells from single-cell RNA-seq data. ccfindR contains functions for quality control, unsupervised clustering, and visualization. The ccfindR algorithm uses Bayesian non-negative matrix factorization for feature selection and clustering.

RNAscClust

Description : RNAscClust a pipeline tool for clustering RNA sequences. The RNAscClust algorithm uses minimum free energy and a graph kernel-based strategy.

CEM

Description : A tool for assembling transcriptome sequences and estimating expression levels in RNA-seq data. The CEM algorithm uses a quasi-multinomial distribution model to detect RNA-seq biases, such as mappability and positional sequencing biases.

VDJSeq-Solver

Description : VDJSeq-Solver is a tool to identify clonal lymphocyte populations from paired-end RNA-seq data from mRNA neoplastic cells. It identifies the main clone characterizing the tissue based on the most abundant V(D)J rearrangement. Note that the source (.tar.gz) is 3.9 GB.

GENIE3

Description : An R tool for prediction of gene regulatory networks from RNA-seq data. The GENIE3 algorithm uses the random forest or Extra-Trees approach.

RSVP

Description : RSVP is a tool to predict protein-coding gene isoforms in RNA-seq data. The RSVP algorithm uses ORF graphs, genomic DNA evidence, and aligned RNA-seq reads for the predictions.

Arboreto

Description : Arboreto is a tool to infer gene regulatory networks in RNA-seq data. The Arboreto framework uses GRNBoost2 and an improved GENIE3 version. GRNBoost2 uses a gradient boosting approach for gene network inference.

AUCell

Description : An R tool for identification of gene signatures and modules in RNA-seq data. The AUCell algorithm detects and ranks enriched gene sets by computing the receiver operator characteristics, i.e., the area under the curve (AUC).

ACTINN

Description : ACTINN (Automated Cell Type Identification using Neural Networks) is a tool to identify cell types in sing-cell RNA-seq data. The ACTINN algorithm uses a neural network with three hidden layers. The publication describes the training on mouse cell type atlas (Tabula Muris Atlas) and a human immune cell dataset, and the results of prediction of cell types for mouse leukocytes, human PBMCs and human T cell subtypes.

LINCS_RNAseq

Description : LINCS_RNAseq is a tool for the analysis of RNA-seq data. The LINCS_RNAseq algorithm uses the unique molecular identifier data from the LINCS Drug ToxicitySignature (DToxS) Generation Center at the Icahn School of Medicine at Mount Sinai in New York. The pipeline has three steps: split, align, and merge.

ScanNeo

Description : A pipeline tool for prediction of neoepitopes derived from small to large-sized indels in RNA-seq data. The ScanNeo pipeline comprises three steps: Indel discovery, annotation and filtering, and neoantigen prediction.

ORE

Description : ORE (Outlier-RV Enrichment) is a tool for the identification of non-coding rare variants in RNA-seq data. The ORE algorithm detects biologically significant outliers having more or less rare variants than expected by chance. Requires: Python >= 3.5.0, bedtools >= 2.2.7.0, samtools >= 1.3, and bcftools >=1.6.

SCENIC

Description : An R pipeline for inference of gene regulatory networks and identification of cell states in RNA-seq data sets. The SCENIC (Single-Cell rEgulatory Network Inference and Clustering) workflow utilizes three separate packages, GENIE3 or GRNBoost2, RcisTarget, and AUCell. A Python implementation of this workflow is faster than this initial R version. See links for pySCENIC. The current version supports human, mouse, and Drosophila melanogaster.

PRAPI

Description : PRAPI is a tool for the analysis of post-translational regulation in Is-Seq data. PRAPI can analyze alternative splicing and transcription initiation, alternative cleavage and polyadenylation, alternative transcription initiation and natural antisense transcripts, and circular RNAs. The PRAPI algorithm can combine Iso-Seq with RNA-seq or PAS-seq.

PathwaySplice

Description : An R tool for analysis of splicing pathways. The PathwaySplice algorithm adjusts the number of exon junctions, visualizes a selection bias, supports Gene Ontology terms, user-defined gene sets, distinguishes primary genes in a pathway, and arranges pathways into an enrichment map.

Snaptron

Description : Snaptron is a tool to search compiled RNA-seq data. The Snaptron algorithm uses R-tree, B-tree, and inverted indexing. It can also score splice junctions to estimate tissue specificity, the relative frequency of splicing patterns, and according to various other criteria.

BackSPIN

Description : BackSPIN is a tool for clustering RNA-seq data. The BackSPIN algorithm, based on SPIN, is a divisive biclustering method.

Wx

Description : Wx is a tool to select the optimal set of genes for gene expression. The Wx uses Keras (Python Deep Learning library) neural network to learn features and to generate a discriminative index.

MSigDB

Description : MSigDB (The Molecular Signatures Database) is a database comprising of annotated gene sets for use by gene set enrichment analysis (GSEA) tools. The web-based tool allows keyword search, browsing by collection, name, and annotation, computation of overlaps, categorizing, and viewing of expression profiles as well as downloading. A pure Mouse and human version are available in R format at Walter+Eliza Hall Institute of Medical Research (see links).

goseq

Description : goseq is a tool for functional profiling by analysis of gene ontology (GO) of RNA_seq data. The goseq algorithm can also estimate the effect of bias.

SeqGSEA

Description : SeqGSEA is an R tool for gene enrichment analysis (GSEA) of RNA-seq data. The SeqGSEA algorithm uses a negative binomial distribution model for count data, computes and scores differential splicing and expression, estimates for biological variability, and gene set enrichment analysis (GSEA).

GAGE

Description : GAGE (Generally Applicable Gene-set Enrichment) is a tool for gene set enrichment (GSEA) and pathway analysis of RNA-seq and microarray data. The GAGE package contains functions for GSEA, processing of results, reporting, batch analyses, and comparison between studies.

GSAASeqSP

Description : GSAASeqSP is a tool for gene set association analysis of sequence read count data. The algorithm includes functions for gene-level and gene set-level statistics.

SeqGSA

Description : SeqGSA is an R tool for the analysis of gene sets in RNA-seq data. The SeqGSA algorithm accounts for the effects of variable gene lengths.

Enrichr

Description : Enrichr is a web-based tool for gene over-representation analysis using functional annotations. Enrichr has more than 30 geneset libraries and can visualize results with JavaScript library Data-Driven Documents (D3).

clusterProfiler

Description : clusterProfiler is an R tool for the enrichment analysis and visualization of gene clusters. The clusterProfiler algorithm uses Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases.

g:Profiler

Description : g:Profiler is a web-based tool for gene enrichment analysis of gene ontologies, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, and protein-protein interaction analyses. g:Profiler updates Ensembl data quarterly. An R client is available and it includes the following four tools: 1. g:GOSt for functional enrichment analysis and visualization, 2. g:Convert for identifier conversion, 3. g:Orth for orthology searching, and 4. g:SNPense for Single nucleotide polymorphism (SNP) mapping by 'rs' identifiers.

GOEAST

Description : GOEAST (Gene Ontology Enrichment Analysis) is a Gene Ontology (GO) enrichment analysis tool. It can identify over-represented GO terms and uses several different data sources and species.

GOrilla

Description : GOrilla is a web-based tool to identify and visualize enriched GO terms of a list of genes.

ToppGene Suite

Description : ToppGene Suite consists of 1. ToppFun for functional enrichment analysis based on transcriptome, ontology, proteome, phenotype, and pharmacome, 2. ToppGene for prioritization of candidate genes, 3. ToppNet for ranking genes based on topological features, and 4. ToppGenet for prioritization of neighboring genes based on protein-protein interaction networks.

WebGIVI

Description : WebGIVI is a web-based tool for gene enrichment and visualization to obtain gene symbols and iTerm pairs. It uses Cytoscape or Concept Map.

PANTHER Database and Tools

Description : Protein classification to ease high-throughput analyses. Classification is by family and subfamily, molecular function, biological process, and pathway. Updated single-nucleotide polymorphisms (SNP) scoring tool.

DAVID Bioinformatics Resources

Description : DAVID Bioinformatics Resources (Database for Annotation, Visualization, and Integrated Discovery ) is a large resource for visualization, functional annotation, and clustering tools using such databases as GO , KEGG pathways, Biocarta, UniProt/SwissProt.

GO enrichment analysis tool

Description : A tool for gene enrichment analysis to find over- and under-represented GO terms in a gene set.

DOSE

Description : DOSE is a tool for annotating genes based on Disease Ontology (DO). The DOSE includes functions for enrichment analyses using a hypergeometric model.

TS-GOEA

Description : TS-GOEA is a web-based tool for enrichment analysis using data from more than 50 human tissues. The TS-GOEA algorithm uses a hypergeometric distribution for testing, annotations from Gene Ontology Resource (GO), and expression data from GTEX portal.

GSEA (UCSD)

Description : GSEA (Gene Set Enrichment Analysis) is a tool for gene enrichment analysis. The GSEA algorithm assesses differences between gene sets and uses the Molecular Signatures Database (MSigDB) for annotation.

SIGN

Description : SIGN (Similarity Identification in Gene expressioN) is a tool for the analysis of expression patterns and pathways in RNA-seq data. The SIGN algorithm identifies similarities between biological sample sets.

ToppCluster

Description : ToppCluster is a web-based tool for the enrichment and network analysis of mammalian RNA-seq data. ToppCluster allows visualization using R, TreeView, GenePattern, Cytoscape, and Gephi.

PINTA

Description : PINTA is a web-based tool to prioritize candidate genes based on their genome-wide protein-protein interaction network neighborhood and differential gene expression. PINTA works for human, mouse, rat, worm, and yeast data.

DaMiRseq

Description : DaMiRseq is an R tool to select features, classification, and bias removal of RNA-seq data. The input consists of raw counts. DaMiRseq has visualization functions for heatmaps, RLE, MDS, or correlation plots.

Mergeomics

Description : Mergeomics is a tool to recognize pathological pathways, regulatory pathways, and key regulators in omics data. The Mergeomics algorithm consists of two modules: 1. Marker set enrichment analysis (MSEA) and 2. Weighted Key Driver Analysis (wKDA).

goSTAG

Description : goSTAG is a tool GO enrichment analysis (GSEA) of RNA-seq data sets or other high-throughput technologies. The algorithm uses Fisher s exact test and GO subtrees for annotation to describe biological themes.

MCbiclust

Description : MCbiclust (Massively Correlated Biclustering) is a tool to cluster correlated gene expression in RNA-seq data sets. The MCbiclust algorithm finds the maximum strength correlation matrix and includes visualization tools.

CrossHub

Description : CrossHub is a tool for the analysis of methylome data using The Cancer Genome Atlas (TCGA). CrossHub identifies possible transcription factor gene (TF-gene) interactions. The CrossHub algorithm has RNA-Seq analysis functions for differential expression analysis, prediction of regulatory TF and miRNA, analysis of methylation profiles, and RNA-Seq vs. clinical (TNM, stage, follow-up) correlation analysis.

TSSAR

Description : TSSAR is a web service for the identification of bacterial transcription start sites (TSS) in RNA-seq data. The TSSAR algorithm uses Skellam distribution statistics to evaluate enriched transcripts.

CracTools

Description : CracTools provides a set of command-line tools for the analysis of single nucleotide variation (SNV), insertions/deletions (indels), splice junctions, chimeric reads in RNA-seq data.

IUTA

Description : IUTA is a tool for the detection of isoform usage in RNA-seq data sets generated by Illumina paired-end sequencing. The IUTA algorithm uses two probability distributions under the Aitchison geometry to test equal mean values.

Heat seq

Description : Heat seq is a web-based tool to compare ChIP-seq, RNA-seq, and CAGE experiment data with public data.

asSeq

Description : asSeq is a too to expression quantitative trait locus (eQTL) mapping. The asSeq algorithm combines the total read count and allele-specific expression to identify cis- and trans-eQTL.

UnoSeq

Description : UnoSeq is a tool to analyze expression profiles of organisms that lack genome and/or transcriptome information using Illumina RNA-seq data.

spliceR

Description : spliceR is a tool to classify alternative splicing and assess the coding potential of RNA-seq data. The spliceR algorithm can detect exon skipping, intron retention, alternative first or last exon usage, donor and acceptor sites, and mutually exclusive exon events. spliceR produces genomic coordinates for all differentially spliced sequences and predicts the coding potential and possible nonsense-mediated decay for each of the transcripts.

networkBMA

Description : networkBMA is a tool to infer gene regulatory networks. The networkBMA algorithm uses a Bayesian inference method combining external information to increase accuracy.

3USS

Description : 3USS is a web-based tool to detect alternative 3 prime UTRs in RNA-seq data.

CAMUR

Description : CAMUR (Classifier with Alternative and MUltiple Rule-based models) is a tool to obtain knowledge by extracting several different classification models of gene features in RNA-seq data. The CAMUR algorithm uses an iterative approach to compute a classification model and the power and includes an ad-hoc knowledge database and query tool.

MetaDiff

Description : MetaDiff is a tool to analyze isoform expression in RNA-seq data. The MetaDiff algorithm uses a random-effects meta-regression.

FlaiMapper

Description : FlaiMapper is a tool to identify and annotate small non-coding RNAs (sncRNAs) in RNA-seq datasets. The FlaiMapper algorithm uses small RNA-seq read alignments to annotate fragments by peak detection on the start and end position densities. The final step is filtering and a reconstruction.

RNA-eXpress

Description : RNA-eXpress is a tool to annotate new biological significant transcript features in RNA-seq datasets.

WaveQTL

Description : WaveQTL is a tool for the analysis of functional phenotypes in RNA-seq, ChIP-seq, and DNase-seq data. The WaveQTL algorithm uses wavelet-based techniques for the analyses and tests for the association among genetic variants and the underlying function.

GETUTR

Description : GETUTR is a tool estimate and quantify the usage of 3 prime UTRs in RNA-seq datasets

ISoLDE

Description : ISoLDE is a tool to identify genes that are imprinted in RNA-seq datasets.

Transcriptator

Description : Transcriptor is a pipeline tool for GO enrichment analysis in RNA-seq datasets that lack the reference genome.

IsoSCM

Description : IsoSCM (Isoform Structural Change Model) is a tool to annotate 3 prime UTRs in RNA-seq data. The IsoSCM algorithm uses change-point analysis to improve annotation assessment.

GIREMI

Description : GIREMI (genome-independent identification of RNA editing by mutual information) is a tool to predict adenosine-to-inosine editing in RNA-seq data.

GGEA

Description : GGEA (Gene Graph Enrichment Analysis) is a tool for the detection of enriched gene sets in RNA-seq datasets. The GGEA algorithm uses prior knowlede obtained from gene regulatory networks.

zFPKM

Description : zFPKM is a tool for the identification of biologically relevant genes using the zFPKM normalization method. The algorithm operates with gene-level data using FPKM or TPM.

Annocript

Description : Annocript is a tool for the annotation of transcriptomes. The package uses BLAST with UniProt, NCBI Conserved Domain Database and Nucleotide divisions, Gene Ontology, UniPathways, and the Enzyme Commission.

EBSeqHMM

Description : EBSeqHMM is a tool for the identification of isoforms with variable expression profiles over time. The EBSeqHMM algorithm uses an empirical Bayes mixture modeling and an auto-regressive hidden Markov model to assess the dependency of gene expression and conditions.

QuSAGE

Description : QuSAGE (Quantitative Set Analysis of Gene Expression) is a tool quantify and analyze differential gene expression (DGE) and gene to gene correlations in RNA-seq data sets. The QuSAGE algorithm quantifies gene-set activity with a complete probability density function, computes p-values and confidence intervals.

seq2pathway

Description : seq2pathway is a tool for the functional gene-set analysis of genomic loci in RNA-seq data sets. The seq2pathway algorithm assigns genes to the pathways and computes gene-level pathway scores.

ToPASeq

Description : ToPASeq is an interface tool for the pathway analysis of RNA-seq and microarray data sets. The ToPASeq can run the following tools: SPIA, DEGraph, TopologyGSA, TAPPA, PRS, PWEA. ToPASeq also has functions for visualization, importing, and manipulation of pathways.

GOexpress

Description : GOexpress is a tool to identify Gene Ontology (GO) terms in RNA-seq data sets. It uses a supervised clustering approach and can concurrently classify samples from many experimental groups. The GOexpress package also has functions to visualize gene expression profiles.

Genexpi

Description : Genexpi is a tool for the identification of sigma factors using a combination of data from literature and time-course gene expression data. The CyGenexpi (see links) plugin integrates Genexpi with the Cytoscape tool. The Genexpi tool is available as an R package,

KOBAS

Description : KOBAS (KEGG Orthology-Based Annotation System) is a tool for the annotation of sequences by KEGG Orthology terms. KOBAS also identifies enriched pathways and uses KEGG Pathway, PID, BioCyc, Reactome, Panther and human data from OMIM, KEGG Disease, FunDO, GAD, NHGRI, and GWAS databases.

SplAdder

Description : SplAdder is a tool to analyze alternative splicing using RNA-seq alignment data. The SplAdder algorithm uses annotation, and RNA-Seq read alignments, computes a splicing graph, and quantifies the splicing events that may be useful in differential analysis.

3.2 Comparison With Genome

VARUS

Description : VARUS is a tool for genome annotation using RNA-seq reads data. The VARUS algorithm uses reads from NCBI's Sequence Read Archive and depends on samtools, bamtools, fastq-dump, and STAR or HISAT2.

rnaSeqMap

Description : rnaSeqMap is a tool that contains functions for the analysis of RNA-seq datasets using coverage profiles of multiple samples. The tools include functions for analysis of sequence coverage, significance, splicing, and are useful for, for example, finding novel transcripts.

SECTIONS

Tutorials
In-house software
Blog
News
Find Jobs
History
Definition of Bioinformatics
ENCYCLOPEDIA
COVID-19
⚬ Timeline of outbreak
⚬ Blog
⚬ Scientific Facts
⚬ Daily statistics
⚬ News

TOOLS

Find thousands of Bioinformatics and Life Science software tools and databases in the newly launched

Database of Bioinformatics Software Tools and Resources.

Ads