For better experience, turn on JavaScript!


249 Free RNA-seq Core Analysis Tools - Software and Resources

249 Free RNA-seq Core Analysis Tools - Software and Resources


1. Transcriptome Profiling

1.1 Read mapping or assembly

1.1.1 De novo (reference free) transcriptome assembly

1.1.1.1 Unstranded
  1. Trans-ABySS
    • Description : Trans-ABySS is a tool for de novo transcriptome assembly using short reads. The Trans-ABySS algorithm specifically addresses issues caused by local coverage variations by first computing assemblies of substrings using various stringencies. It then merges the separate assemblies into contigs. It can handle paired-end reads, multiple insert sizes, but not strandedness. This tool requires ABySS and BLAT.
  2. SOAPdenovo-Trans
    • Description : SOAPdenovo-Trans is de novo RNA-seq full-length transcriptome assembler. The SOAPdenovo-Trans algorithm adapts the SOAPdenovo framework, uses the Trinity error removal technique, the graph traversal model from Oases, and uses a transitive reduction to simplify scaffolding graphs. It can handle paired-end reads and multiple insert sizes. The physical memory requirement is large.
  3. IDBA-tran
    • Description : IDBA-Tran is a de novo RNA-seq transcriptome assembler. The IDBA-Tran algorithm uses De Bruijn Graphs, can handle paired-end reads, isoforms, and uses a probabilistic heuristic method to remove incorrect vertices.
  4. RNAbrowse
    • Description : RNAbrowse is a browser for de novo RNA-seq data assembly results.
1.1.1.2 Stranded
  1. StringTie
    • Description : A tool to assemble RNA-seq sequence alignments into transcripts. The StringTie algorithm can optionally make de novo assemblies and uses a novel network flow algorithm.
  2. Cidane
    • Description : A tool to assemble ab initio and quantify transcripts in RNA-seq data. The Cidane algorithm also annotates known splice sites, transcription starts and ends.
  3. Rnnotator
    • Description : Rnnotator is a pipeline tool for the generation of full-length transcript models by computing de novo assemblies of RNA-seq data sets. The Rnnotator algorithm specifically addresses issues arising from poor read quality and, read length, and can make deep coverage assemblies. It can use paired-end, stranded reads, multiple insert sizes, and works on multiple CPUs. Obtaining Rnnotator requires a license unless you are collaborating with the developer. Contact Lawrence Berkeley National Laboratory David Gilbert at DEGilbert_at_lbl.gov for more information.
  4. ABySS
    • Description : ABySS is a tool for de novo genome assembly using short read data. It implements a distributed representation of de Bruijn graphs, which enable parallel computation of the assembly algorithm. ABySS stands for Assembly By Short Sequencing.
  5. Oases
    • Description : Oases is a tool for assembling de novo transcriptomes using short RNA-seq reads. The Oases algorithm uses dynamic error removal in the prediction of full-length transcripts, and it can handle a wide range of expression values and the absence of alternative iso-forms. Requires Velvet 1.2.08 or higher (see links).
  6. Trinity
    • Description : Trinity is a tool for de novo transcriptome assembly of RNA-seq data and consists of three modules: Inchworm, Chrysalis, and Butterfly. The algorithm uses de Bruijn graphs, dynamic programming method, it can detect isoforms, handle paired-end reads, multiple insert sizes, and strandedness. The running time is exponential related to the number of graph branches.
  7. Scripture
    • Description : Scripture is a tool for de novo assembly of RNA-seq full-length gene transcriptome data. The Scripture algorithm needs both reads and a genome sequence, and can handle strandedness.
  8. Bridger
    • Description : Bridger is a tool for de novo assembly of RNA-seq full-length transcriptome data. The Bridger algorithm adapts schemes used in Cufflink and Trinity. It can handle paired-end reads and multiple insert sizes. The sensitivity and specificity are similar to Cufflink. Bridger runs faster and requires less physical memory than several other assembly tools.
  9. BinPacker
    • Description : BinPacker is a tool for de novo RNA-seq full-length transcriptome assembly. It can handle paired-end reads.
  10. rnaSPAdes
    • Description : rnaSPAdes is a de novo RNA-seq full-length transcriptome assembly tool. The rnaSPAdes extends the SPAdes genome assembler and can handle paired-end reads, isoforms, and multiple insert sizes.
  11. Bayesembler
    • Description : Bayesembler is a tool
1.1.1.3 Quality Control
  1. DETONATE
    • Description : DETONATE (DE novo TranscriptOme rNa-seq Assembly with or without the Truth Evaluation) is a tool to evaluate de novo RNA-seq transcriptome assemblies. The DETONATE package consists of two modules, RSEM-EVAL and REF-EVAL.
  2. TransRate
    • Description : Transrate is a tool to assess and analyze de novo RNA-seq transcriptome assemblies. Transrate does not need a reference genome, and the report comprises analyses of structural errors, chimeras, incorrect bases, and deficient assembly. The algorithm uses unique statistics, the TransRate contig score, and the TransRate assembly score.

1.1.2 Mapping to a reference genome or transcriptome

1.1.2.1 Splice Aware
  1. HISAT2
    • Description : A tool to map DNA and RNA sequences to one or more genomes. The HISAT2 algorithm uses an extension of the Burrows-Wheeler transform (BWT) to generate graphs, a new graph FM index (GFM), and a Hierarchical Graph FM index (HGFM) to index a whole-genome and population of genomes.
  2. EventPointer
    • Description : It identifies alternative splicing events that involve either simple or complex experimental designs such as time course experiments and studies including paired-samples. The algorithm can be used to analyze data from either junction arrays or sequencing data. The algorithm can generate a series of files to visualize the detected alternative splicing events in IGV. This eases the interpretation of results and the design of primers for standard PCR validation.
  3. OLego
    • Description : A tool for mapping of spliced mRNA-seq reads. The OLego algorithm uses the Burrows-Wheeler transform for the mapping of seeds, splice junctions, and detection of exons. The algorithm also allows multiple threads.
  4. Subjunc
    • Description : Subjunc is a tool to align RNA-seq reads and for the detection of exon-exon junctions and gene fusions. The Subjunc is part of the Subread package (see links). An R version is also available as Rsubread .
  5. Subread
    • Description : Subread is a software tool package for the alignment of both DNA-seq and RNA-seq read data, quantification, and mutation detection. The Subread package consists of five separate tools: 1. Subread, a read aligner for both RNA-seq and DNA-seq data, 2. Subjunc, read aligner for RNA-seq data, detection of exon-exon junctions and gene fusion events, 3. featureCounts, read counting, 4. Sublong, for aligning long reads using the seed-and-vote technique, and 5. exactSNP, a single-nucleotide polymorphism discovery. An R version of the Subread package is also available, Rsubread .
  6. GSNAP
    • Description : GSNAP (Genomic Short-read Nucleotide Alignment Program) is a tool to align single- and paired-end reads to a reference genome. The GSNAP algorithm is based on the seed-and-extend method and works on reads down to 14 nucleotides of length, and computes SNP-tolerant alignments of various combinations of major and minor alleles. The algorithm can discover long-distance and interchromosomal splicing events by utilizing known splice sites data or by probabilistic models. In addition, the GSNAP algorithm can construct alignments using reads originating from bisulfite-treated DNA samples.
  7. CRAC
    • Description : CRAC is a tool to map RNA-seq reads. The CRAC algorithm uses a k-mer profiling approach to identify substitutions, insertions/deletions (indels), and chimeric junctions.
  8. STAR
    • Description : A tool to align RNA-seq data. The STAR algorithm uses suffix arrays, seed clustering, and stitching. It can detect non-canonical splice sites, chimeric sequences, and can also map full-length RNA sequences.
  9. TruHmm
    • Description : TruHmm is a tool for assembling prokaryote RNA-seq transcriptomes based on a reference.
  10. MaLTA
    • Description : A tool to assemble and quantify transcripts in Ion Torrent RNA-seq data sets. The MaLTA uses the IsoEM algorithm for the estimation of expression levels. It also uses a maximum likelihood method in both assembly and quantification steps.
  11. Necklace
    • Description : A tool to assemble RNA-seq data. The Necklace algorithm can assemble genomes both de novo, and guided by a template. It combines an assembled transcriptome with annotations from a reference.
  12. Rail-RNA
    • Description : Rail-RNA is a tool to align spliced sequences from RNA-seq data. Rail-RNA is cloud-enabled and can analyze multiple samples at a time.
  13. TopHat
    • Description : TopHat is a tool for splice-aware mapping of RNA-seq reads. The TopHat uses the Bowtie short read aligner tool (BWT-based algorithm) for the mapping whereafter it identifies intron-exon (splice) junctions. TopHat can use paired-end sequencing reads and parallel computation. (*BWT=Burrows–Wheeler transform)
  14. MapSplice
    • Description : MapSplice is a tool to align RNA-seq read to a reference sequence. The MapSplice algorithm uses the Burrows-Wheeler Transform (BWT) technique and can discover both canonical and non-canonical splice sites.
  15. Rbowtie2
    • Description : Rbowtie2 is an R tool that wraps the Bowtie 2 tool and includes adapter removal, read merging and identification.
  16. Rsubread
    • Description : Rsubread is an R tool for RNA-/DNA-seq data mapping, read counting, single-nucleotide polymorphism (SNP), structural variant, and gene fusion detection. The tool is also available in C language, see Subread.
  17. Rbowtie
    • Description : This package provides an R wrapper around the popular bowtie short read aligner and around SpliceMap, a de novo splice junction discovery and alignment tool. The package is used by the QuasR bioconductor package. We recommend to use QuasR instead of using this package directly.
  18. DeepBound
    • Description : DeepBound is a tool to identify splicing junctions and boundaries of expressed transcript read alignments in RNA-seq data. The DeepBound algorithm uses deep convolutional neural fields.
  19. Supersplat
    • Description : Supersplat is a tool to identify splice junctions in RNA-seq data.
  20. Qpalma
    • Description : Qpalma is a tool to align spliced reads. The Qpalma algorithm uses quality values and information of predicted splice sites for the assessment of alignment accuracy.
  21. tophat-IP
    • Description : TopHat-IP is the TopHat tool at Galalaxy Pasteur. See links for TopHat. TopHat is a tool for splice-aware mapping of RNA-seq reads. The TopHat uses the Bowtie short read aligner tool (BWT-based algorithm) for the mapping whereafter it identifies intron-exon (splice) junctions. TopHat can use paired-end sequencing reads and parallel computation. (*BWT=Burrows–Wheeler transform)
  22. SpliceJumper
    • Description : SpliceJumper is a tool to identify splice junctions in RNA-seq data. The SpliceJumper algorithm uses a classification-based approach
  23. PALMapper
    • Description : PALMapper is a tool to align read from RNA-seq data. The PALMapper can compute spliced and unspliced alignments. The package GenomeMapper together with the spliced aligner QPALMA (see links). The PALMapper tool is available as a command-line tool or via the web service (https://galaxy.inf.ethz.ch/ ).
  24. SGSeq
    • Description : SGSeq is a tool to predict and quantify splice events in RNA-seq datasets. The SGSeq algorithm predicts splice junctions and exons by mapping reads to a reference genome.
  25. MapPER
    • Description : MapPER is a tool to align paired-end reads in RNA-seq data sets. The MapPER algorithm uses an expectation-maximization method to assign likelihood values.
  26. FusionSeq
    • Description : FusionSeq is a tool for the identification of fusion transcript in RNA-seq data sets using paired-end reads. The FusionSeq includes functions to filter out spurious fusions caused by misalignment artifacts or random pairing. It ranks the candidate fusions using varied statistical methods.
1.1.2.2 Splice unaware
  1. mmquant
    • Description : A tool to quantiy gene expression. The mmquant algorithm handles multiply mapping reads, i.e., duplicated genes by constructing merged genes.
  2. RNA-MATE
    • Description : A recursive mapping strategy for high-throughput RNA-sequencing data. This pipeline described here is written in Perl, and makes use of a PBS queue manager, however it can be configured to use LSF or SGE
  3. NanoPARE
    • Description : NanoPARE is a set of tools for the analysis of 5' RNA data from nanoPARE sequencing libraries. The NanoPARE package contains (1) EndMap for Aligning 5P and BODY FASTQ files to a reference genome, (2) EndGraph for Identifying 5P features, (3) EndClass for Classifying 5P features as capped or noncapped and label features according to a reference transcriptome, (3) EndMask for Masking genomic regions with capped features and converting genome coordinates to transcriptome coordinates, (4) EndCut for searching evidence of small RNA mediated cleavage in transcript-mapping noncapped 5P reads. Requirements: STAR aligner 2.5+, Python 3.6+, Samtools 1.3+, Bedtools 2.26, and Cutadapt 1.9.
  4. GEM Mapper
    • Description : GEM Mapper is a tool for aligning paired-end reads to a reference genome. The GEM algorithm uses seed and extend technique and enables exhaustive searches given specific criteria.
  5. GEM-Tools
    • Description : GEM-Tools is an API and a Python API that simplify the usage of the GEM Mapper tool (See links). Besides, GEM-Tools includes a command-line interface, gemtools, for initiating the RNAseq pipeline, indexer module, statistics module, and various other tools.
  6. Bowtie
    • Description : Bowtie is a tool for aligning short DNA sequence reads to a reference genome. The Bowtie algorithm uses the Burrows-Wheeler transform (BWT) technique and permits the use of multiple CPUs.
  7. Bowtie 2
    • Description : Bowtie 2 is a tool for aligning short DNA sequence reads to a reference genome. The Bowtie algorithm uses a compressed full-text substring index based on the Burrows-Wheeler transform (BWT) technique and permits the use of multiple CPUs. Bowtie 2 can align reads up to thousands of nucleotides of length, has a gapped local alignment, and paired-end alignment modes.
  8. GRIT
    • Description : GRIT (Generalized RNA Integration Tool) is a tool to assemble transcripts using RNA-seq data. The GRIT pipeline combines RNA-seq and gene-boundary data, CAGE, RAMPAGE, and poly(A)-seq data.
1.1.2.3 Quality Control
  1. ORMAN
    • Description : ORMAN (Optimal Resolution of Multimapping Ambiguity of RNA-Seq Reads) is a tool to resolve transcript mappings in RNA-seq data. Th ORMAN algorithm uses combinatorial optimization, integer linear programming, heuristics, and well-know approximation methods.
  2. FastQ Screen
    • Description : FastQ Screen is a tool for quality control of DNA samples by screening against a reference genomes to validate the origins.
  3. Picard
    • Description : Picard is a collection of command-line tools for handling high-throughput sequencing data.
  4. RSeQC
    • Description : RSeQC tool consists of a set of tools for quality assessment of RNA-seq data. The RSeQC package includes functions for sequence quality, nucleotide composition, GC, and PCR bias, sequencing depth, strand specificity, mapped reads distribution, coverage uniformity, RNA integrity, and genomic read distribution.
  5. AlignerBoost
    • Description : AlignerBoost is a tool to analyze the mapping of high-throughput sequencing reads and increase the overall mapping precision. It works with all sequence aligners that produce SAM or BAM output and further accepts known SNPs as input to improve the quality of alignments. The authors have specifically optimized AlignerBoost for Bowtie, Bowtie2, NovoAlign, BWA-ALN/BWA-SW/BWA-MEM, SeqAlTo (DNA aligners), and Tophat, Tophat2, STAR (RNA aligners).
  6. QoRTs
    • Description : QoRTs is a tool for quality control of RNA-seq data. The QoRTs algorithm has functions for the analysis, quality control, and data management, primarily the detection identification mistakes, biases, and artifacts in paired-end sequencing. It can also compute count data and group-summary genome track files for visualization in the UCSC genome browser.
  7. RNA-SeQC
    • Description : A tool for quality control of RNA-seq data. The RNA-SeQC package has functions for computing various quality metrics, such as alignment quality, duplication rates, GC bias, rRNA content, coverage continuity, covered alignment regions, transcript count, and 3'/5' bias. It produces Read counts, coverage, correlation quality control metrics, and is also suitable for use with scRNA-seq data sets.
  8. QuaCRS
    • Description : QuaCRS is a tool for integrated quality control of RNA-seq data. The QuaCRS package consolidates the FastQC, RNA-SeQC tools, and a collection of functions in the RSeQC.
  9. MultiQC
    • Description : MultiQC is a tool that aggregates results from multiple sequence aligners, post- and pre-processing, and quality control tools. Version 1.8 supports 78 separate tool packages. The MultiQC obtains the information by scanning the log files and produces an HTML report. This tool is also useful for Single-cell sequencing data and population studies.
  10. QualiMap
    • Description : QualiMap and a later version, Qualimap 2, is a tool for quality control of sequence alignments and genomic features. The QualiMap can use whole-genome genome and exome sequencing, RNA-seq, and ChIP-seq data. It also has functions for comparison of multiple samples and clustering of epigenomic profiles.
  11. NOISeq
    • Description : NOISeq is a tool for quality control of RNA-seq count data. The NOISeq can evaluate, among others, count distribution, per chromosome expression, and detected features. The NOISeqBIO module in the NOISeq package assesses false positives non-parametrically.
  12. EDASeq
    • Description : EDASeq is an R tool for visualization of RNA-seq data. The EDASeq includes functions for within lane and between lane normalizations.
  13. rnaQUAST
    • Description : rnaQUAST is a tool to assess the quality of RNA-seq assemblies. The rnaQUAST uses a reference and a gene database to compute several quality metrics for the assembly correctness and completeness.
  14. GeneScissors
    • Description : GeneScissors is a tool for quality control of mapped RNA-seq data. The GeneScissors algorithm combines machine learning (ML) with biological knowledge for the detections and adjustment of spurious inferences.
  15. CADBURE
    • Description : CADBURE is a tool

1.2 Expression Quantification

1.2.1 Union-exon Based

  1. Subread
    • Description : Subread is a software tool package for the alignment of both DNA-seq and RNA-seq read data, quantification, and mutation detection. The Subread package consists of five separate tools: 1. Subread, a read aligner for both RNA-seq and DNA-seq data, 2. Subjunc, read aligner for RNA-seq data, detection of exon-exon junctions and gene fusion events, 3. featureCounts, read counting, 4. Sublong, for aligning long reads using the seed-and-vote technique, and 5. exactSNP, a single-nucleotide polymorphism discovery. An R version of the Subread package is also available, Rsubread .
  2. IsoEM2
    • Description : A tool to estimate differential expression and confidence intervals in RNA-seq data. The IsoEM2 package integrates both IsoEM2 and IsoDE2. The IsoEM2 algorithm uses bootstrapping to evaluate expression levels and confidence intervals. It reports fragments per kilobase million and transcript per million (FPKM, TPM) for genes and isoforms. IsoDE2 uses the data generated by IsoEM2 to analyze differential expression.
  3. FeatureCounts
    • Description : featureCounts is a tool to quantify RNA-seq and gDNA-seq data as counts. It is also suitable for single-cell RNA-seq (scRNA-seq) data. It supports multi-threading. The featureCounts is part of the Subread package (see links). An R version is also available as Rsubread .
  4. easyRNASeq
    • Description : easyRNASeq is a tool to quantify RNA-seq expression data. The package also has functions for retrieving annotations, read count summarization by feature. It reports Reads per kilobase per million mapped reads (RPKM).
  5. HTSeq
    • Description : HTSeq is a tool for the analysis of high-throughput sequencing data. It processes reads aligned with HISTAT or STAR and assign expression value counts. The HTSeq is also suitable for the quantification of single-cell RNA-seq data (scRNA-seq). The package also includes a htseq-count tool for pre-processing RNA-seq reads before differential expression analysis and a htseq-qa tool that assesses the read quality.
  6. Rsubread
    • Description : Rsubread is an R tool for RNA-/DNA-seq data mapping, read counting, single-nucleotide polymorphism (SNP), structural variant, and gene fusion detection. The tool is also available in C language, see Subread.
  7. PennDiff
    • Description : PennDiff is a tool to quantify RNA-seq data. The PennDiff algorithm uses both transcript-based and union-exon methods.

1.2.2 Transcript Based

  1. StringTie
    • Description : A tool to assemble RNA-seq sequence alignments into transcripts. The StringTie algorithm can optionally make de novo assemblies and uses a novel network flow algorithm.
  2. ISVASE
    • Description : A tool for the identification of splice variants in RNA-seq data. The ISVASE algorithm uses rule-based filters, identifies splicing junctions, sequence variants, and exon-exon junction shifts.
  3. Matataki
    • Description : A tool to estimate gene expression levels in RNA-seq data sets. The Matalaki algorithm uses unique k-mers for each gene to quickly map all the fragments to genes. According to the Authors, Matalaki is faster than conventional methods.
  4. EMASE
    • Description : EMASE (Expectation-Maximization for Allele-Specific Expression) is a tool for estimating total gene expression, isoform usage, and allele-specific expression in RNA-seq data. The EMASE algorithm approaches the problem hierarchically by first resolving uncertainties between genes secondly between isoforms, and finally between alleles. EMASE is a prototype implementation in Python language, and EMASE-Zero is a C++ version.
  5. SparseIso
    • Description : A tool for identification of alternatively spliced transcripts in RNA-seq data. The SparseIso algorithm Gibbs sampling method for simultaneous identification and quantification of transcripts. It also estimates the joint distribution of all transcript candidates to improve the detection of transcripts that are expressed in small quantities and expressed isoforms.
  6. AltHapAlignR
    • Description : AltHapAlignR is a tool estimate transcript abundace on gene and haplotype levels on genomic regions.
  7. RefBool
    • Description : RefBool is a tool to classify RNA-Seq and microarray gene expression data in three categories: active, intermediate, and inactive. The RefBool algorithm is based on reference and provides p- and q- values for each classification.
  8. Salmon
    • Description : A tool to quantify transcript expression in RNA-seq data. The Salmon algorithm can correct for GC-bias, and it uses 'selective-alignment' and massively-parallel stochastic collapsed variational inference to achieve high accuracy and speed. It reports transcripts per million mapped reads (TPM).
  9. NURD
    • Description : NURD is a tool to estimate expression levels of isoforms in RNA-seq data. The NURD algorithm uses a binary interval search method and can correct for experimental sequencing biases both globally and locally. The home page is currently not available. You may email the Authors and request the source code.
  10. kallisto
    • Description : A tool to quantify RNA-seq data. The kallisto algorithm uses a pseudo alignment approach to speed up the alignment procedure. The "pseudo alignment" approach can quantify reads without making actual alignments. Kallisto can handle paired-end and single-end reads. It reports transcripts per million mapped reads (TPM).
  11. BRIE
    • Description : BRIE (Bayesian regression for isoform estimation) is a tool to quantify splicing from RNA-seq data. The BRIE algorithm learns prior distribution isoform proportions from the sequences in samples using a Bayesian hierarchical model.
  12. Outrigger
    • Description : Outrigger is a tool for the creation of de novo alternative splicing annotation for RNA-seq data. The Outrigger uses junction reads, a graph database, and quantifies spliced-in (Psi) events.
  13. bonvoyage
    • Description : bonvoyage is a tool for the detection of alternative splicing in RNA-seq data. The bonvoyage algorithm uses the outrigger de novo splice graph method and a Bayesian approach for modality assignment. It can also show changes in modalities using non-negative matrix factorization.
  14. EMSAR
    • Description : EMSAR is a tool for transcript quantification of RNA-seq data. The EMSAR algorithm can use both single- and paired-end reads, it can operate in multi-thread mode, and reports Fragments per kilobase million Reads (FPKM).
  15. RSEM
    • Description : RSEM (RNA-Seq by Expectation-Maximization) is a tool for the quantification of RNA-seq data. The RSEM algorithm uses the expectation-maximization technique, it can operate with and without a reference, and reports transcripts per million mapped reads (TPM). RSEM scales linearly with the amount of alignment quantity and uses The Bowtie tool for the read alignments.
  16. Cufflinks
    • Description : Cufflinks consist of a suite of tools for differential gene expression analysis of RNA-seq data. It assembles aligned reads in a set of transcripts and estimates the relative abundances. The Cufflinks suite consists of the following tools: cufflinks, cuffcompare, cuffmerge, cuffquant, cuffdiff, and cuffnorm.
  17. eXpress
    • Description : eXpress is a tool to quantify RNA-seq data, but it is also applicaple to ChIP-seq, metagenomics, and large-scale sequencing data in general. The eXpress streaming algorithm computes sequenced DNA or RNA in real-time. Unfortunately, the Authors no longer maintain the software and recommend to use the kallisto tool.
  18. Sailfish
    • Description : Sailfish is a tool to estimate the abundance of gene isoforms using reference sequences and RNA-seq data sets. The Sailfish uses an alignment-free algorithm and k-mers.
  19. RNA-Skim
    • Description : RNA-Skim is a tool to quantify transcripts in RNA-seq data. The RNA-Skim algorithm uses a concept of sig-mers, a type of k-mers, for quantification based on distinct clusters of transcripts.
  20. SpliceTrap
    • Description : SpliceTrap is a tool to quantify exon inclusion ratios in paired-end RNA-seq data. The SpliceTrap algorithm quantifies the extent to which each exon is included, skipped has size variations.
  21. PennDiff
    • Description : PennDiff is a tool to quantify RNA-seq data. The PennDiff algorithm uses both transcript-based and union-exon methods.
  22. TIGAR2
    • Description : TIGAR2 is a tool to quantify transcript isoforms in RNA-seq data sets. The TIGAR2 algorithm applies a variational Bayesian inference, and it can also model sequencing errors. It can use Bowtie2 and BWA-MEM tools in the computational pipeline.
  23. LocExpress
    • Description : LocExpress is web-based to quantify the expression of novel gene transcripts in RNA-seq data. The LocExpress algorithm work with human and mouse data. For the abundance estimation, LocExpress uses a minimum spanning bundle (MSB) region to allow quantification without the need to analyze a whole genome.
  24. EPIG-Seq
    • Description : EPIG-Seq is a tool to cluster co-expressed genes in RNA-seq data sets. The EPIG-Seq algorithm uses count correlation to estimate the gene similarity and to estimate differential expression level, it uses quasi-Poisson modeling and a location parameter.
  25. PDEGEM
    • Description : PDEGEM (Positional Dependent Energy Guided Expression Model ) is a tool to estimate transcript abundance and isoform expression in RNA-seq data. The PDEGEM algorithm uses the Positional Dependent Nearest Neighborhood (PDNN) based technique to model the distribution of reads.
  26. SeqSaw
    • Description : SeqSaw is a tool for the de novo identification of splice junctions in RNA-seq data. The SeqSaw algorithm detects splice junctions also without GT-AG splicing signals.
  27. IsoLasso
    • Description : IsoLasso is a tool
  28. R-SAP
    • Description : R-SAP is an RNA-seq analysis pipeline tool. R-SAP can quantitate and uses a hierarchical decision-making scheme to characterize various classes of transcripts. R-SAP reports expression levels as RPKM (reads per kilobase of exon model per million mapped reads).
  29. Solas
    • Description : Solas is a tool to predict and quantify expressed isoforms within observed coding regions in RNA-seq data. The Solas algorithm has three separate functions: 1. detection of alternative splicing events differentiating two conditions, 2. detection of genes and exons being part of an alternative splicing event, 3. quantification of the relative proportion of isoforms.
  30. SplitSeek
    • Description : SplitSeek is a tool to detect splice junctions and chimeric reads in RNA-seq data.
  31. cufflinks-IP
    • Description : cufflinks-IP is on Institute Pasteur - Cufflinks consist of a suite of tools for differential gene expression analysis of RNA-seq data. It assembles aligned reads in a set of transcripts and estimates the relative abundances. The Cufflinks suite consists of the following tools: cufflinks, cuffcompare, cuffmerge, cuffquant, cuffdiff, and cuffnorm.
  32. Rcount
    • Description : Rcount is a tool to quantify the number of reads mapped to a specific gene (feature counts) in RNA-seq datasets. The Rcount algorithm specifically addresses the issue arising from reads mapping to multiple locations.
  33. SEQ-EM
    • Description : SEQ-EM is a tool to estimate the expression levels of homologous genes in RNA-seq datasets. The SEQ-EM algorithm uses a maximum likelihood-based method to estimate the model parameters.
  34. Net-RSTQ
    • Description : Net-RSTQ is a tool to quantify isoforms in RNA-seq data aimed for cancer transcriptome. The Net-RSTQ algorithm uses protein domain-domain interaction network information as prior knowledge in the abundance estimation.
  35. SplicingTypesAnno
    • Description : SplicingTypesAnno is a tool for the annotation and quantification of alternative splicing in RNA-seq datasets. The SplicingTypesAnno annotates major alternative splicing at exon/intron level, genome-scale annotation or gene-scale annotation, and outputs report in HTM plus additional BED files for IGV visualization.
  36. SplicingCompass
    • Description : SplicingCompass is a tool to predict differentially splices genes between two separate conditions in RNA-seq datasets. The SplicingCompass uses a technique of computing geometric angles between the high dimensional vectors of exon read counts.
  37. flipflop
    • Description : flipflop is a tool to identify and quantify isoforms in RNA-seq data. The flipflop algorithm uses a network flow optimization technique to solve the sparse estimation problem.
  38. QuasR
    • Description : QuasR is a tool to quantify and annotate reads from RNA-seq, ChIP-seq, and Bis-seq. The QuasR package has tools for all analysis steps from sequence read preprocessing, alignment, and quality control to quantification.
  39. GPSeq
    • Description : GPSeq is a tool to quantify transcriptomes using RNA-seq data. The GPSeq algorithm uses a two-parameter generalized Poisson model to estimate the position-specific read counts.
  40. MMSEQ
    • Description : MMSEQ is a tool to estimate isoform in RNA-seq data. The MMSEQ algorithm uses a new statistical method that deconvolves the mapping of reads to haplotype-specific isoforms and works with paired-end reads.

1.2.3 Bacterial genome

  1. EDGE-pro
    • Description : EDGE-pro (Estimated Degree of Gene Expression in PROkaryotes) is a tool to quantify gene expression in prokaryotes and archaea. The EDGE-pro algorithm can align overlapping gene regions.
  2. SeqTU
    • Description : SeqTU is a tool for the analysis of strand-specific RNA-seq data. The SeqTU algorithm us machine learning approach.
  3. Parseq
    • Description : Parseq is a tool to estimate transcription levels of microbial genomes in RNA-seq data. The Parseq algorithm uses a particle Gibbs algorithm.
  4. TSSer
    • Description : TSSer is a tool to detect transcription start sites in bacterial RNA-seq data.

2. Differential Expression Analysis

2.1 Pre-processing DEA

  1. PoissonSeq
    • Description : PoissonSeq is an R library to normalize, estimate false discovery rate (FDR), and testing of RNA-seq data sets. The PoissonSeq algorithm uses a Poisson log-linear model.
  2. GENAVi
    • Description : GENAVi (Gene Expression Normalization Analysis and Visualization) is a tool to normalize, analyze, and visualize gene expression in human or mouse RNA-seq data. GENAVi provides a user-friendly GUI and does not require bioinformatics expertise to operate. GENAVi is available as a web-based tool and also installable on a local computer using docker.
  3. TCC
    • Description : TCC (Tag Count Comparison) is a tool for the differential analysis of tags counts in RNA-seq datasets. The TCC algorithm uses a multi-step normalization method based on differentially expressed genes (DEG) elimination strategy (DEGES).
  4. alpine
    • Description : alpine is a tool to reduce systematic biases in the estimation of transcript abundances in RNA-seq datasets. The alpine algorithm uses sequence features in the analysis of the abundance.

2.2 Parametric

  1. ideal
    • Description : ideal is a tool for differential expression analysis of RNA-seq data. The ideal is a Shiny app.
  2. CORNAS
    • Description : CORNAS is a tool for differential expression analysis of RNA-seq data. The CORNAS algorithm uses a Bayesian approach to compute the sequence coverage from concentrations of RNA-seq samples. It uses a posterior distribution to estimate the true gene count.
  3. TCseq
    • Description : TCseq is an R tool for analysis of quantitative and differential expression of RNA-seq data. The TCseq algorithm can also do cluster analysis and has functions for visualization of time-course data. It uses the generalized linear model (GLM).
  4. XBSeq
    • Description : XBSeq is an R tool for genome-wide expression analysis of RNA-seq data. The XBSeq algorithm uses a statistical approach in which observed signals are a convolution of real expression signals and sequencing noises. It assumes the reads that map on intergenic regions are distributed according to Poisson and distinguishes signals using the negative binomial distribution.
  5. BNBR
    • Description : BNBR is an R tool for the analysis of differential expression in RNA-seq data. The BNBR algorithm uses a new Bayesian negative binomial regression technique (BNB-R).
  6. IsoEM2
    • Description : A tool to estimate differential expression and confidence intervals in RNA-seq data. The IsoEM2 package integrates both IsoEM2 and IsoDE2. The IsoEM2 algorithm uses bootstrapping to evaluate expression levels and confidence intervals. It reports fragments per kilobase million and transcript per million (FPKM, TPM) for genes and isoforms. IsoDE2 uses the data generated by IsoEM2 to analyze differential expression.
  7. ABSSeq
    • Description : ABSSeq is a tool for differential gene expression analysis in RNA-seq data. The ABSSeq algorithm uses a negative binomial distribution approach to infer expression differences.
  8. NSMAP
    • Description : NSMAP (Nonnegativity and Sparsity constrained Maximum A Posteriori ) is a tool for quantification of expression levels and identification of isoforms in RNA-seq data. The NSMAP algorithm uses A Nonnegativity and Sparsity constrained Maximum APosteriori model, to simultaneous identification of isoform structures and estimation of expression levels.
  9. WemIQ
    • Description : WemIQ is a tool for quantification of isoform expression and exon splicing ratios in RNA-seq data. The WemIQ algorithm uses the expectation-maximization (EM) approach and a Poisson model.
  10. DSGseq
    • Description : This program aims to identify differentially spliced genes from two groups of RNA-seq samples.
  11. DREAMSeq
    • Description : DREAMSeq is an R tool for the detection of differentially expressed genes in RNA-seq data. The DREAMSeq algorithm uses a double Poisson model to capture all data properties, such as underdispersion, overdispersion, and equidispersion.
  12. VCNet
    • Description : A tool to construct co-expressed gene networks from RNA-seq data. The VCNet algorithm uses a new statistical test on the correlation of a gene pair using the Frobenius norm (Euclidean norm ).
  13. DESeq
    • Description : DESeq is a tool for hypothesis testing and differential gene expression analysis of RNA-seq data. The DESeq algorithm applies the negative binomial distribution and a Likelihood Ratio Test (LRT), it normalizes data by trimmed mean of M-values and circumvents a small sample size by incorporating information from all genes in a set of samples.
  14. DESeq2
    • Description : DESeq2 is a tool for differential gene expression analysis of RNA-seq data. DESeq2 is a new version of DESeq and can detect more differentially expressed genes (DEGs) than DESeq2. However, it also seems to allow more false positives. The DESeq2 algorithm uses the negative binomial distribution, the Wald, and the Likelihood Ratio Tests.
  15. edgeR
    • Description : edgeR is a tool for differential expression (DE) analysis of RNA-seq, ChIP-seq, CAGE, and SAGE data with biological replicates. The edgeR algorithm uses information from all the genes, computes the dispersion using a weighted likelihood and F-test techniques. For the normalization, it can use the trimmed mean of M-values, upper-quartile (UQ) procedure, Relative Log Expression (RLE), and DESeq. It can compare two groups, paired and unpaired, or use a Generalized Linear Model (GLM). The upper-quartile (UQ) procedure is also applicable to single-cell RNA-seq (scRNA-seq).
  16. ImpulseDE
    • Description : ImpulseDE is an R tool for differentially expressed genes (DEGs) in RNA-seq and scRNA-seq time-course data. The ImpulseDE can report DEGs across time points over time in datasets with single or multiple conditions. It includes quality values for DEGs, impulse model parameters, fitted values for genes, and can use multi-threading.
  17. ImpulseDE2
    • Description : ImpulseDE2 is an R tool for differentially expressed genes (DEGs) time course in RNA-seq, ChIP-seq, ATAC-seq and DNaseI-seq data sets. The ImpulseDE2 algorithm uses negative binomial noise and impulse models. It can also correct for batch and library construction effects.
  18. SARTools
    • Description : SARTools is an R tool package for differential expression analysis of RNA-seq data. SARTools uses DESeq2 and edgeR. The input consists of raw count data, experimental description files. It will then normalize, estimate dispersion, and analyze differential gene expression. The output is a tab-delimited file and optionally a report in HTML format.
  19. DEApp
    • Description : DEApp is a web-based tool for differential analysis of RNBA-seq data. It uses edgeR, Limma-Voom, and DESeq2 for cross-validation.
  20. Cufflinks
    • Description : Cufflinks consist of a suite of tools for differential gene expression analysis of RNA-seq data. It assembles aligned reads in a set of transcripts and estimates the relative abundances. The Cufflinks suite consists of the following tools: cufflinks, cuffcompare, cuffmerge, cuffquant, cuffdiff, and cuffnorm.
  21. Cuffdiff 2
    • Description : Cuffdiff 2 is a tool to estimate differential expression at gene and transcript levels. It uses a negative binomial model, normalizes using the relative log expression method implemented in DESeq, Inter-sample normalization method Q, and reports Fragments per kilobase million Reads per million mapped reads (FPKM). Cuffdiff 2 is a part of the Cufflinks suite of tools.
  22. MISO
    • Description : MISO (Mixture-of-Isoforms) is a tool for estimating expression levels of alternatively spliced genes and isoforms. The Authors have implemented MISO as an alternative to the Cufflinks tool. MISO is no longer maintained, but it is available for download.
  23. rMATS
    • Description : rMATS is a tool to detect major differential alternative splicing types in RNA-seq data with replicates. The rMATS algorithm can use both paired and unpaired reads and computes p-values and false discovery rates based on a user-defined threshold.
      Alternative name: MATS.
  24. tweeDEseq
    • Description : tweeDEseq is a tool to analyze differential gene expression in RNA-seq data sets. The tweeDEseq algorithm uses the Poisson-Tweedie family of distributions.
  25. deGPS
    • Description : deGPS is a tool to analyze differential gene expression in RNA-seq data sets. The deGPS algorithm uses a normalization technique based on generalized Poisson distribution and tests using permutations.
  26. SpatialDE
    • Description : SpatialDE is a tool to identify spatially variable genes in data from multiplexed imaging or RNA-seq data. The SpatialDE algorithm can cluster genes for expression-based tissue histology.
  27. PLNseq
    • Description : PLNseq is a tool for the differential gene expression analysis (DGE) in RNA-seq data. The PLNseq algorithm uses a multivariate Poisson lognormal distribution for modeling the read count data.
  28. RDiff
    • Description : RDiff is a tool for the detection of differential RNA processing in RNA-seq data. The RDiff algorithm can identify and quantify novel and known isoforms. The RDiff provides a parametric test for annotated genomes and a non-parametric version for genomes where the annotation is incomplete.
  29. BM-DE
    • Description : BM-DE (Bayesian method of calling differential expression) is a tool for differential gene expression (DGE) analysis of RNA_seq data. The BM-DE algorithm models read counts at each position using Bayesian statistics and can analyze data without biological replicates.
  30. sSeq
    • Description : sSeq is a tool for differential gene expression (DGE) analysis using RNA-seq data. The sSeq algorithm uses the Negative Binomial (NB) distribution, and a shrinkage technique, and outputs expression as counts.
  31. EBSeq
    • Description : EBSeq is a tool to identify differential expression isoforms in RNA-seq data. The EBSeq is based on empirical Bayesian methods.
  32. JunctionSeq
    • Description : JunctionSeq is a tool for the detection of differential splice junction usage. The JunctionSeq algorithm does not need an extra isoform assembly step. The JunctionSeq tool includes visualization functions.
  33. maSigPro
    • Description : maSigPro is an R tool to discover genes with sufficient differences in gene expression among experimental groups in time-course microarray and RNA-Seq experiments.
  34. FunSys
    • Description : FunSys is a tool for the analysis of differential gene expression (DGE) in RNA-seq data. The FunSys can associate RNA-seq data with proteomics data.
  35. DEXUS
    • Description : DEXUS is a tool for the analysis of differential gene expression (DGE) in RNA-seq data where the conditions are unknown. The DEXUS algorithm uses a finite mixture of the negative binomial distribution to model read counts.
  36. diffcoexp
    • Description : diffcoexp is a tool to detect differentially co-expressed genes and gene pairs (links) in microarray data.
  37. svapls
    • Description : svapls is a tool for the identification of various sample-specific sources of heterogeneity of gene expression in RNA-seq datasets, producing an increasingly accurate expression pattern. The svapls algorithm uses Partial Least Squares regression statistics to obtain the hidden signals of sample-specific heterogeneity to identify phenotypes.
  38. variancePartition
    • Description : variancePartition is a tool to partition and visualize gene variation diverging from a general trend in RNA-seq datasets. The variancePartition algorithm uses a linear mixed model and partitions traits with differences in, for example, disease status, sex, cell or tissue type, genetic background, experimental conditions, and technical variation.
  39. BitSeq
    • Description : BitSeq is an R tool for differential gene expression (DGE) analysis in RNA-seq datasets. The BitSeq algorithm uses Bayesian inference and Markov chain Monte Carlo sampling to model the data.
  40. gCMAP
    • Description : gCMAP is a tool for the analysis of differential gene expression (DGE) analysis of RNA-seq data sets.
  41. anota2seq
    • Description : anota2seq is a tool for the analysis of translational efficiency and differential expression analysis for polysome-profiling and ribosome-profiling studies that are quantified by RNA-seq or DNA-microarray.
  42. ECFS-DEA
    • Description : ECFS-DEA (feature selection tool for differential expression analysis) is a tool to select features for differential expression analysis in RNA-seq data.

2.3 Non-parametric

  1. PennSeq
    • Description : PennSeq is a tool for isoform-specific gene expression quantification in RNA-seq data. The PennSeq algorithm uses a statistical method permitting each isoform to have a separate specific distribution and uses a non-parametric procedure.
  2. DSS
    • Description : DSS is an R library for differential expression analysis in RNA-seq count-based data and differential methylation (DML/DMRs) in bisulfite sequencing (BS-seq) data. For the estimation of the dispersion of the count data, the DSS algorithm uses an empirical Bayes shrinkage estimate (Gamma-Poisson or Beta-Binomial distributions). To handle varied sequencing coverages, it uses comparisons between two groups to obtain dispersion shrinkage for multiple factors. For differential expression, it uses the Wald test. DSS fits the Generalized Linear Model (GLM) using edgeR (see links). The DSS algorithm tends to under-estimate the false discovery rate (FDR).
  3. limma
    • Description : The limma (limma-voom) tool is for the analysis of gene expression of microarray and RNA-seq data. The limma algorithm uses a generalized linear model (GLM), log-normal distribution, trimmed mean of M-values, t- and F-tests.
  4. DEApp
    • Description : DEApp is a web-based tool for differential analysis of RNBA-seq data. It uses edgeR, Limma-Voom, and DESeq2 for cross-validation.
  5. NPEBseq
    • Description : NPEBseq is an R tool for differential gene expression analysis of RNA-seq data. The NPEBseq algorithm uses an empirical Bayesian model with the prior distribution estimation from the data itself. NPEBseq can estimate differential expression on both gene and exon levels.
  6. FDM
    • Description : FDM is a tool for the analysis of differential gene expression in RNA-seq data sets.
  7. DegPack
    • Description : DegPack is a web-based tool for the identification of differentially expressed genes in RNA-seq data. The DegPack algorithm uses PoissonSeq and SAMseq methods.
  8. DiffSplice
    • Description : DiffSplice is a tool for analysis of differential splicing in RNA-seq data. The DiffSplice uses a non-parametric permutation test to determine differences in transcription levels. The DiffSplice algorithm doesn't use annotations, and the results are viewable in the UCSC genome browser.
  9. Guide
    • Description : Guide (Genome Informatics Data Explorer) is a tool for the differential expression analysis of RNA-seq and microarray data. The Guide package is aimed for wet-lab biologists and uses the limma tool for the analyses but does not require specific programming knowledge.
  10. RDiff
    • Description : RDiff is a tool for the detection of differential RNA processing in RNA-seq data. The RDiff algorithm can identify and quantify novel and known isoforms. The RDiff provides a parametric test for annotated genomes and a non-parametric version for genomes where the annotation is incomplete.
  11. GSAR
    • Description : GSAR (Gene Set Analysis in R) is a tool for gene set analysis (GSA). The GSAR algorithm uses multivariate non-parametrical testing methods to test null hypothesis against alternative hypotheses. For example, mean, variance, and net correlations structure. GSAR also includes visualisation function.
  12. rSeqNP
    • Description : rSeqNP is a tool for the non-parametric detection of differential gene expression in RNA-seq datasets. The rSeqNP algorithm uses permutation tests to access statistical significance.
  13. fishpond
    • Description : fishpond is a tool for the expression analysis of RNA-seq data. The fishpond algorithm uses a nonparametric method.

2.4 Power analysis

  1. powsimR
    • Description : powsimR is a tool to simulate differential expression RNA-seq data. The powsimR package can simulate read counts, model the mean, dispersion, dropout distributions, and compute the power of sample sizes.
  2. RNASeqPower
    • Description : RNASeqPower is an R tool for estimating power and computing RNA-seq sample sizes.
  3. RNASeqPowerCalculator
    • Description : RNASeqPowerCalculator is a tool for computing the sample size and estimating the power of RNA-seq data sets.
  4. Scotty
    • Description : Scotty is a web-based tool for estimating the statistical power in RNA_seq data samples. It helps to optimize the sequencing depth and the sample sizes.
  5. RnaSeqSampleSize
    • Description : RnaSeqSampleSize is a tool for estimating the power and sample sizes in RNA-seq data sets.
  6. PowerExplorer
    • Description : PowerExplorer is a tool for the power estimation of multiple sizes of samples using simulated data. The PowerExplorer algorithm estimates the distribution parameters from the input data.
  7. PROPER
    • Description : PROPER is a tool for the assessment of power in RNA-seq data. The PROPER algorithm uses a semi-parametric simulation by a computation based on the experimental data and provides stratified power and false discovery cost.
  8. SSPA
    • Description : SSPA is a tool to calculate sample size and power for RNA-seq and microarray data. The SSPA algorithm uses pilot-data for the evaluation.
  9. samExploreR
    • Description : samExploreR is an R package for the analysis and exploration of sequencing experiments by simulation using subsampling. The algorithm works for data produced by Illumina GA, HiSeq, MiSeq, ABI SOLiD, Roche GS-FLX, and LifeTech Ion PGM Proton sequencing machines.
  10. subSeq
    • Description : subSeq is a tool to determine a suitable coverage for RNA-seq experiments. The subSeq algorithm uses subsampling to determine a point where both the accuracy and power coincide to help in the design of the experiments.
  11. erccdashboard
    • Description : erccdashboard is a tool for the assessment of differential gene expression (DGE) in RNA-seq data. The erccdashboard algorithm uses external spike-in RNA control ratio mixtures.

3. Functional Profiling

3.1 Enrichment Analysis (GSEA), annotation, other

  1. ACTION
    • Description : A tool to detect a functional identity of cells from RNA-seq expression data profiles. The ACTION algorithm uses cells' expression profile to classify cells by dominant functions and reconstructs gene regulatory networks involved in mediating identities. NOTE! The source may be available from the Authors. The given GitLab repository does not exist.
  2. ccfindR
    • Description : ccfindR (Cancer Clone findeR) is a tool for the analysis of cancer cells from single-cell RNA-seq data. ccfindR contains functions for quality control, unsupervised clustering, and visualization. The ccfindR algorithm uses Bayesian non-negative matrix factorization for feature selection and clustering.
  3. RNAscClust
    • Description : RNAscClust a pipeline tool for clustering RNA sequences. The RNAscClust algorithm uses minimum free energy and a graph kernel-based strategy.
  4. CEM
    • Description : A tool for assembling transcriptome sequences and estimating expression levels in RNA-seq data. The CEM algorithm uses a quasi-multinomial distribution model to detect RNA-seq biases, such as mappability and positional sequencing biases.
  5. VDJSeq-Solver
    • Description : VDJSeq-Solver is a tool to identify clonal lymphocyte populations from paired-end RNA-seq data from mRNA neoplastic cells. It identifies the main clone characterizing the tissue based on the most abundant V(D)J rearrangement. Note that the source (.tar.gz) is 3.9 GB.
  6. GENIE3
    • Description : An R tool for prediction of gene regulatory networks from RNA-seq data. The GENIE3 algorithm uses the random forest or Extra-Trees approach.
  7. RSVP
    • Description : RSVP is a tool to predict protein-coding gene isoforms in RNA-seq data. The RSVP algorithm uses ORF graphs, genomic DNA evidence, and aligned RNA-seq reads for the predictions.
  8. Arboreto
    • Description : Arboreto is a tool to infer gene regulatory networks in RNA-seq data. The Arboreto framework uses GRNBoost2 and an improved GENIE3 version. GRNBoost2 uses a gradient boosting approach for gene network inference.
  9. AUCell
    • Description : An R tool for identification of gene signatures and modules in RNA-seq data. The AUCell algorithm detects and ranks enriched gene sets by computing the receiver operator characteristics, i.e., the area under the curve (AUC).
  10. ACTINN
    • Description : ACTINN (Automated Cell Type Identification using Neural Networks) is a tool to identify cell types in sing-cell RNA-seq data. The ACTINN algorithm uses a neural network with three hidden layers. The publication describes the training on mouse cell type atlas (Tabula Muris Atlas) and a human immune cell dataset, and the results of prediction of cell types for mouse leukocytes, human PBMCs and human T cell subtypes.
  11. LINCS_RNAseq
    • Description : LINCS_RNAseq is a tool for the analysis of RNA-seq data. The LINCS_RNAseq algorithm uses the unique molecular identifier data from the LINCS Drug ToxicitySignature (DToxS) Generation Center at the Icahn School of Medicine at Mount Sinai in New York. The pipeline has three steps: split, align, and merge.
  12. ScanNeo
    • Description : A pipeline tool for prediction of neoepitopes derived from small to large-sized indels in RNA-seq data. The ScanNeo pipeline comprises three steps: Indel discovery, annotation and filtering, and neoantigen prediction.
  13. ORE
    • Description : ORE (Outlier-RV Enrichment) is a tool for the identification of non-coding rare variants in RNA-seq data. The ORE algorithm detects biologically significant outliers having more or less rare variants than expected by chance. Requires: Python >= 3.5.0, bedtools >= 2.2.7.0, samtools >= 1.3, and bcftools >=1.6.
  14. SCENIC
    • Description : An R pipeline for inference of gene regulatory networks and identification of cell states in RNA-seq data sets. The SCENIC (Single-Cell rEgulatory Network Inference and Clustering) workflow utilizes three separate packages, GENIE3 or GRNBoost2, RcisTarget, and AUCell. A Python implementation of this workflow is faster than this initial R version. See links for pySCENIC. The current version supports human, mouse, and Drosophila melanogaster.
  15. PRAPI
    • Description : PRAPI is a tool for the analysis of post-translational regulation in Is-Seq data. PRAPI can analyze alternative splicing and transcription initiation, alternative cleavage and polyadenylation, alternative transcription initiation and natural antisense transcripts, and circular RNAs. The PRAPI algorithm can combine Iso-Seq with RNA-seq or PAS-seq.
  16. PathwaySplice
    • Description : An R tool for analysis of splicing pathways. The PathwaySplice algorithm adjusts the number of exon junctions, visualizes a selection bias, supports Gene Ontology terms, user-defined gene sets, distinguishes primary genes in a pathway, and arranges pathways into an enrichment map.
  17. Snaptron
    • Description : Snaptron is a tool to search compiled RNA-seq data. The Snaptron algorithm uses R-tree, B-tree, and inverted indexing. It can also score splice junctions to estimate tissue specificity, the relative frequency of splicing patterns, and according to various other criteria.
  18. BackSPIN
    • Description : BackSPIN is a tool for clustering RNA-seq data. The BackSPIN algorithm, based on SPIN, is a divisive biclustering method.
  19. Wx
    • Description : Wx is a tool to select the optimal set of genes for gene expression. The Wx uses Keras (Python Deep Learning library) neural network to learn features and to generate a discriminative index.
  20. MSigDB
    • Description : MSigDB (The Molecular Signatures Database) is a database comprising of annotated gene sets for use by gene set enrichment analysis (GSEA) tools. The web-based tool allows keyword search, browsing by collection, name, and annotation, computation of overlaps, categorizing, and viewing of expression profiles as well as downloading. A pure Mouse and human version are available in R format at Walter+Eliza Hall Institute of Medical Research (see links).
  21. goseq
    • Description : goseq is a tool for functional profiling by analysis of gene ontology (GO) of RNA_seq data. The goseq algorithm can also estimate the effect of bias.
  22. SeqGSEA
    • Description : SeqGSEA is an R tool for gene enrichment analysis (GSEA) of RNA-seq data. The SeqGSEA algorithm uses a negative binomial distribution model for count data, computes and scores differential splicing and expression, estimates for biological variability, and gene set enrichment analysis (GSEA).
  23. GAGE
    • Description : GAGE (Generally Applicable Gene-set Enrichment) is a tool for gene set enrichment (GSEA) and pathway analysis of RNA-seq and microarray data. The GAGE package contains functions for GSEA, processing of results, reporting, batch analyses, and comparison between studies.
  24. GSAASeqSP
    • Description : GSAASeqSP is a tool for gene set association analysis of sequence read count data. The algorithm includes functions for gene-level and gene set-level statistics.
  25. SeqGSA
    • Description : SeqGSA is an R tool for the analysis of gene sets in RNA-seq data. The SeqGSA algorithm accounts for the effects of variable gene lengths.
  26. Enrichr
    • Description : Enrichr is a web-based tool for gene over-representation analysis using functional annotations. Enrichr has more than 30 geneset libraries and can visualize results with JavaScript library Data-Driven Documents (D3).
  27. clusterProfiler
  28. g:Profiler
    • Description : g:Profiler is a web-based tool for gene enrichment analysis of gene ontologies, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, and protein-protein interaction analyses. g:Profiler updates Ensembl data quarterly. An R client is available and it includes the following four tools: 1. g:GOSt for functional enrichment analysis and visualization, 2. g:Convert for identifier conversion, 3. g:Orth for orthology searching, and 4. g:SNPense for Single nucleotide polymorphism (SNP) mapping by 'rs' identifiers.
  29. GOEAST
    • Description : GOEAST (Gene Ontology Enrichment Analysis) is a Gene Ontology (GO) enrichment analysis tool. It can identify over-represented GO terms and uses several different data sources and species.
  30. GOrilla
    • Description : GOrilla is a web-based tool to identify and visualize enriched GO terms of a list of genes.
  31. ToppGene Suite
    • Description : ToppGene Suite consists of 1. ToppFun for functional enrichment analysis based on transcriptome, ontology, proteome, phenotype, and pharmacome, 2. ToppGene for prioritization of candidate genes, 3. ToppNet for ranking genes based on topological features, and 4. ToppGenet for prioritization of neighboring genes based on protein-protein interaction networks.
  32. WebGIVI
    • Description : WebGIVI is a web-based tool for gene enrichment and visualization to obtain gene symbols and iTerm pairs. It uses Cytoscape or Concept Map.
  33. PANTHER Database and Tools
    • Description : Protein classification to ease high-throughput analyses. Classification is by family and subfamily, molecular function, biological process, and pathway. Updated single-nucleotide polymorphisms (SNP) scoring tool.
  34. DAVID Bioinformatics Resources
    • Description : DAVID Bioinformatics Resources (Database for Annotation, Visualization, and Integrated Discovery ) is a large resource for visualization, functional annotation, and clustering tools using such databases as GO , KEGG pathways, Biocarta, UniProt/SwissProt.
  35. GO enrichment analysis tool
    • Description : A tool for gene enrichment analysis to find over- and under-represented GO terms in a gene set.
  36. DOSE
    • Description : DOSE is a tool for annotating genes based on Disease Ontology (DO). The DOSE includes functions for enrichment analyses using a hypergeometric model.
  37. TS-GOEA
    • Description : TS-GOEA is a web-based tool for enrichment analysis using data from more than 50 human tissues. The TS-GOEA algorithm uses a hypergeometric distribution for testing, annotations from Gene Ontology Resource (GO), and expression data from GTEX portal.
  38. GSEA (UCSD)
    • Description : GSEA (Gene Set Enrichment Analysis) is a tool for gene enrichment analysis. The GSEA algorithm assesses differences between gene sets and uses the Molecular Signatures Database (MSigDB) for annotation.
  39. SIGN
    • Description : SIGN (Similarity Identification in Gene expressioN) is a tool for the analysis of expression patterns and pathways in RNA-seq data. The SIGN algorithm identifies similarities between biological sample sets.
  40. ToppCluster
    • Description : ToppCluster is a web-based tool for the enrichment and network analysis of mammalian RNA-seq data. ToppCluster allows visualization using R, TreeView, GenePattern, Cytoscape, and Gephi.
  41. PINTA
    • Description : PINTA is a web-based tool to prioritize candidate genes based on their genome-wide protein-protein interaction network neighborhood and differential gene expression. PINTA works for human, mouse, rat, worm, and yeast data.
  42. DaMiRseq
    • Description : DaMiRseq is an R tool to select features, classification, and bias removal of RNA-seq data. The input consists of raw counts. DaMiRseq has visualization functions for heatmaps, RLE, MDS, or correlation plots.
  43. Mergeomics
    • Description : Mergeomics is a tool to recognize pathological pathways, regulatory pathways, and key regulators in omics data. The Mergeomics algorithm consists of two modules: 1. Marker set enrichment analysis (MSEA) and 2. Weighted Key Driver Analysis (wKDA).
  44. goSTAG
    • Description : goSTAG is a tool GO enrichment analysis (GSEA) of RNA-seq data sets or other high-throughput technologies. The algorithm uses Fisher s exact test and GO subtrees for annotation to describe biological themes.
  45. MCbiclust
    • Description : MCbiclust (Massively Correlated Biclustering) is a tool to cluster correlated gene expression in RNA-seq data sets. The MCbiclust algorithm finds the maximum strength correlation matrix and includes visualization tools.
  46. CrossHub
    • Description : CrossHub is a tool for the analysis of methylome data using The Cancer Genome Atlas (TCGA). CrossHub identifies possible transcription factor gene (TF-gene) interactions. The CrossHub algorithm has RNA-Seq analysis functions for differential expression analysis, prediction of regulatory TF and miRNA, analysis of methylation profiles, and RNA-Seq vs. clinical (TNM, stage, follow-up) correlation analysis.
  47. TSSAR
    • Description : TSSAR is a web service for the identification of bacterial transcription start sites (TSS) in RNA-seq data. The TSSAR algorithm uses Skellam distribution statistics to evaluate enriched transcripts.
  48. CracTools
    • Description : CracTools provides a set of command-line tools for the analysis of single nucleotide variation (SNV), insertions/deletions (indels), splice junctions, chimeric reads in RNA-seq data.
  49. IUTA
    • Description : IUTA is a tool for the detection of isoform usage in RNA-seq data sets generated by Illumina paired-end sequencing. The IUTA algorithm uses two probability distributions under the Aitchison geometry to test equal mean values.
  50. Heat seq
    • Description : Heat seq is a web-based tool to compare ChIP-seq, RNA-seq, and CAGE experiment data with public data.
  51. asSeq
    • Description : asSeq is a too to expression quantitative trait locus (eQTL) mapping. The asSeq algorithm combines the total read count and allele-specific expression to identify cis- and trans-eQTL.
  52. UnoSeq
    • Description : UnoSeq is a tool to analyze expression profiles of organisms that lack genome and/or transcriptome information using Illumina RNA-seq data.
  53. spliceR
    • Description : spliceR is a tool to classify alternative splicing and assess the coding potential of RNA-seq data. The spliceR algorithm can detect exon skipping, intron retention, alternative first or last exon usage, donor and acceptor sites, and mutually exclusive exon events. spliceR produces genomic coordinates for all differentially spliced sequences and predicts the coding potential and possible nonsense-mediated decay for each of the transcripts.
  54. networkBMA
    • Description : networkBMA is a tool to infer gene regulatory networks. The networkBMA algorithm uses a Bayesian inference method combining external information to increase accuracy.
  55. 3USS
    • Description : 3USS is a web-based tool to detect alternative 3 prime UTRs in RNA-seq data.
  56. CAMUR
    • Description : CAMUR (Classifier with Alternative and MUltiple Rule-based models) is a tool to obtain knowledge by extracting several different classification models of gene features in RNA-seq data. The CAMUR algorithm uses an iterative approach to compute a classification model and the power and includes an ad-hoc knowledge database and query tool.
  57. MetaDiff
    • Description : MetaDiff is a tool to analyze isoform expression in RNA-seq data. The MetaDiff algorithm uses a random-effects meta-regression.
  58. FlaiMapper
    • Description : FlaiMapper is a tool to identify and annotate small non-coding RNAs (sncRNAs) in RNA-seq datasets. The FlaiMapper algorithm uses small RNA-seq read alignments to annotate fragments by peak detection on the start and end position densities. The final step is filtering and a reconstruction.
  59. RNA-eXpress
    • Description : RNA-eXpress is a tool to annotate new biological significant transcript features in RNA-seq datasets.
  60. WaveQTL
    • Description : WaveQTL is a tool for the analysis of functional phenotypes in RNA-seq, ChIP-seq, and DNase-seq data. The WaveQTL algorithm uses wavelet-based techniques for the analyses and tests for the association among genetic variants and the underlying function.
  61. GETUTR
    • Description : GETUTR is a tool estimate and quantify the usage of 3 prime UTRs in RNA-seq datasets
  62. ISoLDE
    • Description : ISoLDE is a tool to identify genes that are imprinted in RNA-seq datasets.
  63. Transcriptator
    • Description : Transcriptor is a pipeline tool for GO enrichment analysis in RNA-seq datasets that lack the reference genome.
  64. IsoSCM
    • Description : IsoSCM (Isoform Structural Change Model) is a tool to annotate 3 prime UTRs in RNA-seq data. The IsoSCM algorithm uses change-point analysis to improve annotation assessment.
  65. GIREMI
    • Description : GIREMI (genome-independent identification of RNA editing by mutual information) is a tool to predict adenosine-to-inosine editing in RNA-seq data.
  66. GGEA
    • Description : GGEA (Gene Graph Enrichment Analysis) is a tool for the detection of enriched gene sets in RNA-seq datasets. The GGEA algorithm uses prior knowlede obtained from gene regulatory networks.
  67. zFPKM
    • Description : zFPKM is a tool for the identification of biologically relevant genes using the zFPKM normalization method. The algorithm operates with gene-level data using FPKM or TPM.
  68. Annocript
    • Description : Annocript is a tool for the annotation of transcriptomes. The package uses BLAST with UniProt, NCBI Conserved Domain Database and Nucleotide divisions, Gene Ontology, UniPathways, and the Enzyme Commission.
  69. EBSeqHMM
    • Description : EBSeqHMM is a tool for the identification of isoforms with variable expression profiles over time. The EBSeqHMM algorithm uses an empirical Bayes mixture modeling and an auto-regressive hidden Markov model to assess the dependency of gene expression and conditions.
  70. QuSAGE
    • Description : QuSAGE (Quantitative Set Analysis of Gene Expression) is a tool quantify and analyze differential gene expression (DGE) and gene to gene correlations in RNA-seq data sets. The QuSAGE algorithm quantifies gene-set activity with a complete probability density function, computes p-values and confidence intervals.
  71. seq2pathway
    • Description : seq2pathway is a tool for the functional gene-set analysis of genomic loci in RNA-seq data sets. The seq2pathway algorithm assigns genes to the pathways and computes gene-level pathway scores.
  72. ToPASeq
    • Description : ToPASeq is an interface tool for the pathway analysis of RNA-seq and microarray data sets. The ToPASeq can run the following tools: SPIA, DEGraph, TopologyGSA, TAPPA, PRS, PWEA. ToPASeq also has functions for visualization, importing, and manipulation of pathways.
  73. GOexpress
    • Description : GOexpress is a tool to identify Gene Ontology (GO) terms in RNA-seq data sets. It uses a supervised clustering approach and can concurrently classify samples from many experimental groups. The GOexpress package also has functions to visualize gene expression profiles.
  74. Genexpi
    • Description : Genexpi is a tool for the identification of sigma factors using a combination of data from literature and time-course gene expression data. The CyGenexpi (see links) plugin integrates Genexpi with the Cytoscape tool. The Genexpi tool is available as an R package,
  75. KOBAS
    • Description : KOBAS (KEGG Orthology-Based Annotation System) is a tool for the annotation of sequences by KEGG Orthology terms. KOBAS also identifies enriched pathways and uses KEGG Pathway, PID, BioCyc, Reactome, Panther and human data from OMIM, KEGG Disease, FunDO, GAD, NHGRI, and GWAS databases.
  76. SplAdder
    • Description : SplAdder is a tool to analyze alternative splicing using RNA-seq alignment data. The SplAdder algorithm uses annotation, and RNA-Seq read alignments, computes a splicing graph, and quantifies the splicing events that may be useful in differential analysis.

3.2 Comparison With Genome

  1. VARUS
    • Description : VARUS is a tool for genome annotation using RNA-seq reads data. The VARUS algorithm uses reads from NCBI's Sequence Read Archive and depends on samtools, bamtools, fastq-dump, and STAR or HISAT2.
  2. rnaSeqMap
    • Description : rnaSeqMap is a tool that contains functions for the analysis of RNA-seq datasets using coverage profiles of multiple samples. The tools include functions for analysis of sequence coverage, significance, splicing, and are useful for, for example, finding novel transcripts.