For better experience, turn on JavaScript!


67 Free DNA Sequence Analysis Tools - Software and Resources

67 Free DNA Sequence Analysis Tools - Software and Resources

mass spectrometer.



The advanced search function is under maintenance and coming up shortly. We apologize the inconvenience.


General DNA, RNA sequence Analysis Tools

  1. Alfresco
    • Description : Alfresco is a software tool for comparative genome sequence analysis. The algorithm compares pars of likely homologous sequence regions from different species and visualizes results from external analysis programs to detect functional sequence domains. External software tools may be, for example, repeat masking, sequence alignment, database searching, protein homology, regulatory sequence prediction, and gene prediction.
  2. ALP
    • Description : ALP (Ascending Ladder Program) is a tool to calculate the statistical parameters in the modified Gumbel distribution for BLAST. The ALP algorithm computes E-values for random local DNA-DNA and protein-protein alignments, gap costs, and character abundances for any substitution matrix. FALP (Frameshift Ascending Ladder Program) is a tool for comparable tasks for frameshifting DNA-protein alignments. The tools are available as a library or a standalone implementation.
  3. Arioc
    • Description : Arioc is a set of tools to align short bisulfite-treated DNA sequences (BS-seq reads) to long reference DNA sequences. The Arioc algorithm runs both in GPU and CPU and uses parallel sort and reduction routines to distinguish locations of likely alignments.
  4. AutoCSA
    • Description : AutoCSA (Automatic Comparative Sequence Analysis) is a software tool to detect short, approximately 1 to 50 base pairs long, heterozygous, and homozygous mutations, consisting of insertions and deletions (indels) in capillary sequence traces.
  5. BaMM
    • Description : BaMM is a web-based server with four tools: 1. de-novo discovery of enriched motifs in nucleotide sequences, 2. motif finding in nucleotide sequences, 3. Sequence searching BaMM database for transcription factor motifs, 4. keyword searching the motif database. The BaMM algorithm uses Bayesian Markov Models (BaMMs) of order 4 with AvRec score, the average recall over the TP-to-FP ratio between 1 and 100.
  6. BATS
    • Description : BATS (Basic Analysis Toolkit for biological Sequences) is a software tool and library for the analysis of local sequence alignments by string matching, global sequence alignments by longest common subsequence (LCS) using affine or concave gap cost functions. BATS can also filter strings from sequences and compute the statistical significance as z-score and generate models.
  7. Bioinformatics Toolkit
    • Description : Bioinformatics Toolkit is a web-server consisting of a collection of sequence analysis tools.
      Search tools: HHblits, HHpred, HMMER, PatternSearch, ProtBLAST, PSI-BLAST
      Alignment: Alignmentviewer, Clustal Omega, Kalign, MAFFT, MSAProbs, MUSCLE, T-Coffee
      Sequence Analysis: Aln2Plot, DeepCoil, DeepCoil2, HHrepID, MARCOIL, PCOILS, REPPER, TPRpred
      Secondary Structure: Ali2D, HHomp, Quick2D
      Tertiary Stucture: MODELLER, SamCC
      Classification: ANESCON, CLANS, MMseqs2, PhyML
      Utilities: 6FrameTranslation, BackTranslator, FormatSeq, HHfilter, RetrieveSeq, Seq2ID, Reformat
  8. BioWord
    • Description : BioWord tool is an add-on for Microsoft Word 2007 and 2010 word-processors to manipulate DNA sequences. BioWord editing functions include reverse-complementing, translating, sequence searching, pair-wise sequence alignment, motif discovery, generation of consensus logos representing multiple sequence alignments (MSA), and FASTA formatting.
  9. BuddySuite
    • Description : BuddySuite is a collection of four related tools:
      1. SeqBuddy is a tool to handle FASTA, GenBank, and NEXUS sequence file formats. SeqBuddy includes functions to manipulate and analyze sequence data using 50+ separate tool modules.
      2. AlignBuddy: 30 separate tool modules to read, write, analyze, and manipulate PHYLIP, Stockholm, and NEXUS sequence alignment files.
      3. PhyloBuddy: Consists of 18 tool modules to manage and manipulate phylogenetic trees in NEXUS, Newick, and NeXML formats.
      4. DatabaseBuddy: Contains function to search NCBI, UniProt, and Ensembl databases. The DatabaseBuddy algorithms can sort and filter the search results.
  10. CAFE
    • Description : CAFE is a software tool using alignment-free methods to compute distance or dissimilarity by k-mer counts, background-adjusted k-mer counts, or measure based on presence/absence of k-mers, and 25 additional measures. CAFE visualization includes heatmaps, two-dimensional projection by principal coordinate analysis (PCoA), dendrograms of a network, and sequence clustering by the neighbor-joining algorithm.
  11. CorGen
    • Description : CorGen is a web-based tool to measure long-range correlations in DNA sequences characterized by a power-law decay of the autocorrelation function of the GC-content. CorGen also has a function that generates random DNA sequences with user-specified parameters, alternatively by using the parameters obtained from another DNA sequence.
  12. cpgplot
    • Description : cpgplot is a tool for plotting and identification of CpG islands in nucleotide sequences. The cpgplot algorithm computes CpG islands in overlapping windows along a sequence and by default defines a CpG island where the percent G + C is more than 50% and the observed vs. expected ratio is over 0.6. The minimum length of the region is 200 bases and at least 10 windows.
  13. cpgplot_(EBI)
    • Description : Cpgplot or "EMBOSS Cpgplot" is a web-based tool at EBI to recognize and plot CpG islands in nucleotide sequences. See also Cpgplot
  14. DAMBE7
    • Description : DAMBE7 is a tool for genomic and phylogenetic sequence data analysis. The DAMBE7 package includes functions for 1. Sequence alignment, 2. Molecular phylogenetics, 3. Position weight matrix to analyze sequence motifs, 4. Perceptron for classification of sequence motifs, 5. Gibbs sampler, 6. Hidden Markov models, 7. Secondary structure prediction, 8. rRNA anticodon identification, 9. Codon usage bias, 10. Computation of isoelectric point, and 11. Peptide mass fingerprinting. DAMBE7 works with a variety of well-known sequence formats.
  15. DIAL
    • Description : DIAL (dihedral alignment) is a web-based tool for RNA sequence alignment based on structures. The DIAL algorithm does not require a reference genome and utilizes nucleotide sequence, dihedral angle, and nucleotide base-pairing similarity. The DIAL includes functions for Needleman-Wunsch (global), Smith-Waterman (local), motif search (global-semi global) alignments, and a viewer for 3-dimensional superposition of query and target.
  16. Dotter
    • Description : DOTTER is a tool to render dot-matrix plots between nucleotide sequences, i.e., DNA and protein sequences. The main feature of DOTTER is that it allows the user to interactively adjust the alignment stringency using a 'Greyramp' tool.
  17. eShadow
    • Description : eShadow is a web-based tool for comparative nucleotide sequence analysis. The eShadow algorithm uses two statistical methods. And can train an underlying Hidden Markov Model to predict functional sequences.
  18. IgDiscover
    • Description : IgDiscover tool is for the analyzes of antibody repertoires. IgDiscover algorithm identifies new V genes, heavy chains, kappa, and lambda light chains to discover VH, VK, and VL genes.
  19. Orchid
    • Description : Orchid is a machine learning framework tool to manage, annotate, and analyze cancer mutations to support the knowledge of tumor genetic data. Example of usage: Sub-typing aggressive vs. non-aggressive prostate cancer using mutational profiles in tumor sequence data. NOTE: Orchid requires code or data under separate licenses or copyrights restricting the usage to non-commercial activities.
  20. Pegasys
    • Description : Pegasys is a web-based service containing numerous tools for sequence analyses, such as Agave, Alignment Tools, BRENDA, CpGAT, Edit Tools, Emboss, Information Tools, Kegg Search, Nucleic Tools, Phylogeny Tools, Protein Tools, Sabio - Rk, and Similarity Search Tools.

      A user can create custom analysis workflows with the Pegasys system using a graphical interface.
  21. PyBamView
    • Description : PyBamView is a tool to visualize sequence alignments from BAM files with an optiontional of FASTA-formated reference genome. The PyBamView algorithm renders Single-nucleotide polymorphism (SNP), insertions, and deletions and provides an export function for the creation of publication-ready figures.
  22. RSAT
    • Description : RSAT is a web portal for numerous software tools for the detection and analysis of regulatory signals in non-coding nucleotide sequences. The RSAT tools include sequence retrieval, pattern matching, pattern discovery, feature-map drawing, random sequence generation. And many other tools and utilities. Users can also integrate tools into specific workflows. RSAT site provides access to six servers, RSAT Fungi, RSAT Prokaryotes, RSA Metazoa, RSAT Protists, RSAT Plants, and RSAT Teaching.
  23. SeWeR
    • Description : SeWeR (SEquence analysis using WEb Resources) is a web-based tool for nucleic acid and protein sequence analysis. The tools include sequence retrieval, restriction analysis, translation, gene feature prediction, primary and secondary structure search, PCR primer design, sequence alignment, plasmid drawing, among several others. Users can also download SeWeR and install it locally.
  24. SPARSE
    • Description : SPARSE (Sparsified Prediction and Alignment of RNAs based on their structure Ensembles) is a tool to align RNA sequences based on structural properties of RNA ensembles. The SPARSE algorithm uses a Sankoff-style algorithm and runs in quadratic time without heuristics.
  25. supermatcher
    • Description : supermatcher is a tool to compute approximate alignments between search sequences and the target sequences, for example, sequences in a database. The supermatcher algorithm determines likely matching sequences utilizing word matches and produces the sequence alignments using the Smith-Waterman local alignment method.
  26. unitas
    • Description : unitas is a tool to annotate small non-coding RNA datasets generated by high-throughput sequencing. The unitas algorithm uses the latest reference sequences from public online databases for the annotation.
  27. VectorNTI
    • Description : Vector NTI Software is a tool for sequence analysis and biological data management, consisting of five modules: Vector NTI, AlignX, BioAnnotator, ContigExpress, and GenomBench. Operations include primer design, sequence alignment, virtual cloning, and sequence assembly. Note that this software is no longer supported.
  28. WebSat
    • Description : WebSat is a web-based tool to predict molecular markers, visualization of microsatellites, and design primers for them. The WebSat algorithm accepts user-defined search parameters and a simple way to export the results.
  29. wordmatch
    • Description : wordmatch is a tool to find all identical matches between two nucleotide sequences.

Repeat Analysis Tools

  1. ATRHunter
    • Description : ATRHunter is a tool to find approximate tandem repeats in DNA sequences. The ATRHunter algorithm uses a statistical model allowing a variety of definitions of tandem repeats.
  2. CENSOR
    • Description : CENSOR is a tool to mask a sequence given a reference collection of sequences. CENSOR also reports all masked sequences. Note that EBI has retired this tool.
  3. CRISPRCasFinder
    • Description : CRISPRCasFinder is a tool to find CRISPR (clustered regularly interspaced short palindromic repeats) arrays and detect Cas proteins. The CRISPRCasFinder algorithm aids validation using a rating system, predicts the orientation of CRISPRs, and detects and types Cas protein based on the latest classification.
  4. CRISPRcompar
    • Description : CRISPRcompar is a web-based tool to assist biologists using the CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) for comparative and evolutionary analyzes of closely related bacterial strains.

      The CRISPRs web server contains several tools for the CRISPR analyzes: CRISPRdb, BLAST CRISPRdb, CRISPRcompar, CRISPRtionary, FlankAlign, CRISPRs finder, and CRISPRs Utilities.
  5. CRISPRFinder
    • Description : CRISPRFinder is a web-based tool for the discovery of CRISPRs, the definition of direct repeats (DR), extraction of spacers, obtaining flanking sequences from the Genbank database, and examine DRs in prokaryotic genomes in general. See also CRISPRcompar
  6. detectIR
    • Description : detectIR is a tool to find perfect and imperfect repeats and inverted repeats in DNA sequences. The detectIR algorithm uses vector calculation of complex numbers.
  7. Dfam
    • Description : Dfam is a web-based database containing transposable Element DNA sequence alignments (interspersed repeats), Hidden Markov Models (HMMs), consensus sequences, and genome annotations. The Dfam database represents the transposable element alignments as families together with annotations.
  8. einverted
    • Description : einverted is a tool for finding inverted repeats, or stem-loops, in nucleotide sequences. The einverted algorithm uses dynamic programming for the local alignments.
  9. equicktandem
    • Description : equicktandem is a tool to detect nucleotide sequence sections that potentially contain repeats in tandem. The equicktandem algorithm identifies segments of sequences where bases match elsewhere in the sequence without gaps. The equicktandem algorithm scores each match with +1 and each mismatch with -1.
  10. etandem
    • Description : etandem is a tool to find tandem repeats in DNA sequences. The tandem algorithm computes consensus sequences and scores them using +1 for each match and -1 for each mismatch. The tool accepts A, C, G, T, and N characters as input and uses non-overlapping windows to search putative repeated sequence stretches.
  11. GREAM
    • Description : GREAM (Genomic Repeat Element Analyzer for Mammals) is a tool to select, screen, and analyze genomic repeats in mammals that are likely to be important.

      GREAM offers the following listings and analyzes:
      1. Produce a categorized list of a wide range of statistically over- or under-reperensented repeated elements, and specific types, such as, for example, transposons, retro-transposons.
      2. Enrichment within a specied region of a chromosome.
      3. comparative distribution across the neighborhood of orthologous genes.
  12. hipSTR
    • Description : HipSTR (Haplotype inference and phasing for Short Tandem Repeats) is a tool to genotype, phase short tandem repeats (STRs), and to analyze and validate de novo STR mutations genome-wide. HipSTR also includes a function to visualize the supporting reads. The HipSTR algorithm uses an EM algorithm to learn locus-specific PCR stutter models, a hidden Markov model (HMM) to align reads to candidate alleles avoiding STR artifacts, and phased SNP haplotypes for genotyping and phasing.
  13. Kmer-SSR
    • Description : Kmer-SSR is a tool to detect simple sequence repeats (SSRs) in genomic sequences. The Kmer-SSR algorithm has an option for an exhaustive search.
  14. LobSTR
    • Description : lobSTR is a tool to align and genotype short tandem repeat profiles from high-throughput sequencing data. The lobSTR algorithm uses concepts from signal processing and statistical learning methods to circumvent gapped alignment to filter noise.
  15. LTR_Finder
    • Description : LTR_Finder (Long Terminal Repeat Finder) is a web-based tool to find full-length LTR retrotransposons in genome sequences.
  16. mreps
    • Description : mreps is a tool to identify tandem repeats in DNA sequences.
  17. palindrome
    • Description : palindrome is a tool to find inverted repeats (palindromes, stem-loops) in nucleotide sequences. The palindrome algorithm detects all inverted repeats given a minim and a maximum length, maximum gap, and a maximum number of mismatches.
  18. Pegasys
    • Description : Pegasys is a web-based service containing numerous tools for sequence analyses, such as Agave, Alignment Tools, BRENDA, CpGAT, Edit Tools, Emboss, Information Tools, Kegg Search, Nucleic Tools, Phylogeny Tools, Protein Tools, Sabio - Rk, and Similarity Search Tools.

      A user can create custom analysis workflows with the Pegasys system using a graphical interface.
  19. Phobos
    • Description : Phobos is a tool to detect tandem repeats in genomic sequences. This tool can find repeated units that are longer than 5,000 base pairs in length. Phobos is also well suited for primer design purposes using minisatellites and long sequences repeated in tandem. Phobos software is available by request from Christoph Mayer.
  20. PlotRep
    • Description : PLOTREP is a web-based tool to visualize dispersed genomic repeats. The PLOTREP algorithm merges similar repeat copies and visualizes the results alike to dot plots.
  21. RepeatAnalyzer
    • Description : RepeatAnalyzer is a tool to store, manage, and analyze short sequence repeats (SSRs) to identify strains. The RepeatAnalyzer uses Anaplasma marginale as a model species, but the tool can analyze any SSRs in any species. The RepeatAnalyzer algorithm uses regional genetic diversity as a part of analyses and has functions for visualizing genotype and SSR distributions.
  22. RepeatExplorer
    • Description : RepeatExplorer pipeline tool is for the identification and characterization of DNA repeats in plant and animal genomes using high-throughput data sets. The RepeatExplorer algorithm utilizes graph-based clustering.
  23. RepeatMasker
    • Description : RepeatMasker is a tool to detect repeats and low complexity DNA sequences. The RepeatMasker can use nhmmer, cross_match, ABBlast, WUBlast, RMBlast, and Decypher for repeat detection, as well as Dfam and Repbase libraries. The RepeatMasker algorithm outputs annotation and a FASTA file with repeats masked, i.e., replaced by Ns by default.
  24. RepeatModeler
    • Description : RepeatModeler is a tool to find transposable elements and consists of three programs to compute repeat boundaries and classify family relationships from DNA sequence data sets:
      RECONRepeatScoutLtrHarvest/Ltr_retriever
  25. RepeatRunner
    • Description : RepeatRunner is a pipeline tool to identify repeated sequences in DNA sequences. The RepeatRunner algorithm uses RepeatMasker to search nucleotide libraries of knowns repeats and BLASTX searches.
  26. REPuter
    • Description : REPuter is a tool to study repetitive DNA on a genomic scale. The REPuter algorithm detects various types of repeats, reports statistical significance, and has interactive visualization.
  27. ReUPRed
    • Description : ReUPred (Repetitive Units Predictor) is a tool to predict and classify repeat units. The ReUPred algorithm uses Structure Repeat Unit Library (SRUL) derived from RepeatsDB.
  28. Satellog
    • Description : Satellog is a database to identify and dynamically prioritize repeats by using various characteristics, for example, repeat unit, repeat length percentile rank, class, period, total length, genomic coordinates, UniGene polymorphism profile, proximity to or presence within gene regions, such as CDS, UTR, location upstream.
  29. SBARS
    • Description : SBARS (Spectral-Based Approach for Repeats Search) is a tool to identify various types of repeats. The SBARS algorithm uses spectral methods to profile nucleotide sequences on multiple scales to decrease the running time for creating dot plots.
  30. STRScan
    • Description : STRScan is a tool to profile short tandem repeats (STRs) in high-throughput sequencing data sets and is useful for human identity testing.
  31. TRAP
    • Description : TRAP (the Tandem Repeats Analysis Program) is a tool to choose, classify, quantify, and automatic annotation of sequences repeated in tandem. The TRAP algorithm utilizes the results from the Tandem Repeats Finder to analyze the satellite content of DNA sequences.
  32. TRF
    • Description : Tandem Repeats Finder is a tool to find tandem repeats in DNA sequences. The Tandem Repeats Finder algorithm uses k-tuples for matching to speed up the computation and computes consensus sequences.
  33. Tuiuiu
    • Description : Tuiuiu is a software tool that filters multiple repeats using the criterion of edit distance. Tuiuiu is useful as a preprocessing step in the construction of multiple sequence alignments. Note that the home page is no longer accessible. Tuiuiu might be available from the authors of this software.







If you find errors, please report here: comments and suggestions.