For better experience, turn on JavaScript!


98 Free Whole Genome Assembly (WGA) Analysis Tools - Software and Resources

98 Free Whole Genome Assembly (WGA) Analysis Tools - Software and Resources

Graph: The word 'WGA' occurences in scientific articles stored in PubMed from 1990 to June 2019.
The word "WGA" occurences in scientific articles stored in PubMed from 2000 to December 2018.



  1. 3D-DNA
    • Description : 3D-DNA is a tool to scaffolding contigs using Hi-C reads. It implements iterative scaffolding and mis-assembly detection as the core of the pipeline. It can also perform genome polishing and merging assembly to give chromosome-scale scaffolds.
  2. A5
    • Description : A5 is a tool for automating genome assembly pipeline and it consists of 5 steps: cleaning reads, assemble error corrected reads, scaffolding, scaffold validation, and final scaffold assembly. It has automated parameter selection, which is useful to new users.
  3. ABySS
    • Description : ABySS is a tool for de novo genome assembly using short read data. It implements a distributed representation of de Bruijn graphs, which enable parallel computation of the assembly algorithm. ABySS stands for Assembly By Short Sequencing.
  4. ALE-Assembly
    • Description : Assembly Likelihood Evaluation (ALE) is a tool for evaluating accuracy of assemblies without the need of a reference genome. It makes use of read quality, mate pair orientation, insert length, sequencing coverage, read alignment and k-mer frequency. This tool can be used to detect errors in metagenomes as well by pinpointing single base errors, indels, genome re-arrangement and chimera.
  5. ALLPATHS
    • Description : ALLPATHS is a tool for genome assembly that is applicable to all types of sequences and not limited to just short reads. It performs paired-read assembly in 2 stages: (i) finding all paths across a read pair, which involves filling the gap between each paired end read with other paired end reads, (ii) grouping together paired ends from the previous step and assemble each of these groups separately before assembling all of them to form the final assembly.
  6. Allpaths-LG
    • Description : ALLPATHS-LG is a tool for assembling both small and large genomes and it is an improved version of ALLPATHS. The improvements from its predecessor include better handling of repetitive sequences, error correction, accept sequenced data from jumping libraries, efficient memory usage and can handle low coverage regions.
  7. aTRAM
    • Description : aTRAM is a tool for assembling targeted genes from single library of paired-end reads. It does not perform de novo genome assembly but rather it uses an iterative approach to assemble targeted genes, which saves time for those who are interested only in gathering specific genomic sequences.
  8. AutoSeqMan
    • Description : AutoSeqMan is a tool for assembling Sanger sequences into contigs for users working with the Seqman program. While the SeqMan program is an excellent GUI tool, when users have multiple sequences to assemble contigs, the manual process of clicking through the various functions can be time consuming. Using SeqMan scripting language, AutoSeqMan added two modules to classify and assemble sequences.
  9. Bandage
    • Description : Bandage is a graphical user interface tool for visualization of connections in assembly graphs. It is not a tool for de novo genome assembly but it is very useful for users who wish to inspect nodes and extract sequences from assembly graphs.
  10. BioNanoAnalyst
    • Description : BioNanoAnalyst is a tool for evaluating potential mis-assemblies in reference genomes using optical maps. It is a cross-platform graphical user interface (GUI) program. It produces GFF3 output of potential mis-assembled regions and it also has a zoom in visualization of such genomic locations.
  11. BUSCO
    • Description : BUSCO is a tool to assess completeness of genome assembly, gene set and transcriptome. It is based on the concept of single-copy orthologs that should be highly conserved among the closely related species. For example, users who wish to study the completeness of a mammalian genome assembly will use single-copy orthologs discovered among mammalian species.
  12. CANU
    • Description : CANU is a tool to assemble long reads from either PacBio or Oxford Nanopore, which have higher error rates than short reads from Illumina. The tool runs much faster than its predecessor, Celera Assembler, and implemented some new overlapping and assembly algorithms such as adaptive overlapping strategy and sparse assembly graph construction. It can also provide output in graphical fragment assembly (GFA) format. See also the add-on: HiCanu.
  13. CAR
    • Description : CAR is a tool to rearrange contigs based on a known reference sequence. The algorithm implemented considers permutation of the contig groups and join them to match the reference. It is only implemented as a web application and only small prokaryotic genome can be scaffolded this way.
  14. Celera
    • Description : Celera is the first generation of assembler capable of assembling the genomes of multicellular organisms. It was used to assemble the model organism, fruit fly, and subsequently used to assemble the first human genome. The algorithm implemented is based on Overlap-Layout-Consensus.
  15. CUDA-EC
    • Description : CUDA-EC is a tool to parallelize error correction of short reads by leveraging the power of GPU. The corrected short reads by this tool are ready for assembly. It implements a space-efficient Bloom filter data structure.
  16. dnaasm
    • Description : dnaasm is a tool for assembling tandem repeats. The algoritm implemented uses relative frequency of reads to resolve tandem repeats and is able to restore tandem repeats with lengths longer than the actual sequencing read length. The software is available as console and web applications.
  17. drVM
    • Description : drVM is a tool for extracting known viral reads from metagenomics projects to automatically assemble their genomes. It is essentially a pipeline written in Python that integrates a few tools such as BLAST, SNAP, SPAdes, and khmer to reconstruct a variety of viral genomes among metagenomes. Additionally, it performs coverage profiling of the viruses.
  18. Edena
    • Description : Edena is a tool for de novo genome assembly that is based on overlap layout assembly framework and it is applicable to very short reads from the Illumina platform (e.g. 35 bp). It implements additional features that include exact matching and spurious reads removal. The use of exact matching simplifies and speed up the overlap step.
  19. ELOPER
    • Description : ELOPER is a tool to pre-process paired-end short reads for a better performance during assembly. It implements an algorithm that detects overlaps between both ends of the paired-end reads, which then merged those reads with significant overlaps. The performance is superior than assemblers that typically consider the two ends of each paired-end read separately for overlap detection. However, this tool does not perform the assembly step itself but rather it processed the paired-end reads for assembly.
  20. Enly
    • Description : Enly is a tool for closing gaps in genome assembly. It implements iterative mapping of reads at contig edges to potentially extend these sequences and may lead to closing of gaps.
  21. EULER
    • Description : EULER is a tool for de novo genome assembly. It implements a de Bruijn graph, which is a different method than Overlap-Layout-Consensus. The solution to assembling fragment of reads is found by finding a path that visits every edge of the graph exactly once, which is known as finding the Eulerian Path.
  22. FALCON
    • Description : FALCON is a tool for de novo assembly of long PacBio reads and it is an improved version of its predecessor HGAP. Unlike HGAP, it is a diploid-aware assembler that is better suited to assemble larger genomes. Users should look into FALCON-Unzip if they wish to phase the assembly as well.
  23. FALCON-Unzip
    • Description : FALCON-Unzip is a tool for de novo assembly of long PacBio reads and it is similar to FALCON except it has the ability to phase the assembly.
  24. GAM-NGS
    • Description : GAM-NGS is a tool to merge two or more assemblies to improve certain assembly metrics such as contiguity that is not achievable with the use of a single assembler. The merging process is aided by the use of weighted graph to optimally resolve problematic regions.
  25. GAML
    • Description : GAML is a tool for genome assembly based on maximum likelihood. It implements a probabilistic model to take into account sequencing error rates, insert lengths and other characteristics to produce a final genome assembly. This tool can work on sequenced data generated from multiple sequencing platforms (e.g. Illumina, 454, PacBio).
  26. GenSeed
    • Description : GenSeed is a tool that allows for targeted assembly of specific sequences in the genome using reads relevant to the targets. This Perl program implements a recursive algorithm to find sequence similarity, select reads, and assembly. The program should be useful for assembling particular nuclear genes, transcripts and extrachromosomal genomes.
  27. GRAbB
    • Description : GRAbB is a tool for assembling specific genomic loci by using these regions as baits to find corresponding reads (e.g. Illumina paired end reads) prior to de novo assembly. It can handle multiple loci assemblies simultaneously and is useful for assembling mitochondria genome, rDNA repeats and other poorly assembled regions of the genome.
  28. HapCol
    • Description : HapCol is a tool for assembling haplotypes from long reads. It exploits the random errors property of long read sequencing platforms and implemented an exact algorithm that scales well with increasing read coverage and minimizes the overall error corrections.
  29. Hapler
    • Description : Hapler is a tool for assembling sequences into haplotypes from population-sampled data. It is targeted for low-diversity and low-coverage data that are typical for ecological samples. This tool is able to construct consensus sequences and identify chimera.
  30. HapTree
    • Description : HapTree is a tool for haplotype reconstruction from sequencing data (e.g. Illumina) of a single individual genome that maybe diploid or have higher ploidy. This tool implements a maximum-likelihood estimation framework to assemble haplotypes. HapTree has a high switch accuracy within phased haplotype blocks.
  31. HGAP
    • Description : HGAP is a tool for de novo genome assembly using PacBio reads. It implements a hierarchical assembly process that starts with using reads that are longer than the rest as seed reads to gather all other reads for constructing a highly accurate preassembled reads. After this step, the preassembled reads can be assembled using the overlap-layout-consensus approach.
  32. HiCANU
    • Description : HiCANU is an add-on assembly tool to the popular long read Canu assembler. The assembler is still called Canu but has the added option to read PacBio HiFi reads using ‘-pacbio-hifi’. The main advantage of this tool is the ability to use the highly accurate and long HiFi reads for genome assembly. The algorithm implements homopolymer compression, overlap-based error correction, and aggressive false overlap filtering.
  33. HIFIASM
    • Description : HIFIASM is a genome assembler that takes advantage of the PacBio HiFi reads. It implements an algorithm that preserves the continuity of all haplotypes for the purpose of phasing the genome. It also has the capability like the Canu assembler to use parental short reads input to perform trio-binned genome assembly. This tool claims to have the ability to assemble a human genome in several hours on a single machine.
  34. HINGE
    • Description : HINGE is a tool for de novo genome assembly that addresses the challenge of using error prone long reads. It combines error tolerance feature of Overlap-Layout-Consensus and repeat resolution of de Bruijn graph assembler. Additionally, HINGE produces visually interpretable assembly graph.
  35. HLAreporter
    • Description : HLAreporter is a tool for mapping reads to a known reference panel of HLA alleles and then use these reads for de novo assembly. Apparently, the tool has outperformed similar tools such as HLAminer and PHLAT.
  36. Icarus
    • Description : Icarus is a tool for visualising draft genome assemblies for the purpose of exploring and evaluating potential mis-assemblies. It is integrated with the genome assembly evaluation tool, QUAST, and can be used to view contigs by alignment to a reference genome or by contig size.
  37. Juicer
    • Description : Juicer is a tool that uses Hi-C sequencing reads as input and create normalized contact maps for visualization. The program has been used to visualize how assembly scaffolds should be reorganized following scaffolding with Hi-C data, which typically may still contain scaffolding errors. Besides being used as a tool for correcting scaffolding by Hi-C, the program is also used to detect 3D structure of the genome.
  38. Kermit
    • Description : Kermit is a tool for using linkage maps to guide genome assembly. It simplifies assembly and reduce assembly errors for users with long-read based data for contig assembly. Linkage maps are often used to validate the assembly but, in this tool, these maps are used to guide the assembly instead of being used post-contig assembly. It implements a coloured overlap graphs strategy.
  39. KmerGenie
    • Description : KmerGenie is a tool for automatic estimation of the best k-mer to use for de Bruijn graph based genome assembly. It implements fast and accurate sampling method to produce abundance histograms for k-mers, which then leads to the best possible k-mer value using a heuristic method.
  40. Kollector
    • Description : Kollector is a tool for assembling gene sequences based on the assembler ABySS by using transcript sequences as baits to capture whole genome shotgun (WGS) reads. This way, the WGS reads used for assembly are specific to the genomic region. The algorithm identifies kmers from transcripts and seed them to a progressive bloom filter, which is needed to gather genes among WGS reads.
  41. Kourami
    • Description : Kourami is a tool for assembling HLA haplotypes. It uses high coverage whole genome sequencing data and implements a graph-guided assembly method for classical HLA genes, which is capable of discovering new HLA alleles.
  42. LACHESIS
    • Description : LACHESIS is tool to scaffold contigs based on Hi-C reads, which provide short to long range linkage information. It utilizes the contact probabilities of Hi-C reads to order and orientate contigs. Using this tool, it is possible to generate chromosome-level scaffolds.
  43. laSV
    • Description : laSV is a tool for detecting structural variants (SVs) from paired-end sequenced data at single base pair resolution. It implements a local assembly algorithm that goes through the step of constructing and storing de Bruijn graph before arriving at SVs. This tool can be useful to those who study cancers and wish to identify SVs that are different among individuals.
  44. LightAssembler
    • Description : LightAssembler is a lightweight program for genome assembly based on the use of a pair of cache oblivious Bloom filters. The advantange of using this assembler is its more efficient memory usage while still achieving high assembly accuracy and contiguity.
  45. LoRDEC
    • Description : LoRDEC is an error correction tool for long reads e.g. PacBio, using highly accurate short reads. This tool can be used prior to de novo assembly as the indel errors in long reads complicate assembly. The algorithm implements de Bruijn graph of short reads and looks for corrective sequence in each potentially erroneous region in long reads.
  46. Mapsembler
    • Description : Mapsembler is a tool for targeted assembly of particular genomic locus using short reads. It implements an algorithm to estimate occurrences of a sequence of interest among reads and build an extension graph from matching reads. This tool can be used to target repeats, SNPs, exon skipping, gene fusion and other structural variants without the need to assemble the entire genome.
  47. MaSuRCA
    • Description : MaSuRCA is a tool for genome assembly based on a hybrid approach that combines de Bruijn graph and overlap-based assembly strategies. It can be used for sequenced data with variable read lengths and hence it is suitable for assembling 454, Sanger and Illumina data. It implements the idea of assembling paired-end reads into longer super-reads to facilitate an easier subsequent assembly step.
  48. MECAT
    • Description : MECAT is a tool for de novo assembly of long single molecule sequenced reads (e.g. PacBio). The tool implemented a pseudolinear alignment scoring algorithm to remove unnecessary alignments based on distance difference factors (DDFs) to score matched k-mer pairs. Large genomes can be assembled on a single computer using this tool.
  49. Mercury
    • Description : Mercury is a tool for validating genomes assembled using long read sequencing. The algorithm uses k-mer to evaluate base accuracy and completeness of a genome by comparing the de novo assembled genome with high accuracy reads that are not used in the assembly. The program does not require another reference assembly for its evaluation. It is able to evaluate the haplotype-specific accuracy, completeness, phase block continuity and switch errors in trio binned assemblies.
  50. MindTheGap
    • Description : MindTheGap is a tool specifically designed to assemble insertion variants from re-sequencing data. It implements an algorithm that performs the following three stages: (i) creating de Bruijn graph from short read data (ii) detection of insertion breakpoints in a given reference genome, and (iii) local assembly of the inserts.
  51. miniasm
    • Description : miniasm is a tool for de novo assembly of long reads from either the PacBio or Oxford Nanopore platforms. It does not perform an error correction step. This tool is likely used in conjunction with minimap in order to generate all-vs-all reads mapping to be used as input for the assembly.
  52. misFinder
    • Description : misFinder is a tool for checking assembly errors by using a reference genome and alignments of paired-end reads. Inconsistencies such as structural variant differences between draft assemblies and a reference genome (or closely related reference genome) help reveal assembly errors. In addition, unusual paired-end reads coverage and insert distance are also exploited to reveal potential assembly errors.
  53. misSEQuel
    • Description : misSEQuel is a tool for detecting and correcting errors in draft assemblies. It implements an algorithm that is able to incorporate both paired-end reads information and optical maps to correct mis-assemblies.
  54. Mix
    • Description : Mix is a tool for finishing genome assemblies by mixing multiple draft genomes with the goal to improve sequence contiguity. This tool does not require the use of a reference genome. It implements an extension graph based on contig ends and alignments among them.
  55. npScarf
    • Description : npScarf is a tool for assembly scaffolding and gap filling suitable for smaller genomes already assembled with short reads. It takes advantage of the long reads Oxford Nanopore streaming of sequencing result to continuously analyse how the new sequences generated improved the assembly by monitoring key metrics. Once sufficient contiguity or other metrics deemed suitable have been achieved, the long read sequencing can be stopped and hence saves time and money.
  56. Orione
    • Description : Orione is a Galaxy-based framework that grouped together workflows and tools to perform de novo genome assembly, annotation, RNA-Seq and metagenomics analysis. This user-friendly tool is suitable for those working on microbes.
  57. PAGIT
    • Description : PAGIT is a tool to improve the quality of genomes that have already been assembled. It performs subsequent steps needed from the point of contig assembly, which includes closing gaps, error correction of consensus sequence and use other closely related reference genomes to scaffold and generate annotation. PAGIT bundles the software ABACAS, IMAGE, iCORN and RATT.
  58. PASHA
    • Description : PASHA is a tool for assembling genomes based on short reads using the de Bruijn graphs with the main improvement being its code for distributed computing. It is able to assemble genomes efficiently in a short amount of time provided the users have access to high performance computing.
  59. Phusion
    • Description : Phusion is a tool for de novo genome assembly. The assembler is modular in its design and uses the concept of mathematically defined repeats to find repeats and then uses another program, PHRAP as its assembly algorithm. The clustering of reads in pre-assembly step allows parallelization in the assembly stage.
  60. poreTally
    • Description : poreTally is a tool to benchmark a few assemblers of Oxford Nanopore reads. It can run CANU, Flye, SMARTdenovo and wtdbg2 assembly pipelines and generates a report in article style.
  61. Projector 2
    • Description : Projector 2 is a tool for mapping contigs and close gaps in prokaryotic genomes. The method implements the use of finished or unfinished genome assemblies as templates to infer linkage information in the sequenced organism. If there is no linkage information, the gap will be recommended for closure using PCR strategy. The tool has a user-friendly web interface.
  62. QUAST
    • Description : QUAST is a tool for genome assembly evaluation. This tool can be used with or without a reference genome and accepts multiple genome assemblies for comparisons. Users can assemble a genome with different assemblers and compare various assembly metrics.
  63. QUAST-LG
    • Description : QUAST-LG is a tool for genome assembly evaluation and is an extension of the software QUAST. While QUAST is suitable for smaller genomes (e.g. bacteria), QUAST-LG is meant for larger eukaryotic genomes (e.g. mammals). This tool implements new quality metrics such as k-mer based completeness and correctness, BUSCO completeness, and theoretical limits of assembly completeness and contiguity.
  64. RACA
    • Description : RACA is a tool to order and orientate scaffolds generated from short-read based scaffolds using reference genomes from closely related species. The tool takes advantage of conservation of homologous sequences and has demonstrated good performance in simulated and real datasets. The tool is suitable for assemblies made from short reads and no linkage map is available.
  65. Racon
    • Description : Racon is a standalone consensus building tool that can be coupled with a fast assembler such as miniasm, which performs de novo assembly with error prone long reads without error corrections. This dramatically cut down the time needed for sequence assembly and consensus generation. Racon stands for Rapid Consensus and it can be used for PacBio and Oxford Nanopore data.
  66. Ragout
    • Description : Ragout is a tool for reordering contigs to create high quality scaffolds by using a genome rearrangement approach and multiple closely related genome references as a guide. The evolutionary relationship between multiple genome references is used to order contigs. It implements assembly graph and synteny blocks to minimize gaps in assembly.
  67. Rainbow
    • Description : Rainbow is a tool to cluster and assemble short reads sequences originating from restriction-site associated DNA sequencing (RAD-seq). The Rainbow algorithm discriminates repeats from heterozygous sequences by grouping the reads into haplotypes and creates a guide tree, and implements a greedy algorithm for contig assembly.
  68. rampart
    • Description : Rampart is a tool for de novo genome assembly that is implemented as a workflow management system that automatically identify suitable assemblers given users' sequenced data. The workflow is configurable and help users evaluate which assemblers and settings produce the best genome according to some assembly metrics.
  69. Ray Meta
    • Description : Ray Meta is a tool for de novo assembly of metagenomes using distributed computing to enable parallel assemblies of multiple genomes. The program is connected to other useful tools part of the Ray series such as Ray Communities, which performs microbiome profiling. The program can assemble and profile numerous microbiomes in a computationally efficient manner.
  70. REAPR
    • Description : REAPR is a tool for reference-free evaluation of the quality of genome assembly. This tool uses paired reads to determine base accuracy and identify mis-assemblies. It reports error free bases and a new metric known as corrected N50.
  71. RecoverY
    • Description : RecoverY is a tool for identifying Y chromosome specific reads for assembly of the Y chromosome. The approach is based on k-mer abundance of reads and uses knowledge of known Y chromosome sequences from related species or transcripts. The results of this tool on assembling human and gorilla Y chromosomes were given and users may try the approach for other species Y chromosomes.
  72. Redundans
    • Description : Redundans is a genome assembly pipeline that includes de novo contig assembly, removal of heterozygous contigs, scaffolding and gap closing. Its main strength is in the assembly of heterozygous genomes that tend to overestimate genome sizes when heterozygous contigs are included. Some users may assemble the genome using other tools but use Redundans just to exclude heterozygous contigs.
  73. runBNG
    • Description : runBNG is a wrapper script written in Bash to automate tasks for BioNano optical map data. It is a substitute to IrysView, which only works on Windows based platform. It performs optical map de novo assembly, super-scaffolding, and structural variant detection based on functions implemented in IrysView.
  74. SEED
    • Description : SEED is a tool clustering similar sequences prior to subsequent steps in genome assembly. It implements a modified space seed method known as block spaced seeds to efficiently cluster 100 million short reads in less than 4 hours and has linear time and memory performance.
  75. SGA (String Graph Assembler)
    • Description : SGA is a tool for de novo genome assembly and is capable of using short reads for assembly without the need for de Bruijn graph. This tool implements memory efficient data structures and algorithms based on FM-index from compressed Burrows-Wheeler transform. This assembler can error correct, assemble and scaffold sequenced data.
  76. SHARCGS
    • Description : SHARCGS is a tool for accurate and fast assembly of short reads (e.g. 25 to 40 nucleotides). This assembler is able to assemble millions of extremely short reads and handles sequencing errors. In comparison to SSAKE, the authors claimed this tool generates almost no mis-assemblies.
  77. SHORTY
    • Description : SHORTY is a tool for de novo genome assembly of short reads, in particular reads generated from the SOLiD sequencing platform. It implements single seed reads to crystallize assemblies and estimates inter-contig distances from spanning paired-end reads.
  78. Shovill
    • Description : Shovill is a genome assembler pipeline dedicated to bacteria and microbes with smaller genome sizes. The pipeline uses SPAdes assembler at its core but also supports other assemblers such as SKESA, Velvet and Megahit. Shovill is used to assemble isolates but not metagenomes or organisms with larger genomes.
  79. SMRT
    • Description : SMRT is a tool for calling SNPs and assembling haplotypes based on long PacBio reads. It is not a de novo assembly tool and hence, this tool should not be confused with other tools offered as part of the SMRT suite of tools by PacBio developers. The name for this tool is based on Single Molecule Real Time (SMRT) sequencing and the paper describing this tool used PacBio reads. Essentially, it is a method that is able to use the more error prone long PacBio reads to call SNP and haplotypes whereas other methods may need to resort to the use of more accurate short reads for SNP calling.
  80. SOAPdenovo
    • Description : SOAPdenovo is a tool for de novo genome assembly using entirely Illumina short reads. The algorithm implements error correction, de Bruijn graph construction, tip removal, repeat resolution, bubbles merging, contig linkage graph and scaffolding. This tool has a modularized design for each assembly step and has been proven to work on human genomes.
  81. SOMA
    • Description : SOMA is a tool for scaffolding short-read based contigs of bacteria genome using optical maps. The method implemented is robust to sequencing and assembly errors. The program is available as a web-application and an open-source package.
  82. SOPRA
    • Description : SOPRA is a tool for optimising scaffolding of contig assembly using paired reads. This tool is able to use sequencing data from both Illumina and SOLiD. The algorithm implemented is different from greedy algorithms and used an optimization strategy that addresses problems of very short reads and sequencing errors.
  83. Spades
    • Description : SPAdes is a tool for assembling sequences from single-cell and multicell data types. It implements the following 4 stages: assembly graph construction, k-bimer adjustment, construction of paired assembly graph and contig construction. It is better than E+V-SC for assembling single-cell data and Velvet and SoapDeNovo for assembling multicell data.
  84. SQUAT
    • Description : SQUAT is a tool for both pre-assembly and post-assembly evalutation. The pre-assembly evaluation is based on read quality whereas the post-assembly steps takes into account how well reads are mapped onto a reference genome. The output is presented as a report in HTML format for visualization.
  85. SR-ASM
    • Description : SR-ASM is a tool for assembling short reads from the 454 sequencing platform. The algorithm implemented is a heuristic method based on graph model and takes advantage of the way 454 sequence output is presented.
  86. SSAKE
    • Description : SSAKE is a tool for de novo assembly of massive amount of short reads. It implements a progressive search through a prefix tree for the longest overlap between all sequences in a pairwise manner. The output of the assembler is contigs.
  87. SSPACE
    • Description : SSPACE is a tool for scaffolding contigs using paired-end reads. It is modified from SSAKE assembler and has the feature of extending contigs using reads that are unmappable in the contig assembly step.
  88. Taipan
    • Description : Taipan is a tool for assembling short reads using a hybrid approach of greedy extension and graph methods. The contig was constructed based on greedy extensions but at each step it makes better decisions to the corresponding read graph. Its assembly quality matches graph-based approaches but required less computing resources.
  89. Telescoper
    • Description : Telescoper is a tool for assembly of repeats such as the telomeres. It implements an iterative extension algorithm and uses both short- and long-insert libraries.
  90. VAGUE
    • Description : VAGUE is a tool that implements graphical user interface for the assembler Velvet to perform genome assembly. It simplifies tasks such as preparing input files in the right format and getting output files without worrying about command line syntax in Velvet.
  91. VCAKE
    • Description : VCAKE is a tool for short read-based genome assembly using a modification of simple k-mer extension method and it accounts for sequencing error by taking advantage of high read coverage. It is a modification of the assembler SSAKE.
  92. VectorNTI
    • Description : Vector NTI Software is a tool for sequence analysis and biological data management, consisting of five modules: Vector NTI, AlignX, BioAnnotator, ContigExpress, and GenomBench. Operations include primer design, sequence alignment, virtual cloning, and sequence assembly. Note that this software is no longer supported.
  93. Velvet
    • Description : Velvet is a tool for de novo assembly based on de Bruijn graphs and it is suitable for short read data with high coverage. The algorithm implemented is capable of de Bruijn graphs manipulation to remove sequencing errors and resolve repeats.
  94. VGA
    • Description : VGA is a tool for assembling individual viral genomes from a sample that consists of diverse populations of viruses. It takes advantage of high sequencing depth to detect rare variants and requires sequencing library with barcodes attached to sequencing fragments.
  95. VirAmp
    • Description : VirAmp is a tool for assembling viral genomes using the Galaxy workflow, which enables users to use web interface to click through a variety of programs and hence requires little programming experience to operate it. Three assemblers i.e. Velvet, SPAdes, and VICUNA are used by default following installation. The program covers quality checking of raw reads, coverage reduction, de novo assembly, scaffolding, gap filling, and assembly metrics evaluation.
  96. WhatsHap
    • Description : WhatsHap is a tool for phasing long reads despite their higher sequencing error rates. It implements a fixed parameter tractable (FPT) approach to a weighted version of minimum error correction (wMEC) formulation. This tool is useful to users who want to perform haplotype assembly.
  97. wtdbg2
    • Description : wtdbg2 is long-read based assembler that is 2 to 17 times faster than some published tools (as of 2020) that performed the same task. The algorithm basically follows the overlap-layout-consensus method but advanced it with a fast all-versus-all read alignment process. It also uses fuzzy-Bruijn graph.
  98. ZORRO
    • Description : ZORRO is a tool for hybrid assembly that can combine Illumina and 454 contigs. The pipeline consists of repeat masking in contigs, detection of overlapping regions, unmasking repeats, and finally assembly of the contigs.







If you find errors, please report here: comments and suggestions.