Allpaths-LG

Allpaths-LG assembles large and complex genomes de novo from Illumina short-read, massively parallel sequencing data to produce accurate and contiguous draft assemblies.


Key Features:

  • Algorithmic Innovation: Handles repetitive sequences typical of large mammalian genomes and integrates massively parallel Illumina short-read data for assembly.
  • Base Accuracy: Produces assemblies with reported base accuracy ≥99.95%.
  • Contiguity: Delivers both short-range contiguity and long-range connectivity in assembled genomes.
  • Scaffold Size: Generates large scaffold N50 values (human 11.5 Mb; mouse 7.2 Mb) comparable to those from capillary-based sequencing.
  • Genome Coverage: Ensures extensive coverage of target genomes to support comprehensive assembly.

Scientific Applications:

  • De Novo Genome Assembly: Constructs draft genome assemblies for large and complex vertebrate genomes from short-read data.
  • Comparative Genomics: Provides high-quality assemblies that enable cross-species comparative analyses.
  • Genomic Research: Supports gene discovery, functional genomics, and studies of genetic variation.

Methodology:

Employs an advanced algorithm that optimizes assembly of short-read, massively parallel sequencing data and uses computational innovations to manage repetitive sequences and large genome sizes.

Topics

Collections

Details

License:
MIT
Tool Type:
command-line tool
Operating Systems:
Linux
Programming Languages:
C++
Added:
8/20/2017
Last Updated:
5/27/2021

Operations

Publications

Gnerre S, MacCallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, Sharpe T, Hall G, Shea TP, Sykes S, Berlin AM, Aird D, Costello M, Daza R, Williams L, Nicol R, Gnirke A, Nusbaum C, Lander ES, Jaffe DB. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proceedings of the National Academy of Sciences. 2010;108(4):1513-1518. doi:10.1073/pnas.1017351108. PMID:21187386. PMCID:PMC3029755.

Documentation