GAML
GAML assembles genomes by maximizing the likelihood of candidate assemblies using a probabilistic model that accounts for sequencing error rates and insert lengths across Illumina, 454, and PacBio datasets.
Key Features:
- Probabilistic modeling: Uses a probabilistic model that captures sequencing error rates and insert lengths specific to each sequencing technology to evaluate assemblies.
- Likelihood maximization: Searches for assembly configurations that maximize the likelihood under the probabilistic model.
- Integration of diverse sequencing data: Accepts and integrates Illumina and 454 reads across various insert sizes as well as PacBio reads in a single assembly framework.
- Repeat resolution and scaffolding: Targets resolution of repeats and scaffolding of shorter contigs through its likelihood-based assembly evaluation.
- Comparative assembly performance: Achieves N50 sizes and error rates reported as comparable to established assemblers such as ALLPATHS-LG and Cerulean.
Scientific Applications:
- Multi-platform genome assembly: Integrates multiple sequencing platforms (Illumina, 454, PacBio) to produce cohesive genome assemblies from heterogeneous datasets.
- Complex genome reconstruction: Supports projects requiring repeat resolution and contig scaffolding to improve assembly continuity and accuracy.
Methodology:
Searches for an optimal assembly configuration that maximizes likelihood within a probabilistic model that explicitly accounts for dataset-specific error profiles and insert lengths.
Topics
Details
- Tool Type:
- command-line tool
- Operating Systems:
- Linux, Windows, Mac
- Programming Languages:
- C++
- Added:
- 8/3/2017
- Last Updated:
- 11/25/2024
Operations
Publications
Boža V, Brejová B, Vinař T. GAML: genome assembly by maximum likelihood. Algorithms for Molecular Biology. 2015;10(1). doi:10.1186/s13015-015-0052-6. PMID:26042154. PMCID:PMC4454275.