MAKER

MAKER annotates eukaryotic and prokaryotic genomes by integrating repeat identification, EST and protein alignments, ab initio gene predictions, and evidence synthesis to produce quality-indexed gene annotations for first- and second-generation sequencing projects.


Key Features:

  • Portability and Configurability: Configurable and portable pipeline adaptable to different project needs and computational environments.
  • Comprehensive Annotation Capabilities: Identifies genomic repeats, aligns expressed sequence tags (ESTs) and proteins to genomes, generates ab initio gene predictions, and synthesizes these data into evidence-based gene annotations with quality indices.
  • Iterative Training: Automatically retrains gene-prediction algorithms using outputs from preliminary runs to improve subsequent gene models.
  • Minimal Input Requirements: Operates with minimal input data to enable annotation when training or reference data are limited.
  • Integration with GMOD and Apollo Genome Browser: Produces outputs compatible with Generic Model Organism Database (GMOD) creation and with viewing and editing in the Apollo Genome browser.
  • MAKER2 Scalability: MAKER2 is multi-threaded and parallelized to scale for large second-generation sequencing datasets.
  • Handling Limited Training Data: Can produce accurate annotations when training data are limited or absent.
  • Utilization of mRNA-seq Data: Incorporates mRNA-seq data to improve annotation quality and to update legacy annotations.
  • Quality Evaluation and Management: Evaluates annotation quality and identifies problematic regions for manual review.
  • Legacy Annotation Updates: Manages and improves existing genome annotations to update legacy datasets.

Scientific Applications:

  • Genome annotation of non-model organisms: Annotates genomes with limited prior genomic resources using integrated evidence and ab initio prediction.
  • Schmidtea mediterranea annotation and SmedGD: Used to annotate Schmidtea mediterranea and to create the SmedGD genome database.
  • Generation of community-accessible genome databases: Produces annotations and outputs suitable for building community genome databases from raw sequence and evidence.
  • Updating legacy annotations: Updates and refines legacy annotations using mRNA-seq data and iterative retraining.
  • Support for emerging model organism projects: Enables annotation in projects where pre-existing annotation resources are scarce.

Methodology:

Integrates repeat identification, EST and protein alignment, ab initio gene prediction, evidence synthesis with quality indices, and iterative retraining of gene-prediction algorithms; MAKER2 adds multi-threaded parallel processing and incorporation of mRNA-seq data.

Topics

Collections

Details

License:
Artistic-2.0
Tool Type:
command-line tool
Operating Systems:
Linux
Added:
10/3/2016
Last Updated:
11/24/2024

Operations

Data Inputs & Outputs

Genome annotation

Publications

Cantarel BL, Korf I, Robb SM, Parra G, Ross E, Moore B, Holt C, Sánchez Alvarado A, Yandell M. MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Research. 2007;18(1):188-196. doi:10.1101/gr.6743907. PMID:18025269. PMCID:PMC2134774.

Holt C, Yandell M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics. 2011;12(1). doi:10.1186/1471-2105-12-491. PMID:22192575. PMCID:PMC3280279.

Documentation

Links