MAKER
MAKER annotates eukaryotic and prokaryotic genomes by integrating repeat identification, EST and protein alignments, ab initio gene predictions, and evidence synthesis to produce quality-indexed gene annotations for first- and second-generation sequencing projects.
Key Features:
- Portability and Configurability: Configurable and portable pipeline adaptable to different project needs and computational environments.
- Comprehensive Annotation Capabilities: Identifies genomic repeats, aligns expressed sequence tags (ESTs) and proteins to genomes, generates ab initio gene predictions, and synthesizes these data into evidence-based gene annotations with quality indices.
- Iterative Training: Automatically retrains gene-prediction algorithms using outputs from preliminary runs to improve subsequent gene models.
- Minimal Input Requirements: Operates with minimal input data to enable annotation when training or reference data are limited.
- Integration with GMOD and Apollo Genome Browser: Produces outputs compatible with Generic Model Organism Database (GMOD) creation and with viewing and editing in the Apollo Genome browser.
- MAKER2 Scalability: MAKER2 is multi-threaded and parallelized to scale for large second-generation sequencing datasets.
- Handling Limited Training Data: Can produce accurate annotations when training data are limited or absent.
- Utilization of mRNA-seq Data: Incorporates mRNA-seq data to improve annotation quality and to update legacy annotations.
- Quality Evaluation and Management: Evaluates annotation quality and identifies problematic regions for manual review.
- Legacy Annotation Updates: Manages and improves existing genome annotations to update legacy datasets.
Scientific Applications:
- Genome annotation of non-model organisms: Annotates genomes with limited prior genomic resources using integrated evidence and ab initio prediction.
- Schmidtea mediterranea annotation and SmedGD: Used to annotate Schmidtea mediterranea and to create the SmedGD genome database.
- Generation of community-accessible genome databases: Produces annotations and outputs suitable for building community genome databases from raw sequence and evidence.
- Updating legacy annotations: Updates and refines legacy annotations using mRNA-seq data and iterative retraining.
- Support for emerging model organism projects: Enables annotation in projects where pre-existing annotation resources are scarce.
Methodology:
Integrates repeat identification, EST and protein alignment, ab initio gene prediction, evidence synthesis with quality indices, and iterative retraining of gene-prediction algorithms; MAKER2 adds multi-threaded parallel processing and incorporation of mRNA-seq data.
Topics
Collections
Details
- License:
- Artistic-2.0
- Tool Type:
- command-line tool
- Operating Systems:
- Linux
- Added:
- 10/3/2016
- Last Updated:
- 11/24/2024
Operations
Data Inputs & Outputs
Genome annotation
Inputs
Outputs
Publications
Cantarel BL, Korf I, Robb SM, Parra G, Ross E, Moore B, Holt C, Sánchez Alvarado A, Yandell M. MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Research. 2007;18(1):188-196. doi:10.1101/gr.6743907. PMID:18025269. PMCID:PMC2134774.
Holt C, Yandell M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics. 2011;12(1). doi:10.1186/1471-2105-12-491. PMID:22192575. PMCID:PMC3280279.