Minimap2
Minimap2 performs alignment of DNA, spliced long mRNA/cDNA, and assembly contigs to large reference genomes and databases to enable accurate mapping of short reads, noisy long reads, full-length transcripts, and ultra-long genomic contigs.
Key Features:
- General-purpose alignment: Maps DNA or long mRNA/cDNA sequences against large reference genomes and databases.
- Supported sequence types: Handles accurate short reads ≥100 base pairs (bp), genomic reads >1 kilobase (kb) with ~15% error, full-length noisy Direct RNA or cDNA reads, and assembly contigs or chromosomes spanning hundreds of megabases (Mb).
- Ultra-long read support: Supports ultra-long reads exceeding 100 kilobases and genomic contigs exceeding 100 megabases.
- Spliced alignment: Performs spliced nucleotide sequence alignment for mRNA/cDNA mapping.
- Split-read alignment strategy: Employs split-read alignment to effectively handle long insertions and deletions.
- Concave gap costs: Uses concave gap cost functions to model long indels.
- Heuristics to reduce spurious alignments: Incorporates heuristics that minimize spurious alignments.
- Performance: Outpaces mainstream short-read mappers in speed while maintaining comparable accuracy and is reported to be at least 30× faster than other long-read genomic or cDNA mappers when higher accuracy is required.
- Scalability: Designed to process very large datasets arising from ultra-long-read sequencing and large-contig assemblies.
Scientific Applications:
- Reference mapping: Mapping DNA and long mRNA/cDNA reads to large reference genomes and databases.
- Transcriptome alignment: Spliced alignment of full-length mRNA, cDNA, and Direct RNA reads for transcript mapping.
- Long-read and assembly alignment: Aligning noisy long genomic reads (>1 kb, ~15% error) and assembly contigs or chromosomes spanning hundreds of megabases.
- Long indel analysis: Facilitating analysis of long insertions and deletions via split-read alignment and concave gap costs.
Methodology:
Performs split-read alignment using concave gap costs and heuristics to minimize spurious alignments.
Topics
Details
- License:
- MIT
- Tool Type:
- command-line tool
- Operating Systems:
- Linux
- Programming Languages:
- JavaScript, Python, C
- Added:
- 11/29/2018
- Last Updated:
- 12/10/2018
Operations
Publications
Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34(18):3094-3100. doi:10.1093/bioinformatics/bty191. PMID:29750242. PMCID:PMC6137996.