Subread
Subread aligns sequencing reads to reference genomes for genomic DNA-seq and RNA-seq analyses using a seed-and-vote strategy to identify optimal mapping locations.
Key Features:
- Seed-and-vote mapping paradigm: Employs a seed-and-vote approach that uses multiple short seeds ("subreads") from each read to vote on the best genomic location.
- Subread extraction: Extracts multiple short seeds (subreads) from each sequencing read for mapping.
- Overlapping subreads for short reads: Uses overlapping subreads when reads are shorter than 160 base pairs to ensure comprehensive coverage.
- Pre-alignment localization: Determines the overall genomic position of a read before performing detailed alignment to reduce computation.
- Tolerance of subread mismatches: Allows individual subreads not to map exactly or be constrained by proximity while requiring final mapping support from multiple distinct subreads.
- Automatic alignment mode selection: Automatically decides whether to apply global or local alignment for each read.
- Indel detection: Supports detection of insertions and deletions (indels) during alignment.
- Variable and fixed read length handling: Handles reads of both fixed and variable lengths.
- Exon junction identification: Identifies exon junctions by mapping sets of subreads that align to different exons within the same gene.
- Performance characteristics: Designed for speed, accuracy, and scalability in read mapping.
- Benchmarking relevance: Has been benchmarked against other splice-aware aligners, highlighting that alignment accuracy affects downstream analyses and varies with genome complexity.
Scientific Applications:
- Genomic DNA-seq alignment: Aligns genomic DNA-seq reads to reference genomes for variant calling and genome analysis.
- RNA-seq alignment and splice discovery: Performs splice-aware RNA-seq alignment and exon junction discovery through subread mappings to different exons.
- Indel detection and variable-length read support: Detects insertions and deletions and accommodates reads of variable lengths.
- Benchmarking and pipeline optimization: Serves in benchmarking comparisons with other splice-aware aligners and informs parameter optimization to improve downstream analyses across genomes of varying complexity.
Methodology:
Subread extracts multiple short seeds (subreads) from each read (using overlapping subreads for reads <160 bp), applies a seed-and-vote scheme where subreads vote on the optimal genomic location, determines overall read position prior to detailed alignment, automatically chooses global or local alignment, supports indel detection, and identifies exon junctions by mapping subreads to different exons.
Topics
Details
- License:
- GPL-3.0
- Maturity:
- Mature
- Tool Type:
- workflow
- Operating Systems:
- Linux, Mac
- Programming Languages:
- C
- Added:
- 1/13/2017
- Last Updated:
- 5/19/2021
Operations
Publications
Liao Y, Smyth GK, Shi W. The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nucleic Acids Research. 2013;41(10):e108-e108. doi:10.1093/nar/gkt214. PMID:23558742. PMCID:PMC3664803.
Baruzzo G, Hayer KE, Kim EJ, Di Camillo B, FitzGerald GA, Grant GR. Simulation-based comprehensive benchmarking of RNA-seq aligners. Nature Methods. 2016;14(2):135-139. doi:10.1038/nmeth.4106. PMID:27941783. PMCID:PMC5792058.