dada2
dada2 infers exact amplicon sequence variants from demultiplexed Illumina-sequenced fastq files to enable accurate, high-resolution characterization of microbial communities.
Key Features:
- Exact Sequence Variant (SV) inference: Infers exact sequence variants that can differ by a single nucleotide rather than clustering reads into Operational Taxonomic Units (OTUs).
- Processing of demultiplexed fastq: Processes demultiplexed fastq files from Illumina-sequenced datasets to derive SVs and sample-wise abundances.
- Error removal: Identifies and removes substitution errors and chimeric sequences to minimize spurious variants.
- Sample-wise abundance output: Outputs precise sequence variants together with their sample-wise abundances.
- Integrated taxonomic classification: Implements the RDP naive Bayesian classifier for taxonomic assignment of sequence variants.
- Benchmarking performance: Demonstrated recovery of a greater number of genuine sequence variants in mock community comparisons.
Scientific Applications:
- Amplicon sequencing analysis: High-resolution characterization of microbial communities from Illumina-sequenced amplicon datasets.
- Mock community benchmarking: Comparative evaluation showing identification of greater numbers of genuine variants in mock communities.
- Intra-species diversity detection: Detection of subtle intra-species variants, exemplified by uncovering diverse Lactobacillus crispatus variants in vaginal samples.
- Taxonomic profiling: Comprehensive microbial community taxonomic classification using the integrated RDP naive Bayesian classifier.
Methodology:
Processes demultiplexed fastq files to infer exact sequence variants, removes substitution and chimera errors, and outputs sequence variants with sample-wise abundances; includes an implementation of the RDP naive Bayesian classifier for taxonomic assignment.
Topics
Collections
Details
- License:
- GPL-3.0
- Tool Type:
- command-line tool, library
- Operating Systems:
- Linux, Windows, Mac
- Programming Languages:
- R
- Added:
- 1/17/2017
- Last Updated:
- 11/24/2024
Operations
Data Inputs & Outputs
DNA barcoding
Inputs
Outputs
Publications
Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nature Methods. 2016;13(7):581-583. doi:10.1038/nmeth.3869. PMID:27214047. PMCID:PMC4927377.