dada2

dada2 infers exact amplicon sequence variants from demultiplexed Illumina-sequenced fastq files to enable accurate, high-resolution characterization of microbial communities.


Key Features:

  • Exact Sequence Variant (SV) inference: Infers exact sequence variants that can differ by a single nucleotide rather than clustering reads into Operational Taxonomic Units (OTUs).
  • Processing of demultiplexed fastq: Processes demultiplexed fastq files from Illumina-sequenced datasets to derive SVs and sample-wise abundances.
  • Error removal: Identifies and removes substitution errors and chimeric sequences to minimize spurious variants.
  • Sample-wise abundance output: Outputs precise sequence variants together with their sample-wise abundances.
  • Integrated taxonomic classification: Implements the RDP naive Bayesian classifier for taxonomic assignment of sequence variants.
  • Benchmarking performance: Demonstrated recovery of a greater number of genuine sequence variants in mock community comparisons.

Scientific Applications:

  • Amplicon sequencing analysis: High-resolution characterization of microbial communities from Illumina-sequenced amplicon datasets.
  • Mock community benchmarking: Comparative evaluation showing identification of greater numbers of genuine variants in mock communities.
  • Intra-species diversity detection: Detection of subtle intra-species variants, exemplified by uncovering diverse Lactobacillus crispatus variants in vaginal samples.
  • Taxonomic profiling: Comprehensive microbial community taxonomic classification using the integrated RDP naive Bayesian classifier.

Methodology:

Processes demultiplexed fastq files to infer exact sequence variants, removes substitution and chimera errors, and outputs sequence variants with sample-wise abundances; includes an implementation of the RDP naive Bayesian classifier for taxonomic assignment.

Topics

Collections

Details

License:
GPL-3.0
Tool Type:
command-line tool, library
Operating Systems:
Linux, Windows, Mac
Programming Languages:
R
Added:
1/17/2017
Last Updated:
11/24/2024

Operations

Data Inputs & Outputs

DNA barcoding

Outputs

    Publications

    Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. DADA2: High-resolution sample inference from Illumina amplicon data. Nature Methods. 2016;13(7):581-583. doi:10.1038/nmeth.3869. PMID:27214047. PMCID:PMC4927377.

    Documentation

    Downloads

    Links