USEARCH

USEARCH performs sensitive, high-throughput sequence searching and error reduction for NGS and amplicon data to support homology detection and functional inference from large sequence databases.


Key Features:

  • High-throughput sensitive search: Performs fast, non-profile-based homology searches against large sequence databases for functional inference.
  • Error-correction strategies: Implements three independent error-reduction approaches: filtering reads by expected number of errors, assembling overlapping read pairs, and leveraging unique sequence abundances in amplicon reads.
  • Low-coverage discrimination: Distinguishes true biological variation from sequencing errors, including under low-coverage conditions.
  • Benchmarking against other methods: Has been systematically compared to CS-BLAST, HHSEARCH, PHMMER, NCBI-BLAST, UBLAST, and FASTA using challenging datasets.
  • Domain-architecture evaluations: Evaluations used protein domain architecture schemes such as PFAM+Clan, SCOP/Superfamily, and CATH/Gene3D.
  • Speed–accuracy trade-off: Offers significant speed advantages relative to profile-based methods while exhibiting lower remote-homolog detection accuracy than top profile-based tools.

Scientific Applications:

  • Homology detection and functional annotation: Infers function from sequences deposited in public databases via non-profile homology searches.
  • NGS and amplicon error reduction: Improves accuracy of amplicon and NGS datasets by applying expected-error filtering, read-pair assembly, and abundance-based correction.
  • Low-coverage variant discrimination: Facilitates separation of sequencing errors from true biological variation in low-coverage data.
  • Method benchmarking: Serves as a comparator in systematic evaluations of profile-based and non-profile-based homology search tools.

Methodology:

Uses three explicit error-correction methods—filtering reads by expected error counts, assembling overlapping read pairs, and leveraging unique sequence abundances in amplicon reads—and performs non-profile-based homology searches benchmarked against CS-BLAST, HHSEARCH, PHMMER, NCBI-BLAST, UBLAST, and FASTA using datasets based on PFAM+Clan, SCOP/Superfamily, and CATH/Gene3D.

Topics

Details

Tool Type:
desktop application
Operating Systems:
Linux, Windows, Mac
Added:
8/3/2017
Last Updated:
11/24/2024

Operations

Publications

Edgar RC, Flyvbjerg H. Error filtering, pair assembly and error correction for next-generation sequencing reads. Bioinformatics. 2015;31(21):3476-3482. doi:10.1093/bioinformatics/btv401. PMID:26139637.

Saripella GV, Sonnhammer ELL, Forslund K. Benchmarking the next generation of homology inference tools. Bioinformatics. 2016;32(17):2636-2641. doi:10.1093/bioinformatics/btw305. PMID:27256311. PMCID:PMC5013910.

Documentation

Links