USEARCH

USEARCH performs sensitive, high-throughput sequence searching and error reduction for NGS and amplicon data to support homology detection and functional inference from large sequence databases.

Key Features:

High-throughput sensitive search: Performs fast, non-profile-based homology searches against large sequence databases for functional inference.
Error-correction strategies: Implements three independent error-reduction approaches: filtering reads by expected number of errors, assembling overlapping read pairs, and leveraging unique sequence abundances in amplicon reads.
Low-coverage discrimination: Distinguishes true biological variation from sequencing errors, including under low-coverage conditions.
Benchmarking against other methods: Has been systematically compared to CS-BLAST, HHSEARCH, PHMMER, NCBI-BLAST, UBLAST, and FASTA using challenging datasets.
Domain-architecture evaluations: Evaluations used protein domain architecture schemes such as PFAM+Clan, SCOP/Superfamily, and CATH/Gene3D.
Speed–accuracy trade-off: Offers significant speed advantages relative to profile-based methods while exhibiting lower remote-homolog detection accuracy than top profile-based tools.

Scientific Applications:

Homology detection and functional annotation: Infers function from sequences deposited in public databases via non-profile homology searches.
NGS and amplicon error reduction: Improves accuracy of amplicon and NGS datasets by applying expected-error filtering, read-pair assembly, and abundance-based correction.
Low-coverage variant discrimination: Facilitates separation of sequencing errors from true biological variation in low-coverage data.
Method benchmarking: Serves as a comparator in systematic evaluations of profile-based and non-profile-based homology search tools.

Methodology:

Uses three explicit error-correction methods—filtering reads by expected error counts, assembling overlapping read pairs, and leveraging unique sequence abundances in amplicon reads—and performs non-profile-based homology searches benchmarked against CS-BLAST, HHSEARCH, PHMMER, NCBI-BLAST, UBLAST, and FASTA using datasets based on PFAM+Clan, SCOP/Superfamily, and CATH/Gene3D.

Visit Official Homepage →

Topics

DNA Sequence analysis Sequencing

Details

Tool Type:: desktop application
Operating Systems:: Linux, Windows, Mac
Added:: 8/3/2017
Last Updated:: 11/24/2024

Operations

Sequence analysis

Publications

Edgar RC, Flyvbjerg H. Error filtering, pair assembly and error correction for next-generation sequencing reads. Bioinformatics. 2015;31(21):3476-3482. doi:10.1093/bioinformatics/btv401. PMID:26139637.

DOI: 10.1093/bioinformatics/btv401

PMID: 26139637

Saripella GV, Sonnhammer ELL, Forslund K. Benchmarking the next generation of homology inference tools. Bioinformatics. 2016;32(17):2636-2641. doi:10.1093/bioinformatics/btw305. PMID:27256311. PMCID:PMC5013910.

DOI: 10.1093/bioinformatics/btw305

PMID: 27256311

PMCID: PMC5013910

Documentation

General

http://www.drive5.com/usearch/manual/

Links

Software catalogue

http://www.mybiosoftware.com/usearch-4-1-93-uclust-ublast-sequence-search-clustering.html

← Back to search