USEARCH
USEARCH performs sensitive, high-throughput sequence searching and error reduction for NGS and amplicon data to support homology detection and functional inference from large sequence databases.
Key Features:
- High-throughput sensitive search: Performs fast, non-profile-based homology searches against large sequence databases for functional inference.
- Error-correction strategies: Implements three independent error-reduction approaches: filtering reads by expected number of errors, assembling overlapping read pairs, and leveraging unique sequence abundances in amplicon reads.
- Low-coverage discrimination: Distinguishes true biological variation from sequencing errors, including under low-coverage conditions.
- Benchmarking against other methods: Has been systematically compared to CS-BLAST, HHSEARCH, PHMMER, NCBI-BLAST, UBLAST, and FASTA using challenging datasets.
- Domain-architecture evaluations: Evaluations used protein domain architecture schemes such as PFAM+Clan, SCOP/Superfamily, and CATH/Gene3D.
- Speed–accuracy trade-off: Offers significant speed advantages relative to profile-based methods while exhibiting lower remote-homolog detection accuracy than top profile-based tools.
Scientific Applications:
- Homology detection and functional annotation: Infers function from sequences deposited in public databases via non-profile homology searches.
- NGS and amplicon error reduction: Improves accuracy of amplicon and NGS datasets by applying expected-error filtering, read-pair assembly, and abundance-based correction.
- Low-coverage variant discrimination: Facilitates separation of sequencing errors from true biological variation in low-coverage data.
- Method benchmarking: Serves as a comparator in systematic evaluations of profile-based and non-profile-based homology search tools.
Methodology:
Uses three explicit error-correction methods—filtering reads by expected error counts, assembling overlapping read pairs, and leveraging unique sequence abundances in amplicon reads—and performs non-profile-based homology searches benchmarked against CS-BLAST, HHSEARCH, PHMMER, NCBI-BLAST, UBLAST, and FASTA using datasets based on PFAM+Clan, SCOP/Superfamily, and CATH/Gene3D.
Topics
Details
- Tool Type:
- desktop application
- Operating Systems:
- Linux, Windows, Mac
- Added:
- 8/3/2017
- Last Updated:
- 11/24/2024
Operations
Publications
Edgar RC, Flyvbjerg H. Error filtering, pair assembly and error correction for next-generation sequencing reads. Bioinformatics. 2015;31(21):3476-3482. doi:10.1093/bioinformatics/btv401. PMID:26139637.
Saripella GV, Sonnhammer ELL, Forslund K. Benchmarking the next generation of homology inference tools. Bioinformatics. 2016;32(17):2636-2641. doi:10.1093/bioinformatics/btw305. PMID:27256311. PMCID:PMC5013910.