UniParc
UniParc provides a comprehensive non-redundant archive of protein sequences by assigning a stable UniParc identifier (UPI) to each distinct sequence and aggregating cross-references to source databases.
Key Features:
- Non-redundant sequence archive: Stores each unique protein sequence once to eliminate redundancy across aggregated data sources.
- Stable identifiers (UPI): Assigns a stable UniParc identifier to every distinct protein sequence.
- Daily updates: Integrates new and revised protein sequences from publicly accessible source databases on a daily basis.
- Cross-references to source databases: Creates links from UniParc entries back to originating databases for traceability of sequence provenance.
- Centralized sequence-only repository: Maintains only sequences and their cross-references, with additional annotations retained in the original source databases.
- Aggregated search behavior: Executes searches against UniParc entries to reflect matches across the set of cross-referenced databases.
Scientific Applications:
- Comparative genomics: Provides a consolidated set of unique protein sequences for cross-species sequence comparison.
- Evolutionary studies: Supplies non-redundant sequences and provenance links useful for tracing sequence conservation and divergence.
- Functional annotation projects: Serves as a central sequence index with UPIs and cross-references to support annotation efforts using source-database metadata.
- Protein sequence analysis: Enables comprehensive sequence searches across aggregated source databases via UniParc entries.
Methodology:
Assigns a stable UniParc identifier (UPI) to each distinct sequence, collects and integrates new and updated protein sequences from public source databases on a daily schedule, and creates cross-references linking each UniParc entry to its originating databases.
Topics
Collections
Details
- Tool Type:
- web application
- Operating Systems:
- Linux, Windows, Mac
- Added:
- 6/11/2015
- Last Updated:
- 11/24/2024
Operations
Data Inputs & Outputs
Query and retrieval
Publications
Leinonen R, Diez FG, Binns D, Fleischmann W, Lopez R, Apweiler R. UniProt archive. Bioinformatics. 2004;20(17):3236-3237. doi:10.1093/bioinformatics/bth191. PMID:15044231.
PMID: 15044231
Documentation
General
http://www.uniprot.org/help/