ORDB
ORDB provides a centralized repository of olfactory receptor and olfactory receptor-like gene and protein sequences, automating acquisition from GenBank and SWISS-PROT and structuring sequence metadata for olfactory research.
Key Features:
- Automated sequence acquisition: Automates downloading of relevant sequences from web sources including GenBank and SWISS-PROT.
- HTML parsing and extraction: Uses HTML parsing techniques to extract sequence and annotation information from source pages.
- Metadata correlation: Correlates extracted data with existing metadata in the ORDB knowledge base.
- Unstructured-to-structured transformation: Transforms unstructured text into structured records aligned with the database architecture.
- EAV/CR data encoding: Encodes structured data using an entity attribute value with classes and relationship (EAV/CR) model.
- Population methods: Supports batch, automatic, and semi-automatic population modes for importing data.
- XML import: Leverages extensible markup language (XML) for encoding imported documents.
- SenseLab integration: Integrates structured data within the broader SenseLab project framework.
Scientific Applications:
- Sequence repository and curation: Centralized cataloging and management of olfactory receptor gene and protein sequences for research.
- Metadata structuring for olfaction research: Structuring unstructured sequence and annotation text to support downstream analyses of olfactory receptors.
- Integrated datasets for sequence analysis: Provision of integrated sequence and metadata resources to support analysis of genetic and protein sequences associated with olfaction.
Methodology:
Automated downloading from GenBank and SWISS-PROT; HTML parsing for data extraction; correlation of extracted data with existing ORDB metadata; transformation of unstructured text into structured records encoded using an EAV/CR model; data import via batch, automatic, or semi-automatic population methods using XML.
Topics
Collections
Details
- Tool Type:
- web application
- Added:
- 8/30/2023
- Last Updated:
- 11/24/2024
Operations
Publications
Crasto C. Olfactory Receptor Database: a metadata-driven automated population from sources of gene and protein sequences. Nucleic Acids Research. 2002;30(1):354-360. doi:10.1093/nar/30.1.354. PMID:11752336. PMCID:PMC99065.