PolySearch

PolySearch performs associative text-mining queries across biomedical literature and bioinformatic databases to identify and rank associations among entities such as diseases, tissues, cell compartments, gene/protein names, SNPs, mutations, drugs, and metabolites for genomics, proteomics, and metabolomics research.

Key Features:

Associative query formulation: Supports "Given X, find all Y" queries where X or Y can be diseases, tissues, cell compartments, gene/protein names, SNPs, mutations, drugs, or metabolites.
Query breadth: Supports over 50 different classes of queries across nearly a dozen types of text sources, including scientific abstracts and bioinformatic databases.
Text mining and information retrieval: Employs advanced text-mining and information-retrieval techniques to identify, highlight, and rank relevant abstracts, paragraphs, and sentences.
Output granularity: Identifies and ranks information at the abstract, paragraph, and sentence levels.
Performance benchmarking: Evaluated on gene synonym identification, protein-protein interaction identification, and disease gene identification using manually assembled gold-standard text corpuses, achieving f-measures of 88%, 81%, and 79%, respectively, with reported improvements of 5%–50% over other published tools.
Omics applicability: Applicable to genomics, proteomics, and metabolomics research contexts.

Scientific Applications:

Associative discovery: Uncovers associations among genes, proteins, metabolites, diseases, drugs, SNPs, and mutations from literature and databases.
Gene synonym identification: Identifies gene synonyms within text corpuses.
Protein-protein interaction identification: Extracts mentions of protein-protein interactions from scientific text.
Disease gene identification: Identifies gene-disease associations from literature evidence.
Cross-entity relationship mining: Enables discovery of relationships such as drug–gene, metabolite–disease, and SNP–phenotype associations.

Methodology:

Uses associative query formulation combined with advanced text-mining and information-retrieval methods to identify, highlight, and rank relevant abstracts, paragraphs, and sentences; benchmarking was performed against manually assembled gold-standard text corpuses with f-measure reporting.

Visit Official Homepage →

Topics

Small molecules Pathology Genetic variation Molecular interactions, pathways and networks Metabolomics

Details

Tool Type:: web application
Operating Systems:: Linux, Windows, Mac
Programming Languages:: Perl
Added:: 3/24/2017
Last Updated:: 12/10/2018

Operations

Publications

Cheng D, et al. PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites. Nucleic Acids Res. 2008; 36:W399-405. doi: 10.1093/nar/gkn296

PMID: 18487273

Documentation

General

http://wishart.biology.ualberta.ca/polysearch/cgi-bin/help.cgi

← Back to search