PROTEO: Peptide evidence for the human genome

PROTEO interrogates peptide evidence from human proteomics datasets to determine whether a single dominant protein isoform is expressed per protein-coding gene across tissues and cell types.


Key Features:

  • Large-scale data integration: Analyzes peptides from eight large-scale human proteomics experiments and databases to capture protein expression across tissues and cell types.
  • Peptide evidence filtering: Considers only peptides observed in at least two independent experiments to increase robustness.
  • Isoform dominance analysis: Identifies whether a single dominant splice isoform is observed at the protein level for each gene irrespective of tissue or cellular context.
  • Cross-validation with orthogonal references: Cross-references findings with consensus coding sequence variants curated by genome curation teams and APPRIS principal isoforms, which are predicted based on conservation of protein sequence, structure, and function.
  • Comparison with transcriptomics: Assesses concordance between protein-level dominant isoforms and recent RNAseq study findings.

Scientific Applications:

  • Resolving transcript-protein discrepancies: Clarifies contradictions from previous large-scale transcript expression studies by providing protein-level evidence.
  • Alternative splicing research: Supplies proteomic evidence to identify dominant isoforms for studies of alternative splicing and its functional consequences.
  • Genome annotation and curation: Supports annotation decisions by comparing proteomic isoform dominance with consensus coding sequence variants and APPRIS principal isoforms.

Methodology:

Interrogates peptides from eight large-scale human proteomics experiments and databases, requires peptides to be present in at least two experiments, identifies a dominant protein isoform per gene, and cross-references results with consensus coding sequence variants curated by genome curation teams and APPRIS principal isoforms.

Topics

Collections

Details

Tool Type:
api
Operating Systems:
Linux, Windows, Mac
Added:
4/25/2016
Last Updated:
11/24/2024

Operations

Publications

Ezkurdia I, Rodriguez JM, Carrillo-de Santa Pau E, Vázquez J, Valencia A, Tress ML. Most Highly Expressed Protein-Coding Genes Have a Single Dominant Isoform. Journal of Proteome Research. 2015;14(4):1880-1887. doi:10.1021/pr501286b. PMID:25732134. PMCID:PMC4768900.

PMID: 25732134
PMCID: PMC4768900
Funding: - National Human Genome Research Institute: U41 HG007234 - Ministerio de Economía y Competitividad: BIO2012-37926, BIO2012-40205, PRB2-ProteoRed-PT13/0001/0017, RD07-0067-0014-COMBIOMED, RETICS-RD12-0042-0056 - Seventh Framework Programme: 282510

Documentation