ms-data-core-api

ms-data-core-api provides a Java API that supplies a pluggable programming interface and a common data model for reading and representing proteomics spectra, peptide and protein identifications, and quantitative data to support development of computational proteomics tools and workflows.


Key Features:

  • Common data model: Represents raw spectra, peptide and protein identifications, and quantitative assessments for proteomics analyses.
  • Controlled vocabularies and ontologies: Constructs the data model using controlled vocabularies and ontologies to ensure consistency and interoperability.
  • Pluggable programming interface: Exposes a Java API that enables programmatic access and extensibility for proteomics applications.
  • Supported file formats: Includes native readers for PSI standards mzML, mzIdentML, mzTab and additional formats dta, ms2, mgf, pkl, apl, mzXML, and mzData.
  • PRIDE XML processing: Processes PRIDE XML files for compatibility with datasets from the PRIDE database.

Scientific Applications:

  • Development of proteomics tools and pipelines: Implements core data representations and readers used to build computational proteomics applications.
  • Integration of heterogeneous proteomics data: Enables integration and analysis of spectra, identification, and quantitative results across multiple file formats including PSI standards and legacy formats.
  • Processing PRIDE datasets: Facilitates processing and reanalysis of datasets exported from the PRIDE database via PRIDE XML.

Methodology:

Implemented in Java; employs a common data model constructed from controlled vocabularies and ontologies and provides native readers for mzML, mzIdentML, mzTab, dta, ms2, mgf, pkl, apl, mzXML, mzData, and PRIDE XML.

Topics

Collections

Details

License:
Apache-2.0
Maturity:
Emerging
Cost:
Free of charge
Tool Type:
library
Operating Systems:
Linux
Programming Languages:
Java
Added:
8/3/2017
Last Updated:
6/16/2020

Operations

Publications

Perez-Riverol Y, et al. ms-data-core-api: an open-source, metadata-oriented library for computational proteomics. Bioinformatics. 2015; 31:2903-5. doi: 10.1093/bioinformatics/btv250

PMID: 25910694

Documentation

Downloads

Links