ARX Data Anonymization Tool

ARX Data Anonymization Tool provides anonymization of sensitive personal data to enable analysis and sharing while protecting privacy in clinical, epidemiological, and research datasets.


Key Features:

  • Privacy and Risk Models: Supports diverse privacy models and risk assessments to tailor anonymization and address singling out, inference, and linkage risks.
  • Data Transformation Methods: Implements generalization, suppression, perturbation, and synthesis to reduce re-identification risk while preserving analytical value.
  • Usefulness Analysis: Computes measures of data utility to evaluate the impact of anonymization on data quality.
  • Scalability and Performance: Processes large datasets efficiently on commodity hardware to support high-volume data anonymization.

Scientific Applications:

  • Epidemiological registry anonymization (LEOSS, SARS-CoV-2/COVID-19): Applied quantitative anonymization procedures to the Lean European Open Survey on SARS-CoV-2 Infected Patients (LEOSS) to enable public real-time sharing of COVID-19 patient registry data.
  • Clinical trial data sharing: Enables anonymization of clinical trial datasets to permit secondary analysis while protecting participant privacy.
  • Big data analytics and research projects: Supports anonymization workflows for commercial big data analytics platforms and academic research datasets.
  • Educational and training datasets: Facilitates creation of anonymized datasets for educational and training purposes without exposing sensitive personal information.

Methodology:

Applies privacy models and risk assessments, uses generalization, suppression, perturbation, and synthesis, performs usefulness analysis, and employs quantitative anonymization procedures to protect against singling out, inference, and linkage attacks; research reports minimal introduced bias.

Topics

Details

License:
Apache-2.0
Maturity:
Mature
Cost:
Free of charge
Tool Type:
desktop application, library
Operating Systems:
Mac, Linux, Windows
Programming Languages:
Java
Added:
5/5/2021
Last Updated:
11/24/2024

Operations

Data Inputs & Outputs

Publications

Prasser F, Eicher J, Spengler H, Bild R, Kuhn KA. Flexible data anonymization using ARX—Current status and challenges ahead. Software: Practice and Experience. 2020;50(7):1277-1304. doi:10.1002/spe.2812.

Jakob CEM, Kohlmayer F, Meurers T, Vehreschild JJ, Prasser F. Design and evaluation of a data anonymization pipeline to promote Open Science on COVID-19. Scientific Data. 2020;7(1). doi:10.1038/s41597-020-00773-y. PMID:33303746. PMCID:PMC7729909.

Documentation

Downloads

Links