[Profile picture of Ruben Verborgh]

Ruben Verborgh

Detailed Provenance Capture of Data Processing

Ben De Meester, Anastasia Dimou, Ruben Verborgh, and Erik Mannens

A large part of scientific output entails computational experiments, e.g., processing data to generate new data. However, this generation process is only documented in human-readable form or as a software repository. This inhibits reproducibility and comparability, as current documentation solutions do not provide detailed metadata and rely on the availability of specific software environments. This paper proposes an automatic capturing mechanism for interchangeable and implementation independent metadata and provenance that includes data processing. Using declarative mapping documents to describe the computational experiment, term-level provenance can be automatically captured, for both schema and data transformations, and storing both the used software tools as the input-output pairs of the data processing executions. This approach is applied to mapping documents described using RML and FnO, and implemented in the RMLMapper. The captured metadata can be used to more easily share, reproduce, and compare the dataset generation process, across software environments.

full text BibTeX other citation formats

Published in 2017 in Proceedings of the 1st Workshop on Enabling Open Semantic Science.

Keywords:

Read this article online

Cite this article in your work

Cite this article easily using its BibTeX entry:

@inproceedings{demeester_semsci_2017,
  author = {De Meester, Ben and Dimou, Anastasia and Verborgh, Ruben and Mannens, Erik},
  title = {Detailed Provenance Capture of Data Processing},
  booktitle = {Proceedings of the 1st Workshop on Enabling Open Semantic Science},
  year = 2017,
  month = oct,
  series = {CEUR Workshop Proceedings},
  volume = 1931,
  issn = {1613-0073},
  url = {http://ceur-ws.org/Vol-1931/paper-05.pdf},
}

Alternatively, pick a reference of your choice below:

ACM
Ben De Meester, Anastasia Dimou, Ruben Verborgh, and Erik Mannens. 2017. Detailed Provenance Capture of Data Processing. In Proceedings of the 1st Workshop on Enabling Open Semantic Science (CEUR Workshop Proceedings).
APA
De Meester, B., Dimou, A., Verborgh, R., & Mannens, E. (2017). Detailed Provenance Capture of Data Processing. Proceedings of the 1st Workshop on Enabling Open Semantic Science, 1931.
IEEE
B. De Meester, A. Dimou, R. Verborgh, and E. Mannens, “Detailed Provenance Capture of Data Processing,” in Proceedings of the 1st Workshop on Enabling Open Semantic Science, 2017, vol. 1931.
LNCS
De Meester, B., Dimou, A., Verborgh, R., Mannens, E.: Detailed Provenance Capture of Data Processing. In: Proceedings of the 1st Workshop on Enabling Open Semantic Science (2017).
MLA
De Meester, Ben, et al. “Detailed Provenance Capture of Data Processing.” Proceedings of the 1st Workshop on Enabling Open Semantic Science, vol. 1931, 2017.

Discuss this article