Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality

by Anastasia Dimou, Dimitris Kontokostas, Markus Freudenberg, Ruben Verborgh, Jens Lehmann, Erik Mannens, Sebastian Hellmann, and Rik Van de Walle

RDF dataset quality assessment is currently performed primarily after data is published. Incorporating its results, by applying corresponding adjustments to the dataset, happens manually and occurs rarely. In the case of (semi-)structured data (e.g., CSV, XML), the root of the violations often derives from the mappings that specify how the RDF dataset will be generated. Thus, we suggest shifting the quality assessment from the RDF dataset to the mapping definitions that generate it. The proposed test-driven approach for assessing mappings relies on RDFUnit test cases applied over mappings specified with RML. Our evaluation is applied to different cases, e.g., DBpedia, and indicates that the overall quality of an RDF dataset is quickly and significantly improved.

Published in 2015 in Proceedings of the 14th International Semantic Web Conference: Posters and Demos.

Keywords: RML, XML

