[Profile picture of Ruben Verborgh]

Ruben Verborgh

Data Analysis of Hierarchical Data for RDF Term Identification

by Pieter Heyvaert, Anastasia Dimou, Ruben Verborgh, and Erik Mannens

Generating Linked Data based on existing data sources requires the modeling of their information structure. This modeling needs the identification of potential entities, their attributes and the relationships between them and among entities. For databases this identification is not required, because a data schema is always available. However, for other data formats, such as hierarchical data, this is not always the case. Therefore, analysis of the data is required to support RDF term and data type identification. We introduce a tool that performs such an analysis on hierarchical data. It implements the algorithms, Daro and S-Daro, proposed in this paper. Based on our evaluation, we conclude that S-Daro offers a more scalable solution regarding run time, with respect to the dataset size, and provides more complete results.

BibTeX Mendeley

Published in 2016 in Proceedings of the Joint International Semantic Technology Conference.

Keywords: Linked Data

Read this paper online

Cite this paper in your publications

Discuss this paper