Parallel RDF generation from heterogeneous big data
To unlock the value of increasingly available data in high volumes, we need flexible ways to integrate data across different sources. While semantic integration can be provided through RDF generation, current generators insufficiently scale in terms of volume. Generators are limited by memory constraints. Therefore, we developed the RMLStreamer, a generator that parallelizes the ingestion and mapping tasks of RDF generation across multiple instances. In this paper, we analyze what aspects are parallelizable and we introduce an approach for parallel RDF generation. We describe how we implemented our proposed approach, in the frame of the RMLStreamer, and how the resulting scaling behavior compares to other RDF generators. The RMLStreamer ingests data at 50% faster rate than existing generators through parallel ingestion.
full text BibTeX other citation formats
Published in 2019 in Proceedings of the International Workshop on Semantic Big Data.
- RDF
- constraints
- RML
Read this article online
- Read the full text online.
- Request a digital copy of this article.
- Comment on this article.
Cite this article in your work
Cite this article easily using its BibTeX entry:
@inproceedings{haesendonck_sbd_2019,
author = {Haesendonck, Gerald and Maroy, Wouter and Heyvaert, Pieter and Verborgh, Ruben and Dimou, Anastasia},
title = {Parallel {RDF} generation from heterogeneous big data},
booktitle = {Proceedings of the International Workshop on Semantic Big Data},
year = 2019,
month = jul,
isbn = {978-1-4503-6766-0},
doi = {10.1145/3323878.3325802},
url = {https://dl.acm.org/authorize?N680652},
}
Alternatively, pick a reference of your choice below:
- ACM
- Gerald Haesendonck, Wouter Maroy, Pieter Heyvaert, Ruben Verborgh, and Anastasia Dimou. 2019. Parallel RDF generation from heterogeneous big data. In Proceedings of the International Workshop on Semantic Big Data.
- APA
- Haesendonck, G., Maroy, W., Heyvaert, P., Verborgh, R., & Dimou, A. (2019, July). Parallel RDF generation from heterogeneous big data. Proceedings of the International Workshop on Semantic Big Data.
- IEEE
- G. Haesendonck, W. Maroy, P. Heyvaert, R. Verborgh, and A. Dimou, “Parallel RDF generation from heterogeneous big data,” in Proceedings of the International Workshop on Semantic Big Data, 2019.
- LNCS
- Haesendonck, G., Maroy, W., Heyvaert, P., Verborgh, R., Dimou, A.: Parallel RDF generation from heterogeneous big data. In: Proceedings of the International Workshop on Semantic Big Data (2019).
- MLA
- Haesendonck, Gerald, et al. “Parallel RDF Generation from Heterogeneous Big Data.” Proceedings of the International Workshop on Semantic Big Data, 2019.
Discuss this article
- Discover all publications by Ruben Verborgh.
- Find related articles on Google Scholar.
- Post your questions or comments below.