[Profile picture of Ruben Verborgh]

Ruben Verborgh

Statistics about Data Shape Use in RDF Data

Sven Lieber, Anastasia Dimou, and Ruben Verborgh

Statistics about constraints use in RDF data bring insights in common practices to address data quality. However, we only have such statistics for OWL axioms, not for constraint languages, such as SHACL or ShEx, that have recently become more popular. We extended previous work on axiom statistics to provide evidence of constraint type use. In this poster, we present preliminary statistics about the use of SHACL core constraints in data shapes found on GitHub. We found that class, datatype and cardinality constraints are predominantly used, similar to the dominant use of domain and range in ontologies. Less-used constraint types need further attention in visualization or modeling tools to address data quality issues. More constraints of SHACL but also ShEx need to be included to deepen the understanding. Data quality researchers and tool designers can make informed decisions based on the provided statistics.

full text BibTeX other citation formats

Published in 2020 in Proceedings of the 19th International Semantic Web Conference: Posters, Demos, and Industry Tracks.

Keywords:

Read this article online

Cite this article in your work

Cite this article easily using its BibTeX entry:

@inproceedings{lieber_iswc_poster_2020,
  author = {Lieber, Sven and Dimou, Anastasia and Verborgh, Ruben},
  title = {Statistics about Data Shape Use in {RDF} Data},
  booktitle = {Proceedings of the 19th International Semantic Web Conference: Posters, Demos, and Industry Tracks},
  editor = {Taylor, Kerry and Gonçalves, Rafael and Lecue, Freddy and Yan, Jun},
  year = 2020,
  month = nov,
  series = {CEUR Workshop Proceedings},
  volume = 2721,
  issn = {1613-0073},
  pages = {330--335},
  url = {http://ceur-ws.org/Vol-2721/paper584.pdf},
}

Alternatively, pick a reference of your choice below:

ACM
Sven Lieber, Anastasia Dimou, and Ruben Verborgh. 2020. Statistics about Data Shape Use in RDF Data. In Proceedings of the 19th International Semantic Web Conference: Posters, Demos, and Industry Tracks (CEUR Workshop Proceedings), 330–335.
APA
Lieber, S., Dimou, A., & Verborgh, R. (2020). Statistics about Data Shape Use in RDF Data. In K. Taylor, R. Gonçalves, F. Lecue, & J. Yan (Eds.), Proceedings of the 19th International Semantic Web Conference: Posters, Demos, and Industry Tracks (Vol. 2721, pp. 330–335).
IEEE
S. Lieber, A. Dimou, and R. Verborgh, “Statistics about Data Shape Use in RDF Data,” in Proceedings of the 19th International Semantic Web Conference: Posters, Demos, and Industry Tracks, 2020, vol. 2721, pp. 330–335.
LNCS
Lieber, S., Dimou, A., Verborgh, R.: Statistics about Data Shape Use in RDF Data. In: Taylor, K., Gonçalves, R., Lecue, F., and Yan, J. (eds.) Proceedings of the 19th International Semantic Web Conference: Posters, Demos, and Industry Tracks. pp. 330–335 (2020).
MLA
Lieber, Sven, et al. “Statistics about Data Shape Use in RDF Data.” Proceedings of the 19th International Semantic Web Conference: Posters, Demos, and Industry Tracks, edited by Kerry Taylor et al., vol. 2721, 2020, pp. 330–35.

Discuss this article