[Profile picture of Ruben Verborgh]

Ruben Verborgh

Towards Multi-level Provenance Reconstruction of Information Diffusion on Social Media

by Tom De Nies, Io Taxidou, Anastasia Dimou, Ruben Verborgh, Peter Fischer, Erik Mannens, and Rik Van de Walle

In order to assess the trustworthiness of information on social media, a consumer needs to understand where this information comes from, and which processes were involved in its creation. The entities, agents and activities involved in the creation of a piece of information are referred to as its provenance, which was standardized by W3C PROV. However, current social media APIs cannot always capture the full lineage of every message, leaving the consumer with incomplete or missing provenance, which is crucial for judging the trust it carries. Therefore in this paper, we propose an approach to reconstruct the provenance of messages on social media on multiple levels. To obtain a fine-grained level of provenance, we use an existing approach to reconstruct information cascades with high certainty, and map them to PROV using the PROV-SAID extension for social media. To obtain a coarse-grained level of provenance, we adapt a similarity-based, fuzzy provenance reconstruction approach – previously applied on news. We illustrate our approach by providing the reconstructed provenance of a limited social media dataset gathered during the 2012 Olympics, for which we were able to reconstruct a significant amount of previously unidentified connections.

Full text BibTeX Mendeley

Published in 2015 in Proceedings of the 24th ACM International Conference on Information and Knowledge Management.

Keywords: provenance, social media

Read this paper online

Cite this paper in your publications

Discuss this paper