Research challenges for 2019

24 January 2019

Following the curiosity-inducing post by my colleague Pieter Colpaert, I’m listing some of the questions I’d like to learn more about in 2019. There’s much more in here than anyone can reasonably answer in a year, but these should provide food for thought for the next couple of months.

Querying the Web

This one is actually my Big Question for the years to come, but it doesn’t hurt to state it explicitly:

How can we query data on the Web?

In particular, I’m interested in small data rather than Big Data:

How can we query a large number of small data sources instead of a small number of large ones?

Querying data in decentralized networks

The following questions are inspired by the Solid project, where people store data in their personal data pods instead of inside applications:

How can we query across personal data stores, taking into account privacy?
How can nodes in a decentralized network help each other with caching and querying?

Since querying decentralized networks takes time, the following also become important:

How can we conceal latency in applications that need data from decentralized sources?

How can we approximate query results and incrementally improve them?

In particular, I am interested in reviving link-traversal-based querying. Instead of blind traversal, we should leverage knowledge about the data shape and structure.

How to perform guided link-traversal-based querying, given machine-interpretable knowledge of data shapes and linking structures?
How can additional knowledge improve the efficiency of link traversal?

Facilitating Linked Data application development

In my 2018 blog post, I argue that the developer experience is crucial to accelerate the creation of user-facing apps, which are a major pain point of the Semantic Web. Rather than hand-wavingly considering this a trivial engineering matter, we should look into new abstractions that hide complexity. Think beyond JSON-LD.

How can we expose Linked Data to developers, hiding the complexities of RDF but keeping flexibility and unboundedness?
How to leverage composable data shapes that apps can bind to, as opposed to custom data models?
How to write Linked Data according to a specific shape?

Making personal data linked

I see the GDPR legislation as a godsend for innovation, and a way to break the Semantic Web’s chicken-and-egg problem for personal data. Under GDPR, we can contact any company or organization and retrieve our data in a structured format, giving us plenty of eggs to work with.

How can we extract personal data from third parties?
How can we link personal data from different third parties together?
What kind of vocabularies and shapes should we be using?
Can we easily move from one to another?

Read–write public and private Linked Data

The main Linked Data has success stories are about reading public/open Linked Data. These are important stories, but the opportunities for Linked Data extend far beyond them. Tim Berners-Lee has always called for a Read–Write Web, and such a Web contains public data, private data, and everything in between.

How can we build read–write websites based on Linked Data, accessible to humans and machines?
How can we meaningfully combine public and private Linked Data?

Personalized Linked Data experiences

Finally, we need to rethink the interaction between people and data. Currently, we follow a strong question–answer paradigm, where—if lucky—we get what we ask for. I am interested in personal agent/assistant interactions, where people are given the information they need, when they need it. This is also an alternative approach to latency concealment, in that we predict needs earlier and thus have more time to find answers.

How can we continuously assist people with data needs, as opposed to answering demand-driven questions?
How can we prepare for upcoming data needs to provide answers faster?

Possible application domains

I approach the above problems from a technology perspective, so I am not bound to a specific application domain. That said, we have specific expertise and/or particular interest in these domains:

scholarly communication
libraries, archives, and museums
biomedical data
transport and mobility
personal data management

Happy to engage in new collaborations within the above domains or to learn about new ones.

Want to work on these topics?

In addition to motivating my PhD students to work on the above topics (as well as to suggest their own), there are plenty of opportunities to get involved:

collaborate with us in projects, either on a bilateral basis or as part of a consortium
write a Master’s thesis or PhD thesis in my group
send me exciting ideas or make me another offer I can’t refuse

Contact me to find opportunities to get involved!