Designing a Linked Data developer experience
Making decentralized Web app development fun.
While the Semantic Web community was fighting its own internal battles, we failed to gain traction with the people who build apps that are actually used: front-end developers. Ironically, Semantic Web enthusiasts have failed to focus on the Web; whereas our technologies are delivering results in specialized back-end systems, the promised intelligent end-user apps are not being created. Within the Solid ecosystem for decentralized Web applications, Linked Data and Semantic Web technologies play a crucial role. Working intensely on Solid the past year, I realized that designing a fun developer experience will be crucial to its success. Through dialogue with front-end developers, I created a couple of JavaScript libraries for easy interaction with complex Linked Data—
In the mid-2000s, a new human species silently emerged among increasing complexity and specialization on the Web: the homo developerensis frontendicus, more commonly known as the front-end developer. They generally differ from their ancestors, the homo developerensis generalis, in two important aspects. Front-end developers tend to interact a lot more with regular humans (homo sapiens) in comparison to the stereotypically secluded back-end developers (backendicus) who prefer interacting with machines. And whereas many back-end developers will challenge you with their knowledge of obscure Perl constructs and try to impress potential mating partners with long Java Spring XML configuration files, front-developers often pride themselves on the fact that they are not great programmers but get the job done fast and have fun while doing so. Front-end developers build what people want and like; back-end developers enable them to do so easily.
As new decentralization efforts are emerging, Semantic Web technologies can play a crucial role in addressing interoperability challenges. In particular, the Linked Data way of representing knowledge is perfectly suited to store and integrate data in decentralized networks. However, our technologies are not particularly developer-friendly. The advent of front-end developers began some time after the Semantic Web and RDF communities had started, and our mistake is that we have so far ignored them—
Guess what? If we don’t follow them, then we will be the ones whose jobs will become obsolete. Our mistake is that we haven’t followed where the Web is going. And if we don’t change course, initiatives for re-decentralizing the Web might end up lacking proper data integration, which significantly reduces their chances of success.
I saw the importance of a front-end developer focus very clearly earlier this year at the GraphQL Day in Amsterdam. Somehow, a query language had managed to gather a room full of people who were having lots of fun querying things on the Web and building nice apps on top of that. GraphQL was lauded (incorrectly) as a replacement to the REST architectural style, and some snubbed the complexities of other solutions such as the Semantic Web’s query language SPARQL. Ironically, I learned about future plans that would considerably complexify and reshape GraphQL just to be able to cover part of the decentralized ground that SPARQL has excelled in for years already.
However, it would be very wrong to blame the GraphQL or front-end development communities for that. We have had golds in our hands for many years with Linked Data, RDF, and SPARQL—
I firmly believe that we should bring the Semantic Web (back) to the Web. We need to give front-end developers tools and libraries to do so. This is why, ever since joining the Semantic Web community, I’ve spent so much time creating JavaScript libraries for the browser from scratch, so we can make the Semantic Web happen on the Web. I would’ve moved much faster if I hadn’t insisted on doing so, but the things I built would never have received the visibility the Web brings for free.
However, up to that point, I hadn’t been writing for front-end developers: my libraries provided a low-level entry point to Linked Data. They expose RDF triples, essentially individual branches instead of the JSON trees developers are familiar with. Most developers don’t want RDF—
Why Linked Data?
Decentralized Web apps have multiple back-ends
A crucial first question is whether decentralized Web apps need Linked Data at all. Why not just do like every other Web API, where the server sends custom JSON that the client can easily decipher? The idea behind decentralization in Solid is that apps do not have their own data store. Data is instead stored in a place of the user’s choice. Apps thus need to be more flexible in order to become compatible with different back-ends. Multiple back-ends might be used at the same time by multiple apps. For instance, social media apps show data of multiple profiles, and every profile in a decentralized network can be stored in a different place.
So if you want to express that you like 👍 one of my posts, your like might be stored in a different place than my post. This has the following consequences:
- You need a way of connecting your like to my post.
- Your like needs a universal meaning so different apps can use it.
Addressing these problems is not easy with the custom JSON formats that most Web APIs use today.
Linked Data makes Web apps independent of specific back-ends
Linked Data solves both problems through links. To get started, my post and your like will be given their own URL, such that others can link to them. For instance:
- my post could be
https://ruben.verborgh.org/
posts/ 1234 - your like could be
https://you.example/likes/2018/12#like-on-rubens-post
So your like will connect to my post by linking to it:
{
"@context": "https://www.w3.org/ns/activitystreams",
"actor": "https://you.example/profile#you",
"type": "Like",
"object": "https://ruben.verborgh.org/posts/1234 ",
"published": "2018-12-28T10:00:00Z",
"id": "#like-on-rubens-post"
}
Since terms like type
, actor
, and object
will mean different things to different apps, Linked Data will also use links to establish a universal meaning for these. Through the @context
key in the above snippet, terms become links as well. For instance, actor
actually becomes https://www.w3.org/ns/activitystreams#actor
under the hood. The @context
key does this by pointing to a so-called JSON-LD context.
With the right abstraction layer, you don’t need to know about any of the above. However, without such links, apps are confined to the single back-end they have been hard-coded for—
The next two sections dive into deep detail on React and Linked Data expressions, and mainly target JavaScript developers. Feel free to skip ahead to the lessons learned.
React components for Solid
Picking a language and framework
When designing a developer experience, the first question is what language and framework to target, and JavaScript and React come out as clear winners in the 2018 State of JavaScript. I’m not particularly excited about frameworks in general, as their popularity rises and declines so fast that none of them can be considered a safe bet. (Remember jQuery?) That said, the numbers are what they are—
Neither myself nor the Solid team are married to React, however, and we should keep our eyes open for the other current and upcoming frameworks out there. Importantly, many of the lessons learned (and some of the libraries produced) can be applied directly in other frameworks.
The React components for Solid can be found on GitHub and npm.
Logging in and out
The first category of React components for Solid provide authentication functionality. Although not specific to Linked Data, authentication is crucial to fetch private data within decentralized networks. In contrast to Facebook and other social networks, there is no Log in with Solid
button. Instead, people log in with their own data pod, which can reside anywhere on the Web. A consistent login experience is therefore crucial, but I noticed that wiring up the existing authentication library could easily take a dozen lines of code. By reducing those lines to a single component, developers can reuse a well-tested solution instead. Here’s a snippet of the resulting code:
<LoggedOut>
<p><LoginButton popup="popup.html" /></p>
<p>You are not logged in, and this is a members-only area!</p>
</LoggedOut>
<LoggedIn>
<p>You are logged in and can see our special content.</p>
</LoggedIn>
Interestingly, the Solid React library is not just about providing components: it enables developers to easily build their own Solid components. For instance, the above <LoggedIn>
component has a straightforward implementation: instead of having to call the authentication library itself, it is wrapped into the withWebId
helper. This helper will pass the webID
property to the <LoggedIn>
component, containing the identity of the logged-in user. All <LoggedIn>
needs to do is check whether its webID
property has been set, and only in that case, render its contents. Developers building their own Solid components that involve authentication can simply reuse the withWebId
higher-order component, without having to wonder how it works.
Displaying Linked Data
The second category of React components provides what you’ve been waiting for: easy access to Linked Data. Even really simple tasks, such as showing the logged-in user’s name, required several lines of code that were not very intuitive and required developers to understand the nitty-gritty bits of RDF.
The Solid React library replaces this with a single component:
<p>Welcome, <Value src="user.name" /></p>
The <Value>
component displays the value of a piece of Linked Data identified through the src
property. As with authentication, this is achieved through a higher-order component called evaluateExpressions
, such that developers can easily create their own Linked Data components. All you need to do is wrap your component with evaluateExpressions
and indicate which properties can contain Linked Data expressions (in this case src
). These expressions will then be evaluated into values, and these values are passed to your component.
For example, if we define a <Span>
component as follows:
const Span = evaluateExpressions(({ src }) =>
src ? <span>{src}</span> : <em>pending</em>);
Then we can pass it a src
property:
<p>Your first name is <Span src="user.firstName" />.</p>
This src
property will be translated into an actual value by evaluateExpressions
, such that the rendered value will become:
<p>Your first name is Ruben.</p>
The library contains a couple of components that come in very handy to display Linked Data in various ways:
<LoggedIn>
<p>Welcome, <Value src="user.firstName" /></p>
<Image src="user.image" defaultSrc="profile.svg" />
<ul>
<li><Link href="user.inbox">Your inbox</Link></li>
<li><Link href="user.homepage">Your homepage</Link></li>
</ul>
<h2>Your friends</h2>
<List src="user.friends.firstName" />
</LoggedIn>
If you compare the above code to the RDF- and triple-based way of working with Linked Data, you’ll notice how much it simplifies things. And not just for front-end developers, but for everyone who wants to build Linked Data Web apps. As an example, consider the Solid profile viewer implemented with jQuery and rdflib.js or with the React components. The former requires knowledge of RDF and ontologies, whereas the latter only assumes React and Linked Data expressions. Furthermore, the authentication and data components of the React implementation are heavily tested, so the resulting app comes with stronger quality guarantees.
(async () => {
const store = $rdf.graph();
const fetcher = new $rdf.Fetcher(store);
await fetcher.load(user);
const fullName = store.any($rdf.sym(user), FOAF('name'));
$('#fullName').text(fullName && fullName.value);
})();
<Value src="user.name" />
is definitely more fun and more robust. If something costs a lot of effort, it’s not hard to imagine that people might not bother at all. So this would lead to Solid apps that do not greet the user and hence are not as friendly, or in the worst case, no Solid app at all. And while someone surely could write a simple getUserName
wrapper, such an approach does not scale to all of our data needs. All of this highlights the importance of a good developer experience for Linked Data.
LDflex: query the Web from within JavaScript
Simple expressions for simple data needs
As you probably have noticed, a major enabler for the React components above is the expression language for retrieving Linked Data. This is a custom language called LDflex, which I created for this purpose. LDflex is a domain-specific language (DSL) for JavaScript, meaning that all of its expressions are in fact valid JavaScript programs.
LDflex is my answer to many quick data needs developers experienced when building apps. Things such as getting the user’s name or homepage would involve so many lines of code that developers wouldn’t bother, or take hard-coded shortcuts. LDflex answers those needs with concise expressions, exposed through solid.data
in the browser:
const { data } = solid;
const name = await data.user.firstName;
const email = await data.user.email;
for await (const friend of data.user.friends.firstName)
console.log(friend);
It is very insightful to understand what is actually going on above. While it looks like traversing a local object, we are actually querying the Web every time we await
an LDflex expression. Here are the steps that happen behind the scenes for the LDflex expression solid.data.user.friends.firstName
:
- Obtain the WebID URL of the current user.
- Resolve the terms
friends
andfirstName
to their unique identifiers. - Create a SPARQL query that represents the expression (example).
- Fetch the document of the root node (in this case the user’s WebID) through HTTP.
- Execute the SPARQL query on the document and return the result.
These steps (or a variation thereof) are what you’d need to do yourself for every piece of data you need. And while abstractions such as functions could definitely facilitate all of this, it’s hard to beat the developer experience of just writing an expression. It’s much shorter than squeezing a GraphQL query into a React component; so short in fact that the expressions can just be written as inline properties.
In addition to user data, you can query any Linked Data resource on the Web:
data['https://ruben.verborgh.org/profile/#me'].firstName
data['https://ruben.verborgh.org/profile/#me'].homepage
data['https://ruben.verborgh.org/profile/#me'].friends.firstName
data['https://ruben.verborgh.org/profile/#me'].blog.schema_blogPost.label
These expressions can be used in a standalone way or, for instance, as a value in the src
property of the Solid React components (where solid.data
is omitted for brevity). And it’s not just React—
Getting the “feel” right
Several older libraries I had seen, would provide specific object-oriented wrappers around Linked Data resources. You’d give them the URL of a document, and they would happily populate a JSON object for people, photos, or any other domain-specific concept. This approach has a couple of drawbacks:
- Such libraries are always domain-specific. If you are dealing with a different type of data, you cannot use them. This is odd, since Linked Data can model anything.
- They assume that objects have a specific set of properties. This is a major restriction, since Linked Data enables arbitrary data shapes.
- They remove links by flattening the world into a local object. However, that object cannot possibly contain all data, since Linked Data is spread across the Web.
In other words, by dumbing down Linked Data to a plain old JSON object, we lose the advantages and flexibility of Linked Data and inherit only drawbacks. This happens because JSON objects are trees, whereas Linked Data is a graph. So the pure object-oriented abstraction for Linked Data is broken by design.
When designing LDflex, I was looking for an abstraction that would provide the power and “feel” of Linked Data, while still feeling familiar to developers. This is why LDflex expressions feel like local JSON objects, whereas actually they’re not. Expressions such as the following are a hint for that behavior:
data['https://ruben.verborgh.org/profile/#me'].label
You can substitute my WebID URL by any other Linked Data resource, and it will still work. So solid.data
poses as an object with an infinite number of properties, which is much closer to the true nature of Linked Data.
The magical switch from local expression to remote data source happens when we use the JavaScript await
keyword:
// This line does nothing yet…
const expression = data.user.friends.name;
// …but this line fetches data from the Web
const name = await expression;
Under the hood: JavaScript Proxy
and JSON-LD
LDflex works through JavaScript Proxy
objects, which provide a mechanism for intercepting arbitrary properties. With Proxy
, we can ensure that arbitrarily complex paths such as my.random.path.expression
will actually resolve to a meaningful value, even if the my
object does not really have any of these properties.
Recall that with Linked Data, terms have a universal meaning so they can work across different back-ends. Therefore, a core task of LDflex is to translate simple terms into URLs. For example, the path user.friends.firstName
on solid.data
will be resolved in the following way:
user
becomeshttps://you.example/profile#you
(the current user’s WebID)friends
becomeshttp://xmlns.com/foaf/0.1/knows
firstName
becomeshttp://xmlns.com/foaf/0.1/givenName
Crucially, this knowledge is not hard-coded into LDflex itself. The translation from term into URL is freely configurable through a JSON-LD context. LDflex thereby applies the same mechanism for marking up a JSON object with @context
to the infinite Linked Data graph on the Web. This flexibility is achieved through multiple libraries:
-
The LDflex core library contains the resolution and query mechanisms without concrete implementations. It knows how to resolve paths and generate SPARQL queries, but you still have to configure it with a JSON-LD context and query engine.
-
Comunica for LDflex makes the Comunica query engine work with LDflex expressions. The LDflex core library will pass it a SPARQL query for execution.
-
LDFlex for Solid is a configuration of LDflex that provides the
user
object and a JSON-LD context containing useful terms for Solid. This configuration thus defines whatuser
,friends
, andfirstName
mean to Solid apps.
Together, they provide the feeling of an infinite local object that accesses Linked Data on the entire Web. This final piece of magic is provided through the LDflex core library by implementing await
and for await
support. When await
is used on an LDflex expression, the expression is treated as a Promise
by calling the then
method under the hood. LDflex wires then
to the first result of a query execution. Similarly, for await
is wired up as a method call to Symbol.asyncIterator
.
Explore LDflex in the Solid LDflex playground. You can find inspiration for expressions in the Solid LDflex documentation and its JSON-LD context.
The future is write
Solid aims to realize a read–write Web through Linked Data. As a technology advocate for Inrupt, I see my main role in designing new technological experiences to support that goal. In my previous blog post, I pointed to the importance of queries for decentralized applications, because apps do not (and should not) know how to retrieve data. LDflex realizes this with simple query expressions. While LDflex is not the answer for all query needs, it covers many quick cases much faster than other query languages.
In the future, we will definitely want to explore more powerful languages such as GraphQL. I’m purposely not mentioning SPARQL, as the developer tooling for GraphQL is so much better that it might make more sense to add universal meaning to GraphQL instead of building SPARQL tooling from scratch.
The next leap for LDflex is obviously write: making it as easy to add or change data as it is to read. Because of the flexibility of Linked Data, writing comes with several challenges, such as where to store that data and how. Writing Linked Data doesn’t necessarily mean writing triples, as the following exciting examples show (try them!):
// Follow me
data['https://ruben.verborgh.org/profile/#me'].follow()
// Like all of my blog posts
data['https://ruben.verborgh.org/profile/#me']
.blog.schema_blogPost.like()
// Dislike Facebook
data['https://facebook.com/'].dislike()
When liking becomes as simple as calling a like()
method, such interactions are much easier to create, and hence much more likely to be provided.
Evolving the decentralized developer experience
In his 2018 tech retrospective, André Staltz remarked that, while scoring well on governance and freedom, decentralized projects still require a strong investment in user experience (UX). With this blog post, I am arguing that the decentralized community should focus on developer experience (DX) first, because front-end developers are the ones reaching end users and shaping their experience. We should trust that their talents for creating an appealing user experience far exceed ours.
The question thus is how we can enable front-end developers in the best way possible. Dan Brickley and Libby Miller hit the nail on the head when they wrote:
People think RDF is a pain because it is complicated. The truth is even worse. RDF is painfully simplistic, but it allows you to work with real-world data and problems that are horribly complicated.
However, that does not mean everyone needs to be exposed to RDF. RDF introduces a different way of thinking, on top of the horribly complicated nature of decentralized programming, and we should not force that on front-end developers. Instead, we should leverage the vast amounts of knowledge they already have, and tap into their frameworks and tools.
The only thing we can—
These developers are not burdened by the Semantic Web mistakes of the past, such as our over-reliance on XML and ontologies. We are not trying to reboot the Semantic Web, or force developers once more into our world. This is about Linked Data as a solution for building decentralized apps. The success of Schema.org shows that there is room for these solutions, and I notice a lot of enthusiasm among the young developers who are taking their first steps in the Solid world. This is the future, not the past.
Importantly, the React and LDflex libraries are not just giving front-end developers tools for building end-user apps. They also contain the base components to start creating new libraries and tools. Our goal should be to foster an ecosystem of these, instead of writing everything ourselves.
Contrary to a once dominant sentiment in the Semantic Web research community, realizing sufficiently usable abstractions for a broad developer audience it not a trivial engineering task. The simpler the abstractions, the more complex the intelligence required to realize them. We need research that is able to fill in the blanks if we no longer want developers to overspecify their interactions with data. Concretely, LDflex reduces a complex federated query execution mechanism to a couple of keywords, which means the query engine needs to figure out all of the HTTP requests developers would have written manually. Realizing this leans on many years of research; further optimizing it on many more to come.
By enabling front-end developers, we open up a highway for creativity that will make the decentralized Web reach end users much faster and better. Moreover, we can use the same libraries and tools to accelerate our own development. So we profit from a whole army of new talent, and are at the same time able to better leverage our own. Enabling front-end developers is enabling users, and ultimately enabling ourselves.