Turtles all the way down

APIs are more than just data: context and controls also belong in the message.

How can we ever talk about intelligent clients if we don’t provide them with opportunities to be intelligent? The current generation of RDF APIs is patronizing its clients by only describing its data in RDF. This contrasts to websites for humans, where data would be quite useless if it were not accompanied by context and controls. By omitting these, we withhold basic information from clients, like “what’s in this response?” and “where can I go next?”. This post proposes to extend the power of self-descriptiveness from data to API responses as a whole. Using RDF graphs, we can combine data, context, and controls in one response. RDF APIs need to become like websites, explaining clients where they are and what they can do.

6 October 2015

Suppose you need to buy a present for a kid who likes pigs, and Peppa Pig in particular. One obvious way for humans to do that is to visit the Amazon website and to type in a relevant search term. Amazon will answer with a results page similar to this.

[A screenshot of an Amazon search page for “Peppa Pig”, showing 3 Peppa Pig articles, surrounded with navigation bars and explanatory texts.] — Everywhere on the human Web, data is combined with context and controls. ©Amazon

What makes this page so webby is that everything we need is there. Of course, we get a list of results as expected, but take the time to appreciate how much more is there. In particular, observe these:

There’s a header telling you that you’re on Amazon.
The top bar tells you’re looking at results for the keyword “Peppa Pig”.
In fact, you’re only seeing 16 results out of 30,862.
Each of the items in the list has links to a detail page.
There’s a menu on the left to refine the results.

These might seem like trivial things—and they are, because we use them every day. However, just imagine that any of these were not there. For instance, suppose there were no links. How could you buy the Peppa plushy? You’d need to read the Amazon manual to find out. Fair enough—but remember, no links, so where would you find that? Even worse, suppose you have none of the above, then the page would indeed consist of only data. It would look more or less like this:

[A screenshot showing 3 Peppa Pig articles, without any headers, menus, or additional texts surrounding them.] — Just a list of data, devoid of any context or controls. ©Amazon

If this doesn’t seem inconvenient at first, note that there are no links for you to click; it’s only text and images. Try to answer the following questions:

Where are you?
Where can you go next?
What can you do with the data?

This is especially confusing if you just see this screenshot, or if you would arrive at this document through a direct link. I mean, are you even on Amazon? And on which page? What’s the context of the info? I challenge you to get anything done with this data. Even though it tells you that, for example, a Peppa costs $4.32, you don’t know where and how to buy it at this price, and whether you should trust whoever is selling it.

Building webpages like this doesn’t make sense if we want to clients to do interesting things with them. Yet this is exactly how typical RDF interfaces are designed nowadays: we give machines only data, but neither the context to interpret it nor the controls to act upon it. Should we be surprised they can’t do anything remotely cool?

All eyes are on the data

If we designed RDF representations for the above response with the current mindset, we would end up with something similar to this.

</items/45158567#id> a pto:Book;
  schema:name "Peppa Goes Swimming"@en;
  schema:offer [ schema:price 3.68, schema:priceCurrency "USD" ];
  schema:aggregateRating [ schema:ratingValue 4.5 ].
</items/35235179#id> a pto:Toy;
  schema:name "Peppa Pig Regular Plush"@en;
  schema:offer [ schema:price 4.32, schema:priceCurrency "USD" ];
  schema:aggregateRating [ schema:ratingValue 4.5 ].
</items/10268448#id> a pto:Dress;
  schema:name "Peppa Pig Long Sleeve Bicycle Gown"@en;
  schema:offer [ schema:price 8.99, schema:priceCurrency "USD" ];
  schema:aggregateRating [ schema:ratingValue 4 ].

All data is there obviously: the three articles are adequately represented. But, contrary to the real-world Amazon example, these three items would make up the full response. There wouldn’t be more in there. What we have is effectively the second screenshot: just data, no context or controls. So:

How would a client know where it is?
How would a client know where it can go next?
How would a client know what to do with this data—and how?

The client doesn’t even know what page it is on, or what site, despite its understanding of the data as such. RDF APIs seem afraid to put anything but data in a response. Yet, as the screenshots above show, meaningful data is quite useless without this context.

Example: Linked Data Platform

The need for context is so pressing that people were searching other places for it: if it’s not allowed in the body, you could put it in the headers. For instance, the Linked Data Platform specification (LDP) suggests to stuff Link headers into the HTTP message:

HTTP/1.1 200 OK
Content-Type: text/turtle
Link: <http://www.w3.org/ns/ldp#Resource>; rel="type",
      <http://www.w3.org/ns/ldp#Page>; rel="type"
Link: </search/?q=peppa+pig&page=2>; rel="next",
      </search/?q=peppa+pig&page=1>; rel="first"

I find using the Link header especially ironic, because it essentially just yields triples:

<> a <http://www.w3.org/ns/ldp#Resource>.
<> a <http://www.w3.org/ns/ldp#Page>.
<> iana:next </search/?q=peppa+pig&page=2>.
<> iana:first </search/?q=peppa+pig&page=1>.

So close—yet still so far. It they’re triples, why can’t we put them inside of the message? It’s like we’re afraid to put anything non-data in the body. Because those poor simple clients might get confused? But wasn’t RDF supposed to be self-descriptive, so how could they be confused if the response describes itself?

Imagine how messy headers would get if we put actual context in them, like the title of the page, the total number of items, the fact that orders over $35 have free shipping.

(Note: LDP has some features to represent context, but you don’t really combine them with actual data. In those responses, the context is the data.)

Example: SPARQL protocol

It’s not just LDP—or any specification in particular. It is a general pattern of how RDF APIs so far are being designed. The same happens, for example, with SPARQL endpoint results. If I’d ask a query for Peppa Pig articles, I’d get back data similar to the above. Inside of the message, there wouldn’t be any indication of “you’re now looking at the results of query X” or “these are items 1–16 of 30,862”.

One way the SPARQL protocol and other APIs get around this is by defining specific content types like application/sparql-results+xml. Instead of simply saying, in RDF, “these are the results to a SPARQL query” and “here is where you can ask more SPARQL queries”, clients need to be hard-coded against this specific header constraint. That makes sense for only narrow scenarios; in other words, if all Web shops everywhere all behaved exactly the same. Of course, they don’t.

Some endpoints do describe themselves, like the Turtle version of the DBpedia SPARQL endpoint. Interestingly, those only describe the “entry point” (AKA the resource that is usually called /sparql), not any of the endpoint responses. The /sparql resource is indeed the only one that does not have data, so apparently it felt okay to put some context there. But still, it’s strange. If we did this on the human Web, the starting page of Amazon would have all the context of screenshot 1, whereas all subsequent pages would look like screenshot 2. Obviously, all pages on the human Web are treated equally, because all pages should be equally usable. What if you get the link to a SPARQL response from somewhere, are you supposed to figure out the context by yourself?

Current RDF API responses are not self-descriptive

Just imagine this in HTML. Imagine that you only see the second screenshot, and that your browser has to figure out everything else from the headers. What would your Web experience look like? The context and controls belong alongside the data.

The bigger issue here is that RDF API responses are not self-descriptive. This is quite ironic, given that RDF is the Resource Description Framework. Why don’t we even describe our own messages? If anything should be described, it’s these! They’re the main inputs we generate for our own RDF clients, yet for some reason we deem clients incapable of handling actually useful RDF. Without context and controls, clients have no choice but to exhibit dumb behavior.

Responses with context and controls

Combining everything

The alternative is simple: why don’t we do like everywhere else on the Web? We need to stop patronizing RDF clients and always create exceptions for them. On the human Web, we describe data, context, and controls alongside each other in the same language. The Peppa Pig articles were described in English, and so was all of the context: here are a book, a toy, and a dress, you’re on Amazon looking at the first 16 results, and you get free shipping with orders over $35. Let’s try that:

<> a hydra:Collection, hydra:PagedCollection;
   dc:title "Amazon results for ‘Peppa Pig'";
   hydra:totalItems 30862;
   hydra:itemsPerPage 16;
   hydra:firstPage </search/?q=peppa+pig&page=1>;
   hydra:nextPage </search/?q=peppa+pig&page=2>.
</#amazon> ex:paymentCondition [
  ex:condition [ ex:minimumTotal 35.00 ];
  ex:result [ ex:shippingCost 0.00 ]
].
</items/45158567#id> a pto:Book;
  schema:name "Peppa Goes Swimming"@en.
</items/35235179#id> a pto:Toy;
  schema:name "Peppa Pig Regular Plush"@en.
</items/10268448#id> a pto:Dress;
  schema:name "Peppa Pig Long Sleeve Bicycle Gown"@en.

Yes, I actually mixed data, context, and controls there. This response is not fully accessible for machines yet. For example, simple questions like “how many items are on this page?” are not straightforward to answer:

SELECT DISTINCT ?thing { ?thing ?p ?o. }

will now yield more than just the 3 Peppa items. And some of those, like /#amazon with its ex:paymentCondition, might not be understood by the poor client. Oh, the horror!

Graphs let us combine data, context, and controls neatly

Fortunately, RDF 1.1 introduced the concept of graphs to group triples together. This allows us to mimic the actual Amazon example, which puts the data in a different (visual) container than the context information. We put data in the default graph, and everything else in a dedicated about graph for the resource. Switching to TriG notation:

<#about> {
  <#about> foaf:primaryTopic <>.
  <> a hydra:Collection, hydra:PagedCollection;
     dc:title "Amazon results for ‘Peppa Pig'";
     hydra:totalItems 30862;
     hydra:itemsPerPage 16;
     hydra:firstPage </search/?q=peppa+pig&page=1>;
     hydra:nextPage </search/?q=peppa+pig&page=2>.
  </#amazon> ex:paymentCondition [
    ex:condition [ ex:minimumTotal 35.00 ];
    ex:result [ ex:shippingCost 0.00 ]
  ].
}
</items/45158567#id> a pto:Book;
  schema:name "Peppa Goes Swimming"@en.
</items/35235179#id> a pto:Toy;
  schema:name "Peppa Pig Regular Plush"@en.
</items/10268448#id> a pto:Dress;
  schema:name "Peppa Pig Long Sleeve Bicycle Gown"@en.

All of our problems disappear! Clients now do have access to context and controls within the same response, while at the same time, these are neatly separated from the main data. The response is entirely self-descriptive.

Now if clients want to count the number of items, they will correctly arrive at 3, because everything else is in a separate graph. In fact, clients can just ignore any triples they do not understand, in particular those in non-default graphs. Client who do understand context and controls can just use them.

But what happens if the data contains quads? Then the client can no longer simply skip all non-default graphs to find the actual data. Fortunately, the answer is simple. The context, which makes the response self-descriptive, is also self-descriptive. It explicitly says “I describe the current response”:

<#about> {
  <#about> foaf:primaryTopic <>.
}

So it’s easy to filter it out. Graphs that describe themselves as being context for the current resource are not part of the data. We can’t be any more clear and complete.

Self-descriptive API responses in practice

How does this work in practice, and why would we prefer context and controls inside of the response? Well, for instance, the Triple Pattern Fragments API already uses this principle out of the box. Each response of this API gives clients context of where they are and what they can do. Take this example:

<#about> {
  <#about> foaf:primaryTopic <>.
  <> hydra:itemsPerPage 100; hydra:totalItems 24000;
     hydra:firstPage <?page=1>; hydra:nextPage <?page=2>.

  <http://example.org/store> void:subset <>;
     hydra:search [
       hydra:template "http://example.org/store{?s,p,o}";
       hydra:mapping
         [ hydra:variable "s"; hydra:property rdf:subject ],
         [ hydra:variable "p"; hydra:property rdf:predicate ],
         [ hydra:variable "o"; hydra:property rdf:object ]
  ].
}

It explicitly tells clients that they’re looking at a 24,000-item subset of a dataset, and that this dataset can be searched by triple pattern. They can read more pages by following the links.

If the server supports other features, let’s say a SPARQL endpoint, it can just advertise that in the same response. Different pieces of context information can be combined without conflict, because everything is RDF:

<#about> {
  <#about> foaf:primaryTopic <>.
  <> hydra:itemsPerPage 100; hydra:totalItems 24000;
     hydra:firstPage <?page=1>; hydra:nextPage <?page=2>.

  <http://example.org/store> void:subset <>;
     hydra:search _:triplePatternTemplate.

  </store>  sd:endpoint </sparql>.
  </sparql> sd:feature sd:DereferencesURIs;
            sd:supportedLanguage sd:SPARQL10Query.
}

The client is now informed how to use either of these interfaces. All of this happens inside of the response. The data is not contaminated because we use a different graph, so the client can just ignore whatever it does not understand. There’s nothing hard going on here, just a server telling clients what they can do. Putting all of this in the Link header would be very impractical. But the response body has plenty of space.

You might wonder whether it hurts to include all of this information in each response. After all, it increases the response size. Well, size surely increases… but it’s what we do on the rest of the Web anyway. Since when would bandwidth be an excuse for not including such info on all human webpages? Any webpage would simply not be as useful without it. The same can be said about RDF responses: they are not as useful without context and controls.

Apply the RDF principles to the entire interface

The current generation of RDF clients only uses the bare minimum of what RDF offers. There is simply no other way: data is presented without context to interpret it and without controls to do things with it.

Personally, I don’t understand why specs like LDP go to great lengths to keep context and controls out of the message. At that point, the client is necessarily reduced to a data processor, while it is actually one of the two essential partners in a client/server interaction. For some reason, we’ve been downplaying the role of the client. I’m not saying that all clients are equally intelligent (for whatever definition); however, we shouldn’t necessarily assume that all clients are equally dumb either. Just like we put effort in modeling semantics of the data, we should model the semantics of the context and controls as well. This empowers clients that know how to use (part of) the context and controls, while more simple clients can just ignore it.

It all comes down to being explicit to clients. How do you indicate the next page? Just say to the client “this is the next page”. How do you indicate the number of items? Just say to the client “this is the number of items”. If you want the client to know something, just say it. RDF gives us the power to make such statements, so it’s a pity we seem so keen on avoiding it.

The beauty of the RDF model is that we can combine this into a single response, just like we’ve been doing for ages with HTML. It’s the most natural thing on the Web: HTML describes data, context, and controls using the same set of building blocks. Thanks to RDF’s graph support, we can do the same and separate data from context and controls, simply by describing each part. We don’t even need new content types for that, just like we keep on using text/html instead of inventing text/amazon-results+html. Simply application/trig or application/ld+json is sufficient, because the interpretation is an integral part of the message.

In essence, making responses self-descriptive means applying Turtle all the way down: use triples to describe the object of the interaction (the data), the properties of the response (the context), and the affordance towards more information (the controls). Except that it’s of course really TriG/N-Quads/JSON-LD all the way down, because only these formats support multiple RDF graphs. But that didn’t sound as nice ;-)

Ruben Verborgh

Thanks to Mark Wilkinson and James Anderson for allowing me to refine these ideas during discussions.