The object-resource impedance mismatch

What happens if we model the semantics of HTTP methods in JavaScript?

Most programmers are not familiar with resource-oriented architectures, and this unfamiliarity makes them resort to things they know. This is why we often see URLs that have action names inside of them, while they actually shouldn’t. Indeed, URLs are supposed to identify resources, and HTTP defines the verbs we can use to view and manipulate the state of those resources. Evidently, there is quite a mismatch between imperative (object-oriented) languages and HTTP’s resources-and-representations model. What would happen if we think the other way round and model HTTP methods in an imperative programming language?

27 September 2012

The number one source of the mismatch between HTTP and programming languages such as JavaScript is the difference between resources and objects, and specifically the lack of representations in the latter. In traditional languages, data is encapsulated in objects (or structs in C-like languages), and manipulations happen directly on the object itself. As you can see in the figure below, creating an object in JavaScript leads to a single representation of that object somewhere in memory, with a similar structure (if not identical) to that of the code. In contrast, resource retrieval and manipulation in HTTP happens indirectly through representations.

[Diagram comparing the use of internal and external representations in JavaScript and HTTP] — Unlike resource representations, object representations are unique and manipulated directly.

In the example above, you can see a piece of JavaScript code on the left with its corresponding JSON-like representation in memory. The fact that there is only one representation makes the concept actually superfluous: often, we just say that the object itself sits in memory. If you compare this to HTTP on the right, you’ll see the situation is completely different: the internal representation is data in a MySQL database, and among the many external representations are HTML, JSON, and RDF. In HTTP, these representations are important for interoperability and persistence. Offering multiple representations means clients don’t need to understand a single media type (such as the server’s internal format). Abstracting the server’s internal data into resources means that clients can keep addressing the same resource, even if the server’s internal format changes.

Programmers often try to ignore or circumvent these differences, which in practice leads to URLs such as the following:

/index.php?action=display&type=pet&name=Felix
/index.php?action=display&type=pet&name=Felix&format=json
/index.php?action=changeAge&type=pet&name=Felix&newAge=8
/index.php?action=delete&type=pet&name=Felix&confirm=yes

Note that I’m not criticizing the subjective ugliness of the above URLs. What I argue is that they are tied to a specific implementation, and thereby prevent interoperability and long-term persistence. For example, do the two first URLs refer to the same resource? Because right now, likes or comments on them will go to different places. The third and fourth URLs aren’t resources, but actions, and this can be dangerous. The HTTP uniform interface was invented precisely to prevent this.

To illustrate how to bridge between HTTP and a programming language, I won’t take the traditional approach of showing how to properly design a REST interface. Instead, let’s see what the semantics of HTTP would look like when modelled in JavaScript. Be sure to try out the code on GitHub to see the examples in this article live.

The GET method

The GET method in HTTP is used to retrieve a representation of the resource identified by the specified URL. A naive implementation could be the following:

// GET /animals/tom
console.log(resources['/animals/tom']);

However, then we make the mistake of tying the external to the internal representation. Let’s assume, for the sake of simplicity, that content negotiation has already taken place and that client and server have agreed on JSON. We can make the act of creating a representation visible with JSON.stringify:

// GET /animals/tom
console.log(JSON.stringify(resources['/animals/tom']));

Furthermore, HTTP transfers a representation inside of a message. Assuming a respond method that takes care of the connections, we send the 200 OK status code to the client. This is the final code that returns a representation of Tom:

// GET /animals/tom
respond(200, JSON.stringify(resources['/animals/tom']));

The PUT method

The PUT method has often been misunderstood. Popular Web applications used it to update parts of a resource, while PUT should be used to place the resource in the message’s body at the specified URL. This means there are two cases.

Either the resource did not exist yet on the server, in which case a new resource is created (status code 201 Created). Here, we create a Jerry resource:

// PUT /animals/jerry (with JSON body)
body = '{ "name": "Jerry", "species": "Mouse" }';
resources['/animals/jerry'] = JSON.parse(body);
respond(201);

Or either the resource did already exist, in which case the resource is replaced (status code 200 OK). Here, we replace the Tom resource:

// PUT /animals/tom (with JSON body)
body = '{ "name": "Jerry", "species": "Mouse" }';
resources['/animals/tom'] = JSON.parse(body);
respond(200, JSON.stringify(resources['/animals/tom']));

In the above examples, we deal with two different kinds of representations. The first is a representation made by the client, which has to be parsed by the server (with JSON.parse). The other is a representation made by the server, which doesn’t necessarily have to be the same. In fact, it will be different in most cases, but the good thing is: you don’t have to know, since HTTP deals with that for you.

The PATCH method

Although PATCH has not been defined in the final HTTP 1.1 standard, it can be used to update parts of a resource, which is not possible with PUT. Resources are modified with a patch document in a format such as JSON Patch. This makes the PATCH method more complicated, since client and server need to understand an additional representation format. Furthermore, PATCH is non-idempotent, so executing the same request twice can give different results. In the example below, we use a custom patch document format, which performs string replacement.

// PATCH /animals/tom (with JSON body)
resource = resources['/animals/jerry'];
patchDoc = '{ "prop": "name", "find": "r", "replace": "f" }';
patch = JSON.parse(patchDoc);
resource[patch.prop] = resource[patch.prop].replace(patch.find, patch.replace);
respond(200, JSON.stringify(resource));

The first request would change Jerry’s name into Jefry, a second request would further change this in to Jeffy.

The DELETE method

Modelling the DELETE method is quite straightforward, since JavaScript provides a specific operator for this purpose. We return a status code of 204 No Content, since there isn’t much left to report when the resource is deleted.

// DELETE /animals/jerry
delete resources['/animals/jerry'];
respond(204);

The POST method

The POST method is the most complicated and mysterious of all, since the actual action of POST depends on the underlying resource. Broadly speaking, there are two kinds of cases, both of which are non-idempotent.

The first case is when a POST request manipulates the state of a resource. For example, a POST to the Tom resource could increase the number of mice caught:

// POST /animals/tom
resources['/animals/tom'].miceCaught += 1;
respond(200, JSON.stringify(resources['/animals/tom']));

The second cases is when a subordinate resource is created. In this case, a POST to the animals resource could create a new animal resource. Note how the returned representation contains the URL of the new resource.

// POST /animals (with JSON body)
newResourceUri = '/animals/' + (Object.keys(resources).length + 1);
resources[newResourceUri] = JSON.parse('{ "name": "Pluto", "species": "Dog" }');
respond(201, JSON.stringify({ location: newResourceUri }));

The difference between creating a resource with PUT and POST, is that, in the case of PUT, the client decides what the URL of the new resource will be. With POST, the server decides the URL. Also, PUT is idempotent: issuing the same request multiple times leads to only one resource. POST, on the other hand, will create a new resource every time.

Bridge the gap: model your resources

Probably, the most important lesson here is that the conversion between objects and resources is not straightforward. Since relatively few developers have been taught the semantics of HTTP resources, representations, and methods, it can be hard to point out mistakes when they expose an object model as a Web application. However, since they’re are familiar with programming languages, modeling HTTP in JavaScript could point them to the object-resource impedance mismatch.

Therefore, if you know programmers that need a thorough understanding of HTTP, point them to this blog post ;-)

…and if you like experimenting yourself, download the source code from GitHub.

Ruben Verborgh