BitWorking

REST Tips: Prefer following links over URI construction

When putting together a table to describe your REST service, and believe me, I've been seeing a lot of these tables recently, there needs to be a distinction between the server view and the client view.

For example, here is our table for the employee service from my worked example of how to create a RESTful protocol:

Table 1: Employee Web Service
Resource URI Method Representation Description
Employee List /employees/ GET JSON (emp list) Retrieve the list of employees
POST JSON (employee) Create a new employee
Employee /employees/{employee-id} GET JSON (employee) Retrieve an employee
PUT JSON (employee) Update an employee
DELETE - Remove an employee

The important point is that this is the server's view of the service, the table is a guide to the implementor of the service on how to structure the URIs. The view that actually gets documented and presented to the client needs to be slightly different, the most important part being the replacement of URI construction with links.

As a reminder, here is an example of the employee list, which contains the URIs of the employee resources.


  [
    {
      "name": "Joe Gregorio",
      "href": "jcg111002222"
    },
    {
      "name": "John Q. Public",
      "href": "jqp333445555"
    },
    ...
  ]

Figure 1
Employee List JSON Representation

Here is our table as it should be presented to client implementors:

Table 2: Employee Web Service
Resource URI Method Representation Description
Employee List /employees/ GET JSON (emp list) Retrieve the list of employees
POST JSON (employee) Create a new employee
Employee Found in the 'href' of each object in Employee List. GET JSON (employee) Retrieve an employee
PUT JSON (employee) Update an employee
DELETE - Remove an employee

Note that the client doesn't get to 'know' how to construct employee URIs from employee-ids, instead it just follows links from the employee list to each employee resource.

Tip: Prefer following links over URI construction.

There are still cases for URI construction, where the representation given is really a recipe for how to construct a URI, such as an HTML form, or an OpenSearch document. Even in these cases, the client doesn't have hard-coded knowledge on how to construct a URI, it is only following the recipe in the HTML form, or OpenSearch document. If the recipe gets updated the client will follow the new recipe without needing to be modified.

There are several advantages to keeping URI construction out of view of the client:

Simpler Client Code
The client code is simpler, dereferencing a (possibly relative) URI is simpler than constructing a URI and then dereferencing it.
Server flexibility
The URIs on the server side can be changed as needed without having to update all the clients. Yes, you should strive for unchanging Cool URIs, but mistakes happen and this lets you fix those mistakes without updating the clients.

This isn't just idle theory, at one point in the development of the Atom Publishing Protocol there was a push for a form of URI construction, a WebDAV like use of the URI path for creating and manipulating resources. Luckily that never made it into the specification, because now I can create a service document for a group of APP collections that currently don't have service documents:

The resource at http://bitworking.org/projects/gdata/gdata-service.atomserv is a service document for all my APP collections on Google, and it looks like:

<?xml version="1.0" encoding='utf-8'?>
<service xmlns="http://purl.org/atom/app#"  xmlns:atom="http://www.w3.org/2005/Atom">
  <workspace> 
    <atom:title>Google</atom:title>
    <collection href="http://www.google.com/calendar/feeds/default/private/full">
        <atom:title>Calendar</atom:title>
    </collection>
    <collection href="http://base.google.com/base/feeds/items" >
        <atom:title>Base</atom:title>
    </collection>
    <collection href="http://www.blogger.com/feeds/6464869902972579239/posts/default" >
        <atom:title>Blogger (jcgregorio)</atom:title>
    </collection>
    <collection href="http://spreadsheets.google.com/feeds/spreadsheets/private/full" >
        <atom:title>Spreadsheet</atom:title>
    </collection>
  </workspace>
</service>

I'm sure this is a small error, but your server table shows a URI /employees/{employee-id} and your client table instructs them to follow what would only be /{employee-id}

I considered making a joke about using XSLT to generate your client table from your server table, but given the nature of the 'net I'm afraid I'd be taken seriously :)

Posted by Josh Peters on 2007-03-16

Josh,

The links in the Employee List are relative URIs. They are relative to the URI from which the Employee List was retrieved. So if our service was located at http://example.com/employees/, then one employee list URI would be:

http://example.com/employees/jcg111002222

A good URI parsing library makes this easy:

>>> import urlparse
>>> urlparse.urljoin("http://example.com/employees/", "jcg11102222")
'http://example.com/employees/jcg11102222'
>>>

Posted by Joe on 2007-03-16

Thanks for pointing that out Joe. That makes a good deal of sense, but I think I would have implemented it incorrectly if I had to make an implementation.

Posted by Josh Peters on 2007-03-16

I contrasted/compared this approach a while back in an entry called Resource vs. Service Oriented Data Design. They're two very different approaches. I tend to like the linkable approach you're advocating a lot better.

Posted by Dan Diephouse on 2007-03-17

Relative links seem difficult -- moving the resources around requires understanding where the links are, and there's no standard or even convention in JSON as to where the links are located. xml:base offers a solution to this, or if you are using HTML the <base> tag, or with HTML you can more-or-less know what attributes need to be changed. But with JSON it's more difficult, and so the document isn't really self-describing -- it only is meaningful in the context of the location of the container, and you'll have to drag both around. But anyway, that's a minor detail.

What this doesn't seem to address is random access. Getting a complete employee list is difficult. You could do a query, like /employees/?id=jcg111002222, giving back the same JSON document but with only that one entry. But that's just /employees/?id={employee-id} -- how is that any more abstract than /employees/{employee-id} ? For indirection you can use redirects. This is not without its problems, but the problems don't seem any worse than the alternative.

Posted by Ian Bicking on 2007-03-19

Ian,

Links are links, those links in the employee could be relative or absolute, and the code I supplied would work the same. True, there are no conventions in JSON, the story for XML is good, and the story for HTML is even better being over 10 years old.

A query that only allowed searches by id seems rather useless. That's like Google only allowing searches where you put in the URI of the thing you're looking for. A more useful search would be:

/employees/?name=Joe

This is not without its problems, but the problems don't seem any worse than the alternative.

This is true, but only for small values of 'Google'.

Posted by Joe on 2007-03-19

Joe, On a freshly initialized system, with no employees yet in the database, would an initial request of PUT /employees/mdubinko create a new record or return a 404? Or is it up to the server to decide whether or not it wants to allow clients to control URLspace mapping? -m

Posted by Micah Dubinko on 2007-03-19

Micah,

It's up to the server to decide whether it wants to allow creation via PUT or via POST to /employees/. In this particular worked example, POST is used for creation.

In general I prefer creation via POST as it leaves control of the URI space in the hands of the server, and it avoids potential race conditions.

Posted by Joe on 2007-03-19

Hi Joe,

Is there a best practice (yet) when you want to have part of your resources publicly accessible and part of it restricted to some users?

Is it better to use different bases in that case (e.g. /employees vs /private/employees)? I feel that using only /employees would be confusing as some GETs will work and some no (if you have not been authenticated of course)

And speaking of authentication, is there a best practice (yet)? I've been reading a lot lately and there seems to be no consensus on that matter.

Thanks.

Posted by Avinash Meetoo on 2007-03-21

2007-03-16