Tim Bray has posted a nice summary of what he thinks a a search API should look like: On Search: Interfaces
Most of it I agree with, but I think there are advantages to having it a bit more RESTful.
His query interface I agree, though I obviously have a bias towards the results set being expressed as an Atom feed.
The API for managing postings isn't as RESTful as I think it could be. In his proposal Tim uses an attribute 'op' that is used to indicate which action is to take place. It would be better if the verb were moved out of the body and into the HTTP method.
Initially adding a new resouce could still be done using a POST but have the response be a status code 303 with a Location: header with a URI. Here is an example 'add' request:
POST /cgi-bin/add.cgi HTTP/1.1 Host: 127.0.0.1:8085 <update href="http://example.com/herman"> <posting word="call" wnum="0" /> <posting word="me" wnum="1" /> </update>
And the response just refers to the URI that was just created:
HTTP/1.1 303 See Other Content-Length: XXX Location: http://127.0.0.1:8085/index/1 Content-type: text/plain Entry created in the index.
That URI returned in the
http://127.0.0.1:8085/index/1) identifies that resource in the
search engine. Do a GET on that URI to retrieve an XML document
that describes the current state of that index entry.
Do a PUT with an updated document to update the list of postings.
And finally a HTTP DELETE on that URI will remove the resource
from the index.
How is this reformulation better? First a GET is used to retrieve the current status of an index entry. Those responses can be optimized using caching, gzip, and ETags, thus reducing the bandwidth used. Secondly it gives each entry in the index it's own URI, which is a handly handle to have. Thirdly, since the state of an index entry is retrievable by a GET it can be combined easily with other web services. Lastly, since DELETE is used to remove a entry from the index, proxies and other intermediaries along the way have an opprotunity to remove the item from their caches. The last benefit is a nice benefit of uniform semantics, that is, the intermediaries can take the appropriate action based on the HTTP method DELETE without having to be programmed to understand the particulars of the content being passed in the request body.