BitWorking

The Atom Publishing Protocol is a failure

The Atom Publishing Protocol is a failure. Now that I've met by blogging-hyperbole-quotient for the day let's talk about standards, protocols, and technology.

This is all the fodder I was going to throw together for a presentation I proposed for OSCon. Since that proposal got rejected I'm going to post it here. On the other hand, my App Engine tutorial got accepted, so I'll still see you at the conference.

So AtomPub isn't a failure, but it hasn't seen the level of adoption I had hoped to see at this point in its life. There are still plenty of new protocols being developed on a seemingly daily basis, many of which could have used AtomPub, but don't. Also, there is a large amount of AtomPub being adopted in other areas, but that doesn't seem to be getting that much press, ala, I don't see any Atom-Powered Logo on my phones like Tim Bray suggested.

So why hasn't AtomPub stormed the world to become the one true protocol? Well, there are three answers:

The world is a different place then it was when Atom and AtomPub started back in 2002, browsers are much more powerful, Javascript compatibility is increasing among them, there are more libraries to smooth over the differences, and connectivity is on the rise. So in the face of all those changes let's see how the some of the original motivations behind AtomPub are holding up.

I am so looking for an excuse to fly from NYC to SFO so I can use the in-air wifi on Virgin America, but I digress.

Thick clients, RIAs, were supposed to be a much larger component of your online life. The cliche at the time was, "you can't run Word in a browser". Well, we know how well that's held up. I expect a similar lifetime for today's equivalent cliche, "you can't do photo editing in a browser". The reality is that more and more functionality is moving into the browser and that takes away one of the driving forces for an editing protocol.

Another motivation was the "Editing on the airplane" scenario. The idea was that you wouldn't always be online and when you were offline you couldn't use your browser. The part of this cliche that wasn't put down by Virgin Atlantic and Edge cards was finished off by Gears and DVCS's.

I'm seeing a rise in DVCS based blogging platforms, could that trend extend beyond the highly technical crowd, could 'hg push' be the next big thing in publishing protocols? I digress again.

The last motivation was for a common interchange format. The idea was that with a common format you could build up libraries and make it easy to move information around. The 'problem' in this case is that a better format came along in the interim: JSON. JSON, born of Javascript, born of the browser, is the perfect 'data' interchange format, and here I am distinguishing between 'data' interchange and 'document' interchange. If all you want to do is get data from point A to B then JSON is a much easier format to generate and consume as it maps directly into data structures, as opposed to a document oriented format like Atom, which has to be mapped manually into data structures and that mapping will be different from library to library. The other aspect is that plain old HTML has become a lot more consumable in recent years thanks to the work on HTML5. If you need a hypertext document format you can reach for HTML these days and don't have to resort to XML based formats. The latter is huge shift in thinking for me personally; if you remember I own the domain "wellformedweb.org", so obviously I didn't think things would turn out this way.

All of the advances in browsers and connectivity have conspired to keep AtomPub from reaching the widespread adoption that I had envisioned when work started on the protocol, but that doesn't mean it's a failure. There is still plenty of uses for AtomPub and it has quietly appeared in many places. Other use cases are still holding up over time, such as migrating data from one platform to another. Probably the biggest supplier of AtomPub based services is Google with the Google Data APIs, but it also has support from other services; just recently I noticed that flickr offers AtomPub as a method to post images to your blog. So it's not a failure, but certainly the advancing browser platform has obviated many of the motivations behind its creation.

The reality is that more and more functionality is moving into the browser and that takes away one of the driving forces for an editing protocol.
Hm. I see the RIA's moving into the browser, not being obsoleted by it. In that sense, there is still a huge need for a widespread application protocol (or more than one) which encourages scalable systems. AtomPub does that by supplying clear collection/member boundaries, focusing on idempotent methods, encouraging good caching metadata, and defining link relations, and not by being built on XML. If JSON is more amenable to some environments/problems, then by all means let's make a JSON protocol which does the above. We may first (or simultaneously) need an application-level format based on JSON--it's natural to think that, just like XML-Atom-AtomPub, we need JSON-???-???Pub. But I'm inclined to believe that JSON, because it purposely "maps directly into data structures", doesn't need that intermediate piece. I made Shoji to be a data-catalog protocol on JSON, and it does well without that intermediate format--perhaps somebody will try to write a publishing protocol on JSON and find it needs both, but why would they when Atom/AtomPub already do that well?

Posted by Robert Brewer on 2009-04-18

Microsoft is migrating all of its Live Services APIs into a single unified AtomPub API in the Live Framework. It appears they will also open up the ability for third parties to do the same.

I don't think the JSON argument is such a big deal. Google Data APIs support output in JSON and RSS formats. Microsoft's Live Framework supports full CRUD with JSON, AtomPub, and POX, as well as output in RSS. Under the hood, they are simply offering a JSON encoding of AtomPub.

Arguing that browsers are becoming good enough for most app scenarios reminds me of when Steve Jobs initially said that their SDK solution was web apps only. Look how well that turned out compared to what happened when they released a real SDK. Rich apps are here to stay, powered of course by services such as AtomPub.

Posted by Oran on 2009-04-18

Great post -- excellent food for thought. I am especially intrigued/excited about what DVCS's offer for publishing. JSON, HTML, etc. offer great opportunities as well. Atom/AtomPub enthusiasts (like me) need to fight the urge to discount those trends. What I do really like about Atom is the constrained/minimalist data model that it offers. I honestly think that much of the promise of the Semantic Web/RDF will in fact be realized (already is in many ways) by Atom's simpler, tree-based structures (offering graph semantics a tree-at-a-time).

And what AtomPub offers that is so important is a clear example & implementation of RESTful design. Having just completed a project for interop between a Java-based Course Management System (Blackboard) and a home-grown PHP AtomPub enabled asset store, I've seen first hand the power of a really good spec & roadmap for interaction design -- the abstraction was remarkably non-leaky.

I think RESTful JSON implementations ought to look closely at what AtomPub offers. While browsers are shockingly bad at processing XML, they are really good at creating XML out of JSON. Browser-based AtomPub means JSON in the browser and convert-to-Atom before POST/PUT to the server. I'd like to start seeing link[@rel=edit]/@type='application/atom+json' in feeds. I've found that to be a useful pattern for browser based AtomPub (I realize that mime type doesn't exist, but we can hope...).

Posted by pkeane on 2009-04-18

"Probably the biggest supplier of AtomPub based services is Google with the Google Data APIs"
Yet, I am still waiting for the service documents of those Google services. Where are they? To me Atom and AtomPub haven't lost because they are XML and expect full support of HTTP, but purely because large services out there like Facebook, Twitter, Last.fm or MySpace do not want to expose their data in a way that make it easy to aggregate and compute from them. If Wikipedia was even able to use Atom properly, the data exposed would properly reuse category, link and the like i.e. web semantics.

Posted by Sylvain Hellegouarch on 2009-04-18

I was very close to adopting AtomPub. I wanted to add support for it to my custom blog to be able to quickly add twitter-like updates. I've even started coding and found helpful service that stress-tested my implementation. ...but I couldn't find a single desktop client that worked! I've tried dozen of shiny Mac apps, I've even tested Windows Live Writer, I've tried to fend of dozens of Javascript/XUL errors in Flock. They all turned out to be crappy, lacking or simply not supporting AtomPub (despite sometimes claiming to do so). So I gave up. I won't write server that has no clients.

Posted by kl on 2009-04-18

Atom is also gaining in use outside of the web mainstream, in my case I have found it to be the most widely-supported common document description format in the digital library world, where metadata standards procreate like rabbits and no one can seem to agree on any one format. But Atom is becoming the fallback-format of choice for doing things like publishing a document to a digital repository like Fedora or DSpace.

Posted by alexander on 2009-04-18

The other aspect is that plain old HTML has become a lot more consumable in recent years thanks to the work on HTML5. If you need a hypertext document format you can reach for HTML these days and don't have to resort to XML based formats.

Could you expand on this? How does HTML5 remove the need for XML-based document formats? Is your point that HTML5 removes the need for XHTML or you saying something more than that?

Posted by James Clark on 2009-04-18

Sylvain wrote
To me Atom and AtomPub haven't lost because they are XML and expect full support of HTTP, but purely because large services out there like Facebook, Twitter, Last.fm or MySpace do not want to expose their data in a way that make it easy to aggregate and compute from them.
I can't imagine anything further from the truth. Twitter has an extensive set of APIs at http://apiwiki.twitter.com and Facebook at http://wiki.developers.facebook.com/index.php/API that go MUCH further than the simple microcontent editing scenarios originally envisioned by AtomPub.

As Joe points out in his post, these services could have built their API sets by starting from AtomPub but they didn't. That is the key indictment of AtomPub that [to me] is the final nail in its coffin and JSON was the hand that held the gun.

PS: Even MySpace has extensive APIs at http://developer.myspace.com/community/ which have AtomPub as option but it is clear that the API (OpenSocial) is JSON-centric and AtomPub support is more for "checklist satisfaction" as opposed to being well supported.

Posted by Dare Obasanjo on 2009-04-19

James,

I'll point to Mark's work in Dive Into Python 3 as an example of what I am talking about.

Oran

I don't think the JSON argument is such a big deal. Google Data APIs support output in JSON and RSS formats. Microsoft's Live Framework supports full CRUD with JSON, AtomPub, and POX, as well as output in RSS.

How does that not support my point?

Posted by Joe on 2009-04-19

In the myriad APIs and API clients I've written, my reluctance to use APP is purely and simply about complexity: complexity of the Atom libraries, complexity of XML. Why in the name of whatever holy thing you choose would you want to deal with that when you can just move some data from here ... to here? And documents are data too. I guess you said that Joe, but I don't really think it has much to do with browsers, except perhaps in the sense that browsers have been the engine that granted JSON popularity. Personally I think browsers are just another client. One's that more often than not are way too bloated. Give me PUT, POST, DELETE and GET on URIs at the OS system library level and then we'll be talking.

Posted by Chris Dent on 2009-04-19

@Dare: Much further than microblogging? What the hell Dare? AtomPub itself is so simple that it doesn't predicate you can only use it if you're going to write microblogging. I mean look at the Google API, they have little to do with microblogging and yet are built around AtomPub. Just because the idea came from the blogging world doesn't mean it was solely meant for that field and many instances in the academic world as well have shown it was quite possible to use it efficiently beyond that world.
As Joe points out in his post, these services could have built their API sets by starting from AtomPub but they didn't. That is the key indictment of AtomPub that [to me] is the final nail in its coffin and JSON was the hand that held the gun.
I'm sorry, what? So it can only be a technical issue that prevented those services from using AtomPub. Not a political (or even lack of understanding of the protocol from those services designers, hint Web3S) issue?
I keep thinking that, at an age where the size of your user base as well as the amount of data you have is a premium, providing Atom support in a way that truly supports some of the web semantics promises, companies are really careful not to open up the pipes too loosely and that there are constraints far beyond technical issues explaining why Atom/AtomPub aren't everywhere.

Posted by Sylvain Hellegouarch on 2009-04-19

Joe

How does that not support my point?

I don't think AtomPub needs to be reinvented each time some hot new encoding (JSON, M, etc.) comes along, although I can see the temptation to use that as an excuse to "get it right this time."

One of the beauties of REST is that it makes a distinction between a resource and the resource's representation. To me at least, this means there is less of a need for one encoding to "win" at the expense of others. The important (and hard) part is to fully realize the potential of REST, and this is where AtomPub shines as a standardized method for doing REST right.

I think of AtomPub similarly to the abstract XML Infoset. As evidenced by Google Data and especially by Microsoft's Live Framework, Atom is just one possible encoding of AtomPub, with JSON and "POX" being equally valid encodings. To me, this usage of JSON makes AtomPub even more relevant, not less.

Posted by Oran on 2009-04-19

Alas! my carefully considered three-paragraph response has been junked because I am a robot. In Safari 4 at least the Back button is not available so I can’t repost or even retrieve the text, which is unfortunate.

So, in brief: I implemented a massively elegant Atompub middleware for a project and our partners then asked me to simplify it by adding a SOAP interface. I have not been able to persuade them that XML encoded as a string embedded in a request encoded as XML whose response is XML that must be parsed is in fact more complex than just POSTing the XML you started with and checking a 3-figure status code.

AtomPub is to an extent falling between two stools: people who just want to exchange simple data structures between JavaScript in a web page and a back-end server are better off with JSON, and people exchanging data between back-end servers will have been persuaded to simply use SOAP by the .NET wizards.

Posted by Damian Cugley on 2009-04-20

"AtomPub is to an extent falling between two stools: people who just want to exchange simple data structures between JavaScript in a web page and a back-end server are better off with JSON, and people exchanging data between back-end servers will have been persuaded to simply use SOAP by the .NET wizards."

This nails it for me.

Posted by Phil Wilson on 2009-04-21

2009-04-18