Every Echo entry needs two identifiers, which we'll call, for lack of better names 'post-id' and 'perma-link'. They need to be separate, and they need to be required.
There is still a pretty heavy debate going on in the wiki and in Sam's blog about perma-link versus post-id. Now, I initially was for a single URI that operated as both a perma-link and as a unique id. I have since changed and I'm outlining here the compelling reasons for my change of heart. Also realize that this discussion is in the context of Echo as a syndication format. Echo will also be used as a publishing and possibly a commenting format and the required-ness of these identifiers may be different in those contexts.
Before we start justifying we need some definitions:
- A URI that points to the post on the web. Now that needs some clarification, first URI
is a big concept,
and subsumes many other things, for example all URLs are URIs, which means links of the form
freenet:are all URIs. Also, URNs are also URIs. Secondly, the perma-link should point to the story, not the source. For example, if you write a weblog entry about a story in the NYTimes, the perma-link needs to point to that entry on your weblog and not the story in the NYTimes. The perma-link should be resolvable, for example,
http:, but may be non-resolvable, though that is strongly discouraged.
- An identifier that uniquely identifies the post on the web. Again, that needs some clarification. If you write a weblog entry about a story in the NYTimes, and post it to your weblog under two categories, the post-id will be the same regardless of which category it is published in. Also, the post-id is unique among all the Echo entries ever published, by anyone on the web, for all time. Once an item is published, it's post-id never changes. If you edit your entry, the post-id does not change. If you re-categorize your post, it does not change. Unique across space and time. What if you want to include some link to the source material? That is another Echo tag, possibly in another Echo optional module, that allows for citing multiple sources.
A required perma-link
Perma-link should be required. This is a synidcation format, and the perma-link points back to the thing you are really interested in. The only excuse for not being able to supply a perma-link is that the resource you are describing is not on the web. That's a pretty thin excuse, but for those extremely rare cases, you can stuff a URN or some other non-resolvable URI in this field. But really, if you can generate an XML Echo file that lives on the web that describes your resource, do you really have any excuse for not providing an HTML view of that same data?
A required post-id
Now that you have a required perma-link, do you really need a post-id? This is where I need to show three things.
- While a perma-link is a URI, it may not uniquely identify a weblog entry.
- A method to uniquely identify a weblog entry is necessary.
- Post-id must be required.
1. Perma-links aren't unique The first one is easy if you consider categories. For example, I subscribe to the NYTimes RSS feeds, both the science and the technology feeds. There is overlap, and some stories appear in both the science and the technology feeds. Which means that they show up twice in my aggregator. Similarly MT users can turn on multiple archiving methods, which means that the same story can have mutliple URIs. For each archiving method, the story is the same but sits in a different context. In can sit in a weekly archive, a monthly archive, or in multiple category archives.
But if they are the same story, won't they have the same perma-link? No, the perma-link may point back to the story based on the context. For example, if you are subscribed to an Echo feed that contains just posts from a certain category, the perma-link could bring you to a page that contained just post's from that category, and that's what you want to happen. So it is possible that the same story could have multiple perma-links and that those perma-links show up in different Echo feeds.
2. Uniqueness is required Which brings up to the second question, do you really need a unique identifier? Yes, because this will allow the aggregator builders to track posts and allow the end-user to control whether they see the same item if it appears in multiple contexts. Also, it will allow aggregators to more easily and consistently implement new functionality. For example, with a guaranteed unique id I can track changes to an entry, possibly higlighting differences in versions. I can also more easily and consistenly do threading if each entry has a unique id. I can group Echo entries that are all about the same thing.
On the CMS vendor side, some need a unique id to track items, and the post-id, particulary in the form of a URN, gives them a place to store that information in an easy to parse format.
3. Benfits of being required For the third case, a required element gives a couple of advantages. It makes the specification for Echo easier to write. There are two elements. They are both required. End of story. You don't have to worry about precedence or dis-ambiguation, and it makes for a really simple case for simpler CMS's, just make the two elements the same, and since you are already generating a required perma-link, spitting out the same value in a different element is not a big hurdle to implementation.
Also, if post-id and perma-link are both required this helps support a "view-source" paradigm. If you see both tags in every Echo feed then you can be sure you'll include them in your own feed. If they're optional you might not see a feed with both and subsequently miss out on the advantages.
Echo needs both a required perma-link and a required post-id. Since both are URIs, if the posts from your CMS only have one URL then just set post-id = perma-link. Sure it's a little redundant, but it's easy to implement. If you have content that isn't on the web, then use a URN for perma-link, but think long and hard about justifying what should be an extremely rare situation. Both supply potentially unique information, with the perma-link preserving the context of the weblog entry while the post-id is the same regardless of the context.