Given that Joe works for Google on GData, I have assumed that Joe's post is Google's attempt to float a trial balloon before extending AtomPub in this way.
As I explained in the comments on Dare's post, this is my personal blog and unless otherwise stated, my own thoughts and ideas. If I weren't just speaking for myself I would make that clear.
Like I am about to do now.
At Google we are considering using PATCH. One of the big open questions surrounding that decision is XML patch formats. What have you found for patch formats and associated libraries?
Posted by Stephen Bounds on 2008-02-20
Posted by Gordon Weakliem on 2008-02-21
Posted by Noah Slater on 2008-02-21
Posted by Subbu Allamaraju on 2008-02-21
Yeah, that one, like all the ones I've found so far, tries to take on too much, such as manipulating namespace declarations, comments and processing instructions. In addition it doesn't present on algorithm on how to generate a patch from two documents, only a format for communicating the differences.
A good candidate would have a limited number of operations, and not only specify the patch format and how to apply it, but how to generate a patch document from a pair of documents. For example: flatten the DOM into a sequence of SAX-like events and then applying a line oriented diff to the two sequences. The operations should be very limited in scope: delete element, add element, edit text node, edit element attributes. It could be that simple if I don't care about multi-master synchronization or three-way merge.
Noah,A binary diff presumes that two actors would serialize the same XML DOM in the same exact way, which as Subbu points out is the realm of XML canonicalization, and not anywhere you want to go.
Posted by Joe on 2008-02-21
Joe: "A good candidate would have a limited number of operations, and not only specify the patch format and how to apply it, but how to generate a patch document from a pair of documents. For example: flatten the DOM into a sequence of SAX-like events and then applying a line oriented diff to the two sequences. The operations should be very limited in scope: delete element, add element, edit text node, edit element attributes. It could be that simple if I don't care about multi-master synchronization or three-way merge."
This is a big part of where I was going with this. For the most part, existing solutions try to do way too much.
Posted by James Snell on 2008-02-21
Posted by Gordon Weakliem on 2008-02-21
ISTR IBM had something in this space called TreeDiff...
(rummages about the internet)
Hmm, it looks like it's been 'retired':
http://www.alphaworks.ibm.com/tech/xmltreediff
The Wayback Machine has a bit more from before the retirement in late 2003:
http://web.archive.org/web/20031002102746/http://alphaworks.ibm.com/tech/xmltreediff
I guess that's not actually very helpful, sorry.
Posted by Michael R. Bernstein on 2008-02-22
Posted by Eric Larson on 2008-02-23
document
function, limit the amount of memory and time a transform may hog, that sort of thing), that could work very well. But it does mean that the server has to lug an XSLT processor around, which is a much bigger component than, say, an XUpdate implementation. OTOH, unlike the XML patch formats, it has the upside that server and client need not agree exactly about the document’s Infoset: if elements change position or whitespace nodes shuffle around, the transform need not be affected at all.
Posted by Aristotle Pagaltzis on 2008-02-23
Posted by James Snell on 2008-02-20