Hey, folks. I'm working on some JavaScript code to parse a XML document into a Document Object Model (DOM) and later reserialize it with the user's changes. The catch is I want to try to preserve the document's existing whitespace format, so as not to interfere with SVN diff/blame or Mercurial diff/blame.

At first you might think, "well, whitespace is just a text node, and that's always significant." There are two problems with this. First, in many languages whitespace isn't that significant (XHTML, MathML, SVG, RDF, etc.) - browsers could care less. Second, there's whitespace which isn't part of a DOM node: the spacing between attributes. Case in point:

<author firstName="Isaac" lastName="Asimov" title="Dr."/>

<author firstName="Isaac"
Few XMLSerializer's will care about that, but it's a source of pain when dealing with revision control.

Does anyone know of any algorithms or existing API's for dealing with this adequately? I'm likely to write my own, but I would appreciate seeing whatever others have come up with.