Hmm, some thinking led me some testing. The html:
Code:
<?xml version="1.0" encoding="utf-8"?>
<!-- Just a comment -->
<!DOCTYPE html "-//W3C//DTD XHTML 1.1 Strict">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-gb">
<head>
<title> Title with extra spaces in it </title>
<script type="text/javascript" src="js.js"></script>
</head>
<body>
<pre id="prenode"></pre>
<!-- Just a puny little comment -->
<p> <span> hello
</span>
</p>
</body>
</html>
and the script:
Code:
function fnCleanTree(node){
var
cNodes=node.childNodes,
i=cNodes.length,
t,
reS = /\S/;
while(i-- && (t=cNodes[i]))
switch(t.nodeType){
case 1: // Element Node
fnCleanTree(t);
break;
case 3: // Text Node
if(reS.test(t.nodeValue))
break;
case 8: // Comment Node (and Text Node without non-whitespace content)
node.removeChild(t);
default:
i--;
}
}
window.onload=function(){
fnCleanTree(document.documentElement);
var
i=0,
sDocument='',
sT;
while((sT=document.all[i++])!=document.documentElement)
sDocument+=(sT.outerHTML || sT.text)+'\r\n';
sDocument+=document.documentElement.outerHTML;
document.getElementById('prenode').appendChild(document.createTextNode(sDocument));
}
And finally, the output (as written in the browser window):
Code:
<!DOCTYPE html "-//W3C//DTD XHTML 1.1 Strict">
<HTML xml:lang="en-gb" xmlns="http://www.w3.org/1999/xhtml"><HEAD><TITLE>Title with extra spaces in it</TITLE>
<SCRIPT src="js.js" type=text/javascript></SCRIPT>
</HEAD>
<BODY><PRE id=prenode></PRE>
<P><SPAN>hello </SPAN></P></BODY></HTML>hello
Which gave me the following results:
- You can't iterate over the
document itself (of course, since it's not the root node, but the container of it), it has to be a node like
document.documentElement which has it's
childNodes collection to iterate through, or a collection such as
document.all.
- You can't target comment nodes outside of the root node using the DOM. Which means you can't remove the doctype, either.
- You can target nodes outside the root node with document.all, however.