![]() |
|
|
|||||||
![]() |
|
|
Thread Tools | Rate Thread |
|
|
PM User | #1 |
|
Moderator ![]() ![]() Join Date: May 2002
Location: San Jose, CA
Posts: 1,186
Thanks: 0
Thanked 8 Times in 7 Posts
![]() |
Cleaning useless whitespace in Mozilla DOM
Code:
const notWhitespace = /\S/
function cleanWhitespace(node) {
for (var x = 0; x < node.childNodes.length; x++) {
var childNode = node.childNodes[x]
if ((childNode.nodeType == 3)&&(!notWhitespace.test(childNode.nodeValue))) {
// that is, if it's a whitespace text node
node.removeChild(node.childNodes[x])
x--
}
if (childNode.nodeType == 1) {
// elements can have text child nodes of their own
cleanWhitespace(childNode)
}
}
}
document.addEventListener("load", function() {
cleanWhitespace(document)
}, true)
Make sure you use this only in documents where whitespace is expendable. XHTML documents are among these, as are MathML expressions and SVG images. Theoretically, whitespace can be significant in some XML documents.
__________________
"The first step to confirming there is a bug in someone else's work is confirming there are no bugs in your own." June 30, 2001 author, Abacus MathML Editor author, JavaScript Developer's Dictionary Last edited by Alex Vincent; 09-29-2002 at 02:56 AM.. |
|
|
|
|
|
PM User | #2 |
|
Super Moderator ![]() ![]() Join Date: May 2002
Location: metro DC
Posts: 3,168
Thanks: 1
Thanked 18 Times in 18 Posts
![]() |
Code:
document.addEventListener('load', function() {
var treeWalker = document.createTreeWalker(document, NodeFilter.SHOW_TEXT, { acceptNode: function(node) { return /\S/.test(node.nodeValue) ? NodeFilter.FILTER_REJECT : NodeFilter.FILTER_ACCEPT } }, false);
while (treeWalker.nextNode())
treeWalker.currentNode.parentNode.removeChild(treeWalker.currentNode);
}, true);
Last edited by jkd; 10-01-2002 at 05:42 AM.. |
|
|
|
|
|
PM User | #3 |
|
Moderator ![]() ![]() Join Date: May 2002
Location: San Jose, CA
Posts: 1,186
Thanks: 0
Thanked 8 Times in 7 Posts
![]() |
I think I like mine better, as it can be much easier modified to work in IE browsers as well. We just change the const to a var, and use window.onload instead of document.addEventListener.
__________________
"The first step to confirming there is a bug in someone else's work is confirming there are no bugs in your own." June 30, 2001 author, Abacus MathML Editor author, JavaScript Developer's Dictionary |
|
|
|
|
|
PM User | #4 | |||
|
Super Moderator ![]() ![]() Join Date: May 2002
Location: metro DC
Posts: 3,168
Thanks: 1
Thanked 18 Times in 18 Posts
![]() |
Quote:
![]() Quote:
Quote:
|
|||
|
|
|
|
|
PM User | #5 | |
|
Moderator ![]() ![]() Join Date: May 2002
Location: San Jose, CA
Posts: 1,186
Thanks: 0
Thanked 8 Times in 7 Posts
![]() |
Quote:
![]() Note for all the spectators: Jason and I have been debating code for years. We love to present alternatives to each other's codes and arguments. Of course, he never admits that I can occasionally code better than he can...
__________________
"The first step to confirming there is a bug in someone else's work is confirming there are no bugs in your own." June 30, 2001 author, Abacus MathML Editor author, JavaScript Developer's Dictionary Last edited by Alex Vincent; 10-01-2002 at 05:05 AM.. |
|
|
|
|
|
|
PM User | #6 | |
|
Super Moderator ![]() ![]() Join Date: May 2002
Location: metro DC
Posts: 3,168
Thanks: 1
Thanked 18 Times in 18 Posts
![]() |
Quote:
![]() I just don't like the thought of recursively calling cleanWhitespace on every node in the document. Not only do you get n steps where n is the number of nodes, but in each n, you iterate through its child nodes, which is some exponential growth of the number of calculations the code needs to go through. Using treewalker (I have no idea how it works internally), it should just take a little bit longer to initialize the object, then perform n iterations, where n is the number of text nodes - much lower than the previous n. Of course, this is in the case of a static DOM - because it is all dynamic in Mozilla, I'm sure this algorithm is slightly more than an n one, but I believe still less than n^(some power). On average, I believe the TreeWalker solution would prove to be faster - but who feels like calling the two different versions a few hundred times, and averaging them out?
|
|
|
|
|
|
|
PM User | #7 |
|
Senior Coder ![]() Join Date: Aug 2002
Posts: 3,467
Thanks: 0
Thanked 0 Times in 0 Posts
![]() |
To the contrary, IE DOES create some empty text nodes...
Ever try placing what should be two, horizontally adjacent images into the HTML? Code:
<img src="pic1.jpg"> <img src="pic2.jpg"> Code:
<img src="pic1.jpg"><img src="pic2.jpg"> In short, IE makes them too...just not nearly as many as Gecko. P.S. I have my own whitespace cleaner that I made a bit ago (very similar, but then, wouldn't it be?)...I've used it quite extensively and even on large HTML pages I've never seen it take longer than .5 secs. (AMD 650Mhz) I think even in today's broadband plentiful internet world, most people are patient enough for that.
__________________
My Site | fValidate | My Brainbench | MSDN | Gecko | xBrowser DOM | PHP | Ars | PVP “Minds are like parachutes. They don't work unless they are open” “Maturity is simply knowing when to not be immature” Last edited by beetle; 10-14-2002 at 08:37 PM.. |
|
|
|
|
|
PM User | #8 | |
|
Super Moderator ![]() ![]() Join Date: May 2002
Location: metro DC
Posts: 3,168
Thanks: 1
Thanked 18 Times in 18 Posts
![]() |
Quote:
Believe it or not, modifying the DOM of a page on the fly relatively takes a lot of operation. (Update the objects, notify the renderer something has changed, rerender the page, etc. And each of those aren't necessarily efficient or simple.) |
|
|
|
|
|
|
PM User | #9 | |
|
Senior Coder ![]() Join Date: Aug 2002
Posts: 3,467
Thanks: 0
Thanked 0 Times in 0 Posts
![]() |
Quote:
__________________
My Site | fValidate | My Brainbench | MSDN | Gecko | xBrowser DOM | PHP | Ars | PVP “Minds are like parachutes. They don't work unless they are open” “Maturity is simply knowing when to not be immature” |
|
|
|
|
|
|
PM User | #10 |
|
Senior Coder ![]() Join Date: Jun 2002
Location: 41° 8' 52" N -95° 53' 31" W
Posts: 3,657
Thanks: 0
Thanked 0 Times in 0 Posts
![]() |
Just curious... why not:
const notWhitespace = /\S+/ ?
__________________
Former ASP Forum Moderator - I'm back! If you can teach yourself how to learn, you can learn anything. ;) |
|
|
|
|
|
PM User | #11 |
|
Super Moderator ![]() ![]() Join Date: May 2002
Location: metro DC
Posts: 3,168
Thanks: 1
Thanked 18 Times in 18 Posts
![]() |
You know it is no longer an empty text node when it can match \S just once. Using \S+ is unnecessary, as it doesn't matter if it has more than one nonwhitespace character or not.
|
|
|
|
|
|
PM User | #12 |
|
Senior Coder ![]() Join Date: Jun 2002
Location: 41° 8' 52" N -95° 53' 31" W
Posts: 3,657
Thanks: 0
Thanked 0 Times in 0 Posts
![]() |
Thanks for clarifying that! That makes sense, I think... since a space is usually if not always defined as a string... I assume from your answer that the empty text node always consists of only one or more spaces or line feed characters?
__________________
Former ASP Forum Moderator - I'm back! If you can teach yourself how to learn, you can learn anything. ;) Last edited by whammy; 10-17-2002 at 04:27 AM.. |
|
|
|
|
|
PM User | #13 | |
|
Super Moderator ![]() ![]() Join Date: May 2002
Location: metro DC
Posts: 3,168
Thanks: 1
Thanked 18 Times in 18 Posts
![]() |
Quote:
|
|
|
|
|
|
|
PM User | #14 |
|
Senior Coder ![]() Join Date: Jun 2002
Location: 41° 8' 52" N -95° 53' 31" W
Posts: 3,657
Thanks: 0
Thanked 0 Times in 0 Posts
![]() |
That's about what I figured... so pretty much whatever matches /\s+/ if you were using a regular expression? Actually that's what I was trying to convey by my original post, but perhaps I wasn't very clear.
__________________
Former ASP Forum Moderator - I'm back! If you can teach yourself how to learn, you can learn anything. ;) |
|
|
|
|
|
PM User | #15 |
|
Administrator ![]() ![]() Join Date: Mar 2002
Location: North America
Posts: 2,374
Thanks: 1
Thanked 12 Times in 12 Posts
![]() |
Sometimes it takes a while before you realize just how useful a code snippet is
. I'm currently playing around with using the DOM to retrieve a XML file, and the above really came in handy in getting a consistent document tree across browsers within the XML file. BTW, I resorted to using Alex's code, for share ease of legibility.Is there a logic behind Mozilla/NS inserting whitespaces into a document in such a manner? It seems to accomplish nothing but complicate matters.
__________________
- George - JavaScript Kit- JavaScript tutorials and 400+ scripts! - JavaScript Reference- JavaScript reference you can relate to. |
|
|
|
![]() |
| Bookmarks |
| Thread Tools | |
| Rate This Thread | |
|
|