Go Back   CodingForums.com > :: Client side development > JavaScript programming > Post a JavaScript

Before you post, read our: Rules & Posting Guidelines

Reply
 
Thread Tools Rating: Thread Rating: 3 votes, 5.00 average.
Enjoy an ad free experience by logging in. Not a member yet? Register.
Old 09-29-2002, 01:50 AM   PM User | #1
Alex Vincent
Moderator


 
Join Date: May 2002
Location: Hayward, CA
Posts: 1,427
Thanks: 1
Thanked 19 Times in 17 Posts
Alex Vincent is on a distinguished road
Cleaning useless whitespace in Mozilla DOM

Code:
const notWhitespace = /\S/

function cleanWhitespace(node) {
  for (var x = 0; x < node.childNodes.length; x++) {
    var childNode = node.childNodes[x]
    if ((childNode.nodeType == 3)&&(!notWhitespace.test(childNode.nodeValue))) {
// that is, if it's a whitespace text node
      node.removeChild(node.childNodes[x])
      x--
    }
    if (childNode.nodeType == 1) {
// elements can have text child nodes of their own
      cleanWhitespace(childNode)
    }
  }
}

document.addEventListener("load", function() {
  cleanWhitespace(document)
}, true)
This script is intended to remove whitespace text nodes from a document. These nodes show up far more often than we want to admit, and leads to a DOM that is different in Mozilla than IE.

Make sure you use this only in documents where whitespace is expendable. XHTML documents are among these, as are MathML expressions and SVG images.

Theoretically, whitespace can be significant in some XML documents.
__________________
"The first step to confirming there is a bug in someone else's work is confirming there are no bugs in your own."
June 30, 2001
author, Verbosio prototype XML Editor
author, JavaScript Developer's Dictionary
https://alexvincent.us/blog

Last edited by Alex Vincent; 09-29-2002 at 01:56 AM..
Alex Vincent is offline   Reply With Quote
Old 09-29-2002, 02:14 AM   PM User | #2
jkd
Senior Coder

 
jkd's Avatar
 
Join Date: May 2002
Location: metro DC
Posts: 3,163
Thanks: 1
Thanked 18 Times in 18 Posts
jkd will become famous soon enough
Code:
document.addEventListener('load', function() {
  var treeWalker = document.createTreeWalker(document, NodeFilter.SHOW_TEXT, { acceptNode: function(node) { return /\S/.test(node.nodeValue) ? NodeFilter.FILTER_REJECT : NodeFilter.FILTER_ACCEPT } }, false);

  while (treeWalker.nextNode())
    treeWalker.currentNode.parentNode.removeChild(treeWalker.currentNode);

}, true);
Behold the awesomeness of DOM2 Traversal.
__________________
jasonkarldavis.com

Last edited by jkd; 10-01-2002 at 04:42 AM..
jkd is offline   Reply With Quote
Old 10-01-2002, 02:48 AM   PM User | #3
Alex Vincent
Moderator


 
Join Date: May 2002
Location: Hayward, CA
Posts: 1,427
Thanks: 1
Thanked 19 Times in 17 Posts
Alex Vincent is on a distinguished road
I think I like mine better, as it can be much easier modified to work in IE browsers as well. We just change the const to a var, and use window.onload instead of document.addEventListener.
__________________
"The first step to confirming there is a bug in someone else's work is confirming there are no bugs in your own."
June 30, 2001
author, Verbosio prototype XML Editor
author, JavaScript Developer's Dictionary
https://alexvincent.us/blog
Alex Vincent is offline   Reply With Quote
Old 10-01-2002, 03:30 AM   PM User | #4
jkd
Senior Coder

 
jkd's Avatar
 
Join Date: May 2002
Location: metro DC
Posts: 3,163
Thanks: 1
Thanked 18 Times in 18 Posts
jkd will become famous soon enough
Quote:
Originally posted by Alex Vincent
I think I like mine better
Well naturally

Quote:

, as it can be much easier modified to work in IE browsers as well.
And the purpose would be...? IE only has empty text nodes when you progmatically create them through createTextNode().

Quote:

We just change the const to a var, and use window.onload instead of document.addEventListener.
i.e. make the code bad.
__________________
jasonkarldavis.com
jkd is offline   Reply With Quote
Old 10-01-2002, 04:03 AM   PM User | #5
Alex Vincent
Moderator


 
Join Date: May 2002
Location: Hayward, CA
Posts: 1,427
Thanks: 1
Thanked 19 Times in 17 Posts
Alex Vincent is on a distinguished road
Quote:
Originally posted by jkd

And the purpose would be...?
To avoid browser-sniffing code.

Note for all the spectators: Jason and I have been debating code for years. We love to present alternatives to each other's codes and arguments.

Of course, he never admits that I can occasionally code better than he can...
__________________
"The first step to confirming there is a bug in someone else's work is confirming there are no bugs in your own."
June 30, 2001
author, Verbosio prototype XML Editor
author, JavaScript Developer's Dictionary
https://alexvincent.us/blog

Last edited by Alex Vincent; 10-01-2002 at 04:05 AM..
Alex Vincent is offline   Reply With Quote
Old 10-01-2002, 04:25 AM   PM User | #6
jkd
Senior Coder

 
jkd's Avatar
 
Join Date: May 2002
Location: metro DC
Posts: 3,163
Thanks: 1
Thanked 18 Times in 18 Posts
jkd will become famous soon enough
Quote:
Originally posted by Alex Vincent
Of course, he never admits that I can occasionally code better than he can...
The exact opposite could be said of you.

I just don't like the thought of recursively calling cleanWhitespace on every node in the document. Not only do you get n steps where n is the number of nodes, but in each n, you iterate through its child nodes, which is some exponential growth of the number of calculations the code needs to go through.

Using treewalker (I have no idea how it works internally), it should just take a little bit longer to initialize the object, then perform n iterations, where n is the number of text nodes - much lower than the previous n.
Of course, this is in the case of a static DOM - because it is all dynamic in Mozilla, I'm sure this algorithm is slightly more than an n one, but I believe still less than n^(some power).

On average, I believe the TreeWalker solution would prove to be faster - but who feels like calling the two different versions a few hundred times, and averaging them out?
__________________
jasonkarldavis.com
jkd is offline   Reply With Quote
Old 10-14-2002, 07:34 PM   PM User | #7
beetle
Senior Coder

 
Join Date: Aug 2002
Posts: 3,467
Thanks: 0
Thanked 0 Times in 0 Posts
beetle has a little shameless behaviour in the past
To the contrary, IE DOES create some empty text nodes...

Ever try placing what should be two, horizontally adjacent images into the HTML?
Code:
<img src="pic1.jpg">
<img src="pic2.jpg">
IE renders a space between the two this way. You have to do this:
Code:
<img src="pic1.jpg"><img src="pic2.jpg">
To elminate the space...
In short, IE makes them too...just not nearly as many as Gecko.

P.S. I have my own whitespace cleaner that I made a bit ago (very similar, but then, wouldn't it be?)...I've used it quite extensively and even on large HTML pages I've never seen it take longer than .5 secs. (AMD 650Mhz) I think even in today's broadband plentiful internet world, most people are patient enough for that.
__________________
My Site | fValidate | My Brainbench | MSDN | Gecko | xBrowser DOM | PHP | Ars | PVP
“Minds are like parachutes. They don't work unless they are open”
“Maturity is simply knowing when to not be immature”

Last edited by beetle; 10-14-2002 at 07:37 PM..
beetle is offline   Reply With Quote
Old 10-14-2002, 07:48 PM   PM User | #8
jkd
Senior Coder

 
jkd's Avatar
 
Join Date: May 2002
Location: metro DC
Posts: 3,163
Thanks: 1
Thanked 18 Times in 18 Posts
jkd will become famous soon enough
Quote:
Originally posted by beetle
I think even in today's broadband plentiful internet world, most people are patient enough for that.
Has nothing to do with the Internet connection, rather, the computing power at its disposal.

Believe it or not, modifying the DOM of a page on the fly relatively takes a lot of operation. (Update the objects, notify the renderer something has changed, rerender the page, etc. And each of those aren't necessarily efficient or simple.)
__________________
jasonkarldavis.com
jkd is offline   Reply With Quote
Old 10-14-2002, 07:55 PM   PM User | #9
beetle
Senior Coder

 
Join Date: Aug 2002
Posts: 3,467
Thanks: 0
Thanked 0 Times in 0 Posts
beetle has a little shameless behaviour in the past
Quote:
Originally posted by jkd
Has nothing to do with the Internet connection, rather, the computing power at its disposal.
I know that...I intended to use it as a comparison for the speed of operations versus what people are willing to wait for. Even on broadband connections quite a few pages take a second or two to access, so another .5sec or less is no biggie.
__________________
My Site | fValidate | My Brainbench | MSDN | Gecko | xBrowser DOM | PHP | Ars | PVP
“Minds are like parachutes. They don't work unless they are open”
“Maturity is simply knowing when to not be immature”
beetle is offline   Reply With Quote
Old 10-17-2002, 12:58 AM   PM User | #10
whammy
Senior Coder

 
Join Date: Jun 2002
Location: 41° 8' 52" N -95° 53' 31" W
Posts: 3,660
Thanks: 0
Thanked 0 Times in 0 Posts
whammy is an unknown quantity at this point
Just curious... why not:

const notWhitespace = /\S+/

?
__________________
Former ASP Forum Moderator - I'm back!

If you can teach yourself how to learn, you can learn anything. ;)
whammy is offline   Reply With Quote
Old 10-17-2002, 01:11 AM   PM User | #11
jkd
Senior Coder

 
jkd's Avatar
 
Join Date: May 2002
Location: metro DC
Posts: 3,163
Thanks: 1
Thanked 18 Times in 18 Posts
jkd will become famous soon enough
You know it is no longer an empty text node when it can match \S just once. Using \S+ is unnecessary, as it doesn't matter if it has more than one nonwhitespace character or not.
__________________
jasonkarldavis.com
jkd is offline   Reply With Quote
Old 10-17-2002, 03:23 AM   PM User | #12
whammy
Senior Coder

 
Join Date: Jun 2002
Location: 41° 8' 52" N -95° 53' 31" W
Posts: 3,660
Thanks: 0
Thanked 0 Times in 0 Posts
whammy is an unknown quantity at this point
Thanks for clarifying that! That makes sense, I think... since a space is usually if not always defined as a string... I assume from your answer that the empty text node always consists of only one or more spaces or line feed characters?
__________________
Former ASP Forum Moderator - I'm back!

If you can teach yourself how to learn, you can learn anything. ;)

Last edited by whammy; 10-17-2002 at 03:27 AM..
whammy is offline   Reply With Quote
Old 10-17-2002, 03:54 AM   PM User | #13
jkd
Senior Coder

 
jkd's Avatar
 
Join Date: May 2002
Location: metro DC
Posts: 3,163
Thanks: 1
Thanked 18 Times in 18 Posts
jkd will become famous soon enough
Quote:
Originally posted by whammy
I assume from your answer that the empty text node always consists of only one or more spaces or line feed characters?
Generally newline characters, tabs, and spaces. (Whatever you use to pretty print your markup)
__________________
jasonkarldavis.com
jkd is offline   Reply With Quote
Old 10-18-2002, 02:10 AM   PM User | #14
whammy
Senior Coder

 
Join Date: Jun 2002
Location: 41° 8' 52" N -95° 53' 31" W
Posts: 3,660
Thanks: 0
Thanked 0 Times in 0 Posts
whammy is an unknown quantity at this point
That's about what I figured... so pretty much whatever matches /\s+/ if you were using a regular expression? Actually that's what I was trying to convey by my original post, but perhaps I wasn't very clear.
__________________
Former ASP Forum Moderator - I'm back!

If you can teach yourself how to learn, you can learn anything. ;)
whammy is offline   Reply With Quote
Old 10-18-2002, 10:12 AM   PM User | #15
WA
Administrator


 
Join Date: Mar 2002
Posts: 2,596
Thanks: 2
Thanked 19 Times in 18 Posts
WA will become famous soon enough
Sometimes it takes a while before you realize just how useful a code snippet is . I'm currently playing around with using the DOM to retrieve a XML file, and the above really came in handy in getting a consistent document tree across browsers within the XML file. BTW, I resorted to using Alex's code, for share ease of legibility.

Is there a logic behind Mozilla/NS inserting whitespaces into a document in such a manner? It seems to accomplish nothing but complicate matters.
__________________
- George
- JavaScript Kit- JavaScript tutorials and 400+ scripts!
- JavaScript Reference- JavaScript reference you can relate to.
WA is offline   Reply With Quote
Reply

Bookmarks

Jump To Top of Thread


Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:44 AM.


Advertisement
Log in to turn off these ads.