Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Page 1 of 3 123 LastLast
Results 1 to 15 of 41
  1. #1
    Moderator
    Join Date
    May 2002
    Location
    Hayward, CA
    Posts
    1,459
    Thanks
    1
    Thanked 23 Times in 21 Posts

    Cleaning useless whitespace in Mozilla DOM

    Code:
    const notWhitespace = /\S/
    
    function cleanWhitespace(node) {
      for (var x = 0; x < node.childNodes.length; x++) {
        var childNode = node.childNodes[x]
        if ((childNode.nodeType == 3)&&(!notWhitespace.test(childNode.nodeValue))) {
    // that is, if it's a whitespace text node
          node.removeChild(node.childNodes[x])
          x--
        }
        if (childNode.nodeType == 1) {
    // elements can have text child nodes of their own
          cleanWhitespace(childNode)
        }
      }
    }
    
    document.addEventListener("load", function() {
      cleanWhitespace(document)
    }, true)
    This script is intended to remove whitespace text nodes from a document. These nodes show up far more often than we want to admit, and leads to a DOM that is different in Mozilla than IE.

    Make sure you use this only in documents where whitespace is expendable. XHTML documents are among these, as are MathML expressions and SVG images.

    Theoretically, whitespace can be significant in some XML documents.
    Last edited by Alex Vincent; 09-29-2002 at 01:56 AM.
    "The first step to confirming there is a bug in someone else's work is confirming there are no bugs in your own."
    June 30, 2001
    author, Verbosio prototype XML Editor
    author, JavaScript Developer's Dictionary
    https://alexvincent.us/blog

  • #2
    jkd
    jkd is offline
    Senior Coder jkd's Avatar
    Join Date
    May 2002
    Location
    metro DC
    Posts
    3,163
    Thanks
    1
    Thanked 18 Times in 18 Posts
    Code:
    document.addEventListener('load', function() {
      var treeWalker = document.createTreeWalker(document, NodeFilter.SHOW_TEXT, { acceptNode: function(node) { return /\S/.test(node.nodeValue) ? NodeFilter.FILTER_REJECT : NodeFilter.FILTER_ACCEPT } }, false);
    
      while (treeWalker.nextNode())
        treeWalker.currentNode.parentNode.removeChild(treeWalker.currentNode);
    
    }, true);
    Behold the awesomeness of DOM2 Traversal.
    Last edited by jkd; 10-01-2002 at 04:42 AM.

  • #3
    Moderator
    Join Date
    May 2002
    Location
    Hayward, CA
    Posts
    1,459
    Thanks
    1
    Thanked 23 Times in 21 Posts
    I think I like mine better, as it can be much easier modified to work in IE browsers as well. We just change the const to a var, and use window.onload instead of document.addEventListener.
    "The first step to confirming there is a bug in someone else's work is confirming there are no bugs in your own."
    June 30, 2001
    author, Verbosio prototype XML Editor
    author, JavaScript Developer's Dictionary
    https://alexvincent.us/blog

  • #4
    jkd
    jkd is offline
    Senior Coder jkd's Avatar
    Join Date
    May 2002
    Location
    metro DC
    Posts
    3,163
    Thanks
    1
    Thanked 18 Times in 18 Posts
    Originally posted by Alex Vincent
    I think I like mine better
    Well naturally


    , as it can be much easier modified to work in IE browsers as well.
    And the purpose would be...? IE only has empty text nodes when you progmatically create them through createTextNode().


    We just change the const to a var, and use window.onload instead of document.addEventListener.
    i.e. make the code bad.

  • #5
    Moderator
    Join Date
    May 2002
    Location
    Hayward, CA
    Posts
    1,459
    Thanks
    1
    Thanked 23 Times in 21 Posts
    Originally posted by jkd

    And the purpose would be...?
    To avoid browser-sniffing code.

    Note for all the spectators: Jason and I have been debating code for years. We love to present alternatives to each other's codes and arguments.

    Of course, he never admits that I can occasionally code better than he can...
    Last edited by Alex Vincent; 10-01-2002 at 04:05 AM.
    "The first step to confirming there is a bug in someone else's work is confirming there are no bugs in your own."
    June 30, 2001
    author, Verbosio prototype XML Editor
    author, JavaScript Developer's Dictionary
    https://alexvincent.us/blog

  • #6
    jkd
    jkd is offline
    Senior Coder jkd's Avatar
    Join Date
    May 2002
    Location
    metro DC
    Posts
    3,163
    Thanks
    1
    Thanked 18 Times in 18 Posts
    Originally posted by Alex Vincent
    Of course, he never admits that I can occasionally code better than he can...
    The exact opposite could be said of you.

    I just don't like the thought of recursively calling cleanWhitespace on every node in the document. Not only do you get n steps where n is the number of nodes, but in each n, you iterate through its child nodes, which is some exponential growth of the number of calculations the code needs to go through.

    Using treewalker (I have no idea how it works internally), it should just take a little bit longer to initialize the object, then perform n iterations, where n is the number of text nodes - much lower than the previous n.
    Of course, this is in the case of a static DOM - because it is all dynamic in Mozilla, I'm sure this algorithm is slightly more than an n one, but I believe still less than n^(some power).

    On average, I believe the TreeWalker solution would prove to be faster - but who feels like calling the two different versions a few hundred times, and averaging them out?

  • #7
    Senior Coder
    Join Date
    Aug 2002
    Posts
    3,467
    Thanks
    0
    Thanked 0 Times in 0 Posts
    To the contrary, IE DOES create some empty text nodes...

    Ever try placing what should be two, horizontally adjacent images into the HTML?
    Code:
    <img src="pic1.jpg">
    <img src="pic2.jpg">
    IE renders a space between the two this way. You have to do this:
    Code:
    <img src="pic1.jpg"><img src="pic2.jpg">
    To elminate the space...
    In short, IE makes them too...just not nearly as many as Gecko.

    P.S. I have my own whitespace cleaner that I made a bit ago (very similar, but then, wouldn't it be?)...I've used it quite extensively and even on large HTML pages I've never seen it take longer than .5 secs. (AMD 650Mhz) I think even in today's broadband plentiful internet world, most people are patient enough for that.
    Last edited by beetle; 10-14-2002 at 07:37 PM.
    My Site | fValidate | My Brainbench | MSDN | Gecko | xBrowser DOM | PHP | Ars | PVP
    “Minds are like parachutes. They don't work unless they are open”
    “Maturity is simply knowing when to not be immature”

  • #8
    jkd
    jkd is offline
    Senior Coder jkd's Avatar
    Join Date
    May 2002
    Location
    metro DC
    Posts
    3,163
    Thanks
    1
    Thanked 18 Times in 18 Posts
    Originally posted by beetle
    I think even in today's broadband plentiful internet world, most people are patient enough for that.
    Has nothing to do with the Internet connection, rather, the computing power at its disposal.

    Believe it or not, modifying the DOM of a page on the fly relatively takes a lot of operation. (Update the objects, notify the renderer something has changed, rerender the page, etc. And each of those aren't necessarily efficient or simple.)

  • #9
    Senior Coder
    Join Date
    Aug 2002
    Posts
    3,467
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Originally posted by jkd
    Has nothing to do with the Internet connection, rather, the computing power at its disposal.
    I know that...I intended to use it as a comparison for the speed of operations versus what people are willing to wait for. Even on broadband connections quite a few pages take a second or two to access, so another .5sec or less is no biggie.
    My Site | fValidate | My Brainbench | MSDN | Gecko | xBrowser DOM | PHP | Ars | PVP
    “Minds are like parachutes. They don't work unless they are open”
    “Maturity is simply knowing when to not be immature”

  • #10
    Senior Coder
    Join Date
    Jun 2002
    Location
    41° 8' 52" N -95° 53' 31" W
    Posts
    3,660
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Just curious... why not:

    const notWhitespace = /\S+/

    ?
    Former ASP Forum Moderator - I'm back!

    If you can teach yourself how to learn, you can learn anything. ;)

  • #11
    jkd
    jkd is offline
    Senior Coder jkd's Avatar
    Join Date
    May 2002
    Location
    metro DC
    Posts
    3,163
    Thanks
    1
    Thanked 18 Times in 18 Posts
    You know it is no longer an empty text node when it can match \S just once. Using \S+ is unnecessary, as it doesn't matter if it has more than one nonwhitespace character or not.

  • #12
    Senior Coder
    Join Date
    Jun 2002
    Location
    41° 8' 52" N -95° 53' 31" W
    Posts
    3,660
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks for clarifying that! That makes sense, I think... since a space is usually if not always defined as a string... I assume from your answer that the empty text node always consists of only one or more spaces or line feed characters?
    Last edited by whammy; 10-17-2002 at 03:27 AM.
    Former ASP Forum Moderator - I'm back!

    If you can teach yourself how to learn, you can learn anything. ;)

  • #13
    jkd
    jkd is offline
    Senior Coder jkd's Avatar
    Join Date
    May 2002
    Location
    metro DC
    Posts
    3,163
    Thanks
    1
    Thanked 18 Times in 18 Posts
    Originally posted by whammy
    I assume from your answer that the empty text node always consists of only one or more spaces or line feed characters?
    Generally newline characters, tabs, and spaces. (Whatever you use to pretty print your markup)

  • #14
    Senior Coder
    Join Date
    Jun 2002
    Location
    41° 8' 52" N -95° 53' 31" W
    Posts
    3,660
    Thanks
    0
    Thanked 0 Times in 0 Posts
    That's about what I figured... so pretty much whatever matches /\s+/ if you were using a regular expression? Actually that's what I was trying to convey by my original post, but perhaps I wasn't very clear.
    Former ASP Forum Moderator - I'm back!

    If you can teach yourself how to learn, you can learn anything. ;)

  • #15
    WA
    WA is offline
    Administrator
    Join Date
    Mar 2002
    Posts
    2,596
    Thanks
    2
    Thanked 19 Times in 18 Posts
    Sometimes it takes a while before you realize just how useful a code snippet is . I'm currently playing around with using the DOM to retrieve a XML file, and the above really came in handy in getting a consistent document tree across browsers within the XML file. BTW, I resorted to using Alex's code, for share ease of legibility.

    Is there a logic behind Mozilla/NS inserting whitespaces into a document in such a manner? It seems to accomplish nothing but complicate matters.
    - George
    - JavaScript Kit- JavaScript tutorials and 400+ scripts!
    - JavaScript Reference- JavaScript reference you can relate to.


  •  
    Page 1 of 3 123 LastLast

    LinkBacks (?)


    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •