Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Page 1 of 2 12 LastLast
Results 1 to 15 of 24
  1. #1
    Senior Coder
    Join Date
    Oct 2003
    Location
    Australia
    Posts
    1,963
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Question surround all words w/ spans

    I want to take this chunk of html:
    Code:
    <p>The <span class="noun">Hare</span> <span class="verb">ran</span> past the <span class="adj">lazy</span> <span class="noun">Tortoise</span>.</p>
    ...and enclose any word which is not already inside a span, inside one.

    I'm totally open to ideas on this one, it doesn't have to be done in javascript (the user will always have JS on). The content is starting out life as XML, so if anyone thinks this would be easier with XSL or PHP, I'm all ears.
    Here's the XML block, if anyone wants to see that:
    Code:
    <text id="t002">
        <name>The Tortoise and the Hare</name>
        <content>The <n>Hare</n> <v>ran</v> past the <adj>lazy</adj> <n>Tortoise</n>.</content>
    </text>
    Thanks in advance to anyone who can offer suggestions/advice

    I take no responsibility for the above nonsense.


    Left Justified

  • #2
    Regular Coder
    Join Date
    Aug 2004
    Location
    codegoboom@yahoo.com
    Posts
    999
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Does the lonely period at the end count as a word?

    I'd think a regular expression would be the qickest way...
    *this message will self destruct in n-seconds*

  • #3
    Senior Coder
    Join Date
    Jul 2004
    Location
    New Zealand
    Posts
    1,315
    Thanks
    0
    Thanked 2 Times in 2 Posts
    I'm attempting to do something very similar. This is the code I use to surround keywords
    Code:
    var re = new RegExp("([^A-Za-z0-9_])("+keywords[x]+")([^A-Za-z0-9_])","gi");
    txt = txt.replace(re, "$1<span class=\"keyword\">$2</span>$3");
    Now, the problem I'm having, is the fact that innerHTML does not work in XHTML (and I'm assuming XML). I've been trying all day to get createContextualFragment to work, but Firefox has now decided that my document no longer even has a body element.

    So do a search on that, and I'll keep you posted on any progress I make.

  • #4
    Senior Coder joh6nn's Avatar
    Join Date
    Jun 2002
    Location
    72° W. 48' 57" , 41° N. 32' 04"
    Posts
    1,887
    Thanks
    0
    Thanked 1 Time in 1 Post
    if i were gonna do it in javascript, i'd use the String.split() method to split sentences at white space, iterate through the resulting array, and then put it back together with the Array.join() method.

    i'd probably end up doing the equivalent thing with php, though, because i think php would actually do it faster, and i'd also rather be able to see the result in the generated html. but that's just me.

    php equivalents:
    http://us2.php.net/manual/en/function.split.php OR http://us2.php.net/manual/en/function.explode.php
    http://us2.php.net/manual/en/function.implode.php
    bluemood | devedge | devmo | MS Dev Library | WebMonkey | the Guide

    i am a loser geek, crazy with an evil streak,
    yes i do believe there is a violent thing inside of me.

  • #5
    Senior Coder
    Join Date
    Oct 2003
    Location
    Australia
    Posts
    1,963
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Nice one hemebond - thanks for sharing
    You're way past me in the js skills department, but I'll let you know if/when I make more progress

    Edit:
    Oooh, joh6nn comin' through with the goods! I'll have to try that too.
    Last edited by mindlessLemming; 12-15-2004 at 06:51 AM.

    I take no responsibility for the above nonsense.


    Left Justified

  • #6
    Banned
    Join Date
    Sep 2003
    Posts
    3,620
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I really do not know what you are doing here when a class used on the p tags would format any text not included within a span tag... But, below is an example on how to use the nodeType to target the text not included within any other element within the p tag...

    Please note, I use a return to print it back out onto the page so you can see what is going on... You will need to work the rest of the code into your application...

    Code:
    <script type="text/javascript">
      function format(){ 
       var p = document.getElementsByTagName('p');
        for(var i=0; i<p.length; i++){
         var words = p[i].childNodes;
          for(var j=0; j<words.length; j++){
            if(words[j].nodeType == 3 && words[j].nodeValue.match(/.*[^\s]/g)){ 
               words[j].nodeValue = '<span>'+words[j].nodeValue+'</span>';
            }
          }
        }      return words;
      }        window.onload = format;
    
    </script>
    </head> 
    
    <body>
    <div>
    <p>The <span class="noun">Hare</span> <span class="verb">ran</span> past the <span class="adj">lazy</span> <span class="noun">Tortoise</span>.</p>
    <p>The <span class="noun">Hare</span> <span class="verb">ran</span> past the <span class="adj">lazy</span> <span class="noun">Tortoise</span>.</p>
    
    </div>
    </body>
    .....Willy

  • #7
    Banned
    Join Date
    Sep 2003
    Posts
    3,620
    Thanks
    0
    Thanked 0 Times in 0 Posts
    BTW: It occured to me that this is what you are looking for:

    Code:
    <script type="text/javascript">
      function format(){ 
       var p = document.getElementsByTagName('p');
        for(var i=0; i<p.length; i++){
         var words = p[i].childNodes;
          for(var j=0; j<words.length; j++){
            if(words[j].nodeType == 3 && words[j].nodeValue.match(/.*[^\s]/g)){
             var span = document.createElement('span');
                 span.style.color = 'red'; // TESTING PURPOSES ONLY, PLEASE REMOVE //;
                 span.appendChild(document.createTextNode(words[j].nodeValue)); 
                 words[j].parentNode.replaceChild(span,words[j]);
            }
          }
        }        alert(document.body.innerHTML); // TESTING PURPOSES ONLY, PLEASE REMOVE //;
      }          window.onload = format;
    
    </script>
    </head> 
    
    <body>
    <div>
    <p>The <span class="noun">Hare</span> <span class="verb">ran</span> past the <span class="adj">lazy</span> <span class="noun">Tortoise</span>.</p>
    <p>The <span class="noun">Hare</span> <span class="verb">ran</span> past the <span class="adj">lazy</span> <span class="noun">Tortoise</span>.</p>
    .....Willy

    Edit: Added a style color to make it easier to see something happening...
    Last edited by Willy Duitt; 12-15-2004 at 12:04 PM.

  • #8
    Senior Coder
    Join Date
    Jul 2004
    Location
    New Zealand
    Posts
    1,315
    Thanks
    0
    Thanked 2 Times in 2 Posts
    Thanks Willy. I don't suppose you have a variant that will replace part of a text node do you?

    Edit: Oh you piece of crap! I've spent the last 2 days trying to figure out how to use createContextualFragment in Mozilla Firefox, but was getting errors every time. I just tried it in Seamonkey and the bloody thing works fine. I'll post soem code soon.
    Last edited by hemebond; 12-15-2004 at 09:52 PM.

  • #9
    Senior Coder
    Join Date
    Oct 2003
    Location
    Australia
    Posts
    1,963
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Smile Almost...

    Awesome Willy, thank you! The only prblem is that each word needs to be in it's own span, with no spaces or punctuation inside the span either. I'm going to do my best to convert what you've already provided, I'll post the final result once I get there.

    Quote Originally Posted by Willy Duitt
    I really do not know what you are doing here...
    Then I'll tell you I'm building a web based linguistics tool for post-graduate students at the Uni I work for. This section is where a student selects all the nouns, verb, clauses, conjunctions, etc within the extract. Don't ask me why it's in xhtml/javascript instead of Flash -- that decision came from above I need to surround every word with spans so I have something to attach the onclick behavious to. I'm probably going about it all wrong, but this is only the mockup stage and everything is working well so far
    Last edited by mindlessLemming; 12-15-2004 at 11:17 PM.

    I take no responsibility for the above nonsense.


    Left Justified

  • #10
    Banned
    Join Date
    Sep 2003
    Posts
    3,620
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Oh, you may try to kick the tires on this example then...
    although once I wrote it I realized that I should have used the childNodes of the <p> tag and not worry about that tag itself but only the contents of the <p> tag but the loops and regexp may help on how to ignore words which are attributes within tags...

    Code:
    <script type="text/javascript"> 
        var str = '<p>The <span class="noun">Hare</span> <span class="verb">ran</span> past the <span class="adj">lazy</span> <span class="noun">Tortoise</span>.</p>';
    
        var words = (/^<p>(.*)<\/p>$/g).test(str);
            words = RegExp.$1.split(/<[^>]*>\w*<\/[^>]*>/gi);
            words = words.toString().replace(/\,/g,'').split(/\s+/g);
    
        var temp = (/^<p>(.*)<\/p>$/g).test(str);
            temp = RegExp.$1.replace(/\,/g,'').split(/\s+/g); 
    
            for(var i=0; i<words.length; i++){
              for(var j=0; j<temp.length; j++){
                if(temp[j].match('^'+words[i]+'\$','i')){ 
                   temp[j] = '<span>'+temp[j]+'</span>'; 
                } 
              } 
            }      
    
        str = '<p>'+temp.join(' ')+'</p>'; 
        alert(str)
    
    </script>
    .....Willy

    BTW: hemebond, I hope this helps you also...

  • #11
    Senior Coder
    Join Date
    Oct 2003
    Location
    Australia
    Posts
    1,963
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hot d@mn Willy, you've sure got my vote for most helpful member (that'll be the third time I've voted for you...why haven't you won yet? )

    Now I just need to get the value of 'str' from the page itself, instead of writing it in the JS.
    innerHTML works, of course, but I can't use that. (DOM scripts only)
    My next attempt was this:
    Code:
    // 'textQ' is the id of the div containing the text
        var str = "";
        var hold = document.getElementById('textQ').childNodes;
        for(var j=0; j<hold.length; j++){
        str += hold[j].nodeValue;
        }
    ...but that comes back with 'nullnull'

    I'm guessing I'm going to have to walk through each childNode, check it's nodeType, loop through again if it's an element node, blah blah blah... That's gonna suck. heh.
    Back to work...

    I take no responsibility for the above nonsense.


    Left Justified

  • #12
    Senior Coder
    Join Date
    Jul 2004
    Location
    New Zealand
    Posts
    1,315
    Thanks
    0
    Thanked 2 Times in 2 Posts
    Actually, my problem was getting it back into the document. I managed to get it in the end, but the methods I use are broken in Firefox, which is why I had so much trouble. Here it is for anyone who wants to see it:
    Code:
    function format()
    {
    	var code = document.getElementsByTagName("pre");
    	for(var i = 0; i < code.length; i++)
    	{
    		// I'm assuming there is no existing markup
    		var txt = code[i].childNodes[0].nodeValue;
    
    		// encode entities again
    		txt = txt.replace(/</gi, "&lt;");
    		txt = txt.replace(/>/gi, "&gt;");
    		txt = txt.replace(/&/gi, "&amp;");
    
    		// basic text replacement to add surrounding span tags
    		txt = txt.replace(/\"([^\"]*)\"/gi, "<span class=\"string\">&quot;$1&quot;</span>");
    		txt = txt.replace(/\'([^\']*)\'/gi, "<span class=\"string\">'$1'</span>");
    		for(var x = 0; x < keywords.length; x++)
    		{
    			var re = new RegExp("([^A-Za-z0-9_])("+keywords[x]+")([^A-Za-z0-9_])","gi");
    			txt = txt.replace(re, "$1<span class=\"keyword\">$2</span>$3");
    		}
    		txt = txt.replace(/(\/\/.*)/gi, "<span class=\"comment\">$1</span>");
    		txt = txt.replace(/\/\*(.*)\*\//gi, "<span class=\"comment\">/*$1*\/</span>");
    		txt = txt.replace(/([0-9]+)/gi, "<span class=\"number\">$1</span>");
    		txt = txt.replace(/(#.*)/gi, "<span class=\"prepro\">$1</span>");
    
    		// I don't really understand this part
    		// I have to create a range
    		// set its contents to the contents of the pre element
    		// then create a cCF which doesn't seem to have any connection to the range object
    		// then replace the code
    		var r = document.createRange();
    		r.selectNodeContents(code[i]);
    
    		var f = r.createContextualFragment(txt);
    
    		code[i].replaceChild(f, code[i].childNodes[0]);
    	}
    }
    It's used to markup C++ code within a document, so that a stylesheet can be used to show syntax highlighting. It only works in Gecko browsers (well, Seamonkey at least) because of createContextualFragment. The only reason I've had to do it this way is because innerHTML is read-only in XHTML.
    Last edited by hemebond; 12-16-2004 at 05:09 AM. Reason: Added commenting to the code

  • #13
    Banned
    Join Date
    Sep 2003
    Posts
    3,620
    Thanks
    0
    Thanked 0 Times in 0 Posts
    This may help in reaching the #text of the nodeTypes == 1 (elements)....

    Code:
    <script type="text/javascript">
      function addSpans(){
       var p = document.getElementsByTagName('p');
        for(var i=0; i<p.length; i++){
         var node = p[i].childNodes; 
          for(var j=0; j<node.length; j++){
            if(node[j].nodeType == 1 && node[j].hasChildNodes() == true){
              for(var k=0; k<node[j].childNodes.length; k++){
               var span = document.createElement('span');
                   span.style.cursor = 'pointer';
                   span.style.color = 'blue'; // TESTING PURPOSES ONLY, PLEASE REMOVE //; 
                   span.onclick = function(){ alert(this.innerHTML+'==This word has class!') };
                   span.appendChild(document.createTextNode(node[j].childNodes[k].nodeValue)); 
                   node[j].childNodes[k].parentNode.replaceChild(span,node[j].childNodes[k]);
              }
            }
            if(node[j].nodeType == 3 && node[j].nodeValue.match(/.*[^\s\.]/gi)){
               var span = document.createElement('span');
                   span.style.cursor = 'pointer';
                   span.style.color = 'red'; // TESTING PURPOSES ONLY, PLEASE REMOVE //;
                   span.onclick = function(){ alert(this.innerHTML+'==This word has no class!') };
                   span.appendChild(document.createTextNode(node[j].nodeValue)); 
                   node[j].parentNode.replaceChild(span,node[j]);
             
            }
          }
        }
      }            window.onload = addSpans;
    
    
    </script>
    </head>
    
    <body>
    <div>
    <p>The <span class="noun">Hare</span> <span class="verb">ran</span> past the <span class="adj">lazy</span> <span class="noun">Tortoise</span>.</p>
    <p>The <span class="noun">Hare</span> <span class="verb">ran</span> past the <span class="adj">lazy</span> <span class="noun">Tortoise</span>.</p>
    </div>
    </body>
    However, I still am having problems splitting up more than one word in nodeType == 3, wrapping each individual word in spans and returning them to the nodeValue... somehow, whenever I try, I lose the nodeValue completely...

    ......Willy

  • #14
    Banned
    Join Date
    Sep 2003
    Posts
    3,620
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Alright, I found a method to split up node type #text ....
    I'm not sure how efficient this is and I have only tested it with IE but I would assume it should work cross-browser... The method I used was splitText() and I'm not even sure if I understand it... But I got it to do what I was wanting...

    Code:
    <script type="text/javascript">
     <!--// 
      function addSpans(){ // written by: WillyDuitt@hotmail.com //;
       var p = document.getElementsByTagName('p');
        for(var i=0; i<p.length; i++){
          for(var j=p[i].childNodes.length-1; j>-1; j--){
           var node=p[i].childNodes[j]; 
    
            if(node.nodeType == 1){
              for(var k=0; k<node.childNodes.length; k++){ 
               var span = document.createElement('span');
                   span.style.cursor = 'pointer';
                   span.className = node.className;
                   span.style.color = 'red'; // TESTING PURPOSES ONLY, PLEASE REMOVE //; 
                   span.onclick = function(){ alert(this.innerHTML+':className='+this.className) }; 
                   span.appendChild(document.createTextNode(node.childNodes[k].nodeValue)); 
                   node.childNodes[k].parentNode.replaceChild(span,node.childNodes[k]);
              }
            }
    
            if(node.nodeType == 3){
              while(node.nodeValue.lastIndexOf(' ')>-1){
               var span = document.createElement('span');
                   span.style.cursor = 'pointer';
                   span.style.color = 'blue'; // TESTING PURPOSES ONLY, PLEASE REMOVE //; 
                   span.onclick = function(){ alert(this.innerHTML+': This word has no class!') }; 
                   span.appendChild(node.splitText(node.nodeValue.lastIndexOf(' ')));
                   p[i].insertBefore(span,node.nextSibling);
              }   
            }
          }      
    
               var span = document.createElement('span');
                   span.style.cursor = 'pointer';
                   span.style.color = 'blue'; // TESTING PURPOSES ONLY, PLEASE REMOVE //; 
                   span.onclick = function(){ alert(this.innerHTML+': This word has no class!') }; 
                   span.appendChild(document.createTextNode(node.nodeValue)); 
                   node.parentNode.replaceChild(span,node);
        }
      }            window.onload = addSpans;
     //-->
    </script>
    </head>
    
    <body>
    <div>
    <p>The test <span class="noun">Hare</span> <span class="verb">ran</span> past the <span class="adj">lazy</span> <span class="noun">Tortoise</span>.</p>
    <p>The <span class="noun">Hare</span> <span class="verb">ran</span> past the <span class="adj">lazy</span> <span class="noun">Tortoise</span> test.</p>
    </div>
    I'm not very good at 'splainin, but if there is something you do not understand I will try my best to make sense of it...

    .....Willy

    BTW: Thanks for the interesting question...
    I really had to apply myself and I learned quite a bit...

    Edit: Per Andrew's recommendation, I have replaced params with square bracket notation here: p[i].childNodes[j]
    Last edited by Willy Duitt; 12-17-2004 at 01:51 AM.

  • #15
    Senior Coder
    Join Date
    Oct 2003
    Location
    Australia
    Posts
    1,963
    Thanks
    0
    Thanked 0 Times in 0 Posts
    That is perfect
    Heck Willy, you've gone all the way and provided me with more features than I was asking for help with (and the features you added just happen to be the next bits in line...Good guess! )

    I ran it in IE and it worked nicely, unfortunately it did nothing in FF/Moz and threw this error:
    p[i].childNode is not a function
    Luckily it was only a minor error in your script (wow, you're human too) and i fixed it easily:
    Code:
    // replace this line
    var node=p[i].childNodes(j); 
    //with this one
    var node=p[i].childNodes[j];
    You've saved me days of work here Willy -- if there's anything I can do for you, pass it along (except it won't be till the beginning of 2005...I'm working 7 days/week until then )
    Also, I'll make sure the Uni never tries to lay claim to your intellectual property

    I take no responsibility for the above nonsense.


    Left Justified


  •  
    Page 1 of 2 12 LastLast

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •