...

View Full Version : regexp replace. Keep capitalization.



shlagish
03-20-2005, 06:46 AM
Here is my situation. I'm not very good at regular expression. I want to make a script that will add <span> and </span> around certain words.

What I did:
var newstr=str.replace(/word/gi,"<span>word</span>");
It works okay, but there are a few problems.
First, I don't want it to match words.
Second, I unse the i flag, so it will match wOrD, but then, it will replace it by word (no caps).

So basically, what I want is this:
matches all the occurences of the word, case insensitive, but only if this word is alone (white space on both sides. On the right, there can also be . , ? ! $ %(" etc). It should replace this word by <span>word</span> where the word retains it's original capitalization.

What I have yet:


var wordsMeaning=["fox","dog"];

var str="There is a fOx and a doG.", i;
for(i=0;i<wordsMeaning.length;i++){
var reg=new RegExp(" "+wordsMeaning[i]+"(?=([^a-z0-9]))","gi")
str=str.replace(reg,"<span class='hasDef'> "+wordsMeaning[i]+"</span>");
}

It kind of works. I gets the word only when I want it to. But it takes out the original capitalization. Is there any way to avoid this. And also, Is there any better way to acheive what I am trying to acheive?

Thank you.

Philip M
03-20-2005, 09:00 AM
var newstr=str.replace(/(\s)(word)(\s)/gi,"<span> $2 </span>");

Willy Duitt
03-20-2005, 11:43 AM
I imagine you are revisiting your Tooltip Script from back in September:
http://www.codingforums.com/showthread.php?t=40867

You can try this:



<script type="text/javascript">
var strMeaning = ['fox','dog'];
var str = 'There is a ufOx fOx and a doG. dogx.';
str = str.split(/\s+/);
for(var i=0; i<strMeaning.length; i++){
for(var j=0; j<str.length; j++){
if(str[j].match(new RegExp('^('+strMeaning[i]+')(?![a-zA-Z])','i'))){
str[j] = str[j].replace(RegExp.$1,'<span class="hasDef">'+RegExp.$1+'</span>');
}
}
} str = str.join(' ');
alert(str);

</script>


.....Willy

shlagish
03-20-2005, 08:09 PM
Yes, I am revisiting my tooltip script :thumbsup:
Would you mind explaining a few things?


str = str.split(/\s+/);
why the"+"?

if(str[j].match(new RegExp('^('+strMeaning[i]+')(?![a-zA-Z])','i'))){
I don't quite understand this. why the "^"?
Why the parenthesies around strMeaning[i]?

str[j] = str[j].replace(RegExp.$1,'<span class="hasDef">'+RegExp.$1+'</span>');
What's the ".$1"?


Other than that, it looks like it's working perfectly :)
Thank you very much Willy!
I'll go do some more testing :)

shlagish
03-20-2005, 08:57 PM
I tried to adapt your script to my needs. This is what I have. It behaves in a way I couldn't explain...


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" >
<head>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
<meta name="author" content="Shawn Inder" />
<title>Tooltips</title>
<style type="text/css">
<!--

.hasDef { color: #f00; }

-->
</style>
<script type="text/javascript">
<!--

function findWords(){

var wordMeaning=[
{word:"fox",meaning:"Meaning of fox"},
{word:"dog",meaning:"Meaning of dog"}
];
var html=document.body.innerHTML;
var i;
for(i=0;i<wordMeaning.length;i++){
if(html.match(new RegExp('( '+wordMeaning[i].word+')(?![a-zA-Z])','i'))){
alert(RegExp.$1);
html=html.replace(RegExp.$1,'<span class="hasDef">'+RegExp.$1+'</span>');
}
}
document.body.innerHTML=html;
}

window.onload=findWords;

//-->
</script>
</head>
<body>

<p>The quick brown Fox jumped over the lazy dogg.</p>
<p>The quick brown Fox jumped over the lazy dog.</p>

</body>
</html>

It's weird because it matches dog in dogg, but not fox in Foxx. It behaves correctly in not matching fox in afox. But it's wrong in matching only the first occurences of the words even if I put the 'g' flag in there.
Do you know what I'm doing wrong?


edit: There is another problem. In the source, say I have:
"<p>Fox</p>"
I want it to be tooltipped.. But it's not surrounded by spaces..

shlagish
03-20-2005, 10:57 PM
I've tried another technique. It works to some extent. I can add a span with the apropriate class and add the appropriate text in it, but it looses it's capitalization and I can't add this span at the right place. Maybe you know how to arrange these few things in this technique?


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" >
<head>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
<meta name="author" content="Shawn Inder" />
<title>Tree walker tooltip</title>
<style type="text/css"
<!--

.hasDef { color: #f00; }

-->
</style>
<script type="text/javascript">
<!--

function walkTree(){

var wordMeaning=[
{word:"fox", meaning:"Meaning of fox"},
{word:"dog", meaning:"Meaning of dog"}
];
var body=document.getElementsByTagName('body');
var tags=body[0].getElementsByTagName('*'), i;
for(i=0;i<tags.length;i++){
if(tags[i].childNodes){
var j;
for(j=0;j<tags[i].childNodes.length;j++){
if(tags[i].childNodes[j].nodeType==3){
var k;
for(k=0;k<wordMeaning.length;k++){
var reg=new RegExp(' '+wordMeaning[k].word+'(?![a-zA-Z])','ig');
if(tags[i].childNodes[j].nodeValue.match(reg)){
var span=document.createElement('span');
span.className='hasDef';
span.appendChild(document.createTextNode(wordMeaning[k].word));
tags[i].appendChild(span);
}
}
}
}
}
}
}

window.onload=walkTree;

-->
</script>
</head>
<body>
<div>
<p>The quick brown Fox. jumped over the lazy dog.</p>
<p>The quick brown fox jumped over the lazy dog.</p>
</div>
</body>
</html>


By the way, this script has the strangest bug. If I replace this:


span.appendChild(document.createTextNode(wordMeaning[k].word));

with this


span.appendChild(document.createTextNode(wordMeaning[k].meaning));

The browsers crashes and I can do nothing except for Ctrl+Alt+Del

Do you know what this is about? I did a whole lot of testing but I can't isolate the problem...

Harry Armadillo
03-21-2005, 12:56 AM
Something like this?
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" >
<head>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
<title>Tooltips</title>
<style type="text/css">

.hasDef { color: #f00; }

</style>
<script type="text/javascript">

function fetchTextNodesIn(node,a){
var i=node.childNodes.length;
while(i--)
if(node.childNodes[i].nodeName=="#text")
a[a.length]=node.childNodes[i]
else fetchTextNodesIn(node.childNodes[i],a)
}

function findWords(){
var wordMeaning=[
{word:"fox",meaning:"Get Firefox!"},
{word:"dog",meaning:"Woof"}
];

var text=new Array();

fetchTextNodesIn(document.getElementById('DoItHere'),text);
fetchTextNodesIn(document.getElementById('AndHere'),text);

for(var j=0;j<text.length;j++){
flag=false;
str=text[j].nodeValue.split(/\s+/);
for(var i=0;i<wordMeaning.length;i++){
reg=new RegExp('^('+wordMeaning[i].word+')(?![a-zA-Z])','i');
for(var k=0;k<str.length;k++){
if(str[k].match(reg)){
flag=true;
str[k]=str[k].replace(reg,"<span class='hasDef' title='"+wordMeaning[i].meaning+"'>"+RegExp.$1+"</span>");
}
}
}
if(flag){
var span=document.createElement('span');
var trailing=(trailing=text[j].nodeValue.match(/\s+$/))?trailing:'';
span.innerHTML=str.join(' ')+trailing;
text[j].parentNode.replaceChild(span,text[j]);
}
}
}
window.onload=findWords;
</script>
</head>
<body>
<div>No fox or dog tool-tipping nonsense here!</div>
<div id=DoItHere><p>The quick brown Fox jumped over the lazy dogg.<br/>
The quick brown fox jumped over the lazy <b>dog</b>.</p>
<p>The foxy brown <i>fox</i> humped the lazy dog.</p></div>
<p id=fox>blah blah blah more about the dog and fox</p>
<span id=AndHere>fox fox fox no dogs</span>
</body>
</html>

shlagish
03-22-2005, 06:37 AM
Harry Armadillo: Pretty good, thank you very much.
I adapted some aspects of your script to mine to modify a couple things.

In your script, you need an id, I did not want this.
In your script, I cannot have the option of matching only the first occurence of the words.
Your script adds a whole bunsh of empty spans, I diminished the amount by testing if that node's value was "\n".

Your script uses innerHTML, but I didn't find another way either...
Here is my script:


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" >
<head>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
<meta name="author" content="Shawn Inder" />
<title>Automated tooltip generator (under construction)</title>
<style type="text/css"
><!--

.hasDef { color: #f00; }

-->
</style>
<script type="text/javascript">
<!--

function setupTooltips(){

var onlyFirstOccurence=0,
wordsAndDefs=[
{word:"fox", def:"Meaning of fox"},
{word:"dog", def:"Meaning of dog"}
],
textNodes=new Array(),
allTags=document.body.getElementsByTagName('*'),
tagsI;
for(tagsI=0;tagsI<allTags.length;tagsI++){
var childI;
for(childI=0;childI<allTags[tagsI].childNodes.length;childI++){
if(allTags[tagsI].childNodes[childI].nodeType==3 && allTags[tagsI].childNodes[childI].nodeValue!="\n"){
textNodes[textNodes.length]=allTags[tagsI].childNodes[childI];
}
}
}
var wordsDefsI;
for(wordsDefsI=0;wordsDefsI<wordsAndDefs.length;wordsDefsI++){
var wordOccured=0,
regExp=new RegExp('^('+wordsAndDefs[wordsDefsI].word+')(?![a-zA-Z])','i'),
textI;
for(textI=0;textI<textNodes.length;textI++){
var words=textNodes[textI].nodeValue.split(/\s+/),
wordsI;
for(wordsI=0;wordsI<words.length;wordsI++){
if(!wordOccured && words[wordsI].match(regExp)){
words[wordsI]=words[wordsI].replace(regExp,"<span class='hasDef'>"+RegExp.$1+"</span>");
wordOccured=(onlyFirstOccurence)?1:0;
}
}
var span=document.createElement('span');
span.innerHTML=words.join(' ');
textNodes[textI].parentNode.replaceChild(span,textNodes[textI]);
}
}
}

window.onload=setupTooltips;

-->
</script>
</head>
<body>

<div>
<p>This should match fox, dog, fOx, DoG, fox! and Dog?</p>
<p>It shouldn't match afox, adog, foxes or dogs.</p>
</div>
</body>
</html>

There is one problem though, it only works for the first word (fox), not the second (dog). I just spent a couple hours testing and I really can't find the problem.
If I can fix this, I will be happy. But to be in paradise, I would also want this:

My script starts from this:


<div>
<p>This should match fox, dog, fOx, DoG, fox! and Dog?</p>
<p>It shouldn't match afox, adog, foxes or dogs.</p>
</div>

and transforms it into this:


<div>
<p><span>This should match <span class="hasDef">fox</span>.</span></p>
</div>

Ideally, I would like it to transform it to this:


<div>
<p>This should match <span class="hasDef">fox</span>.</p>
</div>

Notice that there is no extra span in the second example.

Also, to be perfect, I would like my script to use something else than innerHTML. I tried with nodeValue, and it worked, but it just wrote it out and the tags weren't "used". So on the screen, I would see this:


<span class="hasDef">fox</span>

and it wouldn't be red at all.

Do you have any ideas to
- make all the words in wordsAndDefs be matched
- eliminate the extra <span>s
- use something other than innerHTML
?

Thank you very much for your help so far :thumbsup:

Willy Duitt: You're the first one to have brought up "RegExp.$1".
Could you explain to me exactly what this does?
Thanks for all your help :)

Philip M
03-22-2005, 09:06 AM
var newstr=str.replace(/(\s)(word)(\W*)(\s)/gi,"<span> $2 </span>");

replaces space-word-0 or more nonalphacharacters-space [globally and ignoring case] (e.g. fox, fox? fOX? with
<span> the word in the same capitalisation as it was <span>
e.g. <span> FoX <span>

Presumbly you have an array of items against which you can match
the input for validity. Presumably not fxo, dgo etc.

Or have i not understtod what is wanted???

Willy Duitt
03-22-2005, 09:35 AM
Your script is only matching one instance because you are not using a global modifier... However, you can not use a global modifier to match the words in the array and use the ReGExp.$? property to retain the case of the individual matches...

Well, at least my experimenting with the script I wrote I couldn't which is why I added the split() method with the additional loop to match each word boundry seperately, not globaly... To facilitate saving the case of each individual match... There probably is a better way... But, that is what I mashed together that worked... (although I would probably change [a-zA-Z] to \w if I had to duitt again) :eek:

The RegExp.$1 is a stored variable and more can be read here:
http://www.webreference.com/js/column5/backreferences.html

.....Willy

liorean
03-22-2005, 11:00 AM
You're overcomplicating things....
var
sOrig='string with word, otherWOrd, thIrDworD, and maybe even more than one WORD',
rWords=/\b(word|otherword|thirdword)\b/ig;
alert(sOrig.replace(rWords,'<span class="hasDef">$1</span>'));If you wish to use an array of words:
var
sOrig='string with word, otherWOrd, thIrDworD, and maybe even more than one WORD',
aWords=[
'word',
'otherword',
'thirdword'],
rWords=new RegExp('\\b('+aWords.join('|')+')\\b','ig');
alert(sOrig.replace(rWords,'<span class="hasDef">$1</span>'));Pulling such an array together automatically from your wordsAndDefs array should be simple:
var
aWords,
i=wordsAndDefs.length;
while(i-->0)
aWords.push(wordsAndDefs[i].word);

Harry Armadillo
03-22-2005, 11:13 AM
I don't think my first version added too many spans...that's what that flag was for. If no tootip-able words were found, the text node was unchanged. A container span was only used once per tipped text node (and held as many text fragments and tooltip-spans as possible).

Anyways, it is possible to use the minimum number of spans, it's just a pain in the butt. By using a container span and it's innerHTML, the browser automagically does the work of setting the needed attributes and putting the text-fragments and tip-span in proper sibling order. It takes more code to do it manually, for little purpose.

I like the idea of having the option of limiting the repeats of the a given tooltip. How about individual limits instead of a global one? Speaking of the limiting, in your code it looks like you keep looping and splitting strings after your wordOccured.

Here's another go from me. I kept the option of limiting which upper level nodes we're examined for tooltipping, because I wouldn't want my nav or title/logo area done. I also tossed in support for quotation marks, encapsulated the subroutines, etc.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" >
<head>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
<title>Like, whoa!</title>
<style type="text/css">
.hasDef { color: #f00; }
.eek { color: #809; }
</style>

<script type="text/javascript">
function setupTooltips(){
function checkNode(node){
var i=node.firstChild;
do{
if(i.nodeType==3)
tipText(i);
else checkNode(i);
if (!tooltips.length) return;
}while(i=i.nextSibling);
}
function tipText(node){
for(var i=0;i<tooltips.length;i++){
var j=node.nodeValue.search(tooltips[i].regex);
if(j<0) continue;
j+=RegExp.$1.length;
var pre=document.createTextNode(node.nodeValue.substr(0,j));
var word=document.createTextNode(RegExp.$4);
node.nodeValue=node.nodeValue.substr(j+tooltips[i].word.length);
var tt=document.createElement('span')
tt.className=tooltips[i].flavor;
tt.title=tooltips[i].def;
tt.appendChild(word);
node.parentNode.insertBefore(pre,node);
node.parentNode.insertBefore(tt,node);
if(!--tooltips[i].reps) {
tooltips.splice(i,1);
if (!tooltips.length) return;
}
tipText(pre);
tipText(node);
return;
}
}
function tooltipDef(word, def, flavor, reps){
this.word=word;
this.def=def;
this.flavor=flavor;
this.reps=reps;
this.regex=new RegExp('((^)|([\\s\\n\\r\\t\\\'\\\"]))('+word+')(?![a-zA-Z-])','i');
}

var tooltips=new Array();
tooltips[tooltips.length]=new tooltipDef("fox","Guard the Hens!","hasDef", 7);
tooltips[tooltips.length]=new tooltipDef("dog","Arf.", "hasDef",20);
tooltips[tooltips.length]=new tooltipDef("hot-dog","Mmmm, rat hairs","eek", 1);

var targetNodes=[document.body];

for(var i=0;i<targetNodes.length;i++){
checkNode(targetNodes[i]);
if (!tooltips.length) break;
}
}
</script>
</head>
<body onLoad='setupTooltips();'>
<div>
<p>This should match fox, dog, fOx, DoG, fox! and Dog?</p>
<p>It shouldn't match fox-fur, hot-dog, foxy or dog-meat.</p>
</div>
<div>
<p>This should match, <b>fox</b>, 'dog', "fOx", <i>DoG</i>, fox! and Dog?</p>
<p>It shouldn't match firefox, wetdog, foxes or dogs.</p>
</div>
<div><p>fox fox dog dog fox fox dog dog fox dog fox dog</p>
<p>fox fox dog dog fox fox dog dog fox dog fox dog</p></div>

</body>
</html>

shlagish
03-23-2005, 05:14 AM
Philip M:
I tried your method but I can't get it to work. It just doesn't seem to do anything.. Here is what I tried.


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" >
<head>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
<meta name="author" content="Shawn Inder" />
<title>Tooltips</title>
<style type="text/css">
<!--

.hasDef { color: #f00; }

-->
</style>
<script type="text/javascript">
<!--

function findTooltips(){

var wordsAndDefs=[
{word:"fox",def:"Meaning of fox"},
{word:"dog",def:"Meaning of dog"}
],
wordsDefsI,
html=document.body.innerHTML;
for(wordsDefsI=0;wordsDefsI<wordsAndDefs.length;wordsDefsI++){
html=html.replace(/(\s)(wordsAndDefs[wordsDefsI])(\W*)(\s)/gi,"<span class='hasDef'> $2 </span>");
}
}

window.onload=findTooltips;

-->
</script>
</head>
<body>

<h2>Should match</h2>
<p>Fox, dog, dOg, fOX? 'dog", "fox', "dog" 'fox'...</p>
<h2>Shouldn't match</h2>
<p>Firefox, dogs, doggy.</p>

</body>
</html>

Do you know what's wrong?

Willy: Thanks for the explanation about the $1.
Are you telling me that with my method, I can only get the first word?
Also, I tried using \W, but I couldn't make it work, so I went back to [a-zA-Z].

Harry Armadillo: In all my messing around with your code, I forgot all about your flag. So the too much spans problem was not true, sorry for that. I put in the flag again, it's better than my checking for "\n" thingy..

Liorean: Here is what I did with your script. I wanted to make it possible to choose between having only the first occurence of any word match, so I added that functionnality.


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" >
<head>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
<meta name="author" content="Shawn Inder" />
<title>Tooltips</title>
<style type="text/css">
<!--

.hasDef { color: #f00; }

-->
</style>
<script type="text/javascript">
<!--

function findTooltips(){

var onlyFirstOccurence=0,
wordsAndDefs=[
{word:"fox",def:"Meaning of fox"},
{word:"dog",def:"Meaning of dog"}
];
var wordsDefsI;
for(wordsDefsI=0;wordsDefsI<wordsAndDefs.length;wordsDefsI++){
var originalHtml=document.body.innerHTML,
reg=new RegExp('\\b('+wordsAndDefs[wordsDefsI].word+')\\b',(onlyFirstOccurence)?'i':'ig');
document.body.innerHTML=originalHtml.replace(reg,'<span class="hasDef">$1</span>');
}
}

window.onload=findTooltips;

-->
</script>
</head>
<body>

<h2>Should match</h2>
<p>Fox, dog, dOg, fOX? 'dog", "fox', "dog" 'fox'...</p>

<h2>Shouldn't match</h2>
<p>Firefox, dogs, doggy.</p>

</body>
</html>

Your code is basically perfect :)
After a little more testing, I'll be convinced. Thank you very much :)
btw, I don't quite understand your RegExp.


reg=new RegExp('\\b('+wordsAndDefs[wordsDefsI].word+')\\b','ig');

I read in a link from your signature that \b was "backspace".
Whatever does that mean?
And why are there two backslashes "\\"?
Could you do me a step by step put-into-words of your RegExp please?
Thank you much.
Out of curiosity though, I would like to know why the below script won't match the word "dog".. I have tested a lot and I would enjoy finally knowing what I did that was wrong..


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" >
<head>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
<meta name="author" content="Shawn Inder" />
<title>Automated tooltip generator (under construction)</title>
<style type="text/css"
><!--

.hasDef { color: #f00; }

-->
</style>
<script type="text/javascript">
<!--

function setupTooltips(){

var onlyFirstOccurence=0,
wordsAndDefs=[
{word:"fox", def:"Meaning of fox"},
{word:"dog", def:"Meaning of dog"}
],
textNodes=new Array(),
allTags=document.body.getElementsByTagName('*'),
tagsI;
for(tagsI=0;tagsI<allTags.length;tagsI++){
var childI;
for(childI=0;childI<allTags[tagsI].childNodes.length;childI++){
if(allTags[tagsI].childNodes[childI].nodeType==3){
textNodes[textNodes.length]=allTags[tagsI].childNodes[childI];
}
}
}
var wordsDefsI;
for(wordsDefsI=0;wordsDefsI<wordsAndDefs.length;wordsDefsI++){
var wordOccured=0,
regExp=new RegExp('^('+wordsAndDefs[wordsDefsI].word+')(?![a-zA-Z])','i'),
textI;
for(textI=0;textI<textNodes.length;textI++){
var words=textNodes[textI].nodeValue.split(/\s+/),
wordsI,
flag=0;
for(wordsI=0;wordsI<words.length;wordsI++){
if(!wordOccured && words[wordsI].match(regExp)){
flag=1;
words[wordsI]=words[wordsI].replace(regExp,"<span class='hasDef'>"+RegExp.$1+"</span>");
wordOccured=(onlyFirstOccurence)?1:0;
}
}
if(flag){
var span=document.createElement('span');
span.innerHTML=words.join(' ');
textNodes[textI].parentNode.replaceChild(span,textNodes[textI]);
}
}
}
}

window.onload=setupTooltips;

-->
</script>
</head>
<body>

<div>
<p>This should match fox, dog, fOx, DoG, fox! and Dog?</p>
<p>It shouldn't match afox, adog, foxes or dogs.</p>
</div>
</body>
</html>


Thank you all for your help!

Harry Armadillo
03-23-2005, 08:06 AM
The first doesn't work because you are search for the literal phrase wordsAndDefs followes by one of the letters w o r d s D e f s I. To build a Regular-expression from a string, you need to use the new RegExp(...) function, like in your second example. In any event, the match has problems in that it consumes a space on either side of the word. What happens when you have " fox dog dog"? The first match consumes the space between the dogs, leaving no leading space for the seconds dog to match (so it won't match). That's the reasoning behind Willy's splitting of string, and why in my second version I explicitly tested the text fragments generated by inserting the tooltip-span.

The liorean one works well on a simple sample, but what happens on a big, real page with links and pictures and such?
<a href='http://mendel.berkeley.edu/dog.html'>Dog Genome Project</a>
<img src='www.saskschools.ca/~gregory/animals/fox/fox1.jpg' alt='A hungy fox pounces on a mouse'>

The final script won't match dog due to the order you've nested your loops. After the Fox, dog, dOg, fOX? 'dog", "fox', "dog" 'fox'... is tested for fox, it is replaced with a span containing the leftover text and the new tooltip-spans. The original unchanged text-node still exists, and is the one referenced in your array. When the dog matching begins, dog is matched versus that original node, and tooltips are built. The big problem is when you go to replace - that text-node isn't part of the document, its parent is the window element, the script ends with an error for lack of a replace method.

shlagish
03-26-2005, 11:07 PM
Please excuse this lateness in my replies. I still need more time to assimilate all that has been said. I'm working on it though. I'll post back soon enough ;)

Philip M
03-27-2005, 10:24 AM
for(wordsDefsI=0;wordsDefsI<wordsAndDefs.length;wordsDefsI++){

html=html.replace(/(\s)(wordsAndDefs[wordsDefsI])(\W*)(\s)/gi,"<span class='hasDef'> $2 </span>");
}

It does not work because the regex is looking for the literal character string wordsDefsI++

Try something like

for(wordsDefsI=0;wordsDefsI<wordsAndDefs.length;wordsDefsI++){
var word= new RegExp(wordsDefsI);
word=word.replace(/(\W*)(word)/gi,"<span class='hasDef'> $2 </span>");
wordsDefsI=word;
}

shlagish
05-05-2005, 05:46 AM
I'm realising I don't have much time for this anymore. Basically where I'm at is Harry Armadillo's script with minor differences.



<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en" >
<head>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1" />
<title>Harmadillo</title>
<style type="text/css">
.hasDef { color: #f00; }
</style>

<script type="text/javascript">
<!--
function setupTooltips(){
function checkNode(node){
var i=node.firstChild;
do{
if(i.nodeType==3){
tipText(i);
} else { checkNode(i);
if(!tooltips.length){
return;
}
}
} while(i=i.nextSibling);
}
function tipText(node){
for(var i=0;i<tooltips.length;i++){
var j=node.nodeValue.search(tooltips[i].regex);
if(j<0){
continue;
}
j+=RegExp.$1.length;
var pre=document.createTextNode(node.nodeValue.substr(0,j));
var word=document.createTextNode(RegExp.$4);
node.nodeValue=node.nodeValue.substr(j+tooltips[i].word.length);
var tt=document.createElement('span');
tt.className="hasDef";
tt.title=tooltips[i].def;
tt.appendChild(word);
node.parentNode.insertBefore(pre,node);
node.parentNode.insertBefore(tt,node);
if(!tooltips[i].firstOcc){
tooltips.splice(i,1);
}
tipText(pre);
tipText(node);
return;
}
}
function tooltipDef(word,def,firstOcc){
this.word=word;
this.def=def;
this.firstOcc=firstOcc;
this.regex=new RegExp('((^)|([\\s\\n\\r\\t\\\'\\\"]))('+word+')(?![a-zA-Z-])','i');
}
var tooltips=new Array();
tooltips[tooltips.length]=new tooltipDef("fox","Get firefox",1);
tooltips[tooltips.length]=new tooltipDef("dog","Arf.",1);

var targetNodes=[document.body];
for(var i=0;i<targetNodes.length;i++){
checkNode(targetNodes[i]);
if(!tooltips.length){
break;
}
}
}

window.onload=setupTooltips;
//-->
</script>
</head>
<body>
<div>
<p>This should match fox, dog, fOx, DoG, fox! and Dog?</p>
<p>It shouldn't match fox-fur, hot-dog, foxy or dog-meat.</p>
</div>
<div>
<p>This should match, <b>fox</b>, 'dog', <a href="fox.html">fOx</a>, <i>DoG</i>, fox! and Dog?</p>
<p>It shouldn't match firefox, wetdog, foxes or dogs.</p>
</div>
<div><p>fox fox dog dog fox fox dog dog fox dog fox dog</p>
<p>fox fox dog dog fox fox dog dog fox dog fox dog</p></div>

</body>
</html>

Any comments?

btw, when I'm completely done with the whole tooltip script, I'll post it :)

Harry Armadillo
05-06-2005, 11:45 AM
Yah, generally good. The only minor problem is in checkNode() where you moved the if(!tooltips.length)...into the 'else' of the previous conditional, which may mean some pointless processing. As it is, it will only check to see if everything is done (no more tooltips to apply) after checking a whole node, and not after checking a mere text node (which is after all, where the last tooltip-age would have been used up...).

As I said, minor. :)

shlagish
05-07-2005, 04:00 AM
function checkNode(node){
var i=node.firstChild;
do{
if(i.nodeType==3){
tipText(i);
if(!tooltips.length){
return;
}
} else { checkNode(i); }
} while(i=i.nextSibling);
}

You're suggesting I do this?

canadianjameson
05-07-2005, 06:26 AM
Tangent:

Slag, i had no idea you were from montreal! i checked the Jazz & Justice page and to my shock and amusement I live like 5 minutes away from the church you're playing at (i assume you're in the jazz thing). I'm in NDG off Monkland

small world :D

i may pm you, the script has an aspect i may need for something i'm doing.

Cheers :)

Harry Armadillo
05-07-2005, 10:48 AM
Nah, not quite. checkNode() is recursive - while one level of checkNode would have called tipText() just before the supply of tooltips ran out, the rest of the levels of checkNode would have called checkNode. Putting the "we're done, let quit" conditional as part of the of the checkNode vs tipText conditional means that at least one level of checkNode won't check if it's done and thus won't as soon as it could quit (offending my sense of efficiency).

Try:
function checkNode(node){
var i=node.firstChild;
do{
if(i.nodeType==3){
tipText(i);
} else { checkNode(i); }
if(!tooltips.length){
return;
}
} while(i=i.nextSibling);
}That way, checkNode looks to see if there's work still to do regardless of what kind of work it just did...

shlagish
05-07-2005, 07:45 PM
Got it Harmadillo, thanks ;)

And canadianjameson: I don't actually play in the concerts. My dad is John Inder, the one who initiated the whole project and makes things happen. That's how I got into making the website. By the way, the next and last concert is tonight, and I'm going, so that means I'm gonna be 5 minutes away from you tonight :eek: The world is small. As for the script, I'd be glad to help you anyway I can :)



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum