...

View Full Version : Regular Expression Woes - matching number patterns



rhyno
10-15-2008, 08:05 PM
Hi,


I've been trying for many hours to match simple number patterns.

In a nutshell, I want to match every 5- or 6-digit number preceded by either whitespace or #(pound sign).
I do not want to match numbers 7-digits or more.

Here's what I have so far:
matches = notestxt.match(/(?!#|\D)[0-9]{5,6}(?!\d)/ig);
where notestxt is the content of the DIV I'm processing.

It matches 5 & digit strings OK, but also catches a 8-digit strings (10046512, e.g.).

What's worse, in IE, matches are highlighted, plus the entire DIV is duplicated! ... and preceded by an extraneous ">

Demo here:
http://jsbin.com/utuva
Code:
http://jsbin.com/utuva/edit (click JavaScript tab)

What's going on here?


Thanks,
--RHYNO

rnd me
10-15-2008, 08:35 PM
would
matches = notestxt.match(/[\s\#]\d{5,6}?/g);

work?

rhyno
10-15-2008, 08:46 PM
rnd_me,

Thanks for the suggestion, but that seems to be capturing the whitespace or # signs as well (I just want to capture the digits). The 8-digit numbers are having their first 5-digits captured with preceding whitespace as well.

What I'm after is:

before matching:

Lorem ipsum dolor sit amet 09999 #08220 #082201, 082202 consectetuer adipiscing elit. Ut eleifend ipsum nec risus. Proin arcu ligula, part# 10046512 hendrerit et, fringilla a, dapibus sed, libero.

after matching:

Lorem ipsum dolor sit amet 09999 #08220 #082201, 082202 consectetuer adipiscing elit. Ut eleifend ipsum nec risus. Proin arcu ligula, part# 10046512 hendrerit et, fringilla a, dapibus sed, libero.

(adding hyperlinks to matched number strings)

rnd me
10-15-2008, 08:50 PM
ahh.

to stop the 8 digit mismatch:
matches = notestxt.match(/[\s\#]\d{5,6}?\s/g);

now you still have junk left on the end of the number.
while you can mess around with non-caputuring parens, i think it's easier and faster to simply run an extra replace inline.

ex: str= str.replace(/\D/g,"") would kill all the non digits.

rhyno
10-15-2008, 09:52 PM
Ahh, great... just about there!

matches = notestxt.match(/[\s\#]\d{5,6}?[\s,]/g);
works great (catching commas after string as well).

I'm able to strip out non-digits with the .replace() too... but only in Firefox.

Naturally, IE7 triggers an "object doesn't support this property or method" error on the:
str = str.replace(/\D/g, "");
line.

Googling it now, but there doesn't seem to be a clear IE/.replace() JS issue?

mrhoo
10-15-2008, 09:54 PM
If you have a string that can have any number of matches, and you want to return the matches in an array, you can step through the string with a regular expression exec-

var A=[], M;
while((M=/[\s#](\d{5,6})(?!\d)/g.exec(notestxt))!=null){
A[A.length]=M[1];
}

// A now contains every 5 or 6 digit integer that follows a # or whitespace in notestxt

If you have more than one match, replacing the non-digits with the empty string will turn your matches into a single long string of digits, which won't match-
if you go that route, replace non-digits (\D+) with a space instead.

rhyno
10-15-2008, 10:41 PM
Found the IE issue.

It was barking about the .replace() line, but in reality it had a problem with my for() loop shortcut:

for (m in matches) { ... } // IE no likey

for (var i = 0; i < matches.length; i++) { ... } // OK, go ahead

Sheesh!

Thank you rnd_me for the assistance, and mrhoo, I might try your method in the future.


--RHYNO

rnd me
10-15-2008, 11:21 PM
Found the IE issue.

It was barking about the .replace() line, but in reality it had a problem with my for() loop shortcut:

for (m in matches) { ... } // IE no likey

for (var i = 0; i < matches.length; i++) { ... } // OK, go ahead

Sheesh!

Thank you rnd_me for the assistance, and mrhoo, I might try your method in the future.


--RHYNO

yeah, replace works only on single matches...

glad you got it working.

-cheers!

Philip M
10-16-2008, 02:01 PM
mrhoo's script caused my browser to lock up.:eek:

To summarize, the solution to the problem is:-


<script type = "text/javascript">

var notestxt = "Lorem ipsum dolor sit amet 09999 #08220 #082201, 082202 consectetuer adipiscing elit. Ut eleifend ipsum nec risus. Proin arcu ligula, part# 10046512 hendrerit et, fringilla a, dapibus sed, libero."

notestxt1 = notestxt.replace(/\D/g, " ");
matches = notestxt1.match(/\s\d{5,6}\s/g);
alert (matches);

</script>


I have nothing but confidence in you. And very little of that.
Groucho Marx (1890 - 1977)

mrhoo
10-16-2008, 03:21 PM
By putting a regular expression literal in the loop I was calling for a new RegExp in each iteration-
the lastIndex continually gets set to 0, instead of after the last match.

Defining the RegExp outside of the loop solves it.
Thank you for catching it, Philip.

var notestxt = "Lorem ipsum dolor sit amet 09999 #08220 #082201, 082202 consectetuer adipiscing elit. Ut eleifend ipsum nec risus. Proin arcu ligula, part# 10046512 hendrerit et, fringilla a, dapibus sed, libero."



var A=[], M;
var Rx=/[\s#](\d{5,6})(?!\d)/g;
while((M=Rx.exec(notestxt))!=null){
A[A.length]=M[1];
}
alert(A) // returns [09999,08220,082201,082202]

Philip M
10-16-2008, 07:11 PM
By putting a regular expression literal in the loop I was calling for a new RegExp in each iteration-
the lastIndex continually gets set to 0, instead of after the last match.

Defining the RegExp outside of the loop solves it.
Thank you for catching it, Philip.

var notestxt = "Lorem ipsum dolor sit amet 09999 #08220 #082201, 082202 consectetuer adipiscing elit. Ut eleifend ipsum nec risus. Proin arcu ligula, part# 10046512 hendrerit et, fringilla a, dapibus sed, libero."



var A=[], M;
var Rx=/[\s#](\d{5,6})(?!\d)/g;
while((M=Rx.exec(notestxt))!=null){
A[A.length]=M[1];
}
alert(A) // returns [09999,08220,082201,082202]

Once again, shows that there are more ways than one to kill a cat. :thumbsup:



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum