Regular Expression Woes - matching number patterns
Hi,
I've been trying for many hours to match simple number patterns.
In a nutshell, I want to match every 5- or 6-digit number preceded by either whitespace or #(pound sign).
I do not want to match numbers 7-digits or more.
Here's what I have so far: matches = notestxt.match(/(?!#|\D)[0-9]{5,6}(?!\d)/ig);
where notestxt is the content of the DIV I'm processing.
It matches 5 & digit strings OK, but also catches a 8-digit strings (10046512, e.g.).
What's worse, in IE, matches are highlighted, plus the entire DIV is duplicated! ... and preceded by an extraneous ">
Thanks for the suggestion, but that seems to be capturing the whitespace or # signs as well (I just want to capture the digits). The 8-digit numbers are having their first 5-digits captured with preceding whitespace as well.
What I'm after is:
before matching:
Lorem ipsum dolor sit amet 09999 #08220 #082201, 082202 consectetuer adipiscing elit. Ut eleifend ipsum nec risus. Proin arcu ligula, part# 10046512 hendrerit et, fringilla a, dapibus sed, libero.
after matching:
Lorem ipsum dolor sit amet 09999 #08220 #082201, 082202 consectetuer adipiscing elit. Ut eleifend ipsum nec risus. Proin arcu ligula, part# 10046512 hendrerit et, fringilla a, dapibus sed, libero.
to stop the 8 digit mismatch:
matches = notestxt.match(/[\s\#]\d{5,6}?\s/g);
now you still have junk left on the end of the number.
while you can mess around with non-caputuring parens, i think it's easier and faster to simply run an extra replace inline.
ex: str= str.replace(/\D/g,"") would kill all the non digits.
__________________ my site (updated 5/13) STATS (2013/5) HTML5:90.2% MOB:14% IE7:0.5% IE8:8.6% IE9:9.8% IE10:10%
If you have a string that can have any number of matches, and you want to return the matches in an array, you can step through the string with a regular expression exec-
Code:
var A=[], M;
while((M=/[\s#](\d{5,6})(?!\d)/g.exec(notestxt))!=null){
A[A.length]=M[1];
}
// A now contains every 5 or 6 digit integer that follows a # or whitespace in notestxt
If you have more than one match, replacing the non-digits with the empty string will turn your matches into a single long string of digits, which won't match-
if you go that route, replace non-digits (\D+) with a space instead.
By putting a regular expression literal in the loop I was calling for a new RegExp in each iteration-
the lastIndex continually gets set to 0, instead of after the last match.
Defining the RegExp outside of the loop solves it.
Thank you for catching it, Philip.
var notestxt = "Lorem ipsum dolor sit amet 09999 #08220 #082201, 082202 consectetuer adipiscing elit. Ut eleifend ipsum nec risus. Proin arcu ligula, part# 10046512 hendrerit et, fringilla a, dapibus sed, libero."
Code:
var A=[], M;
var Rx=/[\s#](\d{5,6})(?!\d)/g;
while((M=Rx.exec(notestxt))!=null){
A[A.length]=M[1];
}
By putting a regular expression literal in the loop I was calling for a new RegExp in each iteration-
the lastIndex continually gets set to 0, instead of after the last match.
Defining the RegExp outside of the loop solves it.
Thank you for catching it, Philip.
var notestxt = "Lorem ipsum dolor sit amet 09999 #08220 #082201, 082202 consectetuer adipiscing elit. Ut eleifend ipsum nec risus. Proin arcu ligula, part# 10046512 hendrerit et, fringilla a, dapibus sed, libero."
Code:
var A=[], M;
var Rx=/[\s#](\d{5,6})(?!\d)/g;
while((M=Rx.exec(notestxt))!=null){
A[A.length]=M[1];
}
alert(A) // returns [09999,08220,082201,082202]
Once again, shows that there are more ways than one to kill a cat.