...

View Full Version : checking regex is good



bazz
02-18-2010, 07:13 PM
Hi,

my image files are labelled like this:

image01.jpg or image234.jpg or image01.gif or image01.png

The differences are in the numbers or the extension.

This regex appears to work:



next unless ($image_no =~ /^(image)(\d+)(.jpg|.gif|.png)+$/);


Not sure if it should be that or something like



next unless ($image_no =~ /^(image)(\d+).(jpg|gif|png)+$/);


How can we measure the greediness or other efficiency of a regex?
bazz

CrzySdrs
02-18-2010, 07:28 PM
I don't know any good ways to test efficiency but I can help make your regex a little better since it has some issues. With your second regular expression you would accept "image123_pnggifjpg" which probably isn't what you were expecting.


/^image(\d+)\.(jpg|gif|png)$/


You need to escape the . since that represents all characters (except newline), not the period that you were looking for. Also the + after the (jpg|gif|png) was unneccesary, since you expect one file extension.

One thing for efficiency (or memory overhead at least) is that if you don't need the values for the capture groups, you can say not to store them with ?: for example (?:\d+). I can't say exactly how much it helps, if you really want to know everything about perl regex's read http://perldoc.perl.org/perlre.html

FishMonger
02-18-2010, 07:32 PM
See: YAPE::Regex::Explain - explanation of a regular expression

koko5
02-18-2010, 07:32 PM
Hi,my suggestion is:
/^image(\d+)\.(jpe?g|gif|png)$/i testing jpg/jpeg and i modifier

bazz
02-18-2010, 07:46 PM
Thank you all. I'll take a read at yape and perldoc.

bazz



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum