View Full Version : checking regex is good

02-18-2010, 07:13 PM

my image files are labelled like this:

image01.jpg or image234.jpg or image01.gif or image01.png

The differences are in the numbers or the extension.

This regex appears to work:

next unless ($image_no =~ /^(image)(\d+)(.jpg|.gif|.png)+$/);

Not sure if it should be that or something like

next unless ($image_no =~ /^(image)(\d+).(jpg|gif|png)+$/);

How can we measure the greediness or other efficiency of a regex?

02-18-2010, 07:28 PM
I don't know any good ways to test efficiency but I can help make your regex a little better since it has some issues. With your second regular expression you would accept "image123_pnggifjpg" which probably isn't what you were expecting.


You need to escape the . since that represents all characters (except newline), not the period that you were looking for. Also the + after the (jpg|gif|png) was unneccesary, since you expect one file extension.

One thing for efficiency (or memory overhead at least) is that if you don't need the values for the capture groups, you can say not to store them with ?: for example (?:\d+). I can't say exactly how much it helps, if you really want to know everything about perl regex's read http://perldoc.perl.org/perlre.html

02-18-2010, 07:32 PM
See: YAPE::Regex::Explain - explanation of a regular expression

02-18-2010, 07:32 PM
Hi,my suggestion is:
/^image(\d+)\.(jpe?g|gif|png)$/i testing jpg/jpeg and i modifier

02-18-2010, 07:46 PM
Thank you all. I'll take a read at yape and perldoc.