View Full Version : Regex .. anybody have some really good references for this?

07-02-2010, 05:47 PM

I'm a straight up regex noob! :eek: And most of the things I've read don't make sense to me. I can understand the menu at the chinese restaurant better then regex!

Are there any really really good sources out there? Like Regex for Dummies? I'd like to be able to understand this stuff!

Much thanks!

07-02-2010, 06:36 PM
Dude welcome to my world. Some people dream in regex; the rest of us just limp along, hoping we can get the most elementary expressions to do what we want them to!

The best way I've found for figuring out just what the hell is happening with a given expression is to use an online regex tester. I've been pretty happy with this one:


With a tester you can make small changes to both the expression and the string you're evaluating and see what effect the change makes.

Aside from that, I find that reading this website very slowly, over and over again, I can absorb some of the concepts. I forget them quickly, so I have to read the same pages over and over. And over. I just don't do well with regex! :rolleyes::(:D


07-02-2010, 06:45 PM
I've used a lot but can't for the life of me remember which ones without just googling. So let me explain the concept:

The truncated and not entirely accurate idea is that a regular expression is an expression that describes a regular syntax. It does this by describing which and how many characters can and cannot appear, in order.

So, the most restrictive regular expression only has one match. For example, I can match the following (line 1):

I am a god

with the following (r1):

/I am a god/

BAM! I have described line 1 with r1! Go regular expressions. But that's kind of useless for the most part. The real power of regular expressions is that they can allow matching based on many rules. The most basic of these rules comes down to how you describe what happens in a particular location in the syntax. For example, I want to be able to accept both of these strings into my syntax.

(line 1)
I am a god
(line 2)
I am a God

So I can write the following regex (yes, I know about sensitivity switches, I'm starting slow and easy):

/I am a [gG]od/

Bam! Now, what happened? Basically, this expression says in each position in the string, you've gotta match the characters I provide exactly. But, for the 8th character slot, I provided 2 characters, g and G, so either will work. So that means, I could technically rewrite this as:

/[I][ ][a][m][ ][gG][o][d]/

the square brackets make up a character set. So each spot except the "g" has a single acceptable character in it's set. But we don't bother with the braces unless we need them.

So, now you understand character sets, so let's move on to special character sets:

\s - white space
\d - digit
. - anything

So let's say I don't care what you say you are, but you have to say you are some 3-letter object:

I am a god
I am a dog
I am a fop

/I am a .../

Let's say I want to allow spaces OR tabs between words


Let's say I want you to write you are any 3-letter word ending in the letter "x"

I am a sex
I am a dex
I am a lex

/I am a ..x/

Getting it?

Ok, now, more special characters. There are certain properties of a string that have nothing to do with the characters in it. The 2 most common are beginning and end of the line. It turns out, my regexes above all match more than what I said. They match ANY string as long as the pattern is satisfied at some point:

/I am a god/ will accept "Hey, did you hear? I am a god. Isn't that cool" because the pattern is in the string. How do I restrict it to ONLY what I want?

/^I am a god$/
^ - beginning of line
$ - end of line

Line must start with the letter I

or /^[iI]/

Line must end with a question mark or period


I had to escape the . because . means "any character" if you remember.

Only 2 more things left and you'll be advanced enough to use regexes every day. First, NEGATION. You can put any character at the beginning of a line EXCEPT y or Y.


Beginning of line, negated character set of y,Y. So yes, ^ means beginning of line when it's at the beginning of the line, but it means "not" when it's in square braces.

And finally, count modifiers. You can have as many spaces and tabs as you want between two words (including 0 spaces or tabs):


You can have any number of spaces or tabs, but you need at least one


You can have one comma at the end of your line, but it's optional. You cannot have more than comma:


So again:

[] - character set
[^] - negated character set
^ - beginning of line
$ - end of line
\s - white space
\d - digit
. - any character
* - 0 or more of preceding character/character set
+ - 1 or more of " "
? - 0 or 1 of ""
\ - escape special character (if you want to accept a wack and an "s" you would do /\\s/)

And you're off and running. Get learning. Hopefully these now make way more sense to you and you can do the rest on your own!

07-03-2010, 01:31 AM
Thanks for the sources!! :thumbsup:

If I can get regex down .. I should be able to fly the next shuttle mission! :eek: