View Full Version : proper search pattern for the following

06-14-2006, 06:14 AM
I've been trying to get this for the last 2 hours now and I can't seem to get the right pattern for this.

I'm trying to make it so if certain tags are replaced with new ones, and if <br> is found before, or after any of the original tags, it takes them out.
for example, i would want the following text...

$text = 'this is my text<br><br>here is some text<br><br>';

changed to..

$text = 'this is my text<div class=display>here is some text</div>';

Here's what I have so far..

$text = preg_replace("/(<br>)\[code\](<br>)(.*?)(<br>)\[\/code\](<br>)/ism","\n<div style='width:50%;background-color:#cccccc;'>$3</div>\n",$text);

But the problem is that the above code only works if the <br>'s are there in each case. I want it to do the change even if the <br>'s are not there.

I've been reading about the regular expressions but I just can't figure it out:confused:

ralph l mayo
06-14-2006, 07:04 AM
Here's what I'd do with the regex:

$text = 'this is my text<br>
<br>here is some text<br>';
$regex =
'/(?:<br[\s\/]*>)* # Match <br followed by an number of spaces and forward slashes (including zero) and terminated by a gt, the whole group any number of times including zero and not captured
\[code\] # Match the tag
(?:<br[\s\/]*>)* # More optional break tags
(.*?) # Content, capture this group
(?:<br[\s\/]*>)* # More optional breaks
\[\/code\] # The ending tag
(?:<br[\s\/]*>)* # Trim as many break tags as possible from the end
$text = preg_replace($regex, '<div class="display">$1</div>', $text);

The changes are:
(a) (the one that specifically addresses your question) adding * after a group means match any number of times, *including zero*. This makes the breaks optional. If you only want them to match once, postfix the group with a question mark to make it optional. Eg., /net(?:work)?\sstatus/ matches 'net status' and 'network status'.
(b) using ?: at the beginning of a group indicates that it's a semantic group and not a capturing group. It saves a bit of memory and (I think) facilitates understanding.
(c) adding [\s\/]* matches as many spaces and forward slashes as possible within breaks, so you'll catch XHTML style <br/> and <br /> as well as <br> (along with a bunch of malformed junk like <br/ / // / / /////>, but there's no harm in filtering that out)

06-14-2006, 08:33 AM
wow! thanks :thumbsup: that helps alot and I have a better understanding of these patterns :rolleyes:


06-14-2006, 10:18 AM
btw, how would I now make it so any or ..etc won't parse unless it's outside the

eg. I have this code:

$text = preg_replace("/(?!\[code\](.*?))\[color=(.*?)\](.*?)\[\/color\](?!(.*?)\[\/code\])/","<span style='color:$2;'>$3</span>",$text);

But it doesn't work.. it just continues parsing it

What's the proper pattern for this?

to better understand what i'm trying to do, is i want the equivalent of CF's and