...

View Full Version : Regex help - matching nested tags without matching adjacent ones



me'
06-09-2004, 05:45 PM
Hi. I'm in need of a regular expression that will match nested <abbr> tags, then delete any which are children of other abbr tags. For example, if my <abbr>ising algorithm turns up this: <abbr>X<abbr>HTML</abbr></abbr>, I want this regex to match that and to turn out <abbr>XHTML</abbr>.

This is how far I've got:
$text = preg_replace(
'/<abbr title="(.*)">(.*)<abbr.*>(.*)<\/abbr><\/abbr>/',
'<abbr title="$1">$2$4</aabr>',
$text);However, this matches things like "<abbr>XHTML</abbr> and <abbr>XML</abbr>", and shortens them when it shouldn't.

Any help is appreciated.

PS, perl format regex is preferred, but not necessary.

bcarl314
06-09-2004, 05:49 PM
Try adding some lazy operators...



$text = preg_replace(
'/<abbr title="(.*?)">(.*?)<abbr.*?>(.*?)<\/abbr><\/abbr>/',
'<abbr title="$1">$2$4</aabr>',
$text);

me'
06-09-2004, 05:55 PM
That looks like it's working... thanks very much!!

mordred
06-09-2004, 07:00 PM
Try adding some lazy operators...

<answer mode="nitpick">
The code in your answer is absolutely correct, but regarding the question marks... I've never seen them refered to as lazy operators. Usually these thingies are called "greedy quantifiers". For me, lazy operators would suggest lazy evaluation of expressions, like in "if (funcA() || funcB())", where the condition fails if funcA() returns false, and never calls upon funcB().

Here's some reading material about greedy quantifiers (http://de.php.net/pcre.pattern.syntax#AEN102928). Be aware that this feature is only available for perl-like regexes.
</answer>

bcarl314
06-09-2004, 09:09 PM
Sorry, I was too ? to write it. :D



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum