Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 4 of 4

Thread: RegEx Modifiers

  1. #1
    New Coder
    Join Date
    Jun 2006
    Posts
    67
    Thanks
    20
    Thanked 0 Times in 0 Posts

    RegEx Modifiers

    I've got the following sets of code I'm trying to extract data from using a regular expression:
    Code:
    <TD STYLE=text-align:center;background-color:yellow;color:red;font-size:10px;font-weight:bold; CLASS="dedefault"><p class="centeraligntext">ONLINE COURSE
    </TD>
    <TD STYLE=text-align:center;font-size:10px;background:WHITE; CLASS="dedefault"><p class="centeraligntext">
    3
    </TD>
    I'd like to extract both the "ONLINE COURSE" and the "3"; this is the expression I'm using right now:
    PHP Code:
    $REGEX_3='#\<p class\=\"centeraligntext\"\>(.+?)\<\/TD\>#s'
    Since one value is on the same line as the <p> and the other isn't, I need it to ignore the linebreak so that both can be detected with the same expression. I've tried adding \m to make it multiline, but I don't know where precisely to put it at. I've tried a few different spots, but even when it's not causing a syntax error, it doesn't work.

    Can anyone tell me how I need to modify the expression to ignore linebreaks?

    Thanks in advance.
    Last edited by coolcamo8642; 02-18-2011 at 02:50 AM. Reason: [Resolved]

  • #2
    GŁtkodierer
    Join Date
    Apr 2009
    Posts
    2,127
    Thanks
    1
    Thanked 426 Times in 424 Posts
    There is no such thing as "ignore the linebreak". It's only about how to match it.

    The multiline modifier only changes whether "^" and "$" match the beginning and end of lines, or of the whole string. You don't have those in your regexp, so you don't need that modifier.

    Now, the real problem: You are trying to match a string, which includes linebreaks, with (.+?). The dot metacharacter won't match linebreaks, unless you use the "s" modifier, so normally it would just stop at the end of a line, which is not what you want.

    But you already use the "s" modifier, and your regexp works perfectly. I have no idea what your problem is supposed to be. Maybe you're using preg_match, wich will stop after the first match, instead of preg_match_all.

    The right place for modifiers, by the way, is after the closing regexp delimiter, which, in your case, is after the second "#".

  • Users who have thanked venegal for this post:

    coolcamo8642 (02-18-2011)

  • #3
    New Coder
    Join Date
    Jun 2006
    Posts
    67
    Thanks
    20
    Thanked 0 Times in 0 Posts
    Thanks for the clarification. It seems like I was causing a syntax error when I had attempted to change it earlier, because it's working now as you said. Part of what's extracted ends up on a new line, but that's fine.

    I think regular expressions are one of the most frustrating aspects of programming, so I really appreciate your detail!

  • #4
    GŁtkodierer
    Join Date
    Apr 2009
    Posts
    2,127
    Thanks
    1
    Thanked 426 Times in 424 Posts
    The line breaks are in the matched string, because you're matching them with the (.+?). If you don't want them, you can easily change the regexp to
    PHP Code:
    $REGEX_3='#\<p class\=\"centeraligntext\"\>\s*(.+?)\s*\<\/TD\>#s'
    The \s* should eat away the whitespace (including line breaks) just fine. Notice that this way, you don't even need the "s" modifier, because the line breaks will already be gone, so the dot doesn't have to match them any more.

    So,
    PHP Code:
    $REGEX_3='#\<p class\=\"centeraligntext\"\>\s*(.+?)\s*\<\/TD\>#'
    should also do the trick.

    Oh, and another thing: I'd only escape characters if I really need to, because part of why regexps can be frustrating is that they are hard to read, and escaping stuff without really needing to do that makes that problem even worse.

    So, you could easily write it like this:
    PHP Code:
    $REGEX_3 '#<p class="centeraligntext">\s*(.+?)\s*</TD>#'


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •