Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 8 of 8
  1. #1
    Senior Coder
    Join Date
    Jul 2005
    Location
    UK
    Posts
    1,051
    Thanks
    6
    Thanked 13 Times in 13 Posts

    eregi() matching any link

    Given a passage with links in it, I want to process this so that it'll end up as plain text with no links.

    Regular expressions have to be the most complicated thing I have ever attempted to learn. Even after reading a step by step guide I still can't wrap my head around them.

    What I think I have is the basic syntax of the expression. I know I need to match strings starting with <a and ending with > (I'll remove the closing element of the link with str_replace since that's always going to be "</a>"). And I know that between the start and end could be any number of any characters. Therefore, the expression should, if I've understood this correctly, be...

    PHP Code:
    <?php
      eregi
    ("^<a(.*)>$"$text$matches);
    ?>
    Assuming that's right, what I don't get is all the escaping that tends to go on with regular expressions. The tutorial I read mentioned escaping certain characters with backslashes, but made no mention of forward slashes, yet they always seem to feature quite heavily in reg. expressions. Can anyone enlighten me?

    Thanks.
    Last edited by Pennimus; 08-17-2007 at 07:19 PM.

  • #2
    Super Moderator Inigoesdr's Avatar
    Join Date
    Mar 2007
    Location
    Florida, USA
    Posts
    3,642
    Thanks
    2
    Thanked 405 Times in 397 Posts
    Regexlib.com is a good place to start. I wrote this code and accomplished what you're looking for:
    PHP Code:
    $test 'text text text <a href="www.com/blah.ext?arg=123&arg=321" other="1">WWW.LINK.COM</a> MORE TEXT';
    $new preg_replace('/<a[\s]+[^>]*?href[\s]?=[\s\"\']+(.*?)[\"\']+.*?>([^<]+|.*?)?<\/a>/''$2'$test);
    echo 
    $new
    This is just an example, you can adapt it to your needs. You don't need to create a new variable like I did. I used this regex.

  • #3
    Senior Coder
    Join Date
    Jul 2005
    Location
    UK
    Posts
    1,051
    Thanks
    6
    Thanked 13 Times in 13 Posts
    Thank you a thousand times.

    I was nowhere near. Everything I've read says "." is any character, and "*" is any number of the previous character. Is there a simple reason (other than "that's just the way it is" ) why trying to match any number of any character in the middle doesn't work?

    In the meantime I will try and decipher what you've done so I can finally start to learn the syntax.

  • #4
    Super Moderator Inigoesdr's Avatar
    Join Date
    Mar 2007
    Location
    Florida, USA
    Posts
    3,642
    Thanks
    2
    Thanked 405 Times in 397 Posts
    It does work, that expression I used has 3 of them. The syntax is the only difference. By the way, if you're looking for an easy way to test regular expressions you can try out this program. It has some simple features that can help you out. Breaking down the expression, saying the expression in words, showing you the parts of it, testing matches and replacing, etc. Good luck!

  • #5
    Senior Coder TheShaner's Avatar
    Join Date
    Sep 2005
    Location
    Orlando, FL
    Posts
    1,126
    Thanks
    2
    Thanked 40 Times in 40 Posts
    Quote Originally Posted by Pennimus View Post
    Is there a simple reason (other than "that's just the way it is" ) why trying to match any number of any character in the middle doesn't work?
    Like Inigoesdr said, it does work. The difference with his is that his preg_replace expression replaced the entire link with the text that the anchor tags surrounded. You were having to do two separate expressions, which would work also.

    As for the backslashes, you use them to either escape certain characters or use them to create special characters, like \s (which is a space). Forward slashes are used at the start and end of a pattern in the preg functions since they use Perl's regular expression syntax.

    -Shane

  • #6
    Super Moderator Inigoesdr's Avatar
    Join Date
    Mar 2007
    Location
    Florida, USA
    Posts
    3,642
    Thanks
    2
    Thanked 405 Times in 397 Posts
    You can really use any character to mark the beginning and end of the expression as long as it's the same character and it isn't used as a special character for the syntax. Though, from what I've seen, people tend to use the forward slash the most.

  • #7
    Senior Coder
    Join Date
    Jul 2005
    Location
    UK
    Posts
    1,051
    Thanks
    6
    Thanked 13 Times in 13 Posts
    It does work, that expression I used has 3 of them. The syntax is the only difference. By the way, if you're looking for an easy way to test regular expressions you can try out this program. It has some simple features that can help you out. Breaking down the expression, saying the expression in words, showing you the parts of it, testing matches and replacing, etc. Good luck!
    Thanks thats a useful tool. Well I've done some testing with it and realise I was simply missing a question mark, apparently you have to say "zero or one of any number of any character", not just "any number of any character" which I find slightly counter-intuitive but there we go.

    Code:
    <a.*?>([^<]+|.*?)?</a>
    What I'm wondering, given that this works (at least according to the Regex Coach) and can be replaced with $1 to get the outcome I'm looking for, is there any benefit to using the much longer expression you gave me?
    Last edited by Pennimus; 08-18-2007 at 11:59 PM.

  • #8
    Super Moderator Inigoesdr's Avatar
    Join Date
    Mar 2007
    Location
    Florida, USA
    Posts
    3,642
    Thanks
    2
    Thanked 405 Times in 397 Posts
    It depends on what you want to use it for. The one I posted will only take out links, while yours will remove all anchors for example. You should make sure you have it set to use case-insensitivity too.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •