Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 9 of 9
  1. #1
    New to the CF scene
    Join Date
    Mar 2007
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Unhappy regular expressions aka argh!

    Okay, I want to find within a string any URLs, like ...

    http://myworld.ebay.co.uk/pipwish

    and turn them into this ...

    <a href="http://myworld.ebay.co.uk/pipwish" target="_blank">[ Link to ebay.co.uk ]</a>

    I almost have it, but am tripping up on showing JUST the domain inside the Link to brackets.

    Any help???

  • #2
    Regular Coder
    Join Date
    Oct 2005
    Location
    Right Here
    Posts
    654
    Thanks
    1
    Thanked 0 Times in 0 Posts
    What code are you using and what output are you getting?

  • #3
    New to the CF scene
    Join Date
    Mar 2007
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    my code, sample data

    Here's what I have so far ...


    $Text="I like http://cgi.ebay.co.uk/F-ck-Graphic-Design-T-shirt-white-SMALL-mens-cool_W0QQitemZ220094569717QQcategoryZ313QQssPageNameZWDVWQQrdZ1QQcmdZViewItem, http://www.google.com, and I connect through FTP to my site through ftp://www.mysite.com.";
    $strText = preg_replace( '/(http|ftp)+(s)?:(\/\/)((\w|\.)+)(\/)?(\S+)?/i', '<a href="\0">[ Link to \4 ]</a>', $Text );
    echo $strText;


    What you'll notice in the output is that the link that's put into the <a> tag includes the punctuation (comma, period) following the original link. This may be a case of wanting to have my cake and eat it too ... but it seems to me I ought to be able to truncate that last character or at least disregard a comma or period, which should not end a URL (though they may be inside one, I guess) ...

    ugh

  • #4
    New to the CF scene
    Join Date
    Mar 2007
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Anyone have some ideas?

    Just bumping my thread for any fresh ideas ... I know this is simple stuff, but I'm just not a regex guy ... any help is greatly appreciated!

  • #5
    Senior Coder
    Join Date
    Jan 2007
    Posts
    1,648
    Thanks
    1
    Thanked 58 Times in 54 Posts
    I'm not an expert at reading regular expresions If I look at mine after a week or two, I'll go crazy!

    Anyway.

    Pass in a $matches variable to preg_match(), print_r it and see what it puts out. Maybe you'll see that you're using a different reference?

    You'll want to have seperate captures for each section (| marks the section):

    http:// | www.domain.com | /path/to/file.php

    Then it should be relatively easy.

  • #6
    Regular Coder ralph l mayo's Avatar
    Join Date
    Nov 2005
    Posts
    951
    Thanks
    1
    Thanked 31 Times in 29 Posts
    Pilfered from CPAN:

    Code:
    (?:(?:http)://(?:(?:(?:(?:(?:(?:[a-zA-Z0-9][-a-zA-Z0-9]*)?[a-zA-Z0-9])[.])*(?:[a-zA-Z][-a-zA-Z0-9]*[a-zA-Z0-9]|[a-zA-Z])[.]?)|(?:[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+)))(?::(?:(?:[0-9]*)))?(?:/(?:(?:(?:(?:(?:(?:[a-zA-Z0-9\-_.!~*'():@&=+$,]+|(?:%[a-fA-F0-9][a-fA-F0-9]))*)(?:;(?:(?:[a-zA-Z0-9\-_.!~*'():@&=+$,]+|(?:%[a-fA-F0-9][a-fA-F0-9]))*))*)(?:/(?:(?:(?:[a-zA-Z0-9\-_.!~*'():@&=+$,]+|(?:%[a-fA-F0-9][a-fA-F0-9]))*)(?:;(?:(?:[a-zA-Z0-9\-_.!~*'():@&=+$,]+|(?:%[a-fA-F0-9][a-fA-F0-9]))*))*))*))(?:[?](?:(?:(?:[;/?:@&=+$,a-zA-Z0-9\-_.!~*'()]+|(?:%[a-fA-F0-9][a-fA-F0-9]))*)))?))?)
    Remove the first ?: to capture the match

  • #7
    Regular Coder
    Join Date
    Jan 2006
    Location
    Finland, Hollola
    Posts
    285
    Thanks
    8
    Thanked 0 Times in 0 Posts
    ralph that looks way too complicated and slow imo :/
    PHP 5 & MySQL 5 (Y)

  • #8
    New to the CF scene
    Join Date
    Mar 2007
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Sorta working

    Okay ... this ...

    Code:
    (([\w]+:)?//)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?@)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?
    works flawlessly in this on-line validator when using the Javascript option ... but does nothing with preg or ereg ...

    http://www.regextester.com/

    Here's my test string ...

    I like http://cgi.ebay.co.uk/F-ck-Graphic-Design-T-shirt-white-SMALL-mens-cool_W0QQitemZ220094569717QQcategoryZ313QQssPageNameZWDVWQQrdZ1QQcmdZViewItem, http://www.google.com, and I connect through FTP to my site through ftp://www.mysite.com.

  • #9
    Regular Coder ralph l mayo's Avatar
    Join Date
    Nov 2005
    Posts
    951
    Thanks
    1
    Thanked 31 Times in 29 Posts
    Quote Originally Posted by kaisellgren View Post
    ralph that looks way too complicated and slow imo :/
    “For every complex problem, there is a solution that is simple, elegant and wrong.”

    That monstrocity matches per the RFCs. I'd use it unless profiling of the actual application indicated a faster naive solution would be worth the potential for error.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •