Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 8 of 8
  1. #1
    Regular Coder Coastal Web's Avatar
    Join Date
    Oct 2004
    Posts
    225
    Thanks
    12
    Thanked 3 Times in 3 Posts

    Fetching urls from a string (hlp pls)

    Greetings everyone,
    I'm trying to write up a script but l've run into a road block here. After searching google for a bit l wasn't able to turn up anything that really helped me (l also searched the forum here to see if this has been asked before...)

    What l'm trying to do is create a function that will go through a string, extract all the URLS form the string, and return them as an array.

    For instance....

    <?php

    $str = <<<end
    Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. http://www.test.com/somefile.php Ut wisi enim ad minim veniam, quis nostrud exercitation ulliam corper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem veleum iriure dolor http://www.domain.com/files/deep/lin...d=123&user=123 in hendrerit in vulputate velit esse molestie consequat, vel willum lunombro dolore eu feugiat nulla facilisis at vero http://www.google.com/ eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi.
    end;

    //how would l create a function that would go through a string passed to it (similar to the string above), and fetch out all of the URLS within that string (if any) and return those URLS in an array; in this case there would be three urls return within the array...
    // http://www.test.com/somefile.php
    // http://www.domain.com/files/deep/lin...d=123&user=123
    // http://www.google.com/


    ?>

    If anyone would be willing to help me out with this is would be greatly appreciated.

    Thanks so much,

  • #2
    Senior Coder
    Join Date
    Mar 2003
    Location
    Atlanta
    Posts
    1,037
    Thanks
    14
    Thanked 30 Times in 28 Posts
    Hmmm I'm guessing it'll be easiest if you were to use some type of Reg. Expression it'll work. I've never wrote one on my own only customized some. But I'm guessing if you were to look for the "http://" you will know that it starts the url. (This is assuming that all URLs are full like in your example and not www.example.com) Then from there, we know that a URL is not going to have a space in it so we can look for the first space and know that will be the end of the URL.

    As I've stated tho, this is only my hypothesis, but it seems logical. To me anyways. There, may be a better way.
    Most of my questions/posts are fairly straightforward and simple. I post long verbose messages in an attempt to be thorough.

  • #3
    Super Moderator Inigoesdr's Avatar
    Join Date
    Mar 2007
    Location
    Florida, USA
    Posts
    3,647
    Thanks
    2
    Thanked 406 Times in 398 Posts
    PHP Code:
    <?php
    $str 
    = <<<end
    Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. http://www.test.com/somefile.php Ut wisi enim ad minim veniam, quis nostrud exercitation ulliam corper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem veleum iriure dolor http://www.domain.com/files/deep/link.php?id=123&user=123 in hendrerit in vulputate velit esse molestie consequat, vel willum lunombro dolore eu feugiat nulla facilisis at vero http://www.google.com/ eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi.
    end;

    $matches = array();
    preg_match_all('/((www|http)(\W+\S+[^).,:;?\]\} \r\n$]+))/i'$str$matches);
    die(
    '<pre>' print_r($matches[0],1));
    ?>
    Returns:
    Code:
    Array
    (
        [0] => http://www.test.com/somefile.php
        [1] => http://www.domain.com/files/deep/link.php?id=123&user=123
        [2] => http://www.google.com/
    )
    http://www.php.net/manual/en/functio...-match-all.php
    http://regexlib.com/Search.aspx?k=url

  • #4
    Senior Coder
    Join Date
    Mar 2003
    Location
    Atlanta
    Posts
    1,037
    Thanks
    14
    Thanked 30 Times in 28 Posts
    Great job Inigoesdr. I was just trying this out. I knew was thinking it would be preg_match_all(). I was just unsure of the pattern. I see that yours will even get the ones that start with www. But thats as far as my understanding goes. I know that the $ is the end of the expression . I'm going to see if I can decipher what this means.

    And, actually I haven't seen you around in a second, so if you're just returning....Welcome Back.
    Most of my questions/posts are fairly straightforward and simple. I post long verbose messages in an attempt to be thorough.

  • #5
    Regular Coder Coastal Web's Avatar
    Join Date
    Oct 2004
    Posts
    225
    Thanks
    12
    Thanked 3 Times in 3 Posts
    Yes, thank you very much Inigoesdr; that dose the trick perfectly.

    Now l have a smaller side question that kind've works in with the script l'm working with here...

    lets say once l have these URLS; and l wanted my server to "hit" or "visit" each of the URLS that have been snagged form the string as though it were a browser (via HTTP).... what is the fastest // least server intense way to do this?

    Would it be with a quick curl connection, using file_get_contents(), maybe fopen(), or perhaps using the snoopy class (http://sourceforge.net/projects/snoopy/)?

    What would be the fastest, least resource using method?

    Thoughts and suggestions on this appriciated...

  • #6
    Super Moderator Inigoesdr's Avatar
    Join Date
    Mar 2007
    Location
    Florida, USA
    Posts
    3,647
    Thanks
    2
    Thanked 406 Times in 398 Posts
    I would say fopen/file_get_contents.

  • #7
    Regular Coder Coastal Web's Avatar
    Join Date
    Oct 2004
    Posts
    225
    Thanks
    12
    Thanked 3 Times in 3 Posts
    tanks again... l'll do some speed testing with the different options; and post the results (if anyone really gives a hoot...)

    Warm regards, and thank you again Inigoesdr; you rock!
    //too bad codingforms.com doesn't have karma points :S

  • #8
    Super Moderator Inigoesdr's Avatar
    Join Date
    Mar 2007
    Location
    Florida, USA
    Posts
    3,647
    Thanks
    2
    Thanked 406 Times in 398 Posts
    You're welcome. And I think the scale under the avatar is for rep.
    Quote Originally Posted by StupidRalph View Post
    And, actually I haven't seen you around in a second, so if you're just returning....Welcome Back.
    Thanks.
    Last edited by Inigoesdr; 08-12-2007 at 04:39 AM.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •