Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 2 of 2
  1. #1
    Regular Coder
    Join Date
    Apr 2009
    Posts
    135
    Thanks
    83
    Thanked 0 Times in 0 Posts

    Extracting URL Data From TXT/XML File

    I am looking to extract all data from an XML / TXT file that is a URL and just grab the parent URL.

    I tried this but it was not successful:

    (I have been posting the data from a textarea form to this code, this part is working fine):

    PHP Code:
    // If form has been posted then start processing the data
    if (isset($_REQUEST['start'])) {

    // Post the data
    $data $_POST['data'];
    function 
    get_tags($html) {
        
    $regexp '/(http:\/\/)(.*?)(\/)/';
        
    preg_match_all($regexp,$html,$matches,PREG_SET_ORDER);
        foreach (
    $matches as $tag) {
            
    $tags[] = "Http://".$tag[2]."/";
            }
        return 
    $tags;
        }

    if(
    is_array($list get_tags($data))) {
        foreach (
    $list as $tag) {
            echo 
    $tag."\r\n";
            }
        }
    exit;

    (The URL's in the txt file start as Http:// not http://).

    Any idea how to get this working?

  • #2
    Senior Coder kbluhm's Avatar
    Join Date
    Apr 2007
    Location
    Philadelphia, PA, USA
    Posts
    1,509
    Thanks
    3
    Thanked 258 Times in 254 Posts
    Here's a quick and dirty way of extracting URLs:
    PHP Code:
    preg_match_all'/\bhttps?\:\/\/\S+/i'$input$urls ); 
    $urls will be an array of matches of any bit of text that begins with http:// or https:// and continues till it reaches any whitespace or end of line.
    Last edited by kbluhm; 08-28-2009 at 05:35 PM.

  • Users who have thanked kbluhm for this post:

    Sussex_Chris (08-28-2009)


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •