Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 3 of 3
  1. #1
    Senior Coder
    Join Date
    May 2006
    Posts
    1,673
    Thanks
    28
    Thanked 4 Times in 4 Posts

    What is best way to turn local link into full url?

    I am using curl and DOMDocument
    to extract the links from my website.

    This is my script:

    PHP Code:
    require("my_functions.php");

    $target_url "http://www.support-focus.com/customer-service-software.html";
    $userAgent 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';

    echo 
    "<br>Starting<br>Target_url: $target_url";

    // make the cURL request to $target_url
    $ch curl_init();
    curl_setopt($chCURLOPT_USERAGENT$userAgent);
    curl_setopt($chCURLOPT_URL,$target_url);
    curl_setopt($chCURLOPT_FAILONERRORtrue);
    curl_setopt($chCURLOPT_FOLLOWLOCATIONtrue);
    curl_setopt($chCURLOPT_AUTOREFERERtrue);
    curl_setopt($chCURLOPT_RETURNTRANSFER,true);
    curl_setopt($chCURLOPT_TIMEOUT10);
    $pagecurl_exec($ch);
    if (!
    $page) {
        echo 
    "<br />cURL error number:" .curl_errno($ch);
        echo 
    "<br />cURL error:" curl_error($ch);
        exit;
    }

    // parse the html into a DOMDocument
    $doc = new DOMDocument();
    $doc->loadHTML($page);

    //echo $doc->saveHTML();

    $params $doc->getElementsByTagName('a'); // Find  the a hrefs
    $k=0;
    foreach (
    $params as $param//go to each section 1 by 1
    {
             echo 
    "Section Attribute :-> ".$params->item($k)->getAttribute('href')."<br>";   //get a

    $k++;   
             
    }
    ?> 
    As you can see the target page is this one:

    customer service software

    and the output is:

    Starting
    Target_url: http://www.support-focus.com/custome...-software.html

    Section Attribute :-> index.php
    Section Attribute :-> works.php
    Section Attribute :-> pricing.php
    Section Attribute :-> special.php
    Section Attribute :-> contact.php
    Section Attribute :-> login.php
    Section Attribute :-> Customer-Service-Software.php
    Section Attribute :-> articles.php
    Section Attribute :-> Why-Get-An-Internet-Security-Seal.php
    Section Attribute :-> The-Fantastic-Return-on-Investment-from-Trust-Seals.php
    Section Attribute :-> Turn-Browsers-Into-Buyers-Increase-Your-Sales-Conversion.php
    Section Attribute :-> Selecting-The-Best-Trust-Seal-To-Boost-Your-Sales-Conversions.php
    Section Attribute :-> Give-Great-Customer-Service-And-Get-A-Trust-Seal-to-Prove-It.php
    Section Attribute :-> Customer-Service-Software-Solutions-For-Online-Business.php
    Section Attribute :-> 73-Per-Cent-Of-Buyers-Abort-Their-Purchases-How-To-Change-It.php
    Section Attribute :-> Why-Are-Your-Visitors-Not-Buying-Your-Products.php
    Section Attribute :-> http://www.support-focus.com/index.php
    Section Attribute :-> http://www.support-focus.com/special.php
    Section Attribute :-> terms.php
    Section Attribute :-> privacy.php
    Section Attribute :-> earnings_disclaimer.php
    Section Attribute :-> articles.php

    Works quite well, but some of the links are local and some are full urls.

    Given the code I am already using, what is the best way to get
    all these links shown as complete urls.

    Is there a DOMDoc method to do this ?

    Also I want to get out and store the website address
    i.e. just the "www.support-focus.com" part.

  • #2
    Senior Coder tomws's Avatar
    Join Date
    Nov 2007
    Location
    Arkansas
    Posts
    2,644
    Thanks
    29
    Thanked 330 Times in 326 Posts
    All your code is doing is pulling hrefs. Since those links are not coded as full URLs, you'll need to add that yourself. Perhaps try something like a preg_match that looks for http://. If it's not there, concatenate the domain with the href result.
    Are you a Help Vampire?

  • #3
    Senior Coder
    Join Date
    May 2006
    Posts
    1,673
    Thanks
    28
    Thanked 4 Times in 4 Posts
    Yes,
    I realize that I can do it with a preg_match.

    It could also be done with strpos and substr - but it would be a bit messy.

    But I just thought that if there is something in the DOM class that can do the job then it may be quicker and more efficient.

    So in this case, is the most efficient method to use a preg_match ?


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •