Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 4 of 4
  1. #1
    Senior Coder
    Join Date
    May 2006
    Posts
    1,673
    Thanks
    28
    Thanked 4 Times in 4 Posts

    New to classes, how do I use this class ?

    Hi,

    I want to use this class to extract links from my site
    but I am not sure how to use it.

    I need to pass the url to the class.

    This is the class

    PHP Code:
    <?php 
    class Reader
    {
        var 
    $buf;
        var 
    $ix;
        var 
    $list;

        function 
    Reader()
        {
            
    $this->list = array();
        }

        function 
    grab($site)
        {
            
    $this->buf file_get_contents($site);
            
    $this->buf strtolower($this->buf);
            
    $len strlen($this->buf);
            
    $start 0;

            while( 
    $start $len )
            {
                
    $start strpos($this->buf"<a href="http:", $start ); 
                if( $start == false )
                    break;

                $start = $start + 1;
                $end = strpos($this->buf, "
    </a>", $start );

                $ln = $this->getSection($this->buf, $start, $end );
                $fln =  "
    <" . $ln;

                array_push($this->list, $fln);

                $start = $end+1;
            }
        }
        function getSection( $buf, $start, $end )
        {
           if( $start > strlen($buf))
            return false;

            if( $end > strlen($buf))
             return false;

          for( $i=$start; $i<$end; $i++ )
          {
                  $result .= $buf[$i];
          }

            return $result;
        }
    //get array contents
        function results()
        {
            return $this->list;
        }
    }

    ?>
    I assume that ı start off with initiating a new object:


    PHP Code:
    <?php 
    require("Reader.class.php");

    $reader = new Reader();
    $url "http://www.my-site.com/";
    From here what do I do ?

  • #2
    God Emperor Fou-Lu's Avatar
    Join Date
    Sep 2002
    Location
    Saskatoon, Saskatchewan
    Posts
    16,987
    Thanks
    4
    Thanked 2,660 Times in 2,629 Posts
    This line needs to be fixed first:
    PHP Code:
    $start strpos($this->buf"<a href="http:", $start ); 
    it should be:
    PHP Code:
    $start strpos($this->buf"<a href=\"http:\", $start ); 
    Than its used:
    PHP Code:
    <?php 
    require("Reader.class.php");

    $reader = new Reader();
    $url "http://www.my-site.com/"
    $reader->grab($url);
    print_r($reader->results());
    $reader->results() will return an array. This is a PHP4 class and can probably be done easier with using pattern matching, but this should work.
    PHP Code:
    header('HTTP/1.1 420 Enhance Your Calm'); 

  • #3
    Senior Coder
    Join Date
    May 2006
    Posts
    1,673
    Thanks
    28
    Thanked 4 Times in 4 Posts
    Thanks for your reply.

    I have been reading about cUrl and it seems that using
    cUrl may be faster or "better" than using the file_get_contents($site)

    So if I want to update this whole script then I guess that
    I can start off with:

    PHP Code:
    <?php
    $ch 
    curl_init();
    curl_setopt($chCURLOPT_URL'http://www.my-site.com');
    curl_setopt($chCURLOPT_RETURNTRANSFER1);
    $page curl_exec($ch);
    curl_close($ch);

    if (
    preg_match_all("/<a href=\"http:\"(.*?)\".*?>(.*?)<\/a>/i",$page,$matches) ) { 
      
    print_r($matches); 
      }
    ?>
    I am not sure that I have done the pattern correctly.

    Also is it OK to use the string output $page in this way?

    Thanks for helping.


    When I ran this script I got zero output
    Last edited by jeddi; 10-22-2009 at 02:01 PM.

  • #4
    God Emperor Fou-Lu's Avatar
    Join Date
    Sep 2002
    Location
    Saskatoon, Saskatchewan
    Posts
    16,987
    Thanks
    4
    Thanked 2,660 Times in 2,629 Posts
    Yes, that would be correct since you've used the return transfer option. Error check it too using:
    PHP Code:
    if (false !== ($page curl_exec($ch)))
    {
        
    // We got a result

    As for you're pattern, that depends on what you're trying to capture and what you're matching against. This will take any external link (or internal if you've included the entire url in it). However, this requires that the href attribute be the first attribute, and will not handle ssl.
    I'm not a super pattern matcher so it would take time for me to be satisfied with a pattern, but what you could match instead is just the href attribute and display where it is preceded by an <a> tag using lookbehind notations.
    PHP Code:
    header('HTTP/1.1 420 Enhance Your Calm'); 


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •