Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 6 of 6

Thread: Simplify code?

  1. #1
    New Coder
    Join Date
    Jan 2009
    Posts
    91
    Thanks
    17
    Thanked 1 Time in 1 Post

    Simplify code?

    This is my first attempt at scraping a webpage for specific data. I got the job done however my code is probably way off. Can anyone give me some pointers on how to improve it?

    PHP Code:
    <?php
    class guild_exp
    {

    var 
    $ch/// going to used to hold our cURL instance
    var $html/// used to hold resultant html data
    var $binary/// used for binary transfers
    var $url/// used to hold the url to be downloaded

    function guild_exp()
    {
    $this->html "";
    $this->binary 0;
    $this->url "";
    }
    function 
    fetchPage($url)
    {
    $this->url $url;
    if (isset(
    $this->url)) { 

    $this->ch curl_init (); /// open a cURL instance

    curl_setopt ($this->chCURLOPT_RETURNTRANSFER1); // tell cURL to return the data

    curl_setopt ($this->chCURLOPT_URL$this->url); /// set the URL to download

    curl_setopt($this->chCURLOPT_FOLLOWLOCATIONfalse); /// Follow any redirects

    curl_setopt($this->chCURLOPT_BINARYTRANSFER$this->binary); /// tells cURL if the data is binary data or not

    $this->html curl_exec($this->ch); // pulls the webpage from the internet

    curl_close ($this->ch); /// closes the connection

    }
    }
    function 
    parse_array($beg_tag$close_tag)
    {
    preg_match_all("($beg_tag.*$close_tag)siU"$this->html$matching_data);
    return 
    $matching_data[0];
    }
    }
    $myspider = new guild_exp();
    $myspider->fetchPage("http://realmwar.warhammeronline.com/realmwar/GuildInfo.war?id=657&server=168");
    $exparray $myspider->parse_array("<DIV""</DIV>");

    foreach (
    $exparray as $value) {
        
    }
    $experience explode(':'$exparray[38]);
    $divexp explode('/'$experience[1]);
    $finalexp explode('<'$divexp[1]);
    $currentexp $divexp[0];
    $neededexp $finalexp[0];
    echo 
    "Needed Experience:  $neededexp"?><br /><?;
    echo 
    "Current Experience:  $currentexp"?><br /><?;
    $total = ($neededexp/$currentexp);
    echo 
    "Percentage Complete:  $total";

    ?>

  • #2
    Regular Coder Iszak's Avatar
    Join Date
    Jun 2007
    Location
    Perth, Western Australia
    Posts
    332
    Thanks
    2
    Thanked 58 Times in 57 Posts
    Erm, this is a solution but there are some things I should mention.
    a) It doesn't use cURL as I couldn't be bothered to include that.
    b) It uses regular expressions, which many would say you should use a HTML parser
    c) It is not in class form.

    PHP Code:
    <?php

    $html 
    file_get_contents('http://realmwar.warhammeronline.com/realmwar/GuildInfo.war?id=657&server=168');

    preg_match('#<div class="guild-progress-desc">Guild Rank: (\d+)/(\d+)</div>#i'$html$match);

    $needed  $match[1];
    $current $match[2];
    $percent round($needed $current);

    echo 
    "Needed Experience: {$needed}<br>\n";
    echo 
    "Current Experience: {$current}<br>\n";
    echo 
    "Percentage Complete: {$percent}<br>\n";
    Oh as for pointers, erm again most people would probably say use HTML parsers versus regular expressions and explode / split functions. They're probably right, but often HTML mark-up is broken so I suggest using regular expressions.
    Last edited by Iszak; 03-01-2009 at 09:23 AM. Reason: Removed ungreedy parameter

  • Users who have thanked Iszak for this post:

    Hayyel (03-01-2009)

  • #3
    New Coder
    Join Date
    Jan 2009
    Posts
    91
    Thanks
    17
    Thanked 1 Time in 1 Post
    Certainly a lot simpler. Preg_match is something I am unfamiliar with as I am new to arrays and such.

    Thanks for your help.

    Is there a good place to look up code to add so the script only runs if X amount of time has expired since it last run?

  • #4
    New Coder
    Join Date
    Jan 2009
    Posts
    91
    Thanks
    17
    Thanked 1 Time in 1 Post
    I have run into a problem. Using the following code I get an error.

    PHP Code:
    <?php

    $html 
    file_get_contents('http://realmwar.warhammeronline.com/realmwar/GuildInfo.war?id=657&server=168');

    preg_match('#<div id="rank" class="rankbar-full" style="width:(\d+)24(\d+)%" onmousemove="hoverFollow(event,(this.id+'-hover'));" onmouseout="hoverFollow(event,(this.id+'-hover'));"></div>#i'$html$match);

    echo 
    "$match[1]";
    ?>
    How do I get this done since I get an error with the -hover? At least thats what I think is causing the error.

  • #5
    Regular Coder Iszak's Avatar
    Join Date
    Jun 2007
    Location
    Perth, Western Australia
    Posts
    332
    Thanks
    2
    Thanked 58 Times in 57 Posts
    PHP Code:
    <?php

    $html 
    file_get_contents('http://realmwar.warhammeronline.com/realmwar/GuildInfo.war?id=657&server=168');

    preg_match('#<div id="rank" class="rankbar-rank"[^>]*>(\d+)</div>#i'$html$match);

    echo 
    $match[1];

  • Users who have thanked Iszak for this post:

    Hayyel (03-04-2009)

  • #6
    New Coder
    Join Date
    Jan 2009
    Posts
    91
    Thanks
    17
    Thanked 1 Time in 1 Post
    Thanks again for your help. I'll definately have to find a good regex resource.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •