Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 3 of 3
  1. #1
    Mega-ultimate member
    Join Date
    Jun 2002
    Location
    Winona, MN - The land of 10,000 lakes
    Posts
    1,855
    Thanks
    1
    Thanked 45 Times in 42 Posts

    RSS newsfeed and caching the content

    I'm signing up for various newsfeeds and they keep telling me that I should cache the content for a couple of hours before making a request to the server.

    I'm currently using the code I posted in this thread.

    Is there a way to cache the content? Of do I need to write a function that
    1) Checks to see when the last request was,
    2) If the request is over 2 hours, get the rss feed and update the last request, else read an XML file which was stored from the last successful request

    ?

    I'd really like to find some way to cache the content , rather than write a new function to handle the above. Taking the Larry Wall approach to this one.

  • #2
    Regular Coder
    Join Date
    Jun 2002
    Location
    UK
    Posts
    577
    Thanks
    0
    Thanked 0 Times in 0 Posts
    The way I run my rss pulls is through a database of feeds each with a retry duration - cron runs the script every 30 mins and updates the feeds where the retry has expired through a simple filemtime call.

    I save the parsed xml as html blocks and save that on my server so I don't have to double process it (am not overly au-fait with running stylesheeted xml so opted for plain html)

    A simplified version of the file I use is...

    PHP Code:
    <?php
    class rss_compile
        
    {
        var 
    $ct ""// current tag
        
    var $f 0// on/off flag
        
    var $ry = array(); // output array
        
    var $dx = -1// array index

        
    function strt($pa$n$a="")
            {
            
    // check current tag - if 'item' set flag and increment index
            
    $this->ct $n;
            if(
    $n == "ITEM")
                {
                
    $this->1;
                
    $this->dx++;
                }
            }
        function 
    dat($pa$d)
            {
            
    // get data from line if within 'item' block
            
    if($this->== 1)
                {
                
    $d trim(strip_tags($d));
                
    $this->ry[$this->dx][strtolower($this->ct)] .= $d;
                }
            }
        function 
    nd($pa$n$a="")
            {
            
    // reset flag when exiting 'item' block
            
    if($n == "ITEM")
                {
                
    $this->0;
                }
            }
        function 
    set_params($file$sc)
            {
            
    $this->file $file;
            
    $this->sc = ($sc == "yes") ? "yes" "no";
            
    $this->xp xml_parser_create();
            
    xml_set_object($this->xp$this);
            
    xml_set_element_handler($this->xp"strt""nd");
            
    xml_set_character_data_handler($this->xp"dat");
            
    xml_parser_set_option($this->xpXML_OPTION_CASE_FOLDINGTRUE);
            
    xml_parser_set_option($this->xpXML_OPTION_SKIP_WHITETRUE);
            if (!(
    $fp fopen($this->file"r"))) 
                {
                die(
    "Could not read $this->file");
                }
            while (
    $xr fread($fp4096)) 
                {
                if (!
    xml_parse($this->xp$xrfeof($fp))) 
                    {
                    
    $log_error mysql_query("INSERT INTO `glitches` (glitch_id, glitch_ref, glitch_notes) VALUES('','RSS PARSING ERROR - ".$this->file."','".xml_error_string(xml_get_error_code($this->xp))."')");
                    }
                }
            
    xml_parser_free($this->xp);
            return 
    $this->ry;    
            }
        }

    // the editable stuff
    //
    //
    // array        remote file, savename, header text

    $rss_data = array();
    $rss_data[] = array    (    'http://.................news.rss',    'nme.inc',    'News from NME');
    $rss_data[] = array    (    'http://.................more.rdf'    'melodymaker.inc',    'News from Melody Maker');

    $save_folder '/home/site/www/folder/rss_files/';

    foreach(
    $rss_data AS $rf)
        {
        
    $file_ref $save_folder.$rf[1];
        
    // add a filemtime test on $file_ref here if needed

            
    $n = new rss_compile;
            
    $oup $n->set_params($rf[0],"yes");
            
            
    // some output html 
            
            
    $rss_string '<span class="rss_title">'.$rf[2].'</span><br /><br />';

            for (
    $x=0$x<5$x++)
                {
                if(
    trim($oup[$x]['title']) !== "")
                    {

                    
    // more html - note image and span classes - amend to suit

                    
    $rss_string .= '<a href="'.$oup[$x]["link"].'" target="teck_window" class="rss_heading">'.$oup[$x]['title'].' &nbsp; <img src="images/read_story.jpg" width="24" height="14" alt="Read Story" border="0" /></a><br /><span class="rss_description">'.$oup[$x]['description'].'</span><br /><br />';
                    }
                }

            
    // mini news - top 5 stories max

            
    if(!file_exists($save_folder'mini_' .$rf[1]))
                {
                
    touch($save_folder'mini_' .$rf[1]);
                
    chmod($save_folder'mini_' .$rf[1],0777);
                }
            
    $fo fopen($save_folder'mini_' .$rf[1],"w");
            
    fwrite($fo,$rss_string);
            
    fclose($fo);
            
    $cx count($oup);
            for (
    $x=5$x<$cx$x++)
                {
                if(
    trim($oup[$x]['title']) !== "")
                    {

                    
    // more html - same as before 

                    
    $rss_string .= '<a href="'.$oup[$x]["link"].'" target="teck_window" class="rss_heading">'.$oup[$x]['title'].' &nbsp; <img src="images/read_story.jpg" width="24" height="14" alt="Read Story" border="0" /></a><br /><span class="rss_description">'.$oup[$x]['description'].'</span><br /><br />';
                    }
                }

            
    // full feed saving - all stories

            
    if(!file_exists($save_folder.$rf[1]))
                {
                
    touch($save_folder.$rf[1]);
                
    chmod($save_folder.$rf[1],0777);
                }
            
    $ff fopen($save_folder.$rf[1],"w");
            
    fwrite($ff,$rss_string);
            
    fclose($ff);
            }
        }
    ?>
    You could either just cron it at every hour or so or add a few features to run from last modified - shown listing storage is a simple array which I'm sure you can amend to suit your needs.
    Note: it save two files per feed, one with the top five stories (prefix 'mini_' ) and one with all stories
    Ökii - formerly pootergeist
    teckis - take your time and it'll save you time.

  • #3
    New to the CF scene
    Join Date
    Jul 2003
    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Can you help?

    Okii how could I use your code to my specifics? I want to have a program, robots I guess, that would go pull content from url's that I input then have them embedded on my site. Is this possible, can you help?


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •