Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 10 of 10
  1. #1
    Regular Coder
    Join Date
    Dec 2006
    Posts
    417
    Thanks
    168
    Thanked 1 Time in 1 Post

    checking if RSS files still exist

    Hi, I have a mySQL table that stores RSS/XML file links around the web

    I am trying to make a script that I can run every now and then to determine which XML RSS files are still alive and which are dead links.

    Having trouble making this script since my IF ELSE logic is failing.

    I am currently using the following code:

    PHP Code:
            $sql="SELECT id,sitetitle,sitelink,rssfile,rssdescription FROM myTable ORDER BY id DESC LIMIT 500";
            
    $result=mysql_query($sql);
            if(
    $result){
                while(
    $row=mysql_fetch_array($result)){
                        echo 
    '<p>RSS Name: <a href="'.$row[sitelink].'"><span style="font-size:1.4em;">'.$row[sitetitle].'</span></a>\'s id is: '.$row[id].'</p>';
                        echo 
    '<p>- <a href="'.$row[rssfile].'"><span style="font-size:1em;">'.$row[rssfile].'</span></a></p>';
                        if(
    file_exists($row[rssfile])){
                        }else{
                        echo 
    '<p>RSS FILE DOES NOT EXIST</p>';
                        }
                        echo 
    '<p>'.$row[rssdescription].'</p>';
                        echo 
    '<p><hr 80%></p>';
                }
            } 
    When I run this script the RSS FILE DOES NOT EXIST outputs for every RSS file (even the ones that do exist)

    How can I create a script that checks to see other people's RSS files are alive or not?

  • #2
    Master Coder
    Join Date
    Dec 2007
    Posts
    6,682
    Thanks
    436
    Thanked 890 Times in 879 Posts
    few steps:
    - send a HEAD request to the url of rss, I suppose is $row['sitelink']
    - check if answer is 200 Ok, if not the link is dead
    - if link is alive check if your local file is older then remote file, if not download
    last step is to avoid using bandwidth without reason for both you and the rss provider.

    You still have the habit to not quote the indexes in array,

    best regards

  • Users who have thanked oesxyl for this post:

    Bobafart (10-26-2009)

  • #3
    Regular Coder
    Join Date
    Dec 2006
    Posts
    417
    Thanks
    168
    Thanked 1 Time in 1 Post
    Hi oesyxl

    thank for posting. Hope all is well.

    Never made a HEAD request before. I googled it and don't really get it. Can you please show me how?

  • #4
    Master Coder
    Join Date
    Dec 2007
    Posts
    6,682
    Thanks
    436
    Thanked 890 Times in 879 Posts
    Quote Originally Posted by Bobafart View Post
    Hi oesyxl

    thank for posting. Hope all is well.
    you are always welcome,

    Never made a HEAD request before. I googled it and don't really get it. Can you please show me how?
    is something like that:

    http://www.smart-it-consulting.com/a...de=133&page=36

    best regards

  • Users who have thanked oesxyl for this post:

    Bobafart (11-01-2009)

  • #5
    Regular Coder
    Join Date
    Dec 2006
    Posts
    417
    Thanks
    168
    Thanked 1 Time in 1 Post
    why can't I just use if(file_exists($rssurl)) where $rssurl is the URL of the RSS file?

    why send a head request?

  • #6
    Regular Coder
    Join Date
    Dec 2006
    Posts
    417
    Thanks
    168
    Thanked 1 Time in 1 Post
    I am trying to use php's cURL to see if the RSS files exist or not.

    $rssfile is the var I use which contains the URL to each RSS file (in a while loop after querying the db):
    PHP Code:
                    $ch curl_init(); 

                    
    // set url 
                    
    curl_setopt($chCURLOPT_URL$rssurl); 

                    
    //return the transfer as a string 
                    
    curl_setopt($chCURLOPT_RETURNTRANSFER1); 

                    
    // $output contains the output string 
                    
    $output curl_exec($ch); 
                    
                    
    // close curl resource to free up system resources 
                    
    curl_close($ch); 
    Now how do I test to see if the file URL exists or not using a cURL function?

  • #7
    Regular Coder
    Join Date
    Dec 2006
    Posts
    417
    Thanks
    168
    Thanked 1 Time in 1 Post
    what if I did this as my check?

    PHP Code:
    // check to see if RSS File exists 
    if(curl_exec($ch) === false){ 

     echo 
    'error: file does not exist';


    is this considered a good way to address the problem?

  • #8
    Master Coder
    Join Date
    Dec 2007
    Posts
    6,682
    Thanks
    436
    Thanked 890 Times in 879 Posts
    Quote Originally Posted by Bobafart View Post
    why can't I just use if(file_exists($rssurl)) where $rssurl is the URL of the RSS file?

    why send a head request?
    as far as I know file_exists don't work with a url, only with files and directory on your computer.
    head request is only few bytes, like a request to ask the server if the resourse exists but not to get if. After the server response you know what to do, to get the rss or not.
    It save bandwidth for you and the rss provider,

    best regards

  • #9
    Master Coder
    Join Date
    Dec 2007
    Posts
    6,682
    Thanks
    436
    Thanked 890 Times in 879 Posts
    Quote Originally Posted by Bobafart View Post
    I am trying to use php's cURL to see if the RSS files exist or not.

    $rssfile is the var I use which contains the URL to each RSS file (in a while loop after querying the db):
    PHP Code:
                    $ch curl_init(); 

                    
    // set url 
                    
    curl_setopt($chCURLOPT_URL$rssurl); 

                    
    //return the transfer as a string 
                    
    curl_setopt($chCURLOPT_RETURNTRANSFER1); 

                    
    // $output contains the output string 
                    
    $output curl_exec($ch); 
                    
                    
    // close curl resource to free up system resources 
                    
    curl_close($ch); 
    Now how do I test to see if the file URL exists or not using a cURL function?
    Quote Originally Posted by Bobafart View Post
    what if I did this as my check?

    PHP Code:
    // check to see if RSS File exists 
    if(curl_exec($ch) === false){ 

     echo 
    'error: file does not exist';


    is this considered a good way to address the problem?
    I don't use php curl extension but I will try to write something and I will come back when I finish with the code.

    best regards

  • #10
    Master Coder
    Join Date
    Dec 2007
    Posts
    6,682
    Thanks
    436
    Thanked 890 Times in 879 Posts
    PHP Code:
    <?php

    /*
     * Curl setings
     */
    // user agent, put yours here
    $useragent $_SERVER['HTTP_USER_AGENT'];

    /*
     * Feed setings
     */
    // feed url
    $url '........';
    // absolute path to the file where we store the feed
    $oldfeed '.....';

    /*
     * do a HEAD request
     */
    $ch curl_init();
    curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
    curl_setopt($ch,CURLOPT_URL,$url);
    curl_setopt($ch,CURLOPT_USERAGENT,$useragent);
    curl_setopt($ch,CURLOPT_HEADER,true);
    curl_setopt($ch,CURLOPT_CUSTOMREQUEST,'HEAD');
    $headreq curl_exec($ch);
    curl_close($ch);

    /*
     * must be 200 Ok and Last-Modified must be newer then mtime of $oldfeed
     */
    function filter_lines($fld){ return preg_match("/^(HTTP|Last-Modified)/",$fld); }

    $lines explode("\n",$headreq);
    $rest array_filter($lines,"filter_lines");
    $status array_shift($rest);
    if(
    preg_match("/200\s+OK/i",$status)){
      
    $lastmod array_shift($rest);
      
    $modified preg_replace("/Last-Modified:\s+(.+)$/i","\\1",$lastmod);
      
    $timestamp strtotime($modified);
      if((
    file_exists($oldfeed) && filemtime($oldfeed) < $timestamp) ||
         !
    file_exists($oldfeed)){
        
    /*
         * Download the feed in a given file
         */
        
    $ch curl_init();
        
    $fh fopen($oldfeed,'w+');
        
    curl_setopt($ch,CURLOPT_URL,$url);
        
    curl_setopt($ch,CURLOPT_USERAGENT,$useragent);
        
    curl_setopt($ch,CURLOPT_FILE,$fh);
        
    curl_exec($ch);
        
    curl_close($ch);
        
    fclose($fh);
        print 
    "Feed downloaded\n";
      }else{
        print 
    "Feed is already updated\n";
      }
    }else{
      print 
    $status."\n";
    }

    ?>
    best regards


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •