...

View Full Version : fsockopen, how to have the browser use the http response header



bcarl314
09-09-2003, 01:17 PM
All right, I've got this working from behind a proxy, now I would like to have the browser properly interpret the http response header rather than display it. Any ideas?

Here's the code


<?php
/*your proxy server address*/
$proxy = "192.168.1.1";
/*your proxy server port*/
$port = "8080";
/*the url you want to connect to*/
$url = "http://www.codingforums.com";
$fp = fsockopen($proxy, $port) or die("Unable to open");
fputs($fp, "GET $url HTTP/1.1\r\nHost: $proxy\r\n\r\n");
while(!feof($fp)){
$line = fgets($fp, 4000);
print($line);
}
fclose($fp);
?>


The ultimate goal of this is to create an RSS reader, so any tips along those lines would be apprecitated. Also, I've been googleing ofr days on finding a RSS reader written in PHP that works through a proxy (ie uses fsockopen) any one have some code examples or links ???

Thanks


Screen shot of problem attached

mordred
09-09-2003, 02:29 PM
I think I know how to handle your problem, but I fear I'm introducing another one you might not be aware of. Anyway, the response consists of HTTP headers and the content body. The body is what should be displayed, so we need to retrieve that and print it. Here's an example code for doing that (without the proxy part 'cause I'm not behind one, but the same code applies to it as well):



$fp = fsockopen ("www.codingforums.com", 80, $errno, $errstr, 30);
if (!$fp) {
echo "$errstr ($errno)<br>\n";
} else {
$out = '';
fputs ($fp, "GET / HTTP/1.0\r\nHost: www.codingforums.com\r\n\r\n");
while (!feof($fp)) {
$out .= fgets ($fp,128);
}
fclose ($fp);
}
$response = preg_split('/(Content-Type\:.+?)[\r\n]+/i', $out, -1, PREG_SPLIT_DELIM_CAPTURE);
header($response[0]);
header($response[1]);
print $response[2];


I just store the response in a string and split it at the point where the content begins. Anything up to this point, and the boundary itself (what's in the regex) can be sent through header(), the rest can be printed.

And now the problem: You will see that no relative path to images etc. is been traversed, because the browser interprets it relative to your script file's address. Can be fixed though, it's just work... much work... find all paths and open a new connection, download the response and display according to it's content-type.

Concerning useful RSS libraries: The ones I posted in your thread about RSS processing in the XML forum were'nt ok?

EDIT: I'm going nuts with automatically parsed URLs... why is it impossible to edit the post that URLs aren't parsed? The checkbox reappears every time I try to edit my post. Duh. People, don't click the link, it doesn't work.

bcarl314
09-09-2003, 02:58 PM
Actually, that shouldn't bee too much of a problem, because as I said, I'm going to use this for an RSS news feed, which will have absolute paths for the links, so the relative issue should be resolved.

The real problem (what I posted earlier was simplified) was that because I was getting the http headers, the parser I have threw an error that the document was not well formed.

I'll work with what you posted to eliminate the http headers and try again.



Concerning useful RSS libraries: The ones I posted in your thread about RSS processing in the XML forum were'nt ok?

They were great, only problem is all the RSS readers / aggregators I found were using "include()" or "fopen()" which don't work from behind my proxy, so the only way to bypass was to use fsockopen, thus the header problem and so on.
Thanks for your help.

bcarl314
09-11-2003, 01:08 PM
Hmm, it looks better, but I'm still getting the following headers, how could I modify the regex to handle them as well?

<headers>
X-Cache: MISS from myProxyServer.com
X-Cache: MISS from myProxyDomain
Proxy-Connection: close
</headers>

mordred
09-11-2003, 01:18 PM
I suppose you just want to get rid of them in the final output, i.e. deleting them with the help of a regexp? This should do:

$out = preg_replace('/^(x-cache|proxy)(.+?)[\n\r]+/im', '', $out);

// and then splitting the "cleaned" output
$response = preg_split(...

bcarl314
09-11-2003, 01:41 PM
Mmm, much better, but now another question, the XML parser function has this line:



while ($data = fread($fp, 4096))
xml_parse($xml_parser, $data, feof($fp))
or die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser)));



Which is using a file handle ($fp), and I've got the contents in a variable. What's the easiest way to modify this? Write the variable to a local file and then open it with the parser, or modify the parser line to read the variable? If the later, how would you do that?

mordred
09-11-2003, 02:06 PM
It depends a little on the context of your application. If you need to work on the whole XML file and it's not too big, I would first accumulate the file contents in a string variable and when it's complete, parse it with

xml_parse($parser, $data);

However, the approach you have above with the repeated calling of xml_parse() and only sending a chunk of data is better, provided you only do limited work on the XML file, and it's quite big and needs to be downloaded over the network. Say for instance you just want the first three <foobar> elements, and don't care about the other megabytes of the XML file, you'd go with this approach.

You could check in your while loop if the extracted data is already behind the headers, and if so, set a marker that XML parsing should be used from now on, and parse the data chunks in the following iterations.

bcarl314
09-11-2003, 04:01 PM
All I can say is, whoot! Thanks for the help mordred et al.

So anyone else can have an RSS feeder, here are the files. They use fsockopen!

File rssFeed.php (needs to be included via include)


<?php
//SET PROXY SERVER IP HERE
$proxy = "ENTER PROXY IP HERE";
//SET PROXY PORT HERE (usually 8080)
$port="8080";
//SET YOUR EMAIL ADDRESS HERE
$email = "my.email@something.com";
//SET RESULT LIMITS
$limit=3;
//TO DISPLAY AN RSS FEED CALL THE openRSS FUNCTION WITH THE
//URL OF THE NEWS FEED
//EXAMPLE:
// openRSS("http://z.about.com/6/g/stlouis/b/index.xml");
//END EXAMPLE
print "\n<!-- BEGIN DATA / NEWS FEED -->\n";
print "<!-- For further information contact ".$email." -->\n";

//DO NOT EDIT BELOW UNLESS YOU KNOW WHAT YOU ARE DOING
$i=0;
// *************** BEGIN FUNCTIONS ********************
include_once("rssFunctions.php");
function parseRSSFeed( $fileName, $l ) {
global $url, $email, $limit;
$limit=$l;
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");

print "<ul>";
xml_parse($xml_parser, $fileName) or die(sprintf("XML error: %s at line %d ".xml_error_string(xml_get_error_code($xml_parser)).xml_get_current_line_number($xml_parser)));
print "\n</ul>\n";
print "<!-- END DATA / NEWSFEED -->\n";
xml_parser_free($xml_parser);
}
function openRSS( $file, $l=5 ) {
//open a connection
global $proxy, $port;
$fp = fsockopen ($proxy, $port) or die("Unable to open proxy connection");
$out = '';
//get the xml news feed
fputs ($fp, "GET $file HTTP/1.0\r\nHost: $proxy\r\n\r\n");
//read news feed
while (!feof($fp)) {
$out .= fgets ($fp,128);
}
//close news file
fclose ($fp);
//clean up headers
$out = preg_replace('/^(age)(.+?)[\n\r]+/im', '', $out);
$out = preg_replace('/^(x-cache|proxy)(.+?)[\n\r]+/im', '', $out);
$response = preg_split('/(Content-Type\:.+?)[\r\n]+/i', $out, -1, PREG_SPLIT_DELIM_CAPTURE);
$contents = $response[2];
//begin reading feed
//set default variables
$insideitem = false;
$tag = "";
$title = "";
$description = "";
$link = "";
//parse the xml newsfeed
$lim=$l;
parseRSSFeed($contents, $lim);
}
// *************** END FUNCTIONS ********************
?>


and the rssFunctions.php file (included in the file above)


<?php
function startElement($parser, $name, $attrs) {
global $insideitem, $tag, $title, $description, $link;
if ($insideitem) {
$tag = $name;
} elseif ($name == "ITEM") {
$insideitem = true;
}
}
function endElement($parser, $name) {
global $insideitem, $tag, $title, $description, $link, $i, $limit;
if ($name == "ITEM") {

if(trim($title)!="Customize this feed") {
$i++;
if($i<=$limit) {
printf("\n\t<li><strong><a href='%s'>%s</a></strong> <br />%s",
trim($link),htmlspecialchars(trim($title)),htmlspecialchars(trim($description)));
printf("</li>");
}
}
$title = "";
$description = "";
$link = "";
$insideitem = false;
}
}
function characterData($parser, $data) {
global $insideitem, $tag, $title, $description, $link;
if ($insideitem) {
switch ($tag) {
case "TITLE":
$title .= $data;
break;
case "DESCRIPTION":
$description .= $data;
break;
case "LINK":
$link .= $data;
break;
}
}
}
?>


This will return an unordered list from the news feed.

Thanks again.

bcarl314
09-12-2003, 03:06 PM
Ok, not sure if I'm talking to myself here, but...

I'm now going to try to modify this code to read the XML newsfeed from weather.com. I've signed up with them and they say that I need to store the feed in cache for at least 30 minutes.

How would I do that?



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum