View Full Version : Resolved Using cURL to Scan an External Page

09-10-2009, 05:37 PM
Nevermind, I figured it out. This is definitely a hosting issue. I started up an account with 000webhost (who advertises availability for cURL) and got the code working. For all interested, it appears as though Freehostia DOES NOT support the use of cURL on their free accounts. Paid accounts may or may not be different...

I'm messing around trying to scan content from a page on another domain (3rd party, not my site at all) to generate content on my own site on a separate domain based on what I find there (I plan to use a string search on the contents and if certain text is present on the target site I want to change what appears on my site accordingly).

So I tried print file_get_contents('http://www.domain.com/target/'); and came up completely empty. I'm assuming that there is some kind of issue with the target site using sessions/cookies or similar which is preventing file_get_contents() from working. I read elsewhere online that cURL can be used for this type of thing so I tried this instead as a simple attempt to print the full page contents:

<div style="border:1px solid black;padding:10px;">
function curl_login($url,$data,$proxy,$proxystatus){
$fp = fopen("cookie.txt", "w");
$login = curl_init();
curl_setopt($login, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($login, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($login, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");
curl_setopt($login, CURLOPT_TIMEOUT, 40);
curl_setopt($login, CURLOPT_RETURNTRANSFER, TRUE);
if ($proxystatus == 'on') {
curl_setopt($login, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($login, CURLOPT_PROXY, $proxy);
curl_setopt($login, CURLOPT_URL, $url);
curl_setopt($login, CURLOPT_HEADER, TRUE);
curl_setopt($login, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($login, CURLOPT_POST, TRUE);
curl_setopt($login, CURLOPT_POSTFIELDS, $data);
ob_start(); // prevent any output
return curl_exec ($login); // execute the curl command
ob_end_clean(); // stop preventing output
curl_close ($login);

function curl_grab_page($site,$proxy,$proxystatus){
$ch = curl_init();
if ($proxystatus == 'on') {
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_URL, $site);
ob_start(); // prevent any output
return curl_exec ($ch); // execute the curl command
ob_end_clean(); // stop preventing output
curl_close ($ch);

echo curl_grab_page('http://www.domain.com/target/','','off');


The ONLY thing that is returned to the browser from this script for me is the text "29 Naahh.. it will not work!" I don't even see the <div> tags which makes me think that this is some type of host issue (freehostia) that sees the script and refuses to run anything at all. Before I give up, though, I wanted to see if I mangled the code somehow. This is the first time I've heard of or tried cURL so I'm pretty much just copy+pasting and changing the URL.

Can anyone confirm if freehostia does not allow cURL or if I have the code wrong? If cURL isn't going to work for me, is there any way to use file_get_contents()? Is there something else that I should be trying?