...

View Full Version : problem extracting titles of ads from a website



kevinkhan
10-27-2009, 01:35 PM
Hi guys..

Im trying to learn php and im running into a few problems :(

ok im trying to extract the titles of ads from this url

http://www.carzone.ie/search/results?searchsource=browse&cacheBuster=1256634750309620#nParam=200590%2B219%2B147&sortby=County|1&channel=CARS&currency=EUROS&searchResultsView=SPREADSHEET&maxrows=30&page=1



Here is the script that i am using to try and do this


set_time_limit(-1);
ob_implicit_flush(1);
flush();
ob_end_flush();


$strURL = "";
if(isset($_POST["crawlUrl"]))
$strURL = $_POST["crawlUrl"];


function getMatches($strMatch,$strContent)
{
if(preg_match_all($strMatch,$strContent,$objMatches))
{
return $objMatches;
}
return "";
}
?>
<html>
<head>
<title>Project - Extracting Title of ads on www.carzone.ie </title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body>
<form name="frmExtract" method="post" action="">
URL: <input name="crawlUrl" type="text" id="crawlUrl" size="50" value="<? print $strURL;?>" >
<input name="btnCrawl" type="submit" value="Crawl Data">
</form>
<br>
<br>
<?
if($strURL != "")
{
$strListingUrl = $strURL;
while(true)
{
//Get the Content from the URL
// file_get_contents Reads entire file into a string
$strContent = file_get_contents($strListingUrl);

//Expression to match the Link and Title
$strListMatches = '!<li class="vehicle-images" href="(.*)" title="(.*)"><span>(.*)</span></a></li>!isU';
$objListMatches = getMatches($strListMatches,$strContent);

print_r($objListMatches[1]);

if($objListMatches == "" || count($objListMatches[1]) == 0)
{
print "No List found or Invalid URL<br>";
}

}
}

Can anybody tell me what im doing wrong please :(

i keep getting "No List found or Invalid URL"

Phil Jackson
10-27-2009, 02:27 PM
See, we've ran into this problem before. We could tell you how to do it but it would be breaking the terms and cons of that site...

kevinkhan
10-27-2009, 02:48 PM
This is for training purposes only... :(



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum