...

View Full Version : Help with this small php script



kevinkhan
10-29-2009, 01:17 PM
Please have a look at this script and tell me what is wrong..

Im trying to extract the link and whats between <a href> </a> and i only want to take the ones which have a class of class="vehicle"




<?php

function getMatches($strMatch,$strContent)
{
if(preg_match_all($strMatch,$strContent,$objMatches))
{
return $objMatches;
}
return "";
}


$strContent = '<li class="vehicle"><a href="http://www.domain.ie/abc?/">vehicle1</a>
<li class="bus"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>
<li class="vehicle"><a href="http://www.domain.ie/abc?/">vehicle2</a>
<li class="van"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>
<li class="van"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>';

$strMatch "#<li class=\"vehicle\"><a href=\"(.*)\">(.*)</a>#";
$objListMatches = getMatches($strMatch,$strContent);


$sUrl = $objListMatches[1];
$sTitle = $objListMatches[2];


echo $sUrl;
echo "<br />";
echo $sTitle;





Any help will be greatly appreciated ;)

Lamped
10-29-2009, 02:20 PM
Stop using .* - really, just stop using it.

'#<li class="vehicle"><a href="([^"]*)">(.*?)</a>#'

And what is peoples obsession with double quotes?

Phil Jackson
10-29-2009, 02:42 PM
Its good practice thats why. What if you didn't know whether it was going to be ' or "? One would have to be escaped either way. and by choice i would go with ">([^<]+)<"

barkermn01
10-29-2009, 03:18 PM
Stop using .* - really, just stop using it.

'#<li class="vehicle"><a href="([^"]*)">(.*?)</a>#'

And what is peoples obsession with double quotes?

Double Quote ok without using them tell me how you echo in the middle of a statment a new line or tab
\n -- linux new line
\r -- Mac new Line
\r\n -- windows New line
\t -- Tab space

They all have to be inside "here", and allso


$hello = "hello ComputerX"
echo "$hello\r\n";
// outputs hello ComputerX
//

$hello = "hello ComputerX"
echo '$hello\r\n';
// outputs $hello\r\n

Phil Jackson
10-29-2009, 03:22 PM
That was my next comment, " for processing, ' for not processing.

kevinkhan
10-29-2009, 03:57 PM
Stop using .* - really, just stop using it.

'#<li class="vehicle"><a href="([^"]*)">(.*?)</a>#'

And what is peoples obsession with double quotes?

i have no tryed this


<?php

function getMatches($strMatch,$strContent)
{
if(preg_match_all($strMatch,$strContent,$objMatches))
{
return $objMatches;
}
return "";
}


$strContent = '<li class="vehicle"><a href="http://www.domain.ie/abc?/">vehicle1</a>
<li class="bus"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>
<li class="vehicle"><a href="http://www.domain.ie/abc?/">vehicle2</a>
<li class="van"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>
<li class="van"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>';

$strMatch '#<li class="vehicle"><a href="([^"]*)">(.*?)</a>#';
$objListMatches = getMatches($strMatch,$strContent);


$sUrl = $objListMatches[1];
$sTitle = $objListMatches[2];


echo $sUrl;
echo "<br />";
echo $sTitle;


?>


but am getting

Parse error: parse error in C:\Program Files\Apache Software Foundation\Apache2.2\htdocs\carzoneCrawler\preg_match.php on line 19

The line that the regular expression is on

Phil Jackson
10-29-2009, 04:16 PM
$strMatch '#<l

should be

$strMatch = '#<l

kevinkhan
10-29-2009, 04:33 PM
Thanks ha ha..

could have spent all day looking at that and wounldnt have spotted it...

ok so when i change



$strContent = '<li class="vehicle"><a href="http://www.domain.ie/abc?/">vehicle1</a>
<li class="bus"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>
<li class="vehicle"><a href="http://www.domain.ie/abc?/">vehicle2</a>
<li class="van"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>
<li class="van"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>';



to



$strContent = file_get_contents('http://www.carzone.ie/search/results?searchsource=browse&cacheBuster=1256829754944212');


it returns nothing :(

Does the regular expression change for this..

i only want to extract the titles which have the following class


#<li class="vehicle"><a href="([^"]*)">([^<]*)</a>#is

Phil Jackson
10-29-2009, 04:49 PM
<?php

function getMatches($strMatch, $strContent)
{
if(preg_match_all("#".$strMatch."#is", $strContent, $objMatches))
{
return $objMatches[0];
//return $objMatches[1]; for just title
}
else
{
return false;
}
}

$strContent = '<li class="vehicle"><a href="http://www.domain.ie/abc?/">vehicle1</a>
<li class="bus"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>
<li class="vehicle"><a href="http://www.domain.ie/abc?/">vehicle2</a>
<li class="van"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>
<li class="van"><a href="http://www.domain.ie/abc?/">Alfa Romeo 147</a>';

$strMatch = '<li class="vehicle"><a href="([^"]*)">([^>]+)</a>';

if($objListMatches = getMatches($strMatch, $strContent))
{
foreach($objListMatches as $vehicle)
{
echo $vehicle."<br />";
}
}
else
{
//could add something if doesnot find anything
echo "nothing found";
}

?>



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum