...

View Full Version : Help required with a regex problem



Halli
03-14-2007, 11:55 AM
Hi all,

Ive been working on a way to extract data from a table on another url, then to output the data in a format like this --> (Man Utd, 29, 12, 1, 1, etc etc)

At the moment i have the following program using PHP, but when executed on the server it just comes up as a blank page with nothing on it.

Can somebody with a bit more experience than me please point me in the right direction if possible. Perhaps there is a small error with the way im going about this code that i havnt picked up, or an error with the regular expression ive assigned to variable $regexp


<?

$url = "http://www.sportinglife.com/football/premiership/table/table.html";

$input = @file_get_contents($url) or die('Could not access file: $url');

$regexp = "/(?:<tr>{1}?\s?<td\s?align=\"left\">\s?<a[^>]+>(?:\w+\s?\w)+<\/a>\s?<\/td>(?:<td[^>]+>\d{1,}<\/td>)+<\/tr>)+/";

preg_match_all("$regexp", $input, $matches, PREG_SET_ORDER);

foreach($matches as $match) {

echo $match[1].",";
}

?>


Any help greatly appreciated :)

the-dream
03-14-2007, 01:40 PM
Just taking a look at it now...

the-dream
03-14-2007, 01:44 PM
Sorry! Works Fine On mine with ur code

Halli
03-14-2007, 09:44 PM
Any reason why it would work for you and not me? Mine just comes up blank, did you do anything different?

Inigoesdr
03-15-2007, 06:18 PM
I don't know what the-dream got but there are no matches with that regex. I used this code and got 7 matches(Man Utd,Chelsea,Arsenal,Liverpool,Bolton,Everton,Reading):

$url = "http://www.sportinglife.com/football/premiership/table/table.html";
$input = @file_get_contents($url) or die('Could not access file: $url');
//$regexp = "/(?:<tr>{1}?\s?<td\s?align=\"left\">\s?<a[^>]+>(?:\w+\s?\w)+<\/a>\s?<\/td>(?:<td[^>]+>\d{1,}<\/td>)+<\/tr>)+/";
$regexp = '/<tr>\n?<td align=\"left\">\n?<a[^>]+>(.*)<\/a>\n?\n?<\/td>\n?(?:<td[^>]+>\d{1,}<\/td>\n?\n?)+<\/tr>+/';
preg_match_all($regexp, $input, $matches, PREG_SET_ORDER);
echo count($matches) . ' matches<br />';
foreach($matches as $match) {

echo $match[1].",";
}

the-dream
03-15-2007, 06:51 PM
Nope!

Dident Change a thing??

Duno what happend????

Halli
03-16-2007, 06:27 AM
Ok i have gotten further, have adjusted the code and gone with whats written below:


<?php

$url = "http://www.sportinglife.com/football/premiership/table/table.html";

$input = @file_get_contents($url) or die('Could not access file: $url');

$regexp = "/<tr>{1}?\s?<td\s?align=\"left\">\s?<a[^>]+>(?:\w+\s?\w)+<\/a>\s?<\/td>.*?(?:<td[^>]*>\d{1,}<\/td>.*?)+<\/tr>/is";

preg_match_all("$regexp", $input, $matches, PREG_SET_ORDER);

echo "<table>\n";

foreach($matches as $match)
{
echo $match[0];
}

echo "</table>\n";

?>

Now once this is run on the server, the following data is displayed:


Man Utd 29 12 1 1 35 8 11 2 2 31 11 72 47
Chelsea 29 10 4 0 30 8 10 2 3 21 11 66 32
Arsenal 28 9 5 0 34 11 7 2 5 17 12 55 28
Liverpool 29 11 3 1 29 4 5 2 7 15 16 53 24
Bolton 29 8 3 4 21 14 6 2 6 13 20 47 0
Everton 29 7 4 3 21 11 4 6 5 16 15 43 11
Reading 29 9 1 4 26 16 4 3 8 17 22 43 5
Tottenham 29 9 1 4 25 17 3 5 7 15 26 42 -3
Portsmouth 29 8 4 3 22 12 3 4 7 14 19 41 5
Blackburn 29 7 2 5 19 16 5 2 8 16 23 40 -4
Newcastle 29 7 5 3 23 17 3 2 9 11 20 37 -3
Middlesbrough 29 8 3 3 21 14 1 6 8 11 20 36 -2
Aston Villa 29 6 4 4 15 12 1 8 6 14 23 33 -6
Fulham 29 6 5 4 14 13 1 7 6 17 31 33 -13
Wigan 29 5 2 7 14 20 4 3 8 16 24 32 -14
Sheff Utd 29 5 6 4 18 17 3 1 10 7 24 31 -16
Man City 28 5 4 6 10 13 3 2 8 10 21 30 -14
Charlton 29 5 3 6 15 17 1 3 11 11 32 24 -23
Watford 29 2 7 6 13 20 1 4 9 5 23 20 -25
West Ham 29 5 2 8 17 21 0 3 11 4 29 20 -29

The format doesnt come up properly on this forum when i paste it in, but basically the table pretty much comes up in the same html source code as that from the site im extracting it from, even the team names have html links back to the site.

This isnt want i want, what i need is the data to come back to me in plain text with commas sepperated each cell and row lines broken. There are 20 rows of data and i want it returned such as:

Man Utd, 29, 12, 1, 1, 35 etc
Chelsea, 28, 13, 1, 2, 41 etc
Arsenal, 28, 12, 1, 1, 30 etc

Does anyone know of a way to make this possible with the code i have above. Ive tried several different ways but nothing has sufficed

Inigoesdr
03-16-2007, 06:59 AM
Try this out(you may have to modify it a bit):

<?php
$url = 'http://www.sportinglife.com/football/premiership/table/table.html';
$input = @file_get_contents($url) or die('Could not access file: $url');
$regexp = '/<tr>{1}?\s?<td\s?align="left">\s?<a[^>]+>([A-z0-9 ]+)<\/a>\s?<\/td>.*?(?:<td[^>]+>([0-9\-]+)<\/td>.*?)+.*?(?:<td[^>]+>([0-9\-]+)<\/td>.*?)+.*?(?:<td[^>]+>([0-9\-]+)<\/td>.*?)+.*?(?:<td[^>]+>([0-9\-]+)<\/td>.*?)+.*?(?:<td[^>]+>([0-9\-]+)<\/td>.*?)+.*?(?:<td[^>]+>([0-9\-]+)<\/td>.*?)+.*?(?:<td[^>]+>([0-9\-]+)<\/td>.*?)+.*?(?:<td[^>]+>([0-9\-]+)<\/td>.*?)+.*?(?:<td[^>]+>([0-9\-]+)<\/td>.*?)+.*?(?:<td[^>]+>([0-9\-]+)<\/td>.*?)+.*?(?:<td[^>]+>([0-9\-]+)<\/td>.*?)+.*?(?:<td[^>]+>([0-9\-]+)<\/td>.*?)+.*?(?:<td[^>]+>([0-9\-]+)<\/td>.*?)+<\/tr>/is';
preg_match_all($regexp, $input, $matches, PREG_SET_ORDER);
echo "<table>\n";
foreach($matches as $match)
{
unset($match[0]);
echo implode(', ', $match) . "<br />\n";
}
echo "</table>\n";
?>
This is the output:


<table>
Man Utd, 29, 12, 1, 1, 35, 8, 11, 2, 2, 31, 11, 72, 47<br />
Chelsea, 29, 10, 4, 0, 30, 8, 10, 2, 3, 21, 11, 66, 32<br />
Arsenal, 28, 9, 5, 0, 34, 11, 7, 2, 5, 17, 12, 55, 28<br />
Liverpool, 29, 11, 3, 1, 29, 4, 5, 2, 7, 15, 16, 53, 24<br />
Bolton, 29, 8, 3, 4, 21, 14, 6, 2, 6, 13, 20, 47, 0<br />
Everton, 29, 7, 4, 3, 21, 11, 4, 6, 5, 16, 15, 43, 11<br />
Reading, 29, 9, 1, 4, 26, 16, 4, 3, 8, 17, 22, 43, 5<br />
Tottenham, 29, 9, 1, 4, 25, 17, 3, 5, 7, 15, 26, 42, -3<br />
Portsmouth, 29, 8, 4, 3, 22, 12, 3, 4, 7, 14, 19, 41, 5<br />
Blackburn, 29, 7, 2, 5, 19, 16, 5, 2, 8, 16, 23, 40, -4<br />
Newcastle, 29, 7, 5, 3, 23, 17, 3, 2, 9, 11, 20, 37, -3<br />
Middlesbrough, 29, 8, 3, 3, 21, 14, 1, 6, 8, 11, 20, 36, -2<br />
Aston Villa, 29, 6, 4, 4, 15, 12, 1, 8, 6, 14, 23, 33, -6<br />
Fulham, 29, 6, 5, 4, 14, 13, 1, 7, 6, 17, 31, 33, -13<br />
Wigan, 29, 5, 2, 7, 14, 20, 4, 3, 8, 16, 24, 32, -14<br />
Sheff Utd, 29, 5, 6, 4, 18, 17, 3, 1, 10, 7, 24, 31, -16<br />
Man City, 28, 5, 4, 6, 10, 13, 3, 2, 8, 10, 21, 30, -14<br />
Charlton, 29, 5, 3, 6, 15, 17, 1, 3, 11, 11, 32, 24, -23<br />
Watford, 29, 2, 7, 6, 13, 20, 1, 4, 9, 5, 23, 20, -25<br />
West Ham, 29, 5, 2, 8, 17, 21, 0, 3, 11, 4, 29, 20, -29<br />
</table>

Halli
03-16-2007, 08:21 AM
Mate, your a legend, thanks so much! :)

It worked a treat



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum