View Full Version : Take data from website using PHP

07-27-2008, 11:06 AM

i write the below code to capture the images from the website when i submit the url.In the same way i want to capture the Text information from the website.plz tell that whats the code for that.plz help me.


$content= file_get_contents($url);
preg_match_all( "/<img(.*)src=(\"|')(.*)(\"|\')(.*)[\/]?>/siU", $content, $match, PREG_PATTERN_ORDER);

echo "<b>Capture Images :</b><br>";
echo "<br>";
echo "<br>";
echo "<br>";
echo "<b>Capture Images URLS :</b><br><br>";
preg_match_all( "/<img(.*)src=(\"|')(.*)(\"|\')(.*)[\/]?>/siU", $content, $match, PREG_PATTERN_ORDER);

07-27-2008, 09:49 PM
Capturing text is far more difficult than capturing an image. The problem is consistency, we cannot be certain if text has been stored in p, span, and li tags. Unless they have a declared doctype, there is no guarantee that text will be placed in the proper locations.

07-27-2008, 10:02 PM
You could try searching for specific tags (such as p,h1,span) and calculating the length of characters between the opening tag and the closing tag (strlen). With this number you could extract the characters inbetween the tags (strstr, I think). Loop through the entire document looking for these tags and extracting them.

07-28-2008, 02:32 PM
Just out of curiosity, what is the website you want to parse?

07-29-2008, 01:19 PM
google "spiderring web pages" or "data mining"
there's a number of scripts out there you could use