...

View Full Version : Scrapping multiple pages at once?



Suffice
05-17-2011, 07:22 PM
Hi, I'm still pretty new to CF and have been trying to build a site scrapper to grab video embeds for a while now. So far it works surprisingly well, but there is one problem. Whenever the video embeds are separated by a <!--nextpage--> in Wordpress it creates a separate page, making it difficult to get at all of the embeds at once...

What I have so far:


<?php
if (!isset($_POST['submit'])) {

?>

HTML stuff

<?php
}
else {

$VideoSite = $_POST['VideoSite'];
{
?>

HTML stuff

<?php
$s = file_get_contents($VideoSite);

//Not sure how I'd go about adding this:
$s2 = file_get_contents("$VideoSite/2");

$patterns = array();
$patterns[] = '<embed[^>]+src="(.+?)"';
$patterns[] = '<iframe[^>]+src="(.+?)"';
$patterns[] = '<object[^>]+src="(.+?)"';

$patterns = "#(?:" . implode("|", $patterns) . ")#si";

preg_match_all($patterns, $s, $m);
if (!empty($m[0]))
{
$edata = array();
foreach($m[0] as $match)
{
//....rest of code....

What I've been searching for a way to do is get it so that this code can search multiple URLs at the same time, such as using this:


$VideoSite = 'http://random.com/random-episode-4'

To search for these as well:


http://random.com/random-episode-4/2
http://random.com/random-episode-4/3
http://random.com/random-episode-4/4
...

And so on until about eight...

I'm in desperate need to get this working as efficient as possible for our site and our mods, I appreciate any help, thank you!



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum