...

View Full Version : Does this mean that DOM is not working ?



jeddi
10-23-2009, 10:04 AM
Hi,

I am new to using the DOMDocument() function set
and am practicing on website that I know.

this is my script, but I am not getting any output.

I alştered the script to show the contents of $dom
and I find that it is empty ??



$target_url = "http://www.expert-world.com/";
$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';

echo "<br>Starting<br>Target_url: $target_url";

// make the cURL request to $target_url

$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$page= curl_exec($ch);
if (!$page) {
echo "<br />cURL error number:" .curl_errno($ch);
echo "<br />cURL error:" . curl_error($ch);
exit;
}

//echo "<br>Page: $page"; THIS echoed the page fine when uncommented.

// parse the html into a DOMDocument
$dom = new DOMDocument();
@$dom->loadHTML($page);

echo "<br>Dom: $dom";

// grab all the on the page
$xpath = new DOMXPath($dom);

echo "<br>Xpath: $xpath";

?>

The result:

Starting
Target_url: http://www.expert-world.com/

Shouldn't I be seeing something more?

Does this mean that my php5.2 is not equipped with DOMDocument() ?

Fumigator
10-23-2009, 04:05 PM
$dom is an object, it's not a string. You can't just echo $dom and see what's in it, you have to use the saveHTML() method of the DOMDocument object.

http://us3.php.net/manual/en/domdocument.savehtml.php

jeddi
10-23-2009, 05:58 PM
OK,

I have added the extra line:



$target_url = "http://www.support-focus.com/";
$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';

echo "<br>Starting<br>Target_url: $target_url";



// make the cURL request to $target_url
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$page= curl_exec($ch);
if (!$page) {
echo "<br />cURL error number:" .curl_errno($ch);
echo "<br />cURL error:" . curl_error($ch);
exit;
}

//echo "<br>Page: $page";

// parse the html into a DOMDocument
$dom = new DOMDocument();
@$dom->loadHTML($page);
echo "<br>Dom: $dom";



// grab all the on the page
$xpath = new DOMXPath($dom);

echo "<br>Xpath: $xpath";




I also changed the url to: http://www.support-focus.com/

Still I only get:

Starting
Target_url: http://www.support-focus.com/


OPPS - SORRY IGNORE THIS.
I WILL ADD SOME MORE CODE AFTER ı READ A BIT MORE

jeddi
10-23-2009, 06:42 PM
Ok, I have read some more and tried a few things but I
can not get this working.

I am trying to use the dom to get a list of links
and this my code:




<?php
/*
* Get links
*/

$target_url = "http://www.support-focus.com/";
$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';

echo "<br>Starting<br>Target_url: $target_url";

$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_URL,$target_url);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$page= curl_exec($ch);
if (!$page) {
echo "<br />cURL error number:" .curl_errno($ch);
echo "<br />cURL error:" . curl_error($ch);
exit;
}

//echo "<br>Page: $page";

// parse the html into a DOMDocument
$dom = new DOMDocument();
$doc->loadHTMLFile($page);

echo $doc->saveHTML();

$params = $doc->getElementsByTagName('a'); // Find the a hrefs
$k=0;
foreach ($params as $param) //go to each link1 by 1
{
echo "Link Attribute :-> ".$params->item($k)->getAttribute('href')."<br>"; //get a

$k++;

}
?>

I have looked over it a couple of times and tried different things
but I get no output from the screen :(

(even the Starting<br>Target_url: does not display )

Can anyone help me see what I have done wrong ?
Thanks

jeddi
10-23-2009, 06:58 PM
OK I am missing the DOM class

Fatal error: Class 'DOMDocument' not found i

I thought this was part of php 5.

In fact the manual says so:

There is no installation needed to use these functions; they are part of the PHP core.

Fumigator
10-23-2009, 10:42 PM
$dom = new DOMDocument();
$doc->loadHTMLFile($page);

echo $doc->saveHTML();

You managed to float from using $dom to using $doc. I suggest you stick with one or the other :p

jeddi
10-24-2009, 06:12 AM
Well spotted :)

I fixed that, but alas my DOMdoc is still not working.

I ran this command on my server:
rpm -qa | grep php

and got:

php-mcrypt-5.1.6-15.el5.centos.1
wbm-php-pear-1.5-1
php-pdo-5.1.6-23.el5
php-snmp-5.1.6-23.el5
php-devel-5.1.6-23.el5
php-gd-5.1.6-23.el5
php-pear-1.4.9-4
php-common-5.1.6-23.el5
php-cli-5.1.6-23.el5
php-mysql-5.1.6-23.el5
php-pgsql-5.1.6-23.el5
php-mbstring-5.1.6-23.el5
php-odbc-5.1.6-23.el5
php-imap-5.1.6-23.el5
php-5.1.6-23.el5
php-ldap-5.1.6-23.el5
php-xmlrpc-5.1.6-23.el5

The last one in the list looks like the one -
so that means that it should be running ?

Is there something it the php.ini file that I need to turn on ?

You can see the result on this url:

Curl - DOM test (http://www.my-toll-gate.com/curl_5.php)

Any ideas ?



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum