...

View Full Version : Using php - simplexml Some help needed.



jeddi
10-10-2009, 12:24 AM
Hi,

I am looking at a download from clickbank and I notice that
it has two files, a very small on suffixed with .dtd which I
list below, and a huge file 26 Mb suffixed with .xml

Here is the .dtd


<!ELEMENT Catalog ( Category* ) >
<!ELEMENT Category ( Name, Site*, Category* ) >
<!ELEMENT Commission ( #PCDATA ) >
<!ELEMENT Description ( #PCDATA ) >
<!ELEMENT EarnedPerSale ( #PCDATA ) >
<!ELEMENT TotalEarningsPerSale ( #PCDATA ) >
<!ELEMENT TotalRebillAmt ( #PCDATA ) >
<!ELEMENT HasRecurringProducts ( #PCDATA ) >
<!ELEMENT Gravity ( #PCDATA ) >
<!ELEMENT Id ( #PCDATA ) >
<!ELEMENT Name ( #PCDATA ) >
<!ELEMENT PercentPerSale ( #PCDATA ) >
<!ELEMENT PopularityRank ( #PCDATA ) >
<!ELEMENT Referred ( #PCDATA ) >
<!ELEMENT Site ( Commission? | Description+ | EarnedPerSale? | TotalEarningsPerSale? | TotalRebillAmt? | Gravity? | Id+ | PercentPerSale? | PopularityRank+ | Referred? | Title+ | HasRecurringProducts )* >
<!ELEMENT Title ( #PCDATA ) >


And here is the first few lines on the .xml file.


<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE Catalog SYSTEM "marketplace_feed_v1.dtd">
<Catalog>
<Category>
<Name>Business to Business</Name>
<Site>
<Id>REGEASY</Id>
<PopularityRank>1</PopularityRank>
<Title><![CDATA[Registry Easy - #1 Converting Registry Cleaner & System Optimizer.]]></Title>
<Description><![CDATA[Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! Http://www.cheesesoft.com/affiliates/registry-easy/.]]></Description>
<HasRecurringProducts>false</HasRecurringProducts>
<Gravity>226.333</Gravity>
<EarnedPerSale>31.7204</EarnedPerSale>
<PercentPerSale>75.0</PercentPerSale>
<TotalEarningsPerSale>31.7204</TotalEarningsPerSale>
<TotalRebillAmt>0.0</TotalRebillAmt>
<Referred>68.0</Referred>
<Commission>75</Commission>
</Site>
<Site>
<Id>BRYXEN4</Id>
<PopularityRank>2</PopularityRank>
<Title><![CDATA[Keyword Elite 2.0: The New Generation Of Keyword Research Software!]]></Title>
<Description><![CDATA[Dominate Adwords. Dominate Niche Marketing. Dominate The Search Engines. Go Here For Tons Of Affiliate Tools: Http://www.keywordelite.com/affiliate/.]]></Description>
<HasRecurringProducts>true</HasRecurringProducts>
<Gravity>229.6</Gravity>
<EarnedPerSale>65.1052</EarnedPerSale>
<PercentPerSale>48.0</PercentPerSale>
<TotalEarningsPerSale>74.1738</TotalEarningsPerSale>
<TotalRebillAmt>15.2186</TotalRebillAmt>
<Referred>79.0</Referred>
<Commission>50</Commission>
</Site>


Ok - so that shows the header info and the first two lines of data.

Now, the first line of the header info refers to the .dtd file.

If I just use the info in the .dtd file to create a table
with columns ( fields) as it states.

Or I could just create the table structure from looking at the first few
records in the xml file that I have shown.

Once I have done that, I guess that I write a php script
to open the file and then step through each row and pull out the contents that is found between the tags.

As it finds each tag it can locate the contents and update the table.

So:

$CB_file = file('clickbank.xml');

for($i=0; $i<count($CB_file); $i++) {
$arrayOfLine = explode('???', $geo_arr[$i]);

Update cbdb SET ????? = ??????
$result = mysql_query($sql) or die("could not CBDB"). mysql_error();
break;
}
}


Yes, I know that I have a lot of gaps to fill in :o

But, my question is, can this approach work with a
xml file of 28 Mb and based on the files that I have
can you please help me fill in the gaps.

PS I have JUST READ UP ON SIMPLEXML see below

Thanks for any input and help.

jeddi
10-10-2009, 02:10 PM
Hi,

Someone has suggested using "simplexml"

I have read it up and as far as I can tell
this is what I need to do:



$CB_file = file('clickbank.xml');

$xmlstr ="<<<XML ".$CB_file." XML";

// ( do I have to *** line breaks at all ? )

// Then I continue to check validity:

$xmlObject = simplexml_load_string($xmlstr);

// not sure about how I go to this next line

$xml = new SimpleXMLElement($xmlstr);


After this I guess that I need a foreach loop to work through the
whole file ?

oesxyl
10-10-2009, 04:45 PM
$CB_file = file('clickbank.xml');

$CB_file is a array not a string, read the documentation for file
http://www.php.net/manual/en/function.file.php



$xmlstr ="<<<XML ".$CB_file." XML";

this will build, is incorrect so I talk about intention, a string with the content of the $CB_file which as I said is a array not a string.

use one of this if you want to build $xmlstr


$xmlstr = join($CB_file);
$xmlstr = implode($CB_file); // same thing as previous
$xmlstr = file_get_contents('clickbank.xml');

read the manual for join, implode and file_get_contents( only one step)
http://www.php.net/manual/en/function.join.php
http://www.php.net/manual/en/function.implode.php
http://www.php.net/manual/en/function.file-get-contents.php



// ( do I have to *** line breaks at all ? )

only to be easy to read for you



// Then I continue to check validity:

$xmlObject = simplexml_load_string($xmlstr);

there is a simplexml_lod_file and you can avoid previous unnecessary steps( in my opinion)
http://www.php.net/manual/en/function.simplexml-load-file.php



// not sure about how I go to this next line

$xml = new SimpleXMLElement($xmlstr);

is allready loaded, see line $xmlObject

always keep the manual closer, that's important.

best regards

jeddi
10-10-2009, 06:44 PM
Thanks, I read the manual and a couple of tutes.

I now have something close to working :)

But I get and error on trying to write to the database:
it may be because I need to convert the data ?
Expand|Select|Wrap|Line Numbers


$sql = "INSERT INTO `clickbank` (cat,id,pop)
VALUES ('$category->Name','$site->Id','$site->PopularityRank')";


I noticed in the tute it said something that might apply:

It gave this example:

Expand|Select|Wrap|Line Numbers


$xml = ‘test_file.xml’;
$xml = simplexml_load_file($xml);
$value_to_store = (string) $xml->make[0]->model;
// This converts the "Mustang" SimpleXMLElement object to a string, making it disk storable.


Does this mean that I have to do this:

$Db_id = (string) $xml->Category->$site->Id;
for each field?

And is this enough? Or do I need to add counters to keep track of which row is being processed and then use something like:


$Db_id = (string) $xml->Category[$cnt1]->$site->Id;


The error message I get from the script is :


could not execute INSERT set up clients.You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'desc,recurr,grav,earn,percent,totearn,rebill,refe r,comm) VALUES ' at line 1

Your advice is appreciated :)

oesxyl
10-10-2009, 07:24 PM
Thanks, I read the manual and a couple of tutes.

I now have something close to working :)

But I get and error on trying to write to the database:
it may be because I need to convert the data ?
Expand|Select|Wrap|Line Numbers


$sql = "INSERT INTO `clickbank` (cat,id,pop)
VALUES ('$category->Name','$site->Id','$site->PopularityRank')";



use var_dump to see what's in each variable, but I guess each are arrays, so you must extract only the values you need.



And is this enough? Or do I need to add counters to keep track of which row is being processed and then use something like:


$Db_id = (string) $xml->Category[$cnt1]->$site->Id;

yes, somethink like that, $cnt1 is the position of Category node in the tree.



The error message I get from the script is :

Your advice is appreciated :)
check my assumption about being an array there, also look to have a valid sql query. Try to see the query before you use it, write something like this right after $sql line:


print '<pre>'.$sql.'</pre>';


best regards

jeddi
10-10-2009, 08:49 PM
OK - that was a good idea

This is the output I get:

INSERT INTO clickbank (cat,id,pop,title,desc,recurr,grav,earn,percent,totearn,rebill,refer,comm)
VALUES
('Business to Business','REGEASY','1','Registry Easy - #1 Converting Registry Cleaner & System Optimizer.','Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! ww.cheesesoft.com/affiliates/registry-easy/.','false','226.333','31.7204','75.0','31.7204','0.0','68.0','75')

Looks like the values are getting through fine.
Could the problem be that grav is set up in the table as:

double(5.2) maybe the 31.7204 doesn't fit.

Actually I don't understand that number, is it supposed be 31720.40 dollars or 317,204 dollars ? Or only 31.72 dollars ?

Or maybe it is something in that long description ?

PS I had to edit the url that was in the desc because it got messed up in this forum post.

it was
...Talk To Us! Http://www.cheesesoft.com/affiliates/registry-easy/.'

oesxyl
10-10-2009, 09:17 PM
OK - that was a good idea

This is the output I get:

INSERT INTO clickbank (cat,id,pop,title,desc,recurr,grav,earn,percent,totearn,rebill,refer,comm)
VALUES
('Business to Business','REGEASY','1','Registry Easy - #1 Converting Registry Cleaner & System Optimizer.','Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! ww.cheesesoft.com/affiliates/registry-easy/.','false','226.333','31.7204','75.0','31.7204','0.0','68.0','75')

Looks like the values are getting through fine.
Could the problem be that grav is set up in the table as:

double(5.2) maybe the 31.7204 doesn't fit.

Actually I don't understand that number, is it supposed be 31720.40 dollars or 317,204 dollars ? Or only 31.72 dollars ?

Or maybe it is something in that long description ?

PS I had to edit the url that was in the desc because it got messed up in this forum post.

it was

from http://dev.mysql.com/doc/refman/5.0/en/numeric-type-overview.html:


DOUBLE[(M,D)]

M is the total number of digits and D is the number of digits following the decimal point. If M and D are omitted, values are stored to the limits allowed by the hardware.


I don't know, after the error message from your previous post seems that the problem is with the value for field cat or the left round bracket '(' after 'values'.
check the type of the columns and if values are of same type.

best regards

jeddi
10-10-2009, 09:56 PM
OK - think I have found the problem

I think it was the field na "desc" because it is used in the ORDER part of sql.

It was a guess but when I changed the name to "descrip" the first three rows get processed OK

This is my file structure now:



$sql = "CREATE TABLE `clickbank` (
`cb_id` smallint(8) NOT NULL AUTO_INCREMENT,
`id` varchar(10) NOT NULL default 'none',
`cat` varchar(50) NOT NULL default 'none',
`pop` smallint(8) NOT NULL default '1',
`title` varchar(100) NOT NULL default 'n',
`descrip` varchar(300) NOT NULL default 'n',
`recurr` char(1) NOT NULL default 'n',
`grav` double(10,2) NOT NULL default '99.99',
`earn` double(10,2) NOT NULL default '99.99',
`percent` double(5,2) NOT NULL default '99.99',
`totearn` double(10,2) NOT NULL default '99.99',
`rebill` double(10,2) NOT NULL default '99.99',
`refer` double(10,2) NOT NULL default '99.99',
`comm` double(5,2) NOT NULL default '99.99',
PRIMARY KEY (cb_id)


I still have a problem and it seems to be caused by single quotes in the description data.

This is my out put


0) Business to Business
0) REGEASY1Registry Easy - #1 Converting Registry Cleaner & System Optimizer.Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! Http://www.cheesesoft.com/affiliates/registry-easy/.false226.33331.720475.031.72040.068.075

INSERT INTO clickbank ( cat, id, pop, title, descrip, recurr, grav, earn, percent, totearn, rebill, refer, comm )
VALUES ( 'Business to Business', 'REGEASY', '1', 'Registry Easy - #1 Converting Registry Cleaner & System Optimizer.', 'Stunning Conversions With Extremely Low Refund Rate. Dedicated Affiliate Support. Extraordinary Customer Service. Any Kind Of Conversion Tracking & Multiple Landing Pages. Talk To Us! Http://www.cheesesoft.com/affiliates/registry-easy/.', 'false',
'226.333', '31.7204', '75.0', '31.7204', '0.0', '68.0', '75' )

1) BRYXEN42Keyword Elite 2.0: The New Generation Of Keyword Research Software!Dominate Adwords. Dominate Niche Marketing. Dominate The Search Engines. Go Here For Tons Of Affiliate Tools: Http://www.keywordelite.com/affiliate/.true229.665.105248.074.173815.218679.050

INSERT INTO clickbank ( cat, id, pop, title, descrip, recurr, grav, earn, percent, totearn, rebill, refer, comm )
VALUES ( 'Business to Business', 'BRYXEN4', '2', 'Keyword Elite 2.0: The New Generation Of Keyword Research Software!', 'Dominate Adwords. Dominate Niche Marketing. Dominate The Search Engines. Go Here For Tons Of Affiliate Tools: Http://www.keywordelite.com/affiliate/.', 'true', '229.6', '65.1052', '48.0', '74.1738', '15.2186', '79.0', '50' )

2) MAVERICKCO3Maverick Coaching - Cell Phone Cash.Cell Phone Cash: A Brand New Course By Maverick Coaching Members Are Making At Least $279/Day With Cell Phones! Customers Get Our 'Make Money Or Its Free' Guarantee, 24/7 Phone Support! Affiliates: Http://cellphonecash.maverickcoaching.com/affiliates.php.true674.45912.982850.025.546212.563486.050

INSERT INTO clickbank ( cat, id, pop, title, descrip, recurr, grav, earn, percent, totearn,rebill, refer, comm )
VALUES ( 'Business to Business', 'MAVERICKCO', '3', 'Maverick Coaching - Cell Phone Cash.', 'Cell Phone Cash: A Brand New Course By Maverick Coaching Members Are Making At Least $279/Day With Cell Phones! Customers Get Our 'Make Money Or Its Free' Guarantee, 24/7 Phone Support! Affiliates: ttp://cellphonecash.maverickcoaching.com/affiliates.php.', 'true',
'674.459', '12.9828', '50.0', '25.5462', '12.5634', '86.0', '50' )

could not execute INSERT set up clients.You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'Make Money Or Its Free' Guarantee, 24/7 Phone Support! Affiliates: Http://cellph' at line 4

oesxyl
10-10-2009, 11:16 PM
'Make Money Or Its Free' Guarantee, 24/7 Phone Support! Affiliates: ttp://cellphonecash.maverickcoaching.com/affiliates.php.'

that ' is the problem there.
http://www.php.net/manual/en/function.mysql-escape-string.php

best regards



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum