...

View Full Version : Why is my script writing 500,000 records ?



jeddi
01-24-2010, 08:07 PM
Hi,
I am trying to get an xml file into my mysql tables and I am
struggling with its structure.

This is how the xml looks

(Notice that it has two levels of category)


<?xml version="1.0" encoding="ISO-8859-1"?>
<Catalog>
<Category>
<Name>Business to Business</Name>
<Site>
<Id>PRODUCT01</Id>
<Popularity>18</Popularity>
<Title><![CDATA[A title here]]></Title>
<Description><![CDATA[Some words here]]></Description>
<Commission>75</Commission>
</Site>
<Category>
<Name>Education</Name>
<Site>
<Id>PRODUCT02</Id>
<Popularity>2</Popularity>
<Title><![CDATA[A title here]]></Title>
<Description><![CDATA[Some words here]]></Description>
<Commission>75</Commission>
</Site>
</Category>
<Category>
<Name>Publishing</Name>
<Site>
<Id>PRODUCT03</Id>
<Popularity>6</Popularity>
<Title><![CDATA[A title here]]></Title>
<Description><![CDATA[Some words here]]></Description>
<Commission>75</Commission>
</Site>
</Category>
</Category>
<Category>
<Name>Society &amp; Culture</Name>
<Site>
etc, etc



Now, I thought I had it it sorted, but my little script
managed to created over 500,000 records in the table !!

I have two tables,
One table contains the product data, the other table contains which
categories the product is in. The reason the category table is that
the product maybe in 3 or 4 categories or sub-categories.

The script firts checks to see if the product already exists, if it
does then only the category table is updated with the category
data.

I have done two loops, one to work through the first level
categories and the second to handle the sub-categories. ıt all
looks logical to me, but it is not working :(

Could do with some help :)


This is my code:




$xml = simplexml_load_file($file);

$cnt = 0;

foreach ($xml->xpath('/Catalog/Category') as $top_cat) {

foreach ($top_cat->xpath('Site') as $top_site) {
$sql_ck = "SELECT cb_id FROM cb_update WHERE id = '$top_site->Id' AND day_no = '$this_day'";
$result_ck = mysql_query($sql_ck)
or die("could not FIND ID in cb_update.". mysql_error());

$num = mysql_num_rows($result_ck);

if($num == 0) { // - so the product is not yet recorded.

$title = mysql_real_escape_string($top_site->Title);
$descrip = mysql_real_escape_string($top_site->Description);

$sql_ins = "INSERT INTO cb_update ( cb_date, day_no, id, title, descrip, comm )
VALUES
( '$today', '$this_day','$top_site->Id','$title', '$descrip', '$top_site->Commission' )";

$result_ins = mysql_query($sql_ins) or die("could not execute INSERT to clicky.". mysql_error());


// Also insert the category

$sql_ins = "INSERT INTO cb_cat_update (id_cat, cat, sub_cat, pop, day_no_cat)
VALUES ('$top_site->Id', '$top_cat->Name', ' ', '$top_site->Popularity', '$this_day')";

$result_ins = mysql_query($sql_ins) or die("could not execute INSERT to cb_cat_update.". mysql_error());


}
else { // So the product IS recorded, this must be another occurance in a different category - so we just record the extra category.

$sql_ins = "INSERT INTO cb_cat_update (id_cat, cat, sub_cat, pop, day_no_cat)
VALUES ('$top_site->Id', '$top_cat->Name', ' ', '$top_site->Popularity', '$this_day')";

$result_ins = mysql_query($sql_ins) or die("could not execute INSERT to cl_cat.". mysql_error());

}
}

// Now lets do the usb-categories

foreach ($top_cat->xpath('Category') as $sub_cat) {

foreach ($sub_cat->xpath('Site') as $sub_site) {

$sql_ck = "SELECT cb_id FROM cb_update WHERE id = '$sub_site->Id' AND day_no = '$this_day'";
$result_ck = mysql_query($sql_ck)
or die("could not FIND ID in cb_update.". mysql_error());

$num = mysql_num_rows($result_ck);

if($num == 0) { // - so the product is not yet recorded.

$title = mysql_real_escape_string($sub_site->Title);
$descrip = mysql_real_escape_string($sub_site->Description);

$sql_ins = "INSERT INTO cb_update ( cb_date, day_no, id, title, descrip, comm )
VALUES
( '$today', '$this_day','$sub_site->Id','$title', '$descrip', '$sub_site->Commission' )";

$result_ins = mysql_query($sql_ins) or die("could not execute INSERT to clicky.". mysql_error());


// Also insert the sub-category

$sql_ins = "INSERT INTO cb_cat_update (id_cat, cat, sub_cat, pop, day_no_cat)
VALUES ('$sub_site->Id', '$top_cat->Name', '$sub_cat->Name', '$sub_site->Popularity', '$this_day')";

$result_ins = mysql_query($sql_ins) or die("could not execute INSERT to cb_cat_update.". mysql_error());

}
else { // So the product IS recorded, this must be another occurance in a different sub-category - so we just record the extra category and sub-category.

$sql_ins = "INSERT INTO cb_cat_update (id_cat, cat, sub_cat, pop, day_no_cat)
VALUES ('$sub_site->Id', '$top_cat->Name', '$sub_cat->Name', '$sub_site->Popularity', '$this_day')";

$result_ins = mysql_query($sql_ins) or die("could not execute INSERT to cl_cat.". mysql_error());


}

$cnt++;

}
}
}

I am hoping that someone can see where I have gone wrong
with my script so that it created half a million records :confused:

It should only create about 15,000 records.

Would appreciated some help as I feel I am nearly there but
I have got stuck !!

Thanks :)

Dormilich
01-25-2010, 01:36 PM
what happens, if you comment out the sub-category block?

jeddi
01-26-2010, 09:12 AM
I reduced it to this.


foreach ($xml->xpath('/Catalog/Category') as $top_cat) {

foreach ($top_cat->xpath('Site') as $top_site) {
$sql_ck = "SELECT cb_id FROM cb_update WHERE id = '$top_site->Id' AND day_no = '$this_day'";
$result_ck = mysql_query($sql_ck)
or die("could not FIND ID in cb_update.". mysql_error());

$num = mysql_num_rows($result_ck);

if($num == 0) { // - so the product is not yet recorded.

$title = mysql_real_escape_string($top_site->Title);
$descrip = mysql_real_escape_string($top_site->Description);

$sql_ins = "INSERT INTO cb_update ( cb_date, day_no, id, title, descrip, comm )
VALUES
( '$today', '$this_day','$top_site->Id','$title', '$descrip', '$top_site->Commission' )";

$result_ins = mysql_query($sql_ins) or die("could not execute INSERT to clicky.". mysql_error());


// Also insert the category

$sql_ins = "INSERT INTO cb_cat_update (id_cat, cat, sub_cat, pop, day_no_cat)
VALUES ('$top_site->Id', '$top_cat->Name', ' ', '$top_site->Popularity', '$this_day')";

$result_ins = mysql_query($sql_ins) or die("could not execute INSERT to cb_cat_update.". mysql_error());


}
else { // So the product IS recorded, this must be another occurance in a different category - so we just record the extra category.

$sql_ins = "INSERT INTO cb_cat_update (id_cat, cat, sub_cat, pop, day_no_cat)
VALUES ('$top_site->Id', '$top_cat->Name', ' ', '$top_site->Popularity', '$this_day')";

$result_ins = mysql_query($sql_ins) or die("could not execute INSERT to cl_cat.". mysql_error());

}
}

$cnt++;
}


I only got 9750 rows written to my main table
I think it should be a lot more

and I got 14,000 rows written to my category table

Are you saying that this would record the sub-categories
as well ?

Sorry -* but I am a bit mixed up with this :confused:


.

Dormilich
01-26-2010, 01:05 PM
I just want to know, which of the loops causes the 500,000 entries

jeddi
01-26-2010, 01:21 PM
Well
I took out the sub-cat one and the records reduced to
more "normal" amounts

so I guess it must be the sub_cat loop that has an error somewhere
... but I can not see where :(



I think it is something to do with where I am positioning the
foreach loops.

Should the Sub-cat ones start after the <category>
ones have finished, or should they be run inside the <category>
foreach loop (first or second ?) ?


.

jeddi
01-26-2010, 02:14 PM
Hi,
I have put the sub-cat code back in.

So my codes runs:



foreach ($xml->xpath('/Catalog/Category') as $top_cat) {

foreach ($top_cat->xpath('Site') as $top_site) {

// Code block for top_cat

}
}

foreach ($top_cat->xpath('Category') as $sub_cat) {

foreach ($sub_cat->xpath('Site') as $sub_site) {

// Code block for sub_cat

}
}


Doing it this way creates some more records
and looking at them, I see that I am getting
the sub-cat sites for the last <category> only

I am now going to try putting the sub-cat bloclk
inside the first top-cat foreach loop

...



.

jeddi
01-26-2010, 02:29 PM
OK

Done that,

so my code runs:


foreach ($xml->xpath('/Catalog/Category') as $top_cat) {

foreach ($top_cat->xpath('Site') as $top_site) {

// Code block for top_cat

}

foreach ($top_cat->xpath('Category') as $sub_cat) {

foreach ($sub_cat->xpath('Site') as $sub_site) {

// Code block for sub_cat

}
}
}


Now there are less in the main file
( down from 9730 to 7430 )

but there are more entires in the category table
- so I now have to check them all to see if this is the correct structure.

Any ideas ?

meleader2
01-26-2010, 07:55 PM
you'll probably get a better understanding how your code is traversing through your xml if you reduce your xml to just a 1 entry file, and run your script. i'd try that and see how much easier debugging will be :)



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum