View Single Post
Old 11-17-2010, 02:35 PM   PM User | #6
Samhain13
Regular Coder

 
Samhain13's Avatar
 
Join Date: Aug 2008
Location: Pilipinas
Posts: 165
Thanks: 4
Thanked 18 Times in 18 Posts
Samhain13 is on a distinguished road
Woohoo, now that's an example! Anyway, it was a good exercise.

First things first: in order to keep my sanity, I needed to write a helper class that provides a few methods for manipulating the XML (and the nodes) that we have to build. I've attached the script as a TXT file, you should replace the extension to .py if you want to try it.

I also attached a sample usage script (just in case I get disconnected and not be able to get back right away). It has lots and lots of comments, which I'll just be repeating here in parts. Like the helper script, just change the extension to .py.

Now, on using the helper script.

We'll import our helper class, instaniate it, create our basic XML structure:
Code:
from xs2ml import xs2ml

# xs2ml takes the XLS filename as an argument on __init__.
x = xs2ml("sample.xls")

# Give this method the root node name and it's attribute-values, if any.
x.create_root_node("PIES", xmlns="http://www.aftermarket.org/eCommerce/Pies")

# Create our basic node structure. This method takes in the target parent node
# and the node names that need to be created under it. We'll get a list in return.
root_nodes = x.add_nodes(x.dom_.firstChild, "Header", "Items", "Trailer")
Start filling-up the Items node. This is a bit complicated because the values are coming from multiple worksheets, and some values are text nodes while others are attribute-value nodes. Anyway, we'll just use the `Main Item Data Tab` as the basis for our row count. Also, we'll start at row[2] because we know that's were the real data starts.

Code:
for i in range(2, x.sheets["Main Item Data Tab"].nrows):
    # Create the Item node in Items. We'll modify this node as we go along.
    item = x.add_nodes(root_nodes[1], "Item")[0]
    
    # Create an attribute called MaintenanceType. We know that the data for
    # this node is in the Digital Assets Tab, column[1].
    x.cell_to_node(item, "Digital Assets Tab", col=1, row=i, 
                   attr_name="MaintenanceType")
    
    # It seems the MaintenanceType value is reusable, so let's save it.
    MaintenanceType = item.attributes["MaintenanceType"].value
    
    # Create the ItemLevelGTIN node. We know the value is in column[8] of
    # Main Item Data. We also know about its GTINQualifier attribute in col[9].
    ItemLevelGTIN = x.add_nodes(item, "ItemLevelGTIN")[0]
    x.cell_to_node(ItemLevelGTIN, "Main Item Data Tab", col=8, row=i)
    x.cell_to_node(ItemLevelGTIN, "Main Item Data Tab", col=9, row=i,
                   attr_name="GTINQualifier")
    
    # This is going to be a long for loop... but you get the idea.
    
    # One last example, Descriptions nodes:
    Descriptions = x.add_nodes(item, "Descriptions")[0]
    d_nodes = x.add_nodes(Descriptions, "Description", "Description")
    
    # We know what DescriptionCode to put in. LanguageCode, I don't know
    # where to find it but either use the setAttribute way or the cell_to_node
    # way if you know the sheet and cell the value is in.
    d_nodes[0].setAttribute("DescriptionCode", "DES")
    d_nodes[1].setAttribute("DescriptionCode", "SHO")
    
    # MaintenanceType, we'll just get from item since we know that already.
    d_nodes[0].setAttribute("MaintenanceType", MaintenanceType)
    d_nodes[1].setAttribute("MaintenanceType", MaintenanceType)
    
    # Short and long descriptions, according to the XLS file.
    x.cell_to_node(d_nodes[0], "Main Item Data Tab", col=16, row=i)
    x.cell_to_node(d_nodes[1], "Main Item Data Tab", col=15, row=i)
The cell_to_node method, by the way, takes the following arguments:
parent node, worksheet name, column, row, and attribute name (optional, None is default)

cell_to_node reads the cell value of column:row. By default, this value is turned into a text node and appended to the parent node, which results in something like <parent>value</parent>. But if the attribute name is given, an attribute is created within parent and given the cell value, like: <parent attribute_name="value" />

---

Composing the Header and Trailer is more straight forward because their child node values come from a single worksheet and we can easily where the values are coming from. We'll just use some dictionaries here that contain each element's respective column and row. Then we'll feed these maps to a convenience method for generating simple, mapped nodes.

Code:
# Header.
h_map = {
    "PIESVersion": {"col": 1, "row": 2},
    "ParentAAIAID": {"col": 1, "row": 3},
    "BrandOwnerAAIAID": {"col": 1, "row": 4},
    "TechnicalContact": {"col": 1, "row": 5},
    "ContactEmail": {"col": 1, "row": 6},
    "BlanketEffectiveDate": {"col": 1, "row": 7},
}

# Trailer.
t_map = {
    "ItemCount": {"col": 1, "row": 10},
    "TransactionDate": {"col": 1, "row": 11},
}

x.map_simple(root_nodes[0], "Header and Trailer Tab", h_map)
x.map_simple(root_nodes[1], "Header and Trailer Tab", t_map)
The map_simple method takes these arguments:
parent node, worksheet name, and a map that we can feed to cell_to_node as **kargs.

---

Finally, we can choose to save the resulting XML to file or to print it on standard out. Then we'll free our resources (unlink the XML, etc.).

Code:
# x.save_to_file("filename.xml", overwrite=False)
print x.dom_.toprettyxml()
x.done()
Sorry if I didn't make too much sense but I hope this helps you even if it's only a bit.
Attached Files
File Type: txt xs2ml.txt (6.0 KB, 231 views)
File Type: txt xs2ml_usage.txt (3.4 KB, 179 views)
__________________
I am a Man of Truth. I am a Free Human Person. I am a Peacemaker.
** Independent Multimedia Artist in Pasig **

Last edited by Samhain13; 11-17-2010 at 03:10 PM..
Samhain13 is offline   Reply With Quote
Users who have thanked Samhain13 for this post:
grazopper (11-17-2010)