View Full Version : parser with php
i am searching for a tool to parse xml-files with php without having to configure the server or installing a parser there.
I started to write an own parser with a php-class. I can get the content of elements and change them.
it's like an asynchronous database connection based upon a xml-string.
It works fine till now, but there are too many features it doesn't support. It doesn't recognize attributes till now...
Is there any php-class or whatever to manage this?
(If someone is interested to develope this php parser with me... welcome!)
12-30-2002, 05:49 AM
There's the Simple API for XML (http://www.php.net/manual/en/ref.xml.php) - which doesn't require extensions that aren't installed by default.
For XSLT you do need to install a processor. Sablotron (http://www.gingerall.com/charlie/ga/xml/p_sab.xml) seems to work well - I have it on a Windows and Linux server, and both work fine.
The "standard"-XML-functions listed there on the php.net page are SAX?
Yes. I worked with them, but I didn't know that this PHP XML library uses a SAX Parser.
What is expat then?
This extension uses expat, which can be found at http://www.jclark.com/xml/. The Makefile that comes with expat does not build a library by default, you can use...
Is expat the sax-compliant xml-parser?
But I couldn't find any function or possibility to modify the XML-Data.
How do I do that?
(XSLT would be great, too, I think - but I want to do an independent - or almost independent - Script. And its not my server, only my webspace)
Thx very much.
12-30-2002, 07:55 AM
SAX is more like a generic than a specific word - it doesn't refer to the module (which is expat) but rather to the paradigm of "a simple API for XML".
I think ... perhaps its me whose misunderstood.
SAX isn't so general i think. SAX is an event oriented API for xml developed by a mailing list called XML-Dev (under David Megginson). The Homepage-Url I down.
A SAX Parser is a XML Parser which is SAX complient.
I have a list of some SAX-Parser. They support Java, C++, VB, any Com-complient language. I didn't find any for PHP until now.
But you could have right...
Those php-functions are similar to the SAX samples I have (in Java and VB). It's both event oriented.
Then there is DOM.... DOM is general, in fact. DOM is a w3c standard. But there are some DOM APIs, too.
I found so many interesting things... ;-)
But there doesn't seem to be any possibility to edit XML-Data from PHP... regardless you install a XML SAX parser or something like this or you create your own....
Here is another great link:
* xml22 - search and edit XML files - This PHP4 code parses an XML document into a multidimensional array. The produced structure consists of numerical indexed elements representing the tags, each an array by itself. These arrays contain the name of the parsed tag, the level of the XML tree, an array of attributes--if there are any--and the global index to the parent element in the document as well as an array of indices of all immediate children. Enclosed CDATA are stored into a content field of the tag's array.
But lost link....
I think I found something...
Does this work or its necessary to install the Sablotron processor?
12-30-2002, 06:22 PM
Maybe this (http://www.ister.org/code/xml22/) is what you were looking for.
I've spent a couple of months playing around with the PHP expat methods (which I was thinking of as SAX ... thanks for probably clearing that up!) and doing the kind of thing you're talking about - turning the XML into a multi-dimensional PHP array for sorting and so on; I wrote this (http://www.mori.com/pubinfo/articles.phtml?cat=all) using that paradigm; it works very well, and i've been recommending it as a method to other people on this forum.
But ... well .... it's getting tedious; no validation; it doesn't parse the document in DOM order; to build the array you basically have to maintain global variables such as $currentTag and $currentAttributes as you move through the XML functions, just to remember where you are and what you're supposed to be including. Getting an array that truly represents the heirarchy of the original XML is nightmarishly difficult - I basically didn't bother, and built simple 2D arrays on a sort of $data["tag_name"] = "tag_data"; basis, and wrote the XML doc to avoid the same tag name being used as children of different tags. If you're using that xml22 API maybe this doesn't matter to you ...
But what I'm saying is, personally, I'm tired of dealing with it. There is the DOM module you mentioned ... but, most of what I've read about it suggests it's not entirely stable; not really suitable for production use. Mind you expat is alleged to have memory leaks as well; dunno though - that's pure hearsay as I don't really know; I've not had any problems with it.
So I looked at those XSLT functions you saw, and yes - they do need sablotron installed. I believe (but I don't know for sure) that Salbotron is a C version of the java Xalan engine - in C so that it works without a JRE).
Whatever; I just followed the instructions on their website. It's worth installing - XSLT may be a lot more verbose, and hence slower to process, but it is very powerful, flexible, and (of course) infinitely extensible.
It tooks me 10 minutes to install it on my home server; a quick email to my web host's tech support and now I have it running on that as well. Sorted I say :)
ok thx. I think i'll try it with this processor.
Is it possible to modify an XML-File with XSLT?
But I'll go on with my own PHP class, too... I spent a lot of time yet and it's much easier than all the other stuff.
(the use, of course, not the development)
And... suprise ;-), i'm not reading arrays. I started with arrays but then I started with another concept - only with the string.
Of course it won't be possible to use it with huge XML-Databases, but for small ones its fantastically.
And what i sais about expat and SAX - I don't know if this is SAX or not. The functions seems to be identically. Maybe that expat is a SAX-Compliant Parser.
I should find it out...
Thx a lot.
12-30-2002, 06:44 PM
You can't modify an XML document with XSLT in the file manipulation sense; but you can generate another XML document and then use native methods (like PHP) to write it to a file.
Or you could use XSLT to generate PHP to create an XML document! You see why XSL is so appealing :) Working with it feels like doing something really interesting and cutting edge; working with the expat methods just felt like chewing rocks to make sand. Just IMHO.