PDA

View Full Version : I'd appreciate help assessing the efficiency of an XML application


brothercake
10-29-2002, 11:13 PM
I'm not sure if this is actually the right forum though, because its about php as well. Please move it if necessary.

Anyway ... what this application does is parse an XML document with PHP (using exim), turning into an array which is then sorted, searched through and returns results. I'm using it to archive a publications index; you can see it working here (http://www.brothercake.com/Ref/MORI_XML/articles.phtml).

With every query it has to parse the xml document, conditionally based on input criteria specifying attribute or node values; the document is 160k and contains around 700 entries. If you do a "search within text" it additionally goes to a remote search engine, returns the search results in the form of urls, which are then compared with the <url> tags in the xml document.

All fine and dandy. It works. But what I'm wondering is slightly incredelous ... is this all just ridiculously inefficient? Would I be better off doing with a database?

brothercake
10-31-2002, 09:47 PM
Could someone move this thread to PHP please; I don't it really belongs in here.

jkd
10-31-2002, 10:42 PM
Sure thing. :)

Alex Vincent
11-02-2002, 11:28 PM
Think about it this way: your PHP script has to load that entire 160KB XML document each time you want maybe 256 bytes from it. That is horrendously inefficient. In this case, a MySQL database is probably more useful.

I've been told, however, that for a similarly-sized project, MySQL is overkill... :P But it's better than an XML "flatfile" database.

brothercake
11-03-2002, 04:34 PM
Yeah I was afraid of that. But I know nothing about DBs, so before I commit the time to learning it, I'd be grateful for a bit of clarifaction:

- since the purpose of the applications is, for eg, "find articles where the title contains search phrase" isn't it still going to have to search through the entire DB to find all matches anyway? How is that more efficient, or is it in the nature of databases that such a procedure is inherently faster?


In any case - if mySQL is overkill and XML is inefficient - maybe there's som happy middle ground ...?

Alex Vincent
11-06-2002, 02:58 AM
It's not quite that simple. 8) Modern databases support indexing. As I understand it, that means you can have a smaller file hold the index for the database -- which would tell the server platform where to look in the larger file for the record in question. The smaller file would parse much quicker, and the larger file can be opened at the specific point, without reading the whole thing.

The catch is you have to designate columns of the tables (I don't know what the technical term is) as indexes.

I might be wrong on the theory above, mind you. I've never tried to seriously examine the internal guts to a database...

MySQL is something you should probably learn anyway. Not full-fledged -- just a basic tutorial will help you. In fact, you should learn MySQL to become familiar with other SQL-based databases (SQL stands for Structured Query Language).

brothercake
11-06-2002, 04:12 PM
Yeah that sounds reasonable. Thanks for the info.