...

View Full Version : How do I read XML?



jessjenn
01-02-2004, 09:46 PM
How do I read XML data from a string? I've searched around in a lot of places, even the manual, but couldn't find anything useful. For example, if I had the following:



$x = '<fruits>
<fruit>
<type>apple</type>
<color>red</color>
</fruit>
<fruit>
<type>lemon</type>
<color>green</color>
</fruit>
</fruits>';

How could I read that? I want to read it from a textbox, where the user manualy puts it in. How can this be done?

Any help is appreciated. Thanks!

mordred
01-03-2004, 12:02 AM
The function xml_parse() expects a string as input, exactly what you need.
http://us2.php.net/manual/en/function.xml-parse.php

You have to set up the parser and the element handler functions too.

jessjenn
01-03-2004, 05:50 AM
Thanks for the response. I don't see the example, however. All the examples shown there require a file. What I want to do, is loop through it (the string representing xml data) like it was a record set. Each <fruit> is a record, and its child values are fields inside the record. As it loops, it will be inserting it in the database.

If anyone could show me or refer to me to some sample code, I'd appreciated. Thanks!

me'
01-03-2004, 12:16 PM
There's an easy way if you know (or are prepared to learn) XSLT.
$files = array (
'_/xml' => 'xml code here',
'_/xsl' => 'xslt code here'
);
$xh = xslt_create();
echo xslt_process($xh, 'arg:/_xml', 'arg:/_xsl', NULL, $files);You don't have to echo if you don't want to. You could put it into a string and set up a few MySQL queries to get a database up and running.

To get the xslt functions going, you'll need the Sablotron dll, which your version of PHP probably came with. Check you have expat.dll, sablot.dll and iconv.dll in c:\windows or c:\winnt, and in <root to php>/dlls. Then modify your php.ini file to uncomment the line:

;extension=php_xslt.dll

(uncommenting simply means getting rid of the semicolon). Also, make sure the line

extension_dir = "C:\Program Files\pdev\php\extensions\"

points to the correct extension folder. For more info on Sablotron, look at this thread (http://www.codingforums.com/showthread.php?s=&threadid=30545). If you want to learn XSLT, there's a great tutorial (http://www.w3schools.com/xsl/default.asp) at w3schools.

jessjenn
01-03-2004, 06:38 PM
Sounds like a good way to do it. Nice and short but I'm working off a web host that doesn't give me access to the php.ini file, so that wouldn't work. Good way to do it though, I'll have to keep that in mind.

me'
01-03-2004, 06:42 PM
Do they have a list of extensions currently installed? The XSLT functions may already by accessible to you.

mordred
01-03-2004, 09:22 PM
Originally posted by jessjenn
Thanks for the response. I don't see the example, however. All the examples shown there require a file.

They do, but they read a portion (a string) out of each file and feed that string to xml_parse(). You just have to strip the file part from the examples in your case, since the XML isn't coming from a file, as it usually does.



What I want to do, is loop through it (the string representing xml data) like it was a record set. Each <fruit> is a record, and its child values are fields inside the record. As it loops, it will be inserting it in the database.

If you want to use SAX parsing to process your XML document, you have to take care of the all the record-building stuff and looping yourself. You can only provide functions for the parser to be called when an element is found, left, text is found etc. Below is such a script (a quick, adjust to your needs):



$x = '<fruits>
<fruit>
<type>apple</type>
<color>red</color>
</fruit>
<fruit>
<type>lemon</type>
<color>green</color>
</fruit>
</fruits>';

$fruits = array();
$current = '';

$parser = xml_parser_create();

xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, false);
xml_set_element_handler($parser, 'startElement', 'endElement');
xml_set_character_data_handler($parser, 'charHandler');

function startElement(&$parser, $name, $attributes) {
global $fruits, $current;
if ($name == 'fruit') {
$fruits[] = array();
}

if (in_array($name, array('type', 'color'))) {
$fruits[count($fruits) - 1][$name] = '';
$current = $name;
}
}

function endElement(&$parser, $name) {
// empty
}

function charHandler(&$parser, $data) {
global $fruits, $current;
if (in_array($current, array('type', 'color')) ) {
$fruits[count($fruits) - 1][$current] = $data;
$current = '';
}
}

xml_parse($parser, $x);
var_dump($fruits);


If you need to know parent-child node relations while parsing, perhaps the DOM XML extension is more suited to your needs? Although it's not a default extension and probably not installed on your host.

jessjenn
01-04-2004, 03:43 PM
Originally posted by me'
Do they have a list of extensions currently installed? The XSLT functions may already by accessible to you.
Nope they have nothing like that installed. :rolleyes:

jessjenn
01-04-2004, 03:51 PM
mordred, excellent response! Thanks for the sample. It works beautiful. How complicated would it be if I wanted to read an argument in a tag? Let's say I had my original data but wanted to add an id in all the <fruit> tags:


<fruit id="34">
<type>apple</type>
<color>red</color>
</fruit>
Thanks. If it's too much trouble, it's ok, you helped me enough with the sample code.

jessjenn
01-04-2004, 04:10 PM
mordred, how come in your code, I can't have & > < characters between the tags? When I have:



<something>I like M & Ms and soda</something>


the "&" and what follows disappears. Any thoughts? I escape it with entities and nothing. Thanks.

me'
01-04-2004, 08:50 PM
Use the character entities instead:
&amp;amp; = &
&amp;lt; = <
&amp;gt; = >

If you have your own DTD, you could specify it as CDATA as well (only works if the tag can never contain other tags)
Originally posted by jessjenn
Nope they have nothing like that installed. :rolleyes: Shame, XSLT parsing is so much easier than with expat.

mordred
01-05-2004, 12:28 AM
me', I was under the impression that XSLT is used to transfrom one XML document into another one. Ok, HTML is an option, and plain text too. You could create your SQL plain text query via XSLT, but honestly, I've never seen it done anywhere... and from a gut feeling I would say it looks like a lot of hassle.

My point is mainly that sometimes the generation of another document doesn't solve your problem, if you need the actual values as a data structure in your application. You'd still have to parse the generated document anew to read/process it. For these issues SAX and DOM are more suited, IMNSHO. jessjenn mentioned something about database insertion, so I guess XSLT is not really helpful here. Errh, I mean "would be helpful", because the host is as backwards as most are. Agree with your opinion that it's a shame, but not because expat is more difficult to use.

mordred
01-05-2004, 12:35 AM
jessjenn, to get the attributes of parse tag, you can access the "attributes" parameter passed to the function you register as the start-element-handler. In my example this function is called startElement (duh!). A var_dump() printing attribute values is included in the example below.

About the entities... that's due to a little bug in the sample code. Quick hacks almost always work under the "type,run,throw it away" workflow pattern.... ;)

A slightly revamped version is here. The decisive change was made in charHandler, it concatenates the data instead of assigning it.



$x = '<fruits>
<fruit id="foo">
<type>apple</type>
<color>red</color>
</fruit>
<fruit>
<type>lemon is &amp; a green &lt; hat</type>
<color>green</color>
</fruit>
</fruits>';

$fruits = array();
$current = '';

$parser = xml_parser_create();

xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, false);
xml_set_element_handler($parser, 'startElement', 'endElement');
xml_set_character_data_handler($parser, 'charHandler');

function startElement(&$parser, $name, $attributes) {
global $fruits, $current;

$current = '';


if (count($attributes)) {
var_dump($attributes);
}

if ($name == 'fruit') {
$fruits[] = array();
}

if (in_array($name, array('type', 'color'))) {
$fruits[count($fruits) - 1][$name] = '';
$current = $name;
}
}

function endElement(&$parser, $name) {
global $current;
$current = '';
}

function charHandler(&$parser, $data) {
global $fruits, $current;
if (in_array($current, array('type', 'color')) ) {
$fruits[count($fruits) - 1][$current] .= $data;
}
}

function entityHandler(&$parser, $data) {
global $fruits, $current;
if (in_array($current, array('type', 'color')) ) {
$fruits[count($fruits) - 1][$current] = $data;
$current = '';
}
}

xml_parse($parser, $x);
var_dump($fruits);

jessjenn
01-05-2004, 05:00 AM
Originally posted by me'
Use the character entities instead:
&amp;amp; = &
&amp;lt; = <
&amp;gt; = >


I tried that and nothing. Strange. My solution is putting the string within <![CDATA[ $string ]]>. It works well, I don't have to escape those characters but I think I have to escape brackets.

If I have something like:
<![CDATA[ 11-Nov-2003 ]car & ><<><>< < ]]> the " 11-Nov-2003 ]" doesn't get displayed, just what follows the "]" before "car" - so I'm thinking maybe you have to escape brackets. Is this true? What do you think.

me'
01-05-2004, 12:08 PM
Originally posted by mordred
Agree with your opinion that it's a shame, but not because expat is more difficult to use. It certainly can be, I wrote an XML parsing algorithm first in PHP using expat that failed miserably for some stupid reason or another. IMO, using XSLT is more suited to the problem, as it must parse the file first before it can transform it. Also, the code is a lot shorter and it can reside more naturally in two different files for ease of editing. That's my opinion though, I'm sure expat beats XSLT hands down in some cases.

jessjen: just a long shot as it didn't work before, but try it's entitiy, &amp;#93;?

jessjenn
01-06-2004, 12:18 AM
I tried that, and get the same thing. I'm going bonkers.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum