PDA

View Full Version : Java & XML, travesing child nodes is adding non-existent siblings


bluecarbon
06-16-2009, 02:24 PM
Hi Guy's,

Trying to write a program that will read in a XML file and then take the child elements of certain nodes and put them in an array.

eg here's a short excerpt of the XML file...

<languages>
<language>
<name>French</name>
<select1>A1</select1>
<select2>A2</select2>
<select3>A3</select3>
<select4>A1</select4>
<select5>A1</select5>
<years1>-&gt;3</years1>
<years2>5-&gt;</years2>
<years3>-&gt;1</years3>
</language>
</languages>

And the idea is to pull each of the childNodes under language (French, A1, A1 etc...) and put each one in a space in the array..
eg
array[0] = French
array[1] = A1
array[2] = A2
etc

This is the could I'm using...

public void userPass(String _userName) throws ParserConfigurationException, SAXException, IOException
{
bioXML = getFile("passport");
String[] elems;
_bioL = bioXML.getElementsByTagName("language");
for(int i=0; i<_bioL.getLength(); i++)
{
elems = new String[55];
Node temp = _bioL.item(i);
int j=0;
for(Node childNode=temp.getFirstChild(); childNode!=null; childNode=childNode.getNextSibling()){
System.out.println(childNode.getTextContent());
elems[j] = childNode.getTextContent();
j++;
}
bio.addPassport(elems);

}


}

and this is the full XML file

<?xml version="1.0" encoding="iso-8859-1"?>
<languages>
<language>
<name>French</name>
<select1>A1</select1>
<select2>A2</select2>
<select3>A3</select3>
<select4>A1</select4>
<select5>A1</select5>
<years1>-&gt;3</years1>
<years2>5-&gt;</years2>
<years3>-&gt;1</years3>
<years4>-&gt;1</years4>
<years5>select</years5>
<years6>-&gt;3</years6>
<years7>select</years7>
<years8>-&gt;3</years8>
<years9>-&gt;3</years9>
<years10>-&gt;3</years10>
<years11>select</years11>
<years12>select</years12>
<years13></years13>
<months1>-&gt;3</months1>
<months2>-&gt;1</months2>
<months3>-&gt;3</months3>
<months4>select</months4>
<months5>select</months5>
<months6></months6>
<dip11>French</dip11>
<dip12>A1</dip12>
<dip13>Leaving Certificate</dip13>
<dip14>Minister for Education</dip14>
<dip15>2008</dip15>
<dip21></dip21>
<dip22></dip22>
<dip23></dip23>
<dip24></dip24>
<dip25></dip25>
<dip31></dip31>
<dip32></dip32>
<dip33></dip33>
<dip34></dip34>
<dip35></dip35>
<dip41></dip41>
<dip42></dip42>
<dip43></dip43>
<dip44></dip44>
<dip45></dip45>
<dip51></dip51>
<dip52></dip52>
<dip53></dip53>
<dip54></dip54>
<dip55></dip55>
<dip61/>
<dip62/>
<dip63/>
<dip64/>
<dip65/>
</language>
<language>
<name>Spanish</name>
<select1>C1</select1>
<select2>C1</select2>
<select3>C1</select3>
<select4>C1</select4>
<select5>C1</select5>
<years1>-&gt;1</years1>
<years2>-&gt;1</years2>
<years3>-&gt;1</years3>
<years4>-&gt;1</years4>
<years5>-&gt;1</years5>
<years6>-&gt;1</years6>
<years7>-&gt;1</years7>
<years8>-&gt;1</years8>
<years9>-&gt;1</years9>
<years10>-&gt;1</years10>
<years11>-&gt;1</years11>
<years12>select</years12>
<years13></years13>
<months1>-&gt;1</months1>
<months2>select</months2>
<months3>-&gt;1</months3>
<months4>select</months4>
<months5>select</months5>
<months6>I have ever culturally experienced Spain or a Spanish speaking country other than on holidays with family when I was younger.</months6>
<dip11>French</dip11>
<dip12>A1</dip12>
<dip13>Leaving Certificate</dip13>
<dip14>Minister for Education</dip14>
<dip15>2008</dip15>
<dip21></dip21>
<dip22></dip22>
<dip23></dip23>
<dip24></dip24>
<dip25></dip25>
<dip31></dip31>
<dip32></dip32>
<dip33></dip33>
<dip34></dip34>
<dip35></dip35>
<dip41></dip41>
<dip42></dip42>
<dip43></dip43>
<dip44></dip44>
<dip45></dip45>
<dip51></dip51>
<dip52></dip52>
<dip53></dip53>
<dip54></dip54>
<dip55></dip55>
<dip61/>
<dip62/>
<dip63/>
<dip64/>
<dip65/>
</language>
</languages>


so the first array should read

elems[0] = French
elems[1] = A1
elems[2] = A1
elems[3] = A1
etc

But instead its reading

elems[0] =
elems[1] =
elems[2] = French
elems[3] =
elems[4] =
elems[5] = A1
elems[6] =
elems[7] =
elems[8] = A2


etc

And the program continues until it gets an Array Index Out of Bounds Exception...

Any help will be much appreciated...

sobrien79
06-16-2009, 04:53 PM
Keep in mind that white space is also considered a node.

Consider this:

<node0>Text0</node0> <node2>Text2</node2>

There are 3 nodes there.
[0] = node0
[1] = this whitespace between the nodes
[2] = node2

bluecarbon
06-16-2009, 06:00 PM
Ah I see, so when it sees newline it is taking it as an extra node?

Maybe a reformatting of the XML file would be appropriate

Thank you!