XML data representation
If I take data (1,2,3,a,b,c,..) and I put it in XML form (<a>, <b>, <c>,..), does that mean that data becomes information because of XML? Does XML turn data into information by representing it in tags?
I am also reading here in a slideshow: "In JDOM, every XML tree is approached as a document even though the content has nothing to do with documents". I looked up the definition of 'document' on dictionary.com and it states that a document is meant as being informative. 'informative' means 'to convey information'. Then, if the purpose of XML is to represent data into information, why does the content of an XML tree supposedly not have anything to do with a document and therefore nothing to do with information? This is confusing. Perhaps the author of that slideshow was using different semantics than I have in my mind right now.
Any thoughts on this?
XML stands for extensible Markup Language.
XML is designed to transport and store data.
If a line of xml code looks like this:
You can extract "Thomas" when you go to find out the name.
Does that help?
Not really, I was looking for an ontological and metaphysical explanation, i.e. what are we looking at when we see <name>Thomas</name> as opposed to the raw data Thomas? What happens biologically in our brain when our eyes see that pattern? 'Thomas' gets associated with 'name', which gets associated with 'person'. For someone not knowing the name 'Thomas', then it would simply be a collection of singularities: T h o m a s.
Is <name>Thomas</name> information? I would think so, because our brain translates that pattern into an entity. Thus, is XML on its own information? Because data comes raw from a database.
So if XML is just another representation of data, it does not necessarily mean that data becomes information when poured into the XML format? After all, information is meaningful by itself (and therefore having semantics), as is said here:
XML translates data into a structure and our brain translates that structure into entities that make sense to us.. ultimately conveying to our brain that what is being seen is actually information. I have my doubts on that last part.
When we introduce 1 and 2 into the element <child>, then our brain (like the gestalt theory postulates: http://en.wikipedia.org/wiki/Gestalt_psychology) translates that <child> as having the 'child' relationship with its 'parent'. We therefore see it as a whole, an entity and not as a 'sum of parts'. Another relationship would be that values 1 and 2 are now associated with the word 'child'. I could say as well <a>1</a>, <a>2</a>. Now our brain knows that 1 is an 'a'. And so forth.
In practical terms however, XML (as is stated in the slides I am reading) represents data like this:
(input)XML file -> (search its structure and represent it)XPath -> (output) XAML, XHTML, XML, CSV. As we can see, another XML itself can be the result of an input XML file. Therefore it confuses me a little bit what the purpose of the language itself actually is. I agree that it is just another representation of data. In my example, one could say that one takes data, represents it into XML, uses XPath to search in its tree to use its values, to finally represent it into another format. So yes, it transports it.
So according to this article:
XML does not have presentation semantics. However, when we look at XML our brain creates semantics for us, because of the gestalt.
If XML is extensible, how do you extend it then?
How does XML store data?
I will do some more thinking on this.
"extensible" means you don't have finite mumber of markups there like for example in html.
like in this example:
if you have a basket containing 3 veggies say potato, selery, and tomato, in XML it would be:
or if you would like to represent quantities of those veggies in you r basket, you could do it this way:
sorry i can't explain it better :D
XML is simply a self describing language. That means that when I create a document that contains:
It would mean that I have a collection of animals which are type Cat and Dog. That is a human readable aspect of it, the self-describing mechanism. The tags are structural elements and are how we logically divide the sections. It carries all meaning, and absolutely no meaning at the same time. I could just as easily have a text document:
What gives the XML the meaning is the doctype or schema in use, and how the language reading it interprets it. So that would indicate that the element name's chosen don't represent information, rather they represent a logical division of said information. Does that mean it doesn't represent information? No it doesn't. I could just as easily write the same document as:
And it may be just as valid as the previous (although IMHO I wouldn't consider this to be information as its structural, but that doesn't mean it cannot be treated as such). This is the point of XML; aside from syntax, there are no rules only how I interpret said rules.
The choice of names is for human readability, simply the associations we make as people. That would fall in line with why would I name a method
getPerson() instead of
_53askk66k477kj4j()? Both are completely valid, and ultimately either can be chosen. Obviously I'll use the one that I can read though.
So the purpose of XML? To describe itself to another language based on the ruleset governed by the language itself. Take a soap service for example. The WSDL is a type of schema document that dictates what information is provided, and how its presented. This is then used by a soap client to know what and how it can ask for information from the soap server. It is loose, so you can provide a plethora of information to it regarding arguments and types, but its tight in the sense that it must come in a particularly described structure. This is the difference between well formed XML and correct XML. Correct XML is simply XML that is governed by a DTD or Schema ruleset and has adhered to it. Well formed is simply XML that has no particular meaning.
That was very quick to the point. It was exactly what I was thinking. Thanks, I understand it now. :)