View Full Version : Extracting Contents of a Node
binaryWeapon
05-25-2009, 06:56 PM
This seems like a simple question, and I've done several googles but to no success. My question is how do you extract the contents of a node? For example, if I have
<span> blah blah blah <img src="foo.bar"> </span>
I want to just get blah blah blah <img src="foo.bar">
Is there anything other than innerHTML which will do the job for me?
oesxyl
05-26-2009, 01:30 AM
This seems like a simple question, and I've done several googles but to no success. My question is how do you extract the contents of a node? For example, if I have
<span> blah blah blah <img src="foo.bar"> </span>
I want to just get blah blah blah <img src="foo.bar">
Is there anything other than innerHTML which will do the job for me?
textContent for gecko, firstChild.nodeValue for safari and text for ie.
something like that:
function getValue(node){
var result = null;
if(document.implementation && document.implementation.createDocument){
result = node.textContent
if(typeof(result) == 'undefined'){
result = node.firstChild;
if(result){
result = result.nodeValue;
}
}
}else{
if(document.attachEvent && !document.addEventListener){
result = node.text;
}
}
return result;
}
best regards
binaryWeapon
05-26-2009, 05:02 AM
Won't textContent give me just that... the text content? I need to preserve any descendant nodes as well.
oesxyl
05-26-2009, 05:32 AM
Won't textContent give me just that... the text content? I need to preserve any descendant nodes as well.
with node.getElementsByTagName('*') you get the descendants but loose the structure, it's just a list. I don't think there is a general and in the same time simple solution to this.
innerHTML remain the simplest way, in my opinion. :)
best regards
rnd me
05-26-2009, 10:27 AM
the problem is not getting the contents, it's storing the contents in a object-oriented fashion that's lacking.
the example <span> in the post is the perfect demonstration to convey what i mean.
you can't really store the contents in a useful fashion.
premise:
-nodes come in two types: text and element.
-elements (can) have kids and names, textNodes can't
-node lists are flat and miss textNodes or children
by that definition, you could store the contents in an element.
span.cloneNode(true) would dupe the branch and its contents.
this preserves the tree, values, etc.
the span you were trying to dispose of is an element though, so that's not what you want, since it's back where you started!
so that wont work.
you could also get a list of elements via span.getElementsByTagName("*");
this has no tree or the text "blah blah blah" in the result set.
so that wont work.
you could also gather span.childNodes.
for the simple example, this should work.
"blah blah blah" would be broken up into text nodes, and the <img> would be there.
but if you had any sub-tags, like a <A> tag for example, it's content's would not make it into span.childNodes.
so that might work.
you could also use .innerHTML.
it would not give you "<img src="foo.bar">" though.
it would give you "<img src="http://site.com./foo.bar">"
innerHTML also discards form values, parses style attribs, and neglects some properties that were modified by scripts.
so that won't work.
in conclusion, there is no simple answer here.
you have to design a solution to fit the needs of each use case.
as a side note, .textContent works in all browsers except IE, so you can simply say:
function getValue(node){
return node.innerText || node.textContent || node.value || node.text;
}
this should work in all browsers without resorting to object detection or conditionals.
you might want to add properties like .alt, .nodeValue, .innerHTML, etc into the stack; depends on what you need to do.
vBulletin® v3.8.2, Copyright ©2000-2012, Jelsoft Enterprises Ltd.