Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 9 of 9
  1. #1
    New to the CF scene
    Join Date
    Apr 2009
    Posts
    4
    Thanks
    1
    Thanked 0 Times in 0 Posts

    How to read CDATA with Javascript?

    Hi,
    I want to read the parameters from CDATA within a HTML page, similar like the following:
    <html>
    <head>
    <title>AnyTitle</title>
    </head>
    <body>
    <h1>Header</h1><br/>
    Text1
    <br/>
    Text2
    <script defer="defer" type="text/javascript"><!--//--><![CDATA[//><!--param1="value1";param2="value2";//--><!]]></script>
    </body>
    </html>

    I can find the script element, but unfortunately I don't know how to proceed.
    Any ideas are welcome!!!

    Thanks in advance, _.stan._

  • #2
    Senior Coder rnd me's Avatar
    Join Date
    Jun 2007
    Location
    Urbana
    Posts
    4,346
    Thanks
    11
    Thanked 589 Times in 570 Posts
    what is up with the comments in that code?
    there is no reason for all of those.

    for example, three line comments on the same line completly pointless?
    SGML and CDATA ? it should be one or the other to be valid.

    my point, is that if you didn't have the commenting, the values are available to javascript as param1 and param2.

    if it's all inside a script block, it shouldn't be parsed.
    thus, youll need a regexp to process the text contents of the script block.


    Code:
    var tt='<script defer="defer" type="text/javascript"><!--//--><![CDATA[//><!--param1="value1";param2="value2";//--><!]]></script>'
    
    var matches=[];
    tt.replace(/\<\!\[CDATA\[(.+\]{0}\>{0})\]\]\>/g, function(a,b){matches.push(b)});
    
    alert(matches);
    the regexp used extracts the contents of each CDATA section, as requested. It still pulls in the other inner comments.

    you could slice the string, replace() the comment chars, or refine the regexp to get just the stuff you want. from the looks of the example, you could likely just eval() the contents of calling matches.join("\n") to turn all the params into real vars...

    good luck!

    you can refine the regexp
    Last edited by rnd me; 04-24-2009 at 01:27 AM.
    my site (updated 13/9/26)
    BROWSER STATS [% share] (2014/5/28) IE7:0.1, IE8:5.3, IE11:8.4, IE9:3.2, IE10:3.2, FF:18.2, CH:46, SF:7.9, NON-MOUSE:32%

  • #3
    Senior Coder Arbitrator's Avatar
    Join Date
    Mar 2006
    Location
    Splendora, Texas, United States of America
    Posts
    3,302
    Thanks
    28
    Thanked 276 Times in 270 Posts
    Quote Originally Posted by _.stan._ View Post
    Hi,
    I want to read the parameters from CDATA within a HTML page, similar like the following:
    <html>
    <head>
    <title>AnyTitle</title>
    </head>
    <body>
    <h1>Header</h1><br/>
    Text1
    <br/>
    Text2
    <script defer="defer" type="text/javascript"><!--//--><![CDATA[//><!--param1="value1";param2="value2";//--><!]]></script>
    </body>
    </html>

    I can find the script element, but unfortunately I don't know how to proceed.
    Any ideas are welcome!!!

    Thanks in advance, _.stan._
    The W3C DOM specs only allow you to access CDATA nodes in XML documents. [1] While HTML 4.01 documents can have CDATA sections too [2], no browser supports them in such documents except Opera; hence why the DOM spec limits the CDATA interface to XML documents. That may change though since the current HTML 5 Working Draft mandates that they be supported. [3]

    That said, using CDATA sections within an HTML script element is totally pointless since the content of that element is CDATA by default. [4] Your script element and its content can, thus, be reduced to the following:

    Code:
    <script type="text/ecmascript" defer="defer">
    	var param1 = "value1";
    	var param2 = "value2";
    </script>
    Note that I added var keywords; it's good practice to declare your variables. (I also changed the formatting and MIME type to suit my preferences, but those changes have no effective functional difference.)

    If you're trying to access the content of the script element, you could use DOM Core methods; namely, you would need to access the text node's (i.e., firstChild node's) data [5] or nodeValue [6] properties.

    Code:
    var first_script_element = document.getElementsByTagName("script").item(0);
    var script_data = first_script_element.firstChild.data; // or
    script_data = first_script_element.firstChild.nodeValue;
    However, due to a lack of support in Windows Internet Explorer, you have to use an alternative, W3C DOM2 HTML method: access the script element's text property [7]:

    Code:
    var script_data = document.getElementsByTagName("script").item(0).text;
    Finally, your HTML document is not properly written. You need to have a document type declaration such as <!doctype html public "-//W3C//DTD HTML 4.01//EN"> at the top of your document. Don't use XML empty-element tag syntax [8] in HTML documents; in other words, use <br> instead of <br/>. Lastly, if you choose to use the aforementioned HTML 4.01 Strict doctype declaration, you can't nest inline content like Text1 and Text2 directly within the body element; wrapping everything within div elements is one method of addressing that issue.

    1. http://www.w3.org/TR/DOM-Level-3-Cor...ml#ID-E067D597
    2. http://www.w3.org/TR/REC-html40/appe...s.html#h-B.3.5
    3. http://www.whatwg.org/specs/web-apps...cdata-sections
    4. http://www.w3.org/TR/REC-html40/types.html#type-cdata
    5. http://www.w3.org/TR/DOM-Level-3-Cor...ml#ID-72AB8359
    6. http://www.w3.org/TR/DOM-Level-3-Cor...tml#ID-F68D080
    7. http://www.w3.org/TR/DOM-Level-2-HTM...ml#ID-46872999
    8. http://www.w3.org/TR/REC-xml/#dt-eetag
    For every complex problem, there is an answer that is clear, simple, and wrong.

  • #4
    New to the CF scene
    Join Date
    Apr 2009
    Posts
    4
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Hi Guys,
    thanks for the answer, but...

    I was afraid getting advices to avoid using such code I was pasting here, but that's what I cannot influence. I have to deal with a code like that and I have no means to change it (this is used in our system to transfer statistic parameters).

    So, if you have suggestions how to read data from the code above, please let me know.

    Thanks, _.stan._

  • #5
    Senior Coder Arbitrator's Avatar
    Join Date
    Mar 2006
    Location
    Splendora, Texas, United States of America
    Posts
    3,302
    Thanks
    28
    Thanked 276 Times in 270 Posts
    Quote Originally Posted by _.stan._ View Post
    I was afraid getting advices to avoid using such code I was pasting here, but that's what I cannot influence. I have to deal with a code like that and I have no means to change it (this is used in our system to transfer statistic parameters).
    If that is the case, you should have indicated such in your initial post; when people submit poorly written code, the assumption is that they don't know how to write it properly or, otherwise, made mistakes. Clarifying that would have saved time.

    Anyway, I still can't tell how much control you have over your code; are you only able to alter the HTML content of the script element?

    Quote Originally Posted by _.stan._ View Post
    So, if you have suggestions how to read data from the code above, please let me know.
    I gave such suggestions. (Specifically, they allow you to load the content of the script element into a variable as a string then do with it what you wish. I have to wonder whether or not you actually read the post or just skimmed it.)

    If neither rnd me nor myself gave suggestions that help you, you may want to consider trying to clarify the problem and what you're trying to accomplish while detailing (as mentioned) exactly which parts of it you have control over.
    For every complex problem, there is an answer that is clear, simple, and wrong.

  • #6
    New to the CF scene
    Join Date
    Apr 2009
    Posts
    4
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Hi,
    to be more precize I have more script tags in these pages, so I have to find out which contains these parameters. Since no id or name is used I have to find all script tags. I was using the same method as mentioned above:
    var scripts = document.getElementsByTagName('script');

    Then I would like to somehow get the value of the tags as a string and simple make a search. But unfortunately I cannot get its content anyhow. I was trying the following methods:
    for (i=0; i < scripts.length; i++){
    menu += removeWrappers(scripts[i].innerHTML);
    menu += scripts[i].innerHTML;
    menu += scripts[i].item(0).firstChild.nodeValue;
    menu += scripts[i].item(0).firstChild.data;
    for (j=0; j < scripts[i].childNodes.length; j++){
    menu += scripts[i].childNodes[j].nodeValue;
    }
    }

    function removeWrappers(obj){
    obj = obj.replace(/\<[^\>]*\>/g, '');
    return obj.replace(/\[[^\]]*\]/g, '');
    }

    Consider the above lines and blocks as separate solutions, but I did not want to have a long code with copy pasted lines.

    It would be fine to get everything between the two text as string, but I cannot make it.

    Thanks, _.stan._

  • #7
    Senior Coder Arbitrator's Avatar
    Join Date
    Mar 2006
    Location
    Splendora, Texas, United States of America
    Posts
    3,302
    Thanks
    28
    Thanked 276 Times in 270 Posts
    Quote Originally Posted by _.stan._ View Post
    Then I would like to somehow get the value of the tags as a string and simple make a search. But unfortunately I cannot get its content anyhow.

    [...]

    It would be fine to get everything between the two text as string, but I cannot make it.
    I covered two of your failed methods as well as one that works in WIE in my initial post:

    Code:
    var script_data = document.getElementsByTagName("script").item(0).text;
    That code only retrieves the content of the first script element to appear in the document though and won't allow you to retrieve information from external scripts (linked via the src attribute), at least, in the browser in which I did my testing: Mozilla Firefox. I'm not aware of any method for retrieving the content of external script files.

    That said, the below code uses the above technique to push the content of all script elements into an array; for external scripts, a zero-length string is pushed into the array. After this code is executed, you can then use a loop (not provided) to search the strings in the array for whatever information that you need:

    Code:
    var script_elements = document.getElementsByTagName("script");
    var script_data_array = [];
    for (var i = 0; i < script_elements.length; i++) {
    	script_data_array.push(script_elements.item(i).text);
    }
    For every complex problem, there is an answer that is clear, simple, and wrong.

  • Users who have thanked Arbitrator for this post:

    _.stan._ (04-27-2009)

  • #8
    New to the CF scene
    Join Date
    Apr 2009
    Posts
    4
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Unhappy Thanks

    I was struggling with this for hours, when I have realized the fact:
    that I could get the content of the script tags from the very beginning, but when I was writing these out for debugging reasons they obviously were not visible, since they were in comments.

  • #9
    Senior Coder Arbitrator's Avatar
    Join Date
    Mar 2006
    Location
    Splendora, Texas, United States of America
    Posts
    3,302
    Thanks
    28
    Thanked 276 Times in 270 Posts
    Quote Originally Posted by _.stan._ View Post
    I was struggling with this for hours, when I have realized the fact:
    that I could get the content of the script tags from the very beginning, but when I was writing these out for debugging reasons they obviously were not visible, since they were in comments.
    I would get rid of the comments. They're a leftover the past that most people use without understanding why they're using them. And, as far as I can tell, there's no practical reason to use them; I've yet to meet a single person that can tell me which browser(s) such commenting schemes would cause pages to render better in.

    The CDATA section markers are also unnecessary unless you plan on serving XML-compatible XHTML Web pages; that doesn't seem very likely considering that your document doesn't declare a namespace.

    Finally, while you've probably already figured this out, you can get more reliable results by outputting the text via the window.alert method (non-standard) or via appending a text node to the document created via the document.createTextNode method. (Presumably, you were using the HTMLElement.innerHTML property earlier.)
    For every complex problem, there is an answer that is clear, simple, and wrong.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •