Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Page 1 of 3 123 LastLast
Results 1 to 15 of 35
  1. #1
    New Coder
    Join Date
    Jul 2005
    Posts
    10
    Thanks
    0
    Thanked 0 Times in 0 Posts

    How to capture the content between open and close tags

    Hi I have html file with huge contents included multiple open and close tage and one of the tag that I am trying to capture the date contents between the open and close tags show below, but I am not sure regular expression would help. If regular expression would work for this then how?.

    <mso:Date_x0020_updated msdt:dt="string">2005-04 06T00:00:00Z</mso:Date_x0020_updated>


    Please help!

    Thanks so much

    Many
    Last edited by manijs; 07-08-2005 at 07:43 AM.

  • #2
    Regular Coder
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    375
    Thanks
    0
    Thanked 0 Times in 0 Posts
    My first thought was to grab the tag's innerHTML but, it is not considered a tag in the normal sense by IE6 (what look like its opening and closing tags are considered as two separate tags) so I figured out this convoluted method:
    Code:
    function getMsoDateTagData(){
    var msoTest=document.getElementsByTagName('mso:Date_x0020_updated')[0]
    var blah=msoTest.parentNode.innerHTML;
    blah=blah.substr(blah.toLowerCase().lastIndexOf('mso:date_x0020_updated msdt:dt="string">')+40)
    blah=blah.substr(0,blah.indexOf('<'))
    return blah;
    }
    As long as you don't have more than one of these tags on the page, this will give you the string you are looking for. One possible usage:

    alert (getMsoDateTagData())

    One caveat - you must wait until the page has loaded to use this function. One other thing, this all assumes that the name of the tag never changes.
    Last edited by jscheuer1; 07-08-2005 at 09:33 AM.

  • #3
    Regular Coder
    Join Date
    May 2005
    Posts
    313
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Internet Explorer

    If you can, declare mso as a namespace: CUSTOM Element | custom Object; its tagName would then be: "Date_x0020_updated" (or whatever followed the mso: prefix).
    Thanks in advance!

  • #4
    Kor
    Kor is offline
    Red Devil Mod Kor's Avatar
    Join Date
    Apr 2003
    Location
    Bucharest, ROMANIA
    Posts
    8,478
    Thanks
    58
    Thanked 379 Times in 375 Posts
    object.firstChild.nodeValue
    or
    object.firstChild.data

    if the content is a textNode
    KOR
    Offshore programming
    -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*

  • #5
    Regular Coder
    Join Date
    May 2005
    Posts
    313
    Thanks
    0
    Thanked 0 Times in 0 Posts
    If the object is an object.
    Thanks in advance!

  • #6
    Kor
    Kor is offline
    Red Devil Mod Kor's Avatar
    Join Date
    Apr 2003
    Location
    Bucharest, ROMANIA
    Posts
    8,478
    Thanks
    58
    Thanked 379 Times in 375 Posts
    everything can be considered as an object (well, almost ), The problem is, as jscheuer1 noticed, how to refere the object and, I should add, which methods are allowed to be acted upon an object. XML DOM is not quite the same with HTML DOM, from this point of view
    KOR
    Offshore programming
    -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*

  • #7
    Regular Coder
    Join Date
    May 2005
    Posts
    313
    Thanks
    0
    Thanked 0 Times in 0 Posts
    A custom object is created when its namespace is declared. "Otherwise, the custom tag is treated as an unknown tag when the document is parsed."
    Thanks in advance!

  • #8
    New Coder
    Join Date
    Jul 2005
    Posts
    10
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Hello jscheuer1

    Hi jscheuer1, the script you post early was very useful. How do I grab another content between the <mso:Prime_x0020_SME msdt:dt="string">mailto:Many, Many</mso:Prime_x0020_SME>
    and write them out in the same page?.

    please see below:

    <mso:Date_x0020_updated msdt:dt="string">2005-04-06T00:00:00Z</mso:Date_x0020_updated>
    <br>
    <mso:Prime_x0020_SME msdt:dt="string">mailto:Many, Many</mso:Prime_x0020_SME>


    Thanks,
    Appreciated for your help

    Many

  • #9
    Regular Coder
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    375
    Thanks
    0
    Thanked 0 Times in 0 Posts
    After checking the DOM inspector, I thought firstChild.data looked very promising. It worked great in FF but neither it nor firstChild.nodeValue worked in IE. I'm hesitant to declare these tags namespaces because I suspect that they are proprietary MS tags to begin with. Anyways, thanks for the ideas, now, for manijs, here is what works here (stick this in the head):
    Code:
    <script type="text/javascript">
    function getMsoDateTagData(){
    var msoTest=document.getElementsByTagName('mso:date_x0020_updated')[0]
    var blah=msoTest.parentNode.innerHTML;
    blah=blah.substr(blah.toLowerCase().lastIndexOf('mso:date_x0020_updated msdt:dt="string">')+40)
    blah=blah.substr(0,blah.indexOf('<'))
    return blah;
    }
    
    function getMsoPrimeTagData(){
    var msoTest=document.getElementsByTagName('mso:prime_x0020_sme')[0]
    var blah=msoTest.parentNode.innerHTML;
    blah=blah.substr(blah.toLowerCase().lastIndexOf('mso:prime_x0020_sme msdt:dt="string">')+37)
    blah=blah.substr(0,blah.indexOf('<'))
    return blah;
    }
    
    window.onload=function(){
    document.getElementById('datum').innerHTML=getMsoDateTagData()+' '+getMsoPrimeTagData()
    }
    </script>
    and this goes in the body:
    Code:
    <span id="datum"></span>

  • #10
    Kor
    Kor is offline
    Red Devil Mod Kor's Avatar
    Join Date
    Apr 2003
    Location
    Bucharest, ROMANIA
    Posts
    8,478
    Thanks
    58
    Thanked 379 Times in 375 Posts
    It worked great in FF but neither it nor firstChild.nodeValue worked in IE.
    It might be the so called "gap problem". If a possible textNode (even it is a empty, thus a "gap"), Moz consider it as a child, while IE ignore it...
    KOR
    Offshore programming
    -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*

  • #11
    Regular Coder
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    375
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Kor, I've run across the gap problem before but, this seems to be the opposite situation, FF sees it as the firstChild whereas IE does not. In my experience the intervening blank text node throws off FF, not IE.

  • #12
    Kor
    Kor is offline
    Red Devil Mod Kor's Avatar
    Join Date
    Apr 2003
    Location
    Bucharest, ROMANIA
    Posts
    8,478
    Thanks
    58
    Thanked 379 Times in 375 Posts
    well, both data and nodeValue will return the innerText (except that, at least in theory, nodeValue is a readonly attribute), so that I guess that innerHTML (even it is a standard DOM method) must have solved the problem... If not so, as you have said, than I guess that you should circle throught the object's childs and clone them in a colection of objects.
    KOR
    Offshore programming
    -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*

  • #13
    New Coder
    Join Date
    Jul 2005
    Posts
    10
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hi jscheuer1, based on the script below what if the open and close tage those I am trying to crab the content between them are in the <Head></Head> instead of <body></body>. How do I modify this code to grab the contents as before?. Is it necessary to declare the <span id="datum"></span> in the body?

    <script type="text/javascript">
    function getMsoDateTagData(){
    var msoTest=document.getElementsByTagName('mso:date_x0020_updated')[0]
    var blah=msoTest.parentNode.innerHTML;
    blah=blah.substr(blah.toLowerCase().lastIndexOf('mso:date_x0020_updated msdt:dt="string">')+40)
    blah=blah.substr(0,blah.indexOf('<'))
    return blah;
    }

    function getMsoPrimeTagData(){
    var msoTest=document.getElementsByTagName('mso:prime_x0020_sme')[0]
    var blah=msoTest.parentNode.innerHTML;
    blah=blah.substr(blah.toLowerCase().lastIndexOf('mso:prime_x0020_sme msdt:dt="string">')+37)
    blah=blah.substr(0,blah.indexOf('<'))
    return blah;
    }

    window.onload=function(){
    document.getElementById('datum').innerHTML=getMsoDateTagData()+' '+getMsoPrimeTagData()
    }
    </script>

    Thanks a lots,

    Many

  • #14
    Regular Coder
    Join Date
    May 2005
    Posts
    313
    Thanks
    0
    Thanked 0 Times in 0 Posts
    You guys are discussing this as if it were a legitimate technical issue; it isn't: forget about Firefox. This is a Microsoft office document. Just use IE to transform it, and be done already...
    Thanks in advance!

  • #15
    Regular Coder
    Join Date
    Mar 2005
    Location
    SE PA USA
    Posts
    375
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Enumerator, I'm not sure how that gets the OP the data desired.

    Manijs, I don't see why the tags being in the head would make a difference. The document.getElementsByTagName() method will scan the entire document. If these are proprietary tags, perhaps IE will not see them as tags at all when placed in the head. Did you test it out and find that to be the case? If so we can try a different method.

    <span id="datum"></span>

    Must be in the body, yes. But, it needn't be anywhere if you have other uses for the data. Once the tags in question are parsed by the browser, or at the very latest, once the page is parsed by the browser:

    getMsoDateTagData()

    and

    getMsoPrimeTagData()

    can be used as string variables in any other code you like.


  •  
    Page 1 of 3 123 LastLast

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •