Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 14 of 14
  1. #1
    WA
    WA is offline
    Administrator
    Join Date
    Mar 2002
    Posts
    2,596
    Thanks
    2
    Thanked 19 Times in 18 Posts

    Tough one- using regular expressions to remove HTML tags?

    Ok, this is probably as much a challenge as it is a question. Within a string, I'm looking for a way to remove all potential HTML tags. For example, using the following string:

    var mystring='This is a <b>very</b> interesting <a href="http://www.javascriptkit.com">site</a> for JS'

    The resulting output should be instead:

    var mystring='This is a very interesting site for JS'

    I would define a HTML tag as anything that's surrounded by < >.

    Obviously regular expressions is needed here, though I would gather specifically, back referencing. It's definitely not my strong suit.

    Thanks!
    - George
    - JavaScript Kit- JavaScript tutorials and 400+ scripts!
    - JavaScript Reference- JavaScript reference you can relate to.

  • #2
    Senior Coder
    Join Date
    Jun 2002
    Location
    frankfurt, german banana republic
    Posts
    1,848
    Thanks
    0
    Thanked 0 Times in 0 Posts
    You could try

    Code:
    mystring = mystring.replace(/\<.+?\>/g, '');
    though doesn't make use of backreferences...

  • #3
    Senior Coder
    Join Date
    Jun 2002
    Location
    near Oswestry
    Posts
    4,508
    Thanks
    0
    Thanked 0 Times in 0 Posts
    dude - you should get O'Reilly's book - it changed my life

  • #4
    WA
    WA is offline
    Administrator
    Join Date
    Mar 2002
    Posts
    2,596
    Thanks
    2
    Thanked 19 Times in 18 Posts
    Thanks guys. mordred, your solution may just work for me! I was getting ahead of myself with the comment on back referencing. It seems I only need that when trying to replace a regular HTML tag with a custom one, such as from:

    <a href="http://www.dynamicdrive.com">

    to

    [a url="http://www.dynamicdrive.com"]

    or visa versa.

    I'll try and shoot some holes into your solution a bit later. If it's bullet proof, that'd be awesome.
    - George
    - JavaScript Kit- JavaScript tutorials and 400+ scripts!
    - JavaScript Reference- JavaScript reference you can relate to.

  • #5
    Senior Coder
    Join Date
    Aug 2002
    Posts
    3,467
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Regular expressions by their nature are greedy, and unfortunately there is no way to Ungreedy a regex in JS. So, I think the more appropriate syntax is
    Code:
    mystring = mystring.replace(/\<[^\>]+\>/g, '');
    My Site | fValidate | My Brainbench | MSDN | Gecko | xBrowser DOM | PHP | Ars | PVP
    “Minds are like parachutes. They don't work unless they are open”
    “Maturity is simply knowing when to not be immature”

  • #6
    Senior Coder
    Join Date
    Jun 2002
    Location
    frankfurt, german banana republic
    Posts
    1,848
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Of course you can. That's what the question mark after the quantifier is for in my regexp. IIRC the greedy/ungreedyness switching was added in JavaScript1.5, but hey, who targets NN4 or IE4 any longer today?

  • #7
    Senior Coder
    Join Date
    Aug 2002
    Posts
    3,467
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Originally posted by mordred
    Of course you can. That's what the question mark after the quantifier is for in my regexp. IIRC the greedy/ungreedyness switching was added in JavaScript1.5, but hey, who targets NN4 or IE4 any longer today?
    Whoa, cool. Thanks mordred
    My Site | fValidate | My Brainbench | MSDN | Gecko | xBrowser DOM | PHP | Ars | PVP
    “Minds are like parachutes. They don't work unless they are open”
    “Maturity is simply knowing when to not be immature”

  • #8
    Senior Coder
    Join Date
    Jun 2002
    Location
    near Oswestry
    Posts
    4,508
    Thanks
    0
    Thanked 0 Times in 0 Posts
    ungreediness switching?? Is that generally available in regex, or can it only be done with vendor or language specific syntax?

  • #9
    Senior Coder
    Join Date
    Aug 2002
    Posts
    3,467
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Originally posted by brothercake
    ungreediness switching?? Is that generally available in regex, or can it only be done with vendor or language specific syntax?
    Not sure, but typically it's a flag or modifier. For example, that same regex in PHP would be
    PHP Code:
    $mystring preg_replace"/\\<.*\\>/U"""$mystring ); 
    My Site | fValidate | My Brainbench | MSDN | Gecko | xBrowser DOM | PHP | Ars | PVP
    “Minds are like parachutes. They don't work unless they are open”
    “Maturity is simply knowing when to not be immature”

  • #10
    Senior Coder
    Join Date
    Jun 2002
    Location
    near Oswestry
    Posts
    4,508
    Thanks
    0
    Thanked 0 Times in 0 Posts
    so it's the U in that case?

  • #11
    Senior Coder
    Join Date
    Aug 2002
    Posts
    3,467
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Originally posted by brothercake
    so it's the U in that case?
    Yes. But, obviously as all flags do, the Ungreedy modifier affects the whole pattern.
    My Site | fValidate | My Brainbench | MSDN | Gecko | xBrowser DOM | PHP | Ars | PVP
    “Minds are like parachutes. They don't work unless they are open”
    “Maturity is simply knowing when to not be immature”

  • #12
    Senior Coder
    Join Date
    Jun 2002
    Location
    near Oswestry
    Posts
    4,508
    Thanks
    0
    Thanked 0 Times in 0 Posts
    thanks

  • #13
    Senior Coder
    Join Date
    Aug 2002
    Posts
    3,467
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I've also since learned that the double quantifier (using the ? after a quantifier) is the PCRE syntax for ungreedy
    My Site | fValidate | My Brainbench | MSDN | Gecko | xBrowser DOM | PHP | Ars | PVP
    “Minds are like parachutes. They don't work unless they are open”
    “Maturity is simply knowing when to not be immature”

  • #14
    Senior Coder
    Join Date
    Jun 2002
    Location
    near Oswestry
    Posts
    4,508
    Thanks
    0
    Thanked 0 Times in 0 Posts
    ahh .. hence mordred's original expression ..


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •