Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 5 of 5
  1. #1
    Regular Coder
    Join Date
    Dec 2003
    Posts
    100
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Regular expressions that can match multiple substrings

    I'm trying to use regular expressions to filter information out of an HTML table. For example, I might want to strip a row from the table with a certain class, so I used the following piece of code.

    Code:
    var = var.replace(/<tr class="some class">.*<\/tr>/, "");
    However, since this is a table it's filled with </tr> tags and the replace function prefers to just continue on until the last </tr> it can find and subsequently delete the entire table. Is there any way I can have the replacement only act upon the row I want, that is, can I tell replace to use the first </tr> tag it comes across, instead of the last one?

  • #2
    Senior Coder rnd me's Avatar
    Join Date
    Jun 2007
    Location
    Urbana
    Posts
    4,296
    Thanks
    10
    Thanked 584 Times in 565 Posts
    you can make it ungreedy by using the "?"

    exmaple:
    Code:
    var = var.replace(/<tr class="some class">.*?<\/tr>/, "");
    my site (updated 13/9/26)
    BROWSER STATS [% share] (2014/5/28) IE7:0.1, IE8:5.3, IE11:8.4, IE9:3.2, IE10:3.2, FF:18.2, CH:46, SF:7.9, NON-MOUSE:32%

  • #3
    Regular Coder
    Join Date
    Dec 2003
    Posts
    100
    Thanks
    1
    Thanked 0 Times in 0 Posts
    Thanks, that works perfectly. However, I fail to see exactly why it works. The question mark makes it so that only on instance of .* is searched for, I presume, but I don't see how that relates to my problem. That one instance could still be the entire table, could it not? Would you care to elaborate?

  • #4
    Supreme Master coder! Philip M's Avatar
    Join Date
    Jun 2002
    Location
    London, England
    Posts
    17,918
    Thanks
    203
    Thanked 2,531 Times in 2,509 Posts
    var = var.replace(/<tr class="some class">.*<\/tr>/, "");

    . means any character
    * means zero or more instances of preceding character

    so the regex grabs everthing between the first <tr class .....
    and the last </\tr>. This is called "greedy" matching.

    var = var.replace(/<tr class="some class">.*?<\/tr>/, "");

    When the ? character immediately follows any of the other quantifiers (*, +, ?, {n}, {n,}, {n,m}), the matching pattern is non-greedy. A non-greedy pattern matches as little of the searched string as possible, whereas the default greedy pattern matches as much of the searched string as possible.


    A child of five would understand this. Send someone to fetch a child of five.
    Groucho Marx

  • #5
    Regular Coder
    Join Date
    Dec 2003
    Posts
    100
    Thanks
    1
    Thanked 0 Times in 0 Posts
    I see, I wasn't aware that the question mark did that as well. Thanks for the explanation.


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •