PDA

View Full Version : Reverse Regex


adios
10-10-2002, 04:58 AM
How does one specify a regex pattern in the negative? That is, if you used it normally with the String.replace() method, it would replace all occurrances of the pattern with the specified string; suppose you wanted all non-matching occurrances to be replaced? I know you can negate character classes like [^\d\w] - but how would you 'negate' an entire pattern?

beetle
10-10-2002, 05:50 AM
Uh, maybe with backreferences?

normal
str = str.replace(/\d+/g," ");

reversed
str = str.replace(/.*(\d+).*/g," $1 ");

Well, that sort of does it. I think this task would need to be approached for a specific case. I don't know of any way to just plain negate it.

joh6nn
10-10-2002, 09:03 AM
i'd say you have to build an opposite RE. for instance, \d is any one digit, and \D is anyone character that's not a digit.

check this out for help with RE's:
http://evolt.org/article/Regular_Expressions_in_JavaScript/17/36435/index.html

jkd
10-10-2002, 02:50 PM
/[^(?:originalpattern)]/g

Perhaps?

beetle
10-10-2002, 03:37 PM
I'd say there's some testing/exploring to be done here, although I admit my brain at the moment is being powered solely by the cup of coffee in my hand, that I'm having a hard time thinking of a useful subject...

adios
10-10-2002, 08:43 PM
Many thanks for all the replies (and joh6nn for that link). I'm busy digesting it all (and drinking yet more coffee) at the moment. I wonder if Perl (or another language) has built-in support for this sort of functionality. I was really ruminating as to how limited pattern 'negation' seems to be in JS: using that circumflex with character classes, or, as suggested, to 'flipping' metacharacters. Must be a way to 'flip' entire patterns. Maybe not.

:confused:

jkd
10-10-2002, 09:12 PM
Originally posted by adios
Must be a way to 'flip' entire patterns. Maybe not.

:confused:

I'm pretty sure what I posted achieves that:

'hello'.replace(/ll/g, '') == 'heo'

'hello'.replace(/[^(?:ll)]/g, '') == 'll'

mordred
10-10-2002, 09:44 PM
IIRC that shouldn't work, because characters inside a character class do not have any special meaning they usually have.

alert('hello(?:)'.replace(/[^(?:ll)]/g, ''));

It alerts you "ll(?:)" - not quite a negative pattern, just all charactes that aren't within the brackets are stripped.

beetle
10-10-2002, 10:18 PM
IIRC? Someone help me here, my Acronym-O-Matic 2000 is in the shop....

Just talked to a Perl buddy. He says that !~ negates a regex pattern in Perl...well, here's his example anyhow...instead of
$bob =~ m/match/;
u'd say
$bob !~ m/match/;

if it's a replace

id =~ s/[^m][^a][^t][^c][^h]//gFor what it's worth....

whammy
10-11-2002, 02:07 AM
Hmm... do you have an example of a pattern that you don't want to match? :D

I have learned a bit about regular expressions and wouldn't mind at least trying to figure it out. :D

adios
10-11-2002, 05:20 AM
If I remember correctly:D, Perl always seems to have something in the toolkit.

See what you can make of this:


<html>
<head>
<title>untitled</title>
<script type="text/javascript" language="javascript">

var objRegExp = /<\/?[^>]+>/gi;

onload = function() {
document.forms[0].reset();
var sTest = document.getElementsByTagName('body').item(0).innerHTML.substring(2);
document.f1.readoutBef.value = sTest;
sTest = sTest.match(objRegExp);
document.f1.readoutAft.value = sTest.join('');
}

</script>
</head>
<body><script type="text/javascript" language="javascript">
document.write('<font size="4" color="indigo">',objRegExp,'</font><br /><br />');
</script>
Here is some <em>phony</em> HTML. I'm putting it <font color="red">here</font> in order to prove a point - JavaScript <u>regular expressions</u> are OK.<br /><br />
<h5>Here's the deal:</h5>
This is a string that contains text and markup. Can you give me a regular expression that matches all-<i>non</i>-tag content? Please?<br /><br />
Actually...I'd like to be able to strip out the textual content & replace it with a designated string, naturally not in between tags where there was no text to begin with. Ugh.<br /><br />
<strong>from:</strong><br />
<font size="2" face="times new roman">&amp;#149; The Man &amp;#149;</font>
<form name="f1">Before:<br />
<textarea name="readoutBef" rows="15" wrap="soft" style="width:760px;font-size:11px;"></textarea>
<br />After:<br />
<textarea name="readoutAft" rows="6" wrap="soft" style="width:760px;font-size:11px;"></textarea>
</form>
</body>
</html>

beetle
10-11-2002, 06:19 AM
This comes close...<html>
<head>
<title>untitled</title>
<script type="text/javascript" language="javascript">

var objRegExp = /<\/?[\s\w!#\$%\^&\*\(\)-=\+\\\/\[\];:'\"\?\.,]+>/gi;

onload = function() {
var t = document.documentElement.outerHTML;
var f = document.forms[0];
f.readoutBef.value = t;
t = t.replace(/>[^<>\r\n]+</gi,"><");
f.readoutAft.value = t;
}

</script>
</head>
<body><script type="text/javascript" language="javascript">
document.write('<font size="4" color="indigo">',objRegExp,'</font><br /><br />');
</script>
Here is some <em>phony</em> HTML. I'm putting it <font color="red">here</font> in order to prove a point - JavaScript <u>regular expressions</u> are OK.<br /><br />
<h5>Here's the deal:</h5>
This is a string that contains text and markup. Can you give me a regular expression that matches all-<i>non</i>-tag content? Please?<br /><br />
Actually...I'd like to be able to strip out the textual content & replace it with a designated string, naturally not in between tags where there was no text to begin with. Ugh.<br /><br />
<strong>from:</strong><br />
<font size="2" face="times new roman">• The Man •</font>
<form name="f1">Before:<br />
<textarea name="readoutBef" rows="15" wrap="soft" style="width:760px;font-size:11px;"></textarea>
<br />After:<br />
<textarea name="readoutAft" rows="6" wrap="soft" style="width:760px;font-size:11px;"></textarea>
</form>
</body>
</html>