...

View Full Version : Slow RegEx issue



Ultragames
01-02-2008, 05:38 AM
Happy New Year all,

I am running this function onkeydown from a text input. The function changes the color of the text depending on the results from this regex test:


return (content.match(new RegExp(/^\w+([\.-]?\w+)*@\w+([\.-]?\w+)*\.(\w{2}|(com|net|org|edu|int|mil|gov|arpa|biz|aero|name|coop|info|pro|museum))$/))) ? true : false;

The longer my string gets, the longer it takes to do this RegEx. I have quite the powerful computer, but my browsers lags quite a bit when the string gets over 12 or so characters. (This regex tests an email address.)

I am thinking that it might be faster to either have my server do the regex, and use AJAX to test the string, or to do a 200-400 milisecond timeout on this function, and clear the timeout everytime they hit a new key, (to try and limit the number of times this script runs, while still giving them near-live results.)

So my question is, Is there a faster way to do good email address verification, or should I do this by AJAX, or should I use a timer to try and remove itterations of the function.

If you think i should use a timer, cam you give me a working example of clearing the same timer from two functions. I always have issues getting a global variable to work when i want to set a timer in one function, and clear it in another.

Thanks!

rnd me
01-02-2008, 06:01 AM
1. what about emails from .co.uk?

2. about globals: don't use them, use object props pn the function itself:


function tes(a){ tes.data+=a; return tes.data }
tes.data = "";

alert(tes("hello"))
alert(tes(" world"))


also might glimpse at dustin's take for inspiration (http://www.dustindiaz.com/update-your-email-regexp/)

note that using "|" will slow down the regexp eval considerably, as it has to do a test for each case, inclusive of patterns to the left/right of the |.
i would imagine that using a [\w]{2,6} would be faster.

a trick i have used in situations like this , is to break the regexp into two parts.
first match all the substrings that might remotly be construed as an email, like "\w+@\w+\.\w{2,6}", and then filter out precisely all of those matches. that gives you the best of both worlds: a fast gateway and a precise validation filter.

Ultragames
01-02-2008, 08:38 AM
Thank you,

First of all, I don't see how the code you posted has anything to do with my question. I'm assuming im being stupid tonight, and I was hoping you could explain.

Also, could you place the regex sections you were taking about within the regex I posted so that I could see where you are takling about?

Thanks for the help!

Philip M
01-02-2008, 09:24 AM
The numerous OR | operators are the cause. Suggest you try:-

/^([a-z0-9])(([\-.]|[_]+)?([a-z0-9]+))*(@)([a-z0-9])((([a-z0-9\-\.]+))?)*((.[a-z]{2,3})?(.[a-z]{2,6}))$/i

which covers co.uk and so on.

There is no point in trying to validate to excess, as the biggest risk is that the address as entered is simply wrong due to a typo, e.g. philip@mydomian.co.uk


It is your responsibility to die() if necessary….. - PHP Manual

chump2877
01-02-2008, 09:31 AM
So my question is, Is there a faster way to do good email address verification, or should I do this by AJAX, or should I use a timer to try and remove itterations of the function.
The easiest thing to do would be to validate the email address onsubmit instead of onkeydown. Eliminates the issue entirely.

Philip M
01-02-2008, 09:36 AM
The easiest thing to do would be to validate the email address onsubmit instead of onkeydown. Eliminates the issue entirely.

Or better still onchange or onblur to get more immediate validation.

But the guy wants:- "The function changes the color of the text depending on the results from this regex test:"

I suppose the idea is that if the user types a character not consistent with an email address such as @@ then it turns red or something. Too clever by half IMHO.

Ultragames
01-02-2008, 09:58 AM
The easiest thing to do would be to validate the email address onsubmit

Thats nearly pointless. At that point you should just do server side validation (which any good programmer is going to do anyway.) I am looking for (and already have, but am trying to speed up) a nice looking live validation that lets the user know when they have entered the correct information.

On another note, thank you Philip, I will give that Regex a try. There are A LOT of Regex's out there for email validation, and its difficult to find one that covers the majority of email addresses without being too slow. (A regex that conforms to every rule of email address creation and domain names would be many many lines long.)

Philip M
01-02-2008, 12:15 PM
The simplest validation is to require the email address to be entered twice (i.e confirm) which gets rid of most typos.

Another good regex is:-

/^.+@[^\.].*\.[A-z]{2,6}$/

Simple, but allows for foreign language characters.

i raed a wlihe ago taht as lnog as the frsit and lsat lttesrs in a wrod are in the cerroct pcale msot of us wulod siltl be albe to raed bceause our barnis jsut looks at the frsit and lsat lttesrs.

chump2877
01-02-2008, 12:54 PM
Thats nearly pointless. At that point you should just do server side validation (which any good programmer is going to do anyway.)


It's hardly pointless...The point of client-side validation is to validate fields before you make another trip back to the server...However you choose to do your client side validation, that is the bottom line...

I think Philip said it right: "Too clever by half IMHO."

What's happening here is you're getting caught up in making the validation "pretty", but the more you "pretty-ify" it, the less functional your validation becomes.

So, in my opinion, the validation on onkeydown that you are implementing here is a little over the top and doesn;t really enhance the effectiveness of the validation. Sometimes simple is better -- for the developer and the end user.

Ultragames
01-02-2008, 11:10 PM
Regardless of all of this, I still need help on my regex, server side or client side.

Philip:
The regex that you gave me seems to have a few problems.


/^([a-z0-9])(([\-.]|[_]+)?([a-z0-9]+))*(@)([a-z0-9])((([a-z0-9\-\.]+))?)*((.[a-z]{2,3})?(.[a-z]{2,6}))$/i

This regex returns true if the email address has no dot. For instance myEmail@address passes true. Can that be corrected?

Trinithis
01-03-2008, 02:11 AM
Looking at your original regex, this should make it faster:



var myFunc = (function() {
var pattern = /^\w+(?:[.-]?\w+)\w+@\w+(?:[.-]?\w+)\w+\.(?:com|net|org|edu|int|mil|gov|arpa|biz|aero|name|coop|info|pro|museum|[a-z]{2})$/
return function(args) {
//other function code here
return pattern.test("abc@hotmail.com");
};
})();


All I did was rid the regex of capturing parenthesis, changed it to regex.test(str) instead of str.match(regex), and stored a copy of the regex as a "quasi-local" variable. No need of recompiling that regex every time :D.

Philip M
01-03-2008, 10:02 AM
Regardless of all of this, I still need help on my regex, server side or client side.

Philip:
The regex that you gave me seems to have a few problems.


/^([a-z0-9])(([\-.]|[_]+)?([a-z0-9]+))*(@)([a-z0-9])((([a-z0-9\-\.]+))?)*((.[a-z]{2,3})?(.[a-z]{2,6}))$/i

This regex returns true if the email address has no dot. For instance myEmail@address passes true. Can that be corrected?

It returns false for me. Have you copied it correctly?

(.[a-z]{2,6}) requires a dot followed by 2-6 alpha characters at the end of the string



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum