View Full Version : URL validation with regular expression

11-05-2002, 02:37 PM
Hi all - am looking for an answer to a particular problem...

I have a form on my site that allows visitors to provide the URL of the site... Am looking for a reg exp for validating that the URL is correctly formed...

Anyone able to help.


Dominic :D

11-05-2002, 04:25 PM
Can you provide the rules for a correctly formed URL?

What URLs are you going ot accept?


All are valid URLs. Which do you plan on accepting?

11-05-2002, 04:37 PM
Oops - sorry beetle.... should have been clearer.

Am only looking to validate anything http, i.e.


or with a querystring, i.e.


Am not so worried about ppl putting the http:// in front of their entered address as I can check and add if necessary... have been struggling to design the correct pattern match for the rest...

I am getting better with reg exps but this one has me stumped... what I'm looking to do is...

function isValidURL(url) {
if (!/regularexpressionpatternforurlhere/.test(url)) {
return false;
return true;

Hope that makes it clearer.

Dominic :confused:

11-05-2002, 05:03 PM
Well, here's a first draft...



match at beginning of string

optional match of the http://

At least one group of accepted characters and a period

match domain at end of string.

11-05-2002, 05:26 PM
Cool - thanks beetle - works perfectly....

Funnily enough, after I posted my message I found this...


which also validates with a querystring, so have cut it down to...


to validate http:// only

It doesn't check for max number of characters for the TLD - so I'm going to try to decipher it tonight - which should confuse me some - LOL. Am not sure if I'll work it out though :confused:


Dominic :D

11-05-2002, 05:35 PM
How useful is checking for a maximum TLD characters length? Some months ago, ICANN added .museum to the available TLDs, and the effect: Most older regexp validating URLs needed an update, if the problem was communicated and easy to resolve (since the intent behind a regexp is difficult to see from the syntax).

Because we don't know what TLDs might pop up in the nearer future, I would refrain from trying to set a maximum length at the moment.

Just my 2 cents.

11-05-2002, 05:37 PM
Good point mordred... who knows what ICANN are going to do next...

Dominic :D

11-05-2002, 08:39 PM
Huh.... what's going on...

Am incorporating the URL check into my site at home but first need to check whether the submitted url starts with http:// ... so Iwrote this just to do a check see if I was getting it right...

function isValidURL(url) {
if (!/^((http)|(HTTP)):\/\//.test(url)); {
alert ("oops - not OK");
return false;
alert ("everything OK");
return true;

So why does it return false even when I start the url with http:// or HTTP://

Very confused :confused:


11-05-2002, 10:46 PM
Because one semicolon is in the wrong place, your intended else block is always carried out and the function returns.

if (!/^((http)|(HTTP)):\/\//.test(url)); {

Nothing wrong with the RegExp as far as I can see, though.

11-05-2002, 10:59 PM
Hmmm..... thanks mordred....

I took out the semi-colon (the one you showed in red) and it still returns false even if I use a URL starting with http://

function isValidURL(url) {
if (!/^((http)|(HTTP)):\/\//.test(url)) {
alert ("oops - not OK");
return false;
alert ("everything OK");
return true;

Any ideas????

Dominic :confused:

11-05-2002, 11:02 PM
just do this

if (!/^http:\/\//i.test(url)) {

See that red i? It's called a pattern modifier, and the i modifier means case-insensitive matching, or ignore case.

the other common modifier is g which signifies global matching, replacing.

If you look back, you can see that I used the i modifier in my first post :D

11-05-2002, 11:11 PM
LOL - that's what I did in the first place but changed it to be verbose.... still returning the error though

Have attached as a text file - VERY strange....

Dominic :(

11-06-2002, 12:28 AM
Did you check out the regular expression I'm using to validate URLs?


I got it from http://www.regexlib.com/ .

Seems to work pretty good... I haven't run into any problems with it yet, at any rate. :)

11-06-2002, 01:07 AM
Hi dominicall,

<form name="checkurl" onsubmit="return isValidURL(this.url.value)">

( ) ( )

11-06-2002, 01:33 AM
No offense Owl, but where's the javascript function you're calling (isValidUrl())?

Also, when you're using server-side scripting, it's a lot more reliable than javascript, since you don't have to rely on the client having javascript enabled on their browser.


11-06-2002, 01:43 AM
Thanks Owl... that worked....

Re: the client side vs server side validation - I probably go over the top but actually do both... client side since it's instant and gives ppl the chance to change the form before submission and then server side for those without javascript.

All I need to do now is work out how to add to the http:// to the front of the url if it hasn't be added by the user...


Dominic :rolleyes:

11-06-2002, 02:06 AM
Take a look at the regex I'm using, I think it works pretty good, but I can't guarantee it's totally correct, since I didn't write it. It hasn't given me any grief anyway.

As for the javascript thing, my bad... I didn't realize you were using both methods of validation. In my opinion it's great to do that, as long as you are also validating on the server-side since you can catch errors client-side before they are ever posted to the server.

P.S. dominicall, have you gotten the message about your functions that I posted? I can save you a lot of grief since most of them are unnecessary. ;)

11-06-2002, 02:13 AM
Yeah - got your message... sent you a reply...

Send me a msg with your thoughts - look forward to it.

Dominic :D

11-07-2002, 01:19 AM
Well, you can just use ASP to see if the input is http://:

Function BeginsWithHTTP(byVal str)
Dim bghRegEx
Set bghRegEx = New RegExp
bghRegEx.Pattern = "^(https?|ftp):\/\/.*$"
bghRegEx.IgnoreCase = True
BeginsWithHTTP = bghRegEx.Test(str)
End Function

Response.Write(BeginsWithHTTP("http://www.blah.com") & "<br />" & vbCrLf)
Response.Write(BeginsWithHTTP("https://www.blah.com") & "<br />" & vbCrLf)
Response.Write(BeginsWithHTTP("ftp://www.blah.com") & "<br />" & vbCrLf)
Response.Write(BeginsWithHTTP("www.blah.com") & "<br />" & vbCrLf)

I just tested that and it seems to work. You should also be able to use that regular expression in javascript like this:

alert("Oh my gosh! You didn't type a valid URL!");


As for your other functions, I have yet to reply to your email since I haven't had time. But I will show you the right way to do it. I did learn something valuable from the functions that you posted though, so it's all good! :)

11-07-2002, 01:46 AM
Originally posted by whammy
"^((http|https|ftp):\/\/).*$"Wouldn't this work?

11-07-2002, 01:47 AM
Yeah it would, and it's more elegant. I typed that up rather quickly after looking at some other posts... good catch, though :).

I'll fix my above posts to include that in the regex's.

Although if you want to get nitpicky, we could split that apart even more with pipes, parentheses and question marks... but then it would be more code instead of less, so this is still the best way assuming dominicall's string is not part of a larger picture. ;)