...

View Full Version : Resolved How Alow European special characters in a string



Jesper Mller
02-16-2010, 04:45 PM
I need to check some name fields in a form to see that it only contain alpha charecters that shud be simple. However the European languages contains more letteres than a-z

It must allow special European characters like and so on (but not signs like $"#/=+? and so on)


i cud try making a list of all the charecters allowed, lucky this form is mostly for westen europa / skandinavian, but if i shud make a form for the whole europa (not to say worldwide) the list woud be VERY long.
The European languages is divided into 13 diferent ISO set and some of the sets have letters where others have signs

What woud be the best way to do this ? :confused:

Fou-Lu
02-16-2010, 05:19 PM
Pattern matching is probably the easiest way to do this.
I'd expect that locale matching using just \w would provide the accentuated characters. If not, you can try to use the unicode patterns with \pL+. I believe you need to use the u modifier when you're dealing with multibyte charsets though.
Aside from that, you can use the mbstring library to see if it has anything useful.

Jesper Mller
02-16-2010, 06:28 PM
Thanks for you input
But i must admit i simply dont understand annything of that code
(Yes im new to php)

du you meen somthing like preg_match("/\pL+/u", $name) ??

Fou-Lu
02-16-2010, 06:39 PM
Thanks for you input
But i must admit i simply dont understand annything of that code
(Yes im new to php)

du you meen somthing like preg_match("/\pL+/u", $name) ??

o.O
Thats exactly what I meant yes. Did it work?

Jesper Mller
02-16-2010, 06:42 PM
o.O
Thats exactly what I meant yes. Did it work?

Not shue if it works yet ... trying it out
Must say thodse expresion is giving me a hard time...

Jesper Mller
02-16-2010, 07:22 PM
Well ... its ALMOST working

Try this


$test1 = "abcdefghijklmnopqrstuvwxyz 0123456789";
echo $test1."<br><br>";
echo preg_replace('/\pL[0-9]+/u','',$test1);
echo "<br><br>";
$test2 = "abcdefghijklmnopqrstuvwxyz0123456789";
echo $test2."<br><br>";
echo preg_replace('/\pL[0-9]+/u','',$test2);

test2 will removed the

Jesper Mller
02-16-2010, 07:35 PM
Also problem if ther is a / in the string



$test = "abcdefghijklmnopqrstuvwxyz0123456789.-&#012";
echo preg_replace('/\pL+/u','',$test);
echo "<br><br>";
$test2 = "abcdefghijklmnopqrstuvwxyz0123456789.-&/#012";
echo preg_replace('/\pL+/u','',$test2);
echo "<br><br>";

that gives:
0123456789.-
and
0123456789.-&/#012

Jesper Mller
02-16-2010, 07:35 PM
Think ill have to play with this a lot to get it to work as i want :-P

Fou-Lu
02-16-2010, 08:23 PM
Hah, yep. I can't test this ATM since I'm at work, and I'm busy getting things ready to go to move to a new place, but I'll try to test this out when I get home.

Its looking pretty good so far though, I would expect that it shouldn't remove the &#012, but it clearly does (I wouldn't consider 012 to be a character that you'd want to use).
Try this for you're pattern: /(\pLu|\pLT|\pLl)+/u. That will skip 'modified' and 'other' letters, but I'm afraid I have no idea what either are defined as :o

Jesper Mller
02-16-2010, 08:33 PM
Thanks Fou-Lu

Im simply trying/testing diffrent combination and seeing what is hapning and that way trying to learn the ways of expresion
Ill try you sugestion...

My goal is to allow only letters and space . - & sign in the names :-)

Jesper Mller
02-16-2010, 08:43 PM
Funny ... /(\pLu|\pLT|\pLl)+/u only removes kl frome the string :-)

Jesper Mller
02-16-2010, 09:28 PM
Think i got it now

if (strlen(preg_replace('/[\.\-\s\&]|\pL+/u','',$test)) >0) {
echo "ERROR";
}

This shud allow for names like:
Mller
Mr. Jensen
Hans-Christian
Jnson
Glud&Marstand

Proberly cud ben don smarter, and ill stil have to do some more testing
(And have to figure out what \p and L actualy do)

Thanks for the help :thumbsup:

Fou-Lu
02-16-2010, 10:39 PM
\p indicates unicode sets in use, L indicates letter (breakdowns are things like Letter Upper, Letter Lower, Letter Title, Letter Other and Letter Modified).

Glad you got it sorted out.

Jesper Mller
02-16-2010, 10:55 PM
Thanks

Playd with them a little .. got that P vs p thing gave me letters vs not letters
But thought that the u was for unicode (its surly looks funny if i remove it)

pLT and pLl i cud not get to work
pLt removed st frome the string
PLl removed kl frome the string



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum