View Full Version : UTF-8 chars and preg_replace

01-12-2010, 02:16 AM
Just wondered if some of you chaps more familiar with regex and unicode could enlighten me. Would the following two expressions:


match the lesser than symbol whether it's encoded in utf-8, latin etc?


01-12-2010, 04:47 PM
The former will match the literal octet 0x3C (which in ASCII has the representation "<"), which will work for all ASCII-supersets such as ISO-8859-*, UTF-8, Windows-1252 and others, but won't work in other character sets like UTF-16.

The latter requires the input string to be valid UTF-8, and will always fail (and throw a warning, IIRC) if it is not. If you're only dealing with ASCII characters and you don't want the UTF-8 validation, your better off without the u flag.

01-13-2010, 02:50 AM
Cheers. :)