...

View Full Version : Eregi expression major bug, no? The "i" thing...



Shadowfox
02-17-2006, 09:47 PM
Short and clear:

i modifier (case insensitive) does NOT work for Cyrilic layouts (such as windows-1251)

Example:
Imagine XXXX is written in Cyrilic characters, not English..

We have a search form. A script searches a file for the criteria specified by this form. The script contians a modifier i for case insensitive matching...

WHAT IS THE PROBLEM, you will ask me.

Well, here is the situation:

Search criteria: Dog
Finds in file: Dog DoG and DOg if they exist.

Search criteria: Xxxx
Finds in file: ONLY Xxxx if it exists - exact phrase, case sensitive matching for Cyrilic letters in ONE AND THE SAME SCRIPT

Could anybody please give a reasonable explanation of that bizarre issue?

Kid Charming
02-17-2006, 11:16 PM
Cyrillic is a multibyte character set, and standard PHP string functions are designed for single-byte charsets. You need to work with PHP's multibyte string functions (http://us3.php.net/manual/en/ref.mbstring.php).

marek_mar
02-17-2006, 11:49 PM
AFAIK support for unicode is a key feature of PHP6...

Shadowfox
02-18-2006, 12:04 AM
just tell me there is a multibyte replacement for
PREG_MATCH
PREG_SPLIT

because using
MB_EREGI
MB_SPLIT
instead did not work...

Kid Charming
02-18-2006, 01:13 AM
Are you sure the mbstring extension is enabled in your installation?

Shadowfox
02-18-2006, 01:38 PM
Well, the site is located on a hosting company's server.. I should ask the support team.. but still.. I am pretty sure MB's enabled..



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum