Go Back   CodingForums.com > :: Server side development > PHP

Before you post, read our: Rules & Posting Guidelines

Reply
 
Thread Tools Rate Thread
Enjoy an ad free experience by logging in. Not a member yet? Register.
Old 02-05-2013, 10:02 PM   PM User | #16
Fou-Lu
God Emperor


 
Fou-Lu's Avatar
 
Join Date: Sep 2002
Location: Saskatoon, Saskatchewan
Posts: 15,635
Thanks: 4
Thanked 2,448 Times in 2,417 Posts
Fou-Lu is a name known to allFou-Lu is a name known to allFou-Lu is a name known to allFou-Lu is a name known to allFou-Lu is a name known to allFou-Lu is a name known to all
UTF-8 is a variant of unicode. It uses a variable bit length to represent characters beyond the ascii character sets. I think it goes up to. . . 32 bits max, and as few as 7. Unicode character sets are required to display the Asian characters.
PHP itself will pass through UTF8 though, since it is just bits when it comes down to it. The DBMS is encoded with the charsets as well so it knows how to represent them, and PHP is capable of accepting the "text" from the DB, and pushing a UTF-8 header to the browser so it also interprets it correctly. The problem is the middle man, if you try to manipulate it in PHP, you have to describe to it how many bits it needs in order to consider a character. When I ask for $str[0], by default it will take the first char off of the string, which is 8 bits. If it's a 16bit character, than I only end up with "half" of the character I need, which when presented as text will likely appear as nothing but rubbish.
__________________
As of PHP 5.5, the MySQL library has been officially deprecated. It is recommended to move to either MySQLi or PDO libraries for your mysql connectivity. See here for help choosing which interface you prefer: http://php.net/manual/en/mysqlinfo.api.choosing.php
Fou-Lu is offline   Reply With Quote
Old 02-05-2013, 10:22 PM   PM User | #17
nee
New Coder

 
Join Date: Feb 2013
Posts: 17
Thanks: 1
Thanked 0 Times in 0 Posts
nee is an unknown quantity at this point
I was doing some reading, and I feel like there is something wrong with my last post.
On line 10 where I used mb_strtolower don't I have to mark the variable $string as utf-8? Or is that only when converting from lower case to capital because UTF-8 mb_strtolower will only convert upper case characters to lower case which are marked with the Unicode property?

Can I add trim() to line 10 and everything be okay? Or is there a better way? becuase i want to eliminate the error causes by starting a string with a space? I would think it would strip away the first letter as well if I did that. I know I can try it I just want to be correct. Just because something might work doesn't mean it is the appropriate way to do something.

Last edited by nee; 02-05-2013 at 10:24 PM..
nee is offline   Reply With Quote
Old 02-05-2013, 10:39 PM   PM User | #18
Fou-Lu
God Emperor


 
Fou-Lu's Avatar
 
Join Date: Sep 2002
Location: Saskatoon, Saskatchewan
Posts: 15,635
Thanks: 4
Thanked 2,448 Times in 2,417 Posts
Fou-Lu is a name known to allFou-Lu is a name known to allFou-Lu is a name known to allFou-Lu is a name known to allFou-Lu is a name known to allFou-Lu is a name known to all
Don't think the trim will be a problem. It may be wiser to stick with PCRE since you can represent the space with a \s character. Actually, pretty much everything can be done with the PCRE on the match/replaces which may be less tedious than using the mb_string functionality in general.

You've already marked the mb_strtolower as using utf-8. That is what the $e represents in this function. No its not required *but* if you don't give it it will default to the internal machine encoding which may or may not be utf8.
__________________
As of PHP 5.5, the MySQL library has been officially deprecated. It is recommended to move to either MySQLi or PDO libraries for your mysql connectivity. See here for help choosing which interface you prefer: http://php.net/manual/en/mysqlinfo.api.choosing.php
Fou-Lu is offline   Reply With Quote
Old 02-05-2013, 10:52 PM   PM User | #19
nee
New Coder

 
Join Date: Feb 2013
Posts: 17
Thanks: 1
Thanked 0 Times in 0 Posts
nee is an unknown quantity at this point
Yes, but shouldn't I do it like?

PHP Code:
$orig_string mb_strtolower($string$e); 
If I don't use the mb_ functions won't that do away with the unicode?

Last edited by nee; 02-05-2013 at 11:32 PM..
nee is offline   Reply With Quote
Old 02-06-2013, 03:27 PM   PM User | #20
Fou-Lu
God Emperor


 
Fou-Lu's Avatar
 
Join Date: Sep 2002
Location: Saskatoon, Saskatchewan
Posts: 15,635
Thanks: 4
Thanked 2,448 Times in 2,417 Posts
Fou-Lu is a name known to allFou-Lu is a name known to allFou-Lu is a name known to allFou-Lu is a name known to allFou-Lu is a name known to allFou-Lu is a name known to all
Oh I see what you mean. Yes it should be; if you don't specify the encoding to use, than it will default to the internal encoding which may or may not be utf8.
__________________
As of PHP 5.5, the MySQL library has been officially deprecated. It is recommended to move to either MySQLi or PDO libraries for your mysql connectivity. See here for help choosing which interface you prefer: http://php.net/manual/en/mysqlinfo.api.choosing.php
Fou-Lu is offline   Reply With Quote
Reply

Bookmarks

Jump To Top of Thread


Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 05:50 AM.


Advertisement
Log in to turn off these ads.