View Full Version : Chinese characters in PHP

04-13-2012, 08:50 AM
Client page:

<meta http-equiv="Catchup Scheduler" content="text/html; charset=UTF-8">

<form action="next.php" method="POST" >
<input type="text" name="programme_name"></br>
<input type="submit">
Server page:

$name = $_POST['programme_name'];

mysql_query("SET character_set_client=utf8",$con);
mysql_query("SET character_set_connection=utf8", $con);
mysql_query("SET character_set_results=utf8", $con);

$query = "Select * from programme where Programme_name Like '%".$name."'";
Problem is even if there is that chinese text in the database results is ZERO(none). If I print the POST value $name it displays the correct text with the chinese characters.

but if i store it in a variable, such as:

$name= "valuevalue ( 官话 )";
$query = "Select * from programme where Programme_name Like '%".$name."'";
It produces results. how could this be, in POST it doesn't but static value declaration it does? My OS is XP SP3.

Is there a problem in the "internal" representation of chinese characters if it is in POST? Do I need to set something in my OS to support chinese representation?

04-13-2012, 12:58 PM
The problem is your encoding - both in the page and the mysql collation. You need to use utf8 for both. This opens up a minefield of problems as I recently found out when having to switch to utf8 for a project.

Using notepad++ you need to encode your html as utf8 without bom yet still encode the pho files as ansii whilst everything going in and out of the database needs to be utf8. Confused? You're not the only one. To make matters worse I found that pages on my system would output different characters to the pages on the live webserver which then made it pretty much impossible for me to develop the code locally before uploading.

Fou-Lu is far better with this stuff than me and recently got me out of a very big hole regarding this stuff so you may be best waiting for further advice from them.

04-13-2012, 06:53 PM
Yeah php doesn't natively support unicode, and it won't until version 6.

I'd visually compare each of these queries. Write one query with the characters explicitly in it (the one that works), and call it $query1. Then write another one that takes the $_POST and call it $query2. Evaluate each of the characters within them. I'll make a table (with poor standards :P):

$query1 = 'cat'; // put your queries here.
$query2 = 'mouse';

$iMaxLength = max(strlen($query1), strlen($query2));

print '<table border="1">';
for ($i = 0; $i < $iMaxLength; ++$i)
$cl = isset($query1[$i]) ? $query1[$i] : null;
$cr = isset($query2[$i]) ? $query2[$i] : null;
printf('<tr><td>%1$s</td><td>%3$s</td><td>%2$d</td><td>%4$d</td></tr>', $cl, ord($cl), $cr, ord($cr));
print '</table>';

Do the last two columns match row by row? Another good thing to try is to use mb_strlen instead of strlen in the max test to see if it provides different results.