...

View Full Version : Allow Multiple Languages?



Tanner8
01-17-2012, 11:59 PM
Hello, I searched around and can't find much information on this. How can I allow extended char sets and non standard characters to be stored in the database? I realize I can choose the collation but is there one that will umbrella the others? They all seem very specific. For instance Spanish accented characters, Russian letters, Arabic, etc.

Currently English will primarily be stored but that may change, I need to know how to handle these characters for the future. Thank you.

abduraooft
01-18-2012, 10:13 AM
You could use a utf collation (say utf8_general_ci) for your table and/or required fields and then set a utf8 charset for mysql connection, like
mysql_query("SET NAMES 'utf8'");

See http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html

Tanner8
02-04-2012, 09:40 PM
Thank you for the response and I apologize for the late reply.

That doesn't seem to work. I included that after each connection, I made sure the appropriate fields in the table are of collation UTF8_Unicode_CI. However after escaping the input, what goes into the database for character '' is ñ

So ''=>'ñ'

I tried encoding to UTF8 through PHP before insertion, that didn't change anything either. Anymore ideas? Thank you.

After more playing around, it seems that if I do an html_entity_decode I can get the out, however that is bad practice because XSS is then possible.

It seems to have been an encoding problem with htmlEntities.

htmlentities($reply, ENT_COMPAT, 'UTF-8');

Allows me to use an extended set. However Asian characters are being represented in the database as squares. Any way to get that working as well? This would be better suited for the PHP section.

BubikolRamios
02-05-2012, 07:31 PM
There are a zilion things that can affect your problem, OS, web server, .....
In general setting mysql to utf is right thing to sort the problem there, other than mysql,
you will waste some more time to sort all out.

Tanner8
02-05-2012, 09:24 PM
There are a zilion things that can affect your problem, OS, web server, .....
In general setting mysql to utf is right thing to sort the problem there, other than mysql,
you will waste some more time to sort all out.

What is the the "proper" htmlentities encoding I should use however? Changing that really alters what is stored in the database. I tried a few different flags but they all just make squares in the DB. I need to get it to put in something like &#1581 for instance.

BubikolRamios
02-05-2012, 11:35 PM
In general I allway put this on my pages (java):



<%@ page language="java" contentType="text/html; charset=utf-8" pageEncoding="utf-8"%>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />


Then you need (if having tomcat), set utf8 there too.
That should bring you close to result.

abduraooft
02-06-2012, 05:55 AM
However after escaping the input, what goes into the database for character '' is &Atilde;&plusmn;
Could you show some relevant code? Do you convert the input to htmlentities before insertion?

Tanner8
02-06-2012, 06:35 AM
Could you show some relevant code? Do you convert the input to htmlentities before insertion?


I just got it working. The reason it kept on breaking even after trying to do html entities is because I tried using "utf_encode()" while trying random things and I forgot to remove it.

This way however, the text is NOT encoded in the database, it is just in there plainly. So Chinese characters are stored directly in there. Is that a security issue? It still encodes the important things like " < > however so should that be enough? Thank you

abduraooft
02-06-2012, 07:12 AM
So Chinese characters are stored directly in there. Is that a security issue? There's nothing wrong about that. After all, they are all alphabets, like English, that come at a different location in utf table.



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum