Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 9 of 9
  1. #1
    New Coder
    Join Date
    Aug 2009
    Posts
    51
    Thanks
    9
    Thanked 0 Times in 0 Posts

    Allow Multiple Languages?

    Hello, I searched around and can't find much information on this. How can I allow extended char sets and non standard characters to be stored in the database? I realize I can choose the collation but is there one that will umbrella the others? They all seem very specific. For instance Spanish accented characters, Russian letters, Arabic, etc.

    Currently English will primarily be stored but that may change, I need to know how to handle these characters for the future. Thank you.

  • #2
    Supreme Master coder! abduraooft's Avatar
    Join Date
    Mar 2007
    Location
    N/A
    Posts
    14,852
    Thanks
    160
    Thanked 2,223 Times in 2,210 Posts
    Blog Entries
    1
    You could use a utf collation (say utf8_general_ci) for your table and/or required fields and then set a utf8 charset for mysql connection, like
    mysql_query("SET NAMES 'utf8'");

    See http://dev.mysql.com/doc/refman/5.0/...onnection.html
    The Dream is not what you see in sleep; Dream is the thing which doesn't let you sleep. --(Dr. APJ. Abdul Kalam)

  • Users who have thanked abduraooft for this post:

    Tanner8 (02-06-2012)

  • #3
    New Coder
    Join Date
    Aug 2009
    Posts
    51
    Thanks
    9
    Thanked 0 Times in 0 Posts
    Thank you for the response and I apologize for the late reply.

    That doesn't seem to work. I included that after each connection, I made sure the appropriate fields in the table are of collation UTF8_Unicode_CI. However after escaping the input, what goes into the database for character '˝' is ñ

    So '˝'=>'├▒'

    I tried encoding to UTF8 through PHP before insertion, that didn't change anything either. Anymore ideas? Thank you.

    After more playing around, it seems that if I do an html_entity_decode I can get the ˝ out, however that is bad practice because XSS is then possible.

    It seems to have been an encoding problem with htmlEntities.

    htmlentities($reply, ENT_COMPAT, 'UTF-8');

    Allows me to use an extended set. However Asian characters are being represented in the database as squares. Any way to get that working as well? This would be better suited for the PHP section.
    Last edited by Tanner8; 02-05-2012 at 02:08 AM. Reason: More info

  • #4
    Senior Coder
    Join Date
    Dec 2005
    Location
    Slovenia
    Posts
    1,960
    Thanks
    120
    Thanked 76 Times in 76 Posts
    There are a zilion things that can affect your problem, OS, web server, .....
    In general setting mysql to utf is right thing to sort the problem there, other than mysql,
    you will waste some more time to sort all out.
    Found a flower or bug and don't know what it is ?
    agrozoo.net galery
    if you don't spot search button at once, there is search form:
    agrozoo.net galery search

  • #5
    New Coder
    Join Date
    Aug 2009
    Posts
    51
    Thanks
    9
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by BubikolRamios View Post
    There are a zilion things that can affect your problem, OS, web server, .....
    In general setting mysql to utf is right thing to sort the problem there, other than mysql,
    you will waste some more time to sort all out.
    What is the the "proper" htmlentities encoding I should use however? Changing that really alters what is stored in the database. I tried a few different flags but they all just make squares in the DB. I need to get it to put in something like &#1581 for instance.

  • #6
    Senior Coder
    Join Date
    Dec 2005
    Location
    Slovenia
    Posts
    1,960
    Thanks
    120
    Thanked 76 Times in 76 Posts
    In general I allway put this on my pages (java):

    Code:
    <%@ page language="java" contentType="text/html; charset=utf-8" pageEncoding="utf-8"%> 
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    Then you need (if having tomcat), set utf8 there too.
    That should bring you close to result.
    Found a flower or bug and don't know what it is ?
    agrozoo.net galery
    if you don't spot search button at once, there is search form:
    agrozoo.net galery search

  • #7
    Supreme Master coder! abduraooft's Avatar
    Join Date
    Mar 2007
    Location
    N/A
    Posts
    14,852
    Thanks
    160
    Thanked 2,223 Times in 2,210 Posts
    Blog Entries
    1
    However after escaping the input, what goes into the database for character '˝' is &Atilde;&plusmn;
    Could you show some relevant code? Do you convert the input to htmlentities before insertion?
    The Dream is not what you see in sleep; Dream is the thing which doesn't let you sleep. --(Dr. APJ. Abdul Kalam)

  • #8
    New Coder
    Join Date
    Aug 2009
    Posts
    51
    Thanks
    9
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by abduraooft View Post
    Could you show some relevant code? Do you convert the input to htmlentities before insertion?

    I just got it working. The reason it kept on breaking even after trying to do html entities is because I tried using "utf_encode()" while trying random things and I forgot to remove it.

    This way however, the text is NOT encoded in the database, it is just in there plainly. So Chinese characters are stored directly in there. Is that a security issue? It still encodes the important things like " < > however so should that be enough? Thank you

  • #9
    Supreme Master coder! abduraooft's Avatar
    Join Date
    Mar 2007
    Location
    N/A
    Posts
    14,852
    Thanks
    160
    Thanked 2,223 Times in 2,210 Posts
    Blog Entries
    1
    So Chinese characters are stored directly in there. Is that a security issue?
    There's nothing wrong about that. After all, they are all alphabets, like English, that come at a different location in utf table.
    Last edited by abduraooft; 02-06-2012 at 08:18 AM.
    The Dream is not what you see in sleep; Dream is the thing which doesn't let you sleep. --(Dr. APJ. Abdul Kalam)

  • Users who have thanked abduraooft for this post:

    Tanner8 (02-06-2012)


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •