Spudhead
08-20-2007, 02:10 PM
I just don't understand this character encoding thing. Never have. ANSI, ASCII, UTF, Unicode... it may as well be Greek...
So: my simple CMS lets admin users type content into a textarea. Before it goes into the database, I take care of any dodgy chars (ie: single quotes) with escape(). When it comes out, I replace "%0D%0A" with a couple of <br/> tags, then unescape() it and dump it all on the page.
Generally, that's fine. However - one client uses a Mac to update her site. It's not causing me any problems as such, although the double quotes look a bit...odd. But she's saying that some chars are getting replaced with those "I don't know what char this is supposed to be" question mark symbols.
To clarify (hopefully - I hope the forum software doesn't do exactly what I'm trying to and fixes the dodgy char):
- Client pastes a “ into textarea.
- I escape() it. Apparently <%=escape("“")%> returns %E2%u20AC%u0153. :confused:
- <%=asc("“")%> returns 226
- I try to fix it with output= replace(output,"“", "“") - which does, it seems, nothing. :confused:
So... can anyone explain to me, preferably in words of two syllables or less, what the nuts is going on and how to fix it? It is character encodings? Is it locale ID's? It is ANSI or Unicode? What is it? :confused:
How the chuff do I find these things and replace then with something... standard?
So: my simple CMS lets admin users type content into a textarea. Before it goes into the database, I take care of any dodgy chars (ie: single quotes) with escape(). When it comes out, I replace "%0D%0A" with a couple of <br/> tags, then unescape() it and dump it all on the page.
Generally, that's fine. However - one client uses a Mac to update her site. It's not causing me any problems as such, although the double quotes look a bit...odd. But she's saying that some chars are getting replaced with those "I don't know what char this is supposed to be" question mark symbols.
To clarify (hopefully - I hope the forum software doesn't do exactly what I'm trying to and fixes the dodgy char):
- Client pastes a “ into textarea.
- I escape() it. Apparently <%=escape("“")%> returns %E2%u20AC%u0153. :confused:
- <%=asc("“")%> returns 226
- I try to fix it with output= replace(output,"“", "“") - which does, it seems, nothing. :confused:
So... can anyone explain to me, preferably in words of two syllables or less, what the nuts is going on and how to fix it? It is character encodings? Is it locale ID's? It is ANSI or Unicode? What is it? :confused:
How the chuff do I find these things and replace then with something... standard?