View Full Version : Encoding vs. charset

05-05-2006, 10:28 PM
I am not quite clear the difference between encoding and charset in the meta. I know the encoding tells the browser how to display the content of the page. What is the role of the charset than?

05-06-2006, 12:50 AM
charset tells the browser which 'character set' to use. utf-8 has a broader range of characters than 8859. Most of the characters work in both but there are differences.

What do you mean encoding in the meta? Example?

05-06-2006, 01:31 AM
If you're talking about this:

<meta http-equiv="content-type" content="text/html; charset = utf-8"/>

In this context, charset specifies character encoding so encoding would would be the same thing. If you're asking why it's needed, it's because different characters are included in different character sets. Take for example these "normal" special characters and Japanese kanji:

café ćther ©® 日番谷冬獅郎

If you change the character encoding under your browser's View menu, you'll see that the that the special characters (letter e with acute, letter ae, copyright symbol, and registered symbol) will appear as question marks (Firefox) or gibberish (Internet Explorer) because those character sets won't display those characters as typed. You'll notice (if you have the language installed) that the Japanese characters remain constant over every encoding. This is because they were re-encoded into numerical character entities by the forum since under this page's encoding they would not have displayed otherwise. When you use the entities they display in every encoding; that's why they say it's safest to use that for special characters. However, using character entities can be time-consuming and confusing so generally you want to be able to type in your native language freely with minimal entity use, so you specify an encoding. Keep in mind that not everyone browsing the internet speaks English or is Western, so their default encodings might not match what you expect your website to be viewed in, yet another reason to use one. The European Western encoding (applies to the US) is ISO-8859-1. General purpose encoding is UTF-8.

For the most part websites will display correctly anyway in modern browsers because of their auto-detection abilities (if it's turned on) so you don't need to worry about it but it's good to be safe if you plan on using any characters on the standard American keyboard.