cfructose
02-20-2007, 02:21 PM
I'm using Chinese, Russian, Hebrew & Greek characters on most pages of a site (along with Latin).
I therefore need to encode in UTF-8, and am using the meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Fine up to here (I think), BUT the BOM is being displayed on some browsers resulting in either blank space or the dreaded "i-diaresis, right-angle-quotation-mark, inverted question mark" rearing its ugly head.
This screws up my formatting, nudging graphics around etc - the problem being compounded by the high number of scattered php includes to data files which ALSO require multiple alphabets, leading to more blank spaces and gobbledygook throughout each page, not just at the top.
I HAD been using Notepad2 as a text editor (shoot me!), and later learner that Notepad was automatically adding a BOM, and so went about removing the EF BB BF at the beginning of each page manually with a hex editor.
(I've also changed text editor to one that gives me the option!) :-)
Having removed the BOMs (aka "Bane Of My" existence), my formatting is back on track when viewed in, say, Firefox, but the browser is not automatically rendering the pages as UTF-8.
How do I get round this?
Yes, I can instruct users to change the encoding form the view menu, but even if they're not scared off by that (simple as it is), they'd have to change it every time they visited a new page, which is clearly an unacceptable nuisance.
So:
Does the BOM override the meta tag I mentioned above? - I.e. is there a hierarchy?
(It would seem that the tag does nothing, now that the BOMs have been removed)
How can I retain browsers' ability to detect that I want UTF-8 while preventing certain ones from wreaking havoc with my layout when they print the BOM?
I'm desperately hoping that I've fundamentally misunderstood how it all works (despite the inordinate number of pages I've been reading on the subject) and that there's a simple solution: pleeeeeeeez somebody come and tell me I'm being an idiot...!
I therefore need to encode in UTF-8, and am using the meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Fine up to here (I think), BUT the BOM is being displayed on some browsers resulting in either blank space or the dreaded "i-diaresis, right-angle-quotation-mark, inverted question mark" rearing its ugly head.
This screws up my formatting, nudging graphics around etc - the problem being compounded by the high number of scattered php includes to data files which ALSO require multiple alphabets, leading to more blank spaces and gobbledygook throughout each page, not just at the top.
I HAD been using Notepad2 as a text editor (shoot me!), and later learner that Notepad was automatically adding a BOM, and so went about removing the EF BB BF at the beginning of each page manually with a hex editor.
(I've also changed text editor to one that gives me the option!) :-)
Having removed the BOMs (aka "Bane Of My" existence), my formatting is back on track when viewed in, say, Firefox, but the browser is not automatically rendering the pages as UTF-8.
How do I get round this?
Yes, I can instruct users to change the encoding form the view menu, but even if they're not scared off by that (simple as it is), they'd have to change it every time they visited a new page, which is clearly an unacceptable nuisance.
So:
Does the BOM override the meta tag I mentioned above? - I.e. is there a hierarchy?
(It would seem that the tag does nothing, now that the BOMs have been removed)
How can I retain browsers' ability to detect that I want UTF-8 while preventing certain ones from wreaking havoc with my layout when they print the BOM?
I'm desperately hoping that I've fundamentally misunderstood how it all works (despite the inordinate number of pages I've been reading on the subject) and that there's a simple solution: pleeeeeeeez somebody come and tell me I'm being an idiot...!