...

View Full Version : Encoding nightmare -- how to fix?



schwarznavy
05-19-2008, 01:38 PM
Hello,

I started using ListGarden to create RSS Feeds and html pages along with the feeds.

I believe ListGarden encodes the RSS Feeds in UTF-8.

I don't know much about encoding, but when I started mixing in the pages based on those RSS feeds in with my site, I started seeing weird characters like question marks instead of dashes and quotes, and – instead of dashes.

So not really knowing what I was doing, I tried to fix it by going into FrontPage 2003's "Site Settings" and changing the language.

Anyways, things are still a mess.

I decided it might be best to just change everything into UTF-8 (I think I was Western European before) and if I have to go through and fix every little character now, it is best to do it now before the site gets bigger.

Does anyone have any tips on how to do that? I can't just do a blind search for question marks because some question marks might actually be meant to be question marks.

Is there a good way to do this? How can I use FrontPage 2003 to change the encoding on every single page??

Thank you.
Matthew

Apostropartheid
05-19-2008, 04:10 PM
Do a search for the original characters. There's a good chance it will stay like that when you open the document as ISO/IEC 8859-1 (like Notepad does a lot). Replace them with HTML entities, like ’ or —.

If possible, try overriding the character set in ListGarden to iso-8859-1 or windows-1252. That will sort it out entirely.

Arbitrator
05-20-2008, 12:14 AM
I don't know much about encoding, but when I started mixing in the pages based on those RSS feeds in with my site, I started seeing weird characters like question marks instead of dashes and quotes, and – instead of dashes.I would imagine that whats happening is that you have characters from a Windows-1252‐encoded document being displayed as UTF-8 due to an inaccurate meta element, XML declaration, or HTTP header. (Note that you can also have a document claiming to be ISO-8859-1 thats actually Windows-1252‐encoded and not realize it since Web browsers treat them as if they were the same; thus, you could have the same problem even if a document is labeled as ISO-8859-1.)

I would try setting your document editor to read Windows-1252‐encoded (also referred to as ANSI‐encoded) documents. If the characters display correctly, try re‐encoding your document as UTF-8. (How you do this depends upon your editor; some allow you do configure the encoding during the editing process and others allow you to do it during the save process.)



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum