Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 10 of 10
  1. #1
    New Coder
    Join Date
    Mar 2007
    Posts
    30
    Thanks
    5
    Thanked 0 Times in 0 Posts

    Problem: Apostrophe to Question Mark

    Alright we are nearing our site completion at:

    www.legionofangels.net

    We are using a blog system by Greymatter.

    As we were finishing the archive styling yesterday for the blog entries out of nowhere all the ' " or apostrophes turned into Question Marks. We did a google search and found that there is an error when you use Word programs of various types to type something, then copy it into a form and post it on the web. It has something to do with Rich Text, Smart Quotes, or Curly Quotes.

    Here is a few helpful links explaining the problem:

    http://educdata.educ.ttu.edu/tempora...tequestion.htm

    http://www.webmasterworld.com/forum21/11854.htm

    ---------------------------------

    I'm sorry here people, and maybe it's because I'm a n00b or something but i cannot get over the fact that your copying text from a program into a form on the web, and how some hidden character attributes get transferred as well.

    That just insane. It makes me feel in thinking it through that the actual end users have all the power, meaning i can hit right click copy any image online and then open up an image program and hit paste and it's there. Yet why would it work in reverse?

    I would like a fix to this issue and am shocked to believe with as many genius coders in the world why there isn't a fix unless I just haven't found it yet. I mean why can't we apply a code to either the Greymatter Blog, Vbulletin forum, or all our website pages to not recognize special characters and just have things appear as plain text, eliminating any special characters.

    In short I don't understand why there isn't a "protection code" fix to stop CSS or HTML coding to read things incorrectly.

    If there really is no work around or code fix to this can someone please explain to me like I know nothing about computers at all as to why this occurs?? As well as any suggestions you may have as work arounds.

  • #2
    Regular Coder
    Join Date
    Sep 2007
    Location
    Raleigh, NC
    Posts
    273
    Thanks
    7
    Thanked 59 Times in 52 Posts
    If I understand your question right, you're frustrated because you're pasting content from a word editor into a form, and it comes up as unknown characters in the archive? (Usually a diamond with a question mark).

    Coders have thought of this, it deals with document and file encoding. More than likely greymatter converted the apostrophes to a unicode equivalent--this is often called "escaping" special characters. Similar characters like ampersands, emdashes, and quotation marks are also often escaped.

    Depending on the encoding defined on the HTML page, you could be pasting in incopatible encoded characters rather than their ascii equivalents. Also, depending on the collation of your database it could be misinterpreting the characters.

    Encoding is a vital part of localization and internationalization, and sometimes moving from place to place with different encodings can cause stuff to get lost in the translation. Especially if going from a multiple-byte encoding like UTF-8, to a single-byte encoding like WLATIN1.

  • Users who have thanked vtjustinb for this post:

    Omega (09-27-2007)

  • #3
    Senior Coder nikkiH's Avatar
    Join Date
    Jun 2005
    Location
    Near Chicago, IL, USA
    Posts
    1,973
    Thanks
    1
    Thanked 32 Times in 31 Posts
    Are you upset that formatting is preserved?
    Most of us enjoy the fact that our formatting is preserved.
    If you don't, then paste it into notepad first to remove the formatting, or use plain textareas instead of components that support rich text.

    If annoyed that the component isn't preserving is correctly, see above.

    If this post contains any code, I may or may not have tested it. It's probably just example code, so no getting knickers in a bunch over a typo, OK? If it doesn't have basic error checking in it, such as object detection or checking if objects are null before using them, put that in there. I'm giving examples, not typing up your whole app for you. You run code at your own risk.
    Bored? Visit
    http://www.kaelisspace.com/

  • #4
    New Coder
    Join Date
    Mar 2007
    Posts
    30
    Thanks
    5
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by nikkiH View Post
    Are you upset that formatting is preserved?
    Most of us enjoy the fact that our formatting is preserved.
    If you don't, then paste it into notepad first to remove the formatting, or use plain textareas instead of components that support rich text.

    If annoyed that the component isn't preserving is correctly, see above.
    Lets' test format encoding:

    Whatever, I’m watching football this afternoon.

    Going to help lay sod on Saturday.

    Have to leave in about 15 minutes.


    I opened up Word and used Bold, Italic, and Underline, for line 1, 2, and 3 respectively in those 3 sentences. It appears as I'm typing this box that it contains none of the special characteristics of the typing I did, we'll see afer I post.

    I don't see how the formatting is preserved if the characters are changed by a webpages problem with hidden smart or curly quotes than it's actually not preserving your formatting and changing the characters to compensate.

    ------------------------------------

    Thank you for your responses but my frustration lies in not knowing the proper way to attack the problem.

    Do I try to make greymatter turn everything into plain text upon blog entries?

    Do I try to add some type of editor to the webpages or code change to allow the CSS/HTML to recognize rich text as well?

  • #5
    Senior Coder
    Join Date
    Nov 2003
    Location
    Minneapolis, MN
    Posts
    2,879
    Thanks
    2
    Thanked 65 Times in 56 Posts
    HTML is a plain-text markup language. It's not a rich text editor. If you copy and paste from a Word document (especially if you copy and paste from a word document into an online rich text editor) you are, in essence, copying not only the text you see in the Word doc but also things you don't see; things that Word uses to make text "rich" and that the online editor tries its best to emulate.

    This textarea here is not a rich text editor—you need to surround text in code to add style:

    [B]bold[/B] [U]underline[/U] [I]italic[/I]

    Online rich text editors are not real rich text editors, either. They add HTML code behind the scenes, so when you type:

    I want to make these words bold.

    …it's really saving this:

    I want to make <b>these words bold</b>.

    As far as special characters (e.g., “”, –, —, ñ, etc.)—web designers will sometimes type them in their documents and save the plain text doc with a special encoding (like UTF-8) that can recognize it. If your website is coded to work with the same encoding, you will have no trouble. If you're using a CMS (like you are) the database will also need to be able to save text with that encoding; a lot of times the text is mangled while being saved to the database.

    You can always use HTML entities to get your characters across:

    Let&rsquo;s go to the store.
    Let’s go to the store.

  • Users who have thanked rmedek for this post:

    Omega (09-28-2007)

  • #6
    Senior Coder nikkiH's Avatar
    Join Date
    Jun 2005
    Location
    Near Chicago, IL, USA
    Posts
    1,973
    Thanks
    1
    Thanked 32 Times in 31 Posts
    Quote Originally Posted by Omega View Post
    Thank you for your responses but my frustration lies in not knowing the proper way to attack the problem.
    I think it depends on what you see as the problem. If you want to preserve the html formatting, or if you wanted plain text, or if you just are having an issue with a few of the special characters like smart quotes.
    If you're the primary editor and you normally use Word and it's annoying the crap out of you that this is happening, you can either change grey matter to use plain text, use a different "rich text" editor plugin that you prefer (looks like they're working on it http://greymatterforum.proboards82.c...ad=1189044953), change Word to not use smart quotes and whatnot (it's in the options) or copy and paste into notepad first to remove formatting altogether.

    As rmedek said, when you copy from Word, what is really being copied to your clipboard is Word HTML/rich text symbols. The stuff that makes smart quotes and bold text. Some of those things don't directly translate into html real well, so it can confuse some editors.

    Did that help at all?

    If this post contains any code, I may or may not have tested it. It's probably just example code, so no getting knickers in a bunch over a typo, OK? If it doesn't have basic error checking in it, such as object detection or checking if objects are null before using them, put that in there. I'm giving examples, not typing up your whole app for you. You run code at your own risk.
    Bored? Visit
    http://www.kaelisspace.com/

  • Users who have thanked nikkiH for this post:

    Omega (09-28-2007)

  • #7
    New Coder
    Join Date
    Mar 2007
    Posts
    30
    Thanks
    5
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by nikkiH View Post
    I think it depends on what you see as the problem. If you want to preserve the html formatting, or if you wanted plain text, or if you just are having an issue with a few of the special characters like smart quotes.
    If you're the primary editor and you normally use Word and it's annoying the crap out of you that this is happening, you can either change grey matter to use plain text, use a different "rich text" editor plugin that you prefer (looks like they're working on it http://greymatterforum.proboards82.c...ad=1189044953), change Word to not use smart quotes and whatnot (it's in the options) or copy and paste into notepad first to remove formatting altogether.

    As rmedek said, when you copy from Word, what is really being copied to your clipboard is Word HTML/rich text symbols. The stuff that makes smart quotes and bold text. Some of those things don't directly translate into html real well, so it can confuse some editors.

    Did that help at all?
    Yes it did help.

    We also seen that thread last night as my coder found it and that is our hope of course that greymatter can be adjusted to compensate for the characters.

    I hear what everyone is saying in regards to the rich text and such, but yes a plug in or code of some sort would easily fix the issue and that was what I was going for. I understand that you can use Notepad or change Word to recognize plain text only but I want to eliminate the chance for human error as I'm not the main editor as multiple users use the blog system. With multiple users, someone will mess up and then it's going to happen again and I don't want to do a manual edit or cleaning as that is very time consuming.

    I agree with rmedek from what I've read and it appears to be an entity code issue.

    Do you recommend I wait for greymatters update on the subject or try to change the HTML entities as suggested above?

    ------ basically asking there is a way to fix this, but which way should we do it?

  • #8
    Senior Coder
    Join Date
    Nov 2003
    Location
    Minneapolis, MN
    Posts
    2,879
    Thanks
    2
    Thanked 65 Times in 56 Posts
    Quote Originally Posted by Omega View Post
    Do you recommend I wait for greymatters update on the subject or try to change the HTML entities as suggested above?

    ------ basically asking there is a way to fix this, but which way should we do it?
    Tell your clients not to copy and paste from Word.

    I know the web is a miraculous thing and all, but some things can't be fixed by a "simple plug-in or code of some sort." If it were that easy all the time, I would make a plug-in or code of some sort that enabled my clients to believe me when I said not to copy and paste from Word.

    Besides, it's not even Greymatter's issue—they are using TinyMCE as an editor, which is a third-party WYSIWYG editor. (Which, incidentally, has an optional button that cleans up Word's formatting.) Also, AFAIK, TinyMCE doesn't have an option to convert text to smart typography.

    If you're going to use a WYSIWYG editor with your CMS, you have to be able to accept the limitations, which includes, er, copying and pasting from Word. So tell your users not to do that.

    Edit: I thought Greymatter was already using TinyMCE, which they're not but might be soon. Either way, all of the above stands—don't copy and paste from Word.
    Last edited by rmedek; 09-28-2007 at 07:13 PM.

  • #9
    Senior Coder nikkiH's Avatar
    Join Date
    Jun 2005
    Location
    Near Chicago, IL, USA
    Posts
    1,973
    Thanks
    1
    Thanked 32 Times in 31 Posts
    I want to eliminate the chance for human error
    Build a system that even a fool can use, and only a fool will use it.
    Trying to eliminate human error is an exercise in frustration. Do what you can to help, obviously, but if it's just a matter of educating the users on how to use a tool, I wouldn't kill myself trying to get around it. I'd tell them not to copy and paste from Word for now, then use the TinyMCE when you can with the button that cleans up Word's issues.

    There will always be someone who doesn't read instructions properly and messes up your stuff. It's nice to minimize it, but don't code yourself into a corner over it, if that makes sense.

    Your effort is likely better spent creating a small SQL script (pretty simple replace) to fix the data until the new editor can be used.

    If this post contains any code, I may or may not have tested it. It's probably just example code, so no getting knickers in a bunch over a typo, OK? If it doesn't have basic error checking in it, such as object detection or checking if objects are null before using them, put that in there. I'm giving examples, not typing up your whole app for you. You run code at your own risk.
    Bored? Visit
    http://www.kaelisspace.com/

  • #10
    UE Antagonizer Fumigator's Avatar
    Join Date
    Dec 2005
    Location
    Utah, USA, Northwestern hemisphere, Earth, Solar System, Milky Way Galaxy, Alpha Quadrant
    Posts
    7,691
    Thanks
    42
    Thanked 637 Times in 625 Posts
    Build a system that even a fool can use, and only a fool will use it.
    That is clever word play, but I don't really buy it. I love it when I download some freeware application and I can figure it out in 30 seconds and do what I need to do in a minute. The simpler and more "foolproof" the better imo (as long as it does what you need it to do). Of course... that might just make me a fool

    But I agree with the general premise, you don't want to paste Word text into a browser (though it does work most of the time).


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •