Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 7 of 7
  1. #1
    New to the CF scene
    Join Date
    Jun 2002
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Question Special Character Formatting

    Does anyone know how to strip Word or other word processing formatting from an HTML text area? I can use the Replace function for each ASCII character but there has to be an easier way.

  • #2
    Regular Coder
    Join Date
    Jun 2002
    Location
    Plano, Texas
    Posts
    113
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Whenever I copy some text from HTML etc I usually paste it in NotePad and then copy again to clip board from notepad.

    This strips unwanted formatting for me.

    Hope this helps.

  • #3
    Senior Coder
    Join Date
    Jun 2002
    Location
    41° 8' 52" N -95° 53' 31" W
    Posts
    3,660
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I actually do the same thing JoeP does... seems the quickest way to me without stripping stuff you don't want.

    Unless you need to replace the stuff when retrieving it from a file dynamically - in which case you will want to use the Replace() function. If the latter is the case, I agree with Dave - have any examples?

    Here's the basic idea though:

    Code:
    myString = Replace(Replace(Replace(Replace(myString,"<",""),">",""),chr(34),""),chr(39),"")
    Which would replace <,>,",and ' with nothing in the above example.

    Last edited by whammy; 06-18-2002 at 12:24 AM.
    Former ASP Forum Moderator - I'm back!

    If you can teach yourself how to learn, you can learn anything. ;)

  • #4
    New to the CF scene
    Join Date
    Jun 2002
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Special Characters

    Trying to strip Word formatting from a cut and paste before it gets to the database. The formatting could be tabs, symbols, international alphabet characters, etc. Anything that could be cut and pasted into a text area from a word processor.

  • #5
    Regular Coder
    Join Date
    Jun 2002
    Location
    Round Rock, Texas
    Posts
    443
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Instead of trying to identify all the unwanted characters, as in the above example using replace() identify the one's you do want instead. It's much easier to define what you want than trying to define all the other possible characters that you don't want.

  • #6
    Senior Coder
    Join Date
    Jun 2002
    Location
    41° 8' 52" N -95° 53' 31" W
    Posts
    3,660
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Yeah... maybe instead of using Replace(), you could also just use a regular expression that contains the characters that are acceptable to you, and match the whole string against that, like:

    myRegExp = new RegExp

    With myRegExp
    .Pattern = "\w\s"
    .IgnoreCase = true
    .Global = True
    End with

    If myRegExp.test(MyString) = False Then
    myStringError = True
    End If

    I haven't tested that...


    Or, using another method (not NEARLY as elegant), you could make a string of characters that are acceptable, like:

    myAcceptableCharacters = ".|,|A|B|C|D|"

    etc...

    And loop through the string you're checking to see if the current character is in the string (say using a variable like CurrentCharacter), like:

    If InStr(myAcceptableCharacters, CurrentCharacter) = False Then MyError = True
    Last edited by whammy; 06-22-2002 at 12:32 AM.
    Former ASP Forum Moderator - I'm back!

    If you can teach yourself how to learn, you can learn anything. ;)

  • #7
    Senior Coder
    Join Date
    Jun 2002
    Location
    41° 8' 52" N -95° 53' 31" W
    Posts
    3,660
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Heh... that's definitely typical "Word" HTML formatting. YECH.

    HTML TIDY (or the plugin HTML TIDY that comes with HTML KIT) claims to strip all of the "Word" formatting from a WORD-->HTML page, but from what I've seen it strips almost everything, lol.

    I'm not sure how to overcome the obstacle of someone potentially pasting "Word" characters in a textarea, without using a regular expression or function of some sort.

    You might be better off, if you're not comfortable using regular expressions, to let them know they need to paste from NotePad?
    Former ASP Forum Moderator - I'm back!

    If you can teach yourself how to learn, you can learn anything. ;)


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •