Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 7 of 7
  1. #1
    New Coder
    Join Date
    Oct 2006
    Posts
    10
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Exclamation Weird Symbols on Parse

    Hi,
    I am using SimplePie to parse RSS feeds and for some reason I keep getting symbols like —. I need this to stop! I am using PHP to strip all HTML tags except for
    Code:
    <a>
    and
    Code:
    <p>
    . How can I get these symbols to parse as regular Roman characters or to not appear at all? Please help!

    Thanks,
    Matthew

  • #2
    New Coder
    Join Date
    Oct 2006
    Posts
    10
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Anyone?

  • #3
    Senior Coder
    Join Date
    Jan 2007
    Posts
    1,648
    Thanks
    1
    Thanked 58 Times in 54 Posts
    I am using SimplePie to parse RSS feeds and for some reason I keep getting symbols like â€
    I have no experience with using SimplePie, but I do know that those symbols usually appear with string encoding problems. Check your string encoding settings to see that they match up with what you are feeding it.

  • #4
    New Coder
    Join Date
    Mar 2007
    Posts
    63
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by hackmeanscode View Post
    Hi,
    I am using SimplePie to parse RSS feeds and for some reason I keep getting symbols like —. I need this to stop! I am using PHP to strip all HTML tags except for
    Code:
    <a>
    and
    Code:
    <p>
    . How can I get these symbols to parse as regular Roman characters or to not appear at all? Please help!

    Thanks,
    Matthew
    Are you by any chance getting these RSS feeds from Google? A friend was getting the same thing, and it turned out to be a problem on Google's end. Last I knew, as of a few weeks ago, he was still waiting for them to fix it. I think it might be fixed now.. but am not certain.

    HTH,
    Bob
    Last edited by RTrev; 03-19-2007 at 05:40 PM.
    Think slow, type fats

  • #5
    New Coder
    Join Date
    Oct 2006
    Posts
    10
    Thanks
    0
    Thanked 0 Times in 0 Posts
    It's occurring quite randomly. It's even happening on Daily Kos feeds, LA Times feeds, etc... By the way, I'm using the feed parsing for something I'm making called Tudit. You can see an example of the weird characters I'm talking about at: http://tudit.com/hihi/channel.php?q=politics

  • #6
    Senior Coder
    Join Date
    Jan 2007
    Posts
    1,648
    Thanks
    1
    Thanked 58 Times in 54 Posts
    Example: "WASHINGTON — One day last week, the entire Federal"

    The 'weird symbol' here represents the long dash. As I said before, this is an encoding issue. For example, UTF-8 vs. ISO-8859-1. It's reading it as the wrong character encoding (single byte vs. double byte), hence it comes out with extra weird characters.

  • #7
    New Coder
    Join Date
    Oct 2006
    Posts
    10
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I've checked into SimplePie and it looks like it's made to handle both types of encoding! Help!


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •