Hello and welcome to our community! Is this your first visit?
Register
Enjoy an ad free experience by logging in. Not a member yet? Register.
Results 1 to 2 of 2
  1. #1
    Kor
    Kor is offline
    Red Devil Mod Kor's Avatar
    Join Date
    Apr 2003
    Location
    Bucharest, ROMANIA
    Posts
    8,478
    Thanks
    58
    Thanked 379 Times in 375 Posts

    French apostrophe problem

    I have to send some data from an XML file to a MySQL table. Some text there have French apostrophes (), and I need to replace them with single quotes (')

    I tried a lot of methods, including RegExp, with no success. No matter what I tried, the resulted character into the DB remains a question mark (?)

    Additional Info:

    - the XML is Microsoft Spreadsheet 2003 formatted. I get it from exporting an Excel 2007 Worksheet. I don't want to encode it manually as utf-8.

    - I use the DOMDocument::load() method to read the XML file in PHP

    - the fields of the MySQL table have the collation : latin1_swedish_ci. I can not change that.


    Any ideas? Has anyone encounter this problem and found a solution?
    KOR
    Offshore programming
    -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*

  • #2
    Kor
    Kor is offline
    Red Devil Mod Kor's Avatar
    Join Date
    Apr 2003
    Location
    Bucharest, ROMANIA
    Posts
    8,478
    Thanks
    58
    Thanked 379 Times in 375 Posts
    Hm...

    I ended by using this function
    Code:
    function convert_ascii($string) 
    { 
      // Replace Single Curly Quotes
      $search[]  = chr(226).chr(128).chr(152);
      $replace[] = "'";
      $search[]  = chr(226).chr(128).chr(153);
      $replace[] = "'";
    
      // Replace Smart Double Curly Quotes
      $search[]  = chr(226).chr(128).chr(156);
      $replace[] = '"';
      $search[]  = chr(226).chr(128).chr(157);
      $replace[] = '"';
    
      // Replace En Dash
      $search[]  = chr(226).chr(128).chr(147);
      $replace[] = '--';
    
      // Replace Em Dash
      $search[]  = chr(226).chr(128).chr(148);
      $replace[] = '---';
    
      // Replace Bullet
      $search[]  = chr(226).chr(128).chr(162);
      $replace[] = '*';
    
      // Replace Middle Dot
      $search[]  = chr(194).chr(183);
      $replace[] = '*';
    
      // Replace Ellipsis with three consecutive dots
      $search[]  = chr(226).chr(128).chr(166);
      $replace[] = '...';
    
      // Apply Replacements
      $string = str_replace($search, $replace, $string);
    
      // Remove any non-ASCII Characters
      //$string = preg_replace("/[^\x01-\x7F]/","", $string);
    
      return $string; 
    }
    It solves almost all my problems. I have also to solve the problems of the ligatures, but that is easy.

    Do you have any other ideas?
    KOR
    Offshore programming
    -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*


  •  

    Posting Permissions

    • You may not post new threads
    • You may not post replies
    • You may not post attachments
    • You may not edit your posts
    •