Go Back   CodingForums.com > :: Server side development > MySQL

Before you post, read our: Rules & Posting Guidelines

Reply
 
Thread Tools Rate Thread
Enjoy an ad free experience by logging in. Not a member yet? Register.
Old 12-12-2011, 02:11 PM   PM User | #1
Kor
Red Devil Mod


 
Kor's Avatar
 
Join Date: Apr 2003
Location: Bucharest, ROMANIA
Posts: 8,478
Thanks: 58
Thanked 379 Times in 375 Posts
Kor has a spectacular aura aboutKor has a spectacular aura about
French apostrophe problem

I have to send some data from an XML file to a MySQL table. Some text there have French apostrophes (’), and I need to replace them with single quotes (')

I tried a lot of methods, including RegExp, with no success. No matter what I tried, the resulted character into the DB remains a question mark (?)

Additional Info:

- the XML is Microsoft Spreadsheet 2003 formatted. I get it from exporting an Excel 2007 Worksheet. I don't want to encode it manually as utf-8.

- I use the DOMDocument::load() method to read the XML file in PHP

- the fields of the MySQL table have the collation : latin1_swedish_ci. I can not change that.


Any ideas? Has anyone encounter this problem and found a solution?
__________________
KOR
Offshore programming
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
Kor is offline   Reply With Quote
Old 12-12-2011, 04:12 PM   PM User | #2
Kor
Red Devil Mod


 
Kor's Avatar
 
Join Date: Apr 2003
Location: Bucharest, ROMANIA
Posts: 8,478
Thanks: 58
Thanked 379 Times in 375 Posts
Kor has a spectacular aura aboutKor has a spectacular aura about
Hm...

I ended by using this function
Code:
function convert_ascii($string) 
{ 
  // Replace Single Curly Quotes
  $search[]  = chr(226).chr(128).chr(152);
  $replace[] = "'";
  $search[]  = chr(226).chr(128).chr(153);
  $replace[] = "'";

  // Replace Smart Double Curly Quotes
  $search[]  = chr(226).chr(128).chr(156);
  $replace[] = '"';
  $search[]  = chr(226).chr(128).chr(157);
  $replace[] = '"';

  // Replace En Dash
  $search[]  = chr(226).chr(128).chr(147);
  $replace[] = '--';

  // Replace Em Dash
  $search[]  = chr(226).chr(128).chr(148);
  $replace[] = '---';

  // Replace Bullet
  $search[]  = chr(226).chr(128).chr(162);
  $replace[] = '*';

  // Replace Middle Dot
  $search[]  = chr(194).chr(183);
  $replace[] = '*';

  // Replace Ellipsis with three consecutive dots
  $search[]  = chr(226).chr(128).chr(166);
  $replace[] = '...';

  // Apply Replacements
  $string = str_replace($search, $replace, $string);

  // Remove any non-ASCII Characters
  //$string = preg_replace("/[^\x01-\x7F]/","", $string);

  return $string; 
}
It solves almost all my problems. I have also to solve the problems of the ligatures, but that is easy.

Do you have any other ideas?
__________________
KOR
Offshore programming
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
Kor is offline   Reply With Quote
Reply

Bookmarks

Jump To Top of Thread


Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 04:59 PM.


Advertisement
Log in to turn off these ads.