...

View Full Version : Temporarily set relevent line terminators for fputscsv and fgetcsv?



MattyUK
05-30-2011, 07:01 PM
How can I temporarily set relevent line terminators for fputscsv and fgetcsv?

CR = \r = Carriage Return = Chr(13)
LF = \n = Line Feed, NewLine = Chr(10)
Windows uses CRLF for newlines and LF mid line for multi-lines within CSV feilds (ALT+ENTER in Excel).
Linux uses LF for newlines.

I want LF's supported in CSV fields, and CRLF for line terminators.
My Linux server wants to use LF's for line terminators.

I need to use fputscsv and fgetcsv to read arrays to csv and back again. The stored CSV will eventually be imported into Excel.

As you can imagine this competing requirement for LF's doesn't go well and excel imports go badly.

So to summarize I have PHP array data from an explode operation and other sources that contains LF's and I need to store it in CSV (or other suitable file container) using CRLF's for line terminators and retain compatiblity with using fputscsv and fgetcsv.

I'm struggling with how to marry this conflicting requirement over LF's and can't figure out how to set the relevent line terminators for fputscsv and fgetcsv.

I'm hoping for a full example or pointers to a tutorial that covers this mismatch. Any advice?
What is everyone else doing?
What other problems with this approach are likely?
Is there a better route forward that can support the above requirements, whilst still using a file container and not a MySQL database?
Should I write my own fputscsv and fgetcsv functions to cover the requirement or use PEAR Datagrids?


Yours hopefully
Matty

Fumigator
05-31-2011, 02:46 PM
If you convert the alt+enter markers to some other placeholder byte (regex replace), create the .csv file, then convert them back to alt+enter bytes again, does that get you through the problem?

Apart from that I'd say you may need to code your own csv handlers.

Fou-Lu
05-31-2011, 06:44 PM
If you convert the alt+enter markers to some other placeholder byte (regex replace), create the .csv file, then convert them back to alt+enter bytes again, does that get you through the problem?

Apart from that I'd say you may need to code your own csv handlers.

Own handlers may make the most sense IMO. Its trivial to use an fread and detect a \n preceded by a \r. That would make the difference between the record itself and data within the record.
Alternatives include custom Random Access Files which specify maximum and consistent size available for the strings (so each record would represent say 400bytes, and pull a benefit of a known record size so fseek can be used), and using XML to manipulate it at a dom level (fairly easy to use, simple mid-insertions, memory and storage hog). Both of these are still using a direct filesystem wrapper, so you don't need to implement a database at this point.

Fumigator
05-31-2011, 07:18 PM
Sheesh I never thought I'd see a VSAM/ISAM file again, but here you are talking about one in PHP :D

Fou-Lu
05-31-2011, 08:01 PM
Sheesh I never thought I'd see a VSAM/ISAM file again, but here you are talking about one in PHP :D

Hah, yep but RAF files are awfully handy at times. Of course, in this day and age: db's all the way.

MattyUK
06-01-2011, 03:29 AM
If you convert the alt+enter markers to some other placeholder byte (regex replace), create the .csv file, then convert them back to alt+enter bytes again, does that get you through the problem?
Well partly. I'm doing that at the moment and dropping in CR instead. OpenOffice Calc handles it but users will be using Excel which ignores CR and thus strips out the multiple lines.

When downloaded and imported into Excel there may not be a chance for the conversion back. Unless I force a download wrapper that will do the conversion at that time. If doing that I may read the CSV into array, replace CR's back with LF's and LF's back to CRLF's then use PEAR Spreadsheet_Writer to stream an XLS file.

I've also come across a Structures_Datagrid from PEAR that looked promising as it can take in an array OR XLS and export back out an XLS. I feared as the file got larger it'd have to be read fully each time and slow things down.


Apart from that I'd say you may need to code your own csv handlers.

Own handlers may make the most sense IMO. Its trivial to use an fread and detect a \n preceded by a \r. That would make the difference between the record itself and data within the record.

I agree. I will go down this route. I'd better worry about sanitizing strings and string locale encoding again. Might be back on another thread about that... never really got to grips with it.


Alternatives include custom Random Access Files which specify maximum and consistent size available for the strings (so each record would represent say 400bytes, and pull a benefit of a known record size so fseek can be used), and using XML to manipulate it at a dom level (fairly easy to use, simple mid-insertions, memory and storage hog). Both of these are still using a direct filesystem wrapper, so you don't need to implement a database at this point.
XML I am avoiding for the reasons you stated. RAF files I have never heard of but they sound like fixed width style files to some degree. I'll read up on them but the incoming data strings will vary in size and be of as yet unknown lengths.

Assuming my coming education on RAF files doesn't surprise me then writing own version of fputscsv and fgetcsv seems like the best way forward. Thanks for input. I was really hoping there was an ini settings or constant I could change for the operation then change back again afterward.

Best Wishes
Matty



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum