07-08-2009, 07:29 AM
The age old issue of string parsing comes up again ...
I have a text file that contains lines that are SUPPOSED to follow a set format, specifically:
string, string, long string int string double int

The delimiters are therefore:
Comma (,) for the first two fields
Spaces for all other fields

Strings like this would be valid:
Jon, Jack, 100 CPN 5 KTE 1.00 10
Jon, Jack 100 CPN 5 KTE 1.00 10 // notice the extra spaces

Whereas something like these would be considered invalid:
Jon Jack 100 CPN 5 KTE 1.00 10 // missing the commas
Jon, Jack, 100 CPN 5 KTE 1.00 // missing the last field "10"
Jon, Jack, 100CPN 5 KTE 1.00 10 // missing space between "100" and "CPN"

The goal is to EXTRACT each section and store them, and if possible determine when a string is INVALID (does not follow format).
I have a class with the following data members:

class A
// Record
string A
string B
long C;
string D;
string E;
string F;
double G;
int H;

A(string sLine); // constructor

A::A(string sLine)
// somehow parse the string here and determine if it is valid //

So, how can I parse the string (sLine) and extract each piece into there components (A, B, C, D, E, F, G, H)...
I was thinking of using the old method of simply doing substring searches but I find it very error prone and long ... is there a better way to accomplish this?

Anything anyone would recommend?
Any help would be much appreciated...

07-08-2009, 05:23 PM
You could do it with strtok (http://www.cplusplus.com/reference/clibrary/cstring/strtok/). First tokenize based on the comma and then that will isolate the end part of the line with the numbers and tokenize that substring based on a space.

