...

View Full Version : C++: Ignoring spaces when reading a file?



rjoiram
06-25-2007, 09:37 PM
All,

I have delclared a struct, and a vector with that structs type.
As my program reads the file the vector's size is increased by one, and the information which is supposed to be separated by tabs is put into the attributes(?) of the struct:



while(!inFile.eof()){

areas.resize(areas.size()+1);
inFile>>areas[i].mc;
inFile>>areas[i].st;
inFile>>areas[i].ad;
inFile>>areas[i].lp;
inFile>>areas[i].cp;
inFile>>areas[i].cd;
inFile>>areas[i].sb;
inFile>>areas[i].dm;
inFile>>areas[i].tp;
inFile>>areas[i].yb;
inFile>>areas[i].sz;
inFile>>areas[i].br;
inFile>>areas[i].fb;
inFile>>areas[i].hb;

}


The problem is that some of the data includes spaces and therefore treated as a separate element. My question: Is there anyway to ignore spaces and only read the file by tabs (\t)?

Any help would be greatly appreciated,
Mario

ralph l mayo
06-25-2007, 10:42 PM
You may find getline() to work better for you than >> since it accepts a delimiter argument.

Or, you could overload operator>> for the type of your records, in which case the code you posted should start working like you'd like. Here's a simple example assuming string records:



#include <iostream>
#include <fstream>
#include <string>
#include <sstream>

std::istream& operator>>(std::istream& in, std::string& str)
{
char ch;
std::stringstream ss;

while (in)
{
in.get(ch);
// Return a record at the delimiter
if (ch == '\t')
{
str = ss.str();
return in;
}
// Accumulate chars
ss << ch;
}

// Ran out of stream, this record may be incomplete
str = ss.str();
return in;
}

int main(int argc, char* argv[])
{
if (argc != 2)
{
std::cout << argv[0] << " <filename>" << std::endl;
return 0;
}

std::ifstream ifh(argv[1]);
if (!ifh.good())
{
std::cerr << "Couldn't open " << argv[1] << " for reading." << std::endl;
return 1;
}

std::string str;
while (ifh)
{
ifh >> str;
std::cout << str << std::endl;
}

return 0;
}

rjoiram
06-26-2007, 09:16 PM
Hey, thanks for the reply,

If I can figure it out I think I'll use getline(), I don't want to change >> because I'm also collecting data from the user. The thing I can't figure out is how to get the getline() to work.



string s;
while(getline(inFile, s)){
//store lines of txt file here
}


But then how do I separate it into a struct based on tabs? (the tab character, \t)

-Thanks for help

ralph l mayo
06-26-2007, 11:21 PM
use the three argument form: getline(source, dest, '\t')

(I don't know why I didn't use that instead of the character loop in the post above, especially since I mentioned it in the post :confused: )

rjoiram
06-27-2007, 03:17 AM
OK,

The problem is that each line in the text file is a record, and each record within the record is separated by a tab. So basically I need to read the file line by line and then separate each record out by tabs.

The default third parameter of getline() is defaulted to newline unless changed, in this case a tab. So there is a problem because the data needs to first be separated by \n then \t ignoring the spaces ' '.

I'm sorry but I guess my original post was confusing and this was probably how I should have asked in the first place.

Additional help greatly appreciated,
-Mario

rjoiram
06-27-2007, 04:01 AM
Again... OK,

I've been working on this using getline():



#include<iostream>
#include<fstream>
#include<vector>
#include<stdlib.h>
#include<string>
using namespace std;

struct tests {

string one;
string two;
string three;

};

int main() {

ifstream inFile("test.txt");
vector<string> lines(0);

string s;
int i = 0;

// Separate by newline.
while(getline(inFile, s)){

lines.resize(lines.size()+1);
lines[i] = s;
i++;

}

// Divide into tabs.
vector<tests> test(lines.size());

for(int j=0; j<test.size(); j++){

string temp = "";

bool oneset = false;
bool twoset = false;
bool threeset = false;

for(int x=0; x<lines[j].size(); x++){

if(oneset == false){

if(lines[j][x] == '\t'){

test[j].one = temp;
oneset = true;

} else {

temp += lines[j][x];

}

} else if(oneset == true && twoset == false){

if(lines[j][x] == '\t'){

test[j].two = temp;
twoset = true;

} else {

temp += lines[j][x];

}

} else if(oneset == true && twoset == true && threeset == false){

if(lines[j][x] == '\t'){

test[j].two = temp;
threeset = true;

} else {

temp += lines[j][x];

}

}

}

}

// Reasign null values.
for(int q = 0; q<test.size(); q++){

if(test[q].one == ""){

test[q].one == "_";

}

if(test[q].two == ""){

test[q].two == "_";

}

if(test[q].three == ""){

test[q].three == "_";

}

}

// Output the data.
for(int p = 0; p<test.size(); p++){

cout<<"Data "<<p<<": One: "<<test[p].one<<" Two: "<<test[p].two<<" Three: "<<test[p].three<<endl;

}

// Add a pause so the prompt doesn't close.
int w;
cin>>w;

return(0);

}


It still doesn't work and I can't figure it out. For some reason I can't select and copy the output so I took a screenshot.

5397

The text file looks like this:



row 1 col 1 row 1 col 3
row 2 col 2 row 2 col 3
row 3 col 1 row 3 col 1


And here is the same but with a '|' after each tab.



row 1 col 1 | |row 1 col 3
|row 2 col 2 |row 2 col 3
row 3 col 1 |row 3 col 1 |


The over all problem seems to be... well I can't even figure out whats going wrong.

Appreciatively,
-Mario

ralph l mayo
06-27-2007, 08:53 AM
Give this a shot



#include <iostream>
#include <fstream>
#include <string>
#include <vector>

typedef std::vector<std::string> Columns;

Columns split_record(const std::string& record)
{
Columns rv;
std::string::size_type last_pos = 0;

for (
std::string::size_type pos = record.find_first_of('\t');
pos != record.npos;
pos = record.find_first_of('\t', last_pos))
{
// I'm not sure if the "_" for blank columns is just for debug or
// not, if so you can remove the ternary and just push the substr
rv.push_back(
(pos - last_pos == 0)
? "_"
: record.substr(last_pos, pos - last_pos)
);
last_pos = pos + 1;
}

rv.push_back(
(record.length() - last_pos == 0)
? "_"
: record.substr(last_pos, record.length() - last_pos)
);
return rv;
}

int main(int argc, char* argv[])
{
if (argc != 2)
{
std::cout << argv[0] << " <filename>" << std::endl;
return 0;
}

std::ifstream ifh(argv[1]);
if (!ifh.good())
{
std::cerr << "Couldn't open " << argv[1] << " for reading." << std::endl;
return 1;
}

int rec_num, col_num = 0;
std::string str;
Columns cols;
while (getline(ifh, str))
{
std::cout << "Record " << ++rec_num << ": " << std::endl;
std::cout << str << std::endl;
std::cout << "\tColumns: " << std::endl;
cols = split_record(str);
col_num = 0;
for (Columns::const_iterator ci = cols.begin(); ci != cols.end(); ++ci)
{
std::cout << "\t\t" << ++col_num << ": " << *ci << std::endl;
}
}

return 0;
}

rjoiram
06-27-2007, 04:16 PM
OK, so I tried this and it compiles without errors, but then when it runs the prompt closes before I can see anything. Any ideas?

I tried to put this at the end:



int pause;
std::cin>>pause;


But still no luck.

Thanks,
Mario

ralph l mayo
06-27-2007, 08:46 PM
I don't know, it pauses for me when I add that before return 0;

You might consider just opening a command prompt and running it from there, where it won't close afterwards regardless.

edit: it probably closes because you didn't provide a filename or it couldn't open the file. Am I doing homework here? Seemed legit to begin with :[

Gox
06-27-2007, 09:32 PM
If you're using Dev C++ try putting a system("PAUSE") in your main. This may help to keep the console window open.



int main()
{
//code

system("PAUSE");
return 0;
}

rjoiram
06-27-2007, 10:40 PM
I don't know, it pauses for me when I add that before return 0;

You might consider just opening a command prompt and running it from there, where it won't close afterwards regardless.

edit: it probably closes because you didn't provide a filename or it couldn't open the file. Am I doing homework here? Seemed legit to begin with :[

No homework, school ended weeks ago...

rjoiram
06-27-2007, 10:41 PM
If you're using Dev C++ try putting a system("PAUSE") in your main. This may help to keep the console window open.



int main()
{
//code

system("PAUSE");
return 0;
}


I am using Dev C++ and I added system("PAUSE") but still no luck.

rjoiram
06-27-2007, 10:47 PM
OK, I tried to output the results to a file but when I use the ofstream object but my compiler complained: "`ofstream' undeclared (first use this function) ". I checked and fstream is included so I don't know whats wrong.

ralph l mayo
06-27-2007, 10:57 PM
std::ofstream

rjoiram
06-27-2007, 11:08 PM
That compiled, but the text file is empty...

I simply changed std::cout to outFile (which is my ofstream object)



while (getline(ifh, str))
{
outFile << "Record " << ++rec_num << ": " << std::endl;
outFile << str << std::endl;
outFile << "\tColumns: " << std::endl;
cols = split_record(str);
col_num = 0;
for (Columns::const_iterator ci = cols.begin(); ci != cols.end(); ++ci)
{
outFile << "\t\t" << ++col_num << ": " << *ci << std::endl;
}
}


I've been doing a lot of research and something which may be easier is to instead of trying to split up the information straight from the read file would be to store all of the information then split it up by tabs from there?

-Mario

ralph l mayo
06-28-2007, 02:48 AM
It's difficult to tell where you're going wrong when you don't post all the code.

The last sample I posted *does* read in the data by line and then slit by tabs.

Try changing the code in my last post so that the main method looks like this (leave the utility function as-is):



int main(int argc, char* argv[])
{
if (argc < 2)
{
std::cout << argv[0] << " infile [outfile]" << std::endl;
return 0;
}

std::ifstream ifh(argv[1]);
if (!ifh.good())
{
std::cerr << "Couldn't open " << argv[1] << " for reading." << std::endl;
return 1;
}

std::ostream* out = (argc > 2) ? new std::ofstream(argv[2]) : &std::cout;
if (!out->good())
{
std::cerr << "Couldn't open output" << std::endl;
return 1;
}

int rec_num, col_num = 0;
std::string str;
Columns cols;
while (getline(ifh, str))
{
*out << "Record " << ++rec_num << ": " << std::endl
<< str << std::endl
<< "\tColumns: " << std::endl;
cols = split_record(str);
col_num = 0;
for (Columns::const_iterator ci = cols.begin(); ci != cols.end(); ++ci)
{
*out << "\t\t" << ++col_num << ": " << *ci << std::endl;
}
}
if (out != &std::cout)
delete out;

return 0;
}


You have to supply an infile on the command line, and if you supply an outfile it is written to instead of stdout:


steve@lee:~/devel/tmp$ cat in
1, 1 1, 2 1, 3
2, 3
3, 2
steve@lee:~/devel/tmp$ g++ -Wall -pedantic ./rf.cpp -o rf && ./rf in
Record 1:
1, 1 1, 2 1, 3
Columns:
1: 1, 1
2: 1, 2
3: 1, 3
Record 2:
2, 3
Columns:
1: _
2: _
3: 2, 3
Record 3:
3, 2
Columns:
1: _
2: 3, 2
3: _
steve@lee:~/devel/tmp$ ./rf in out
steve@lee:~/devel/tmp$ cat out
Record 1:
1, 1 1, 2 1, 3
Columns:
1: 1, 1
2: 1, 2
3: 1, 3
Record 2:
2, 3
Columns:
1: _
2: _
3: 2, 3
Record 3:
3, 2
Columns:
1: _
2: 3, 2
3: _

rjoiram
06-29-2007, 01:09 AM
I was still having a lot of problems mostly resulting from the "std::" I don't really understand what it is or does, and thats now how we learned it in school...

I did have success however with Boost's library if you want to try that: http://www.boost.org

I got it to work in Dev C++ by creating a new project, modifying the properties and under the Directories tab in the Include Directories tab adding the directory where the Boost Installer installed the libraries.

Thanks for all your help,

Mario

ralph l mayo
06-30-2007, 08:23 PM
The :: operator is for namespace resolution, prefixing std:: to something tells the compiler to look for it in the namespace std. If it's not familiar with it from school you probably used 'using namespace std;' at the top of your programs to dump everything from std into the main namespace, which has its pitfalls. (http://www.parashift.com/c++-faq-lite/coding-standards.html#faq-27.5)

rjoiram
06-30-2007, 11:50 PM
That was actually interesting. Yes, we used "using namespace std;" at school. I have a question though:

At school we only added using namespace std; when the <string> library was included. But for other programs we didn't have to type out std::cout either. Do you know why?

By the way, I can actually understand some of your code now. :)

Thanks,
Mario

ralph l mayo
07-01-2007, 12:31 AM
Did you include <iostream> or <iostream.h>? The latter uses global references and makes unadorned cout and the like acceptable, but it's deprecated and not standard compliant in modern C++.

edit: iostream.h has been deprecated in C++ for "more than five years" (http://www.devx.com/tips/Tip/14447), and the dateline from that article is 2001, pushing it back to more than a decade ago. But I guess a lot of the time when you learn a language in school you learn a time-capsule version from whenever the instructor got their experience in it.

rjoiram
07-01-2007, 02:21 AM
Yeah, we used <iostream.h>.
Wow, I hate my teacher, she taught us all the wrong stuff... :mad:

Thanks for all your help and replies,
Mario



EZ Archive Ads Plugin for vBulletin Copyright 2006 Computer Help Forum