I have large .txt files with more than a million lines and 7 colums of float numbers per line. The columns are seperated via spaces.
Currently, I import the files by reading each line (getline), transforming the line to a stream and then storing the seven values into array variables (please see my minimal example). However, this procedure is quite slow and takes around 10 minutes for 3 million lines (500MB). This corresponds to 0.8 MB/s and is much slower than it takes to write the files. My hard drive is SSD.
Can you give me advice of how to improve the efficiency of the code?
Bests, Fabian
C++
#include
#include
#include
#include
struct Container { double a, b, c, d, e, f, g; };
void read_my_file(std::ifstream &file, Container *&data) {
std::string line;
std::stringstream line_as_stream;
unsigned int column;
unsigned long int row;
data = new Container[300000]; //dynamically allocated because the
//length is usually a user input.
for (row = 0; row < 300000; row++) {
getline(file, line);
line_as_stream.str(line);
for (column = 0; column < 7; column++) {
line_as_stream >> data[row].a;
line_as_stream >> data[row].b;
line_as_stream >> data[row].c;
line_as_stream >> data[row].d;
line_as_stream >> data[row].e;
line_as_stream >> data[row].f;
line_as_stream >> data[row].g;
}
line_as_stream.clear();
}
}
int main(void) {
Container *data = nullptr;
std::ifstream file;
file.open("./myfile.txt", std::ios::in);
read_my_file(file, data);
std::cout << data[2].b << "\n";
file.close();
return 0;
}
No comments:
Post a Comment