[update: As I hoped, there is a simple answer, getline(), see comments, thank you Charles Lecklider]
The classic is fgets(), it is simple, and easy to use...
Of course, for some reason, fgets() gives you the line endings, so I usually end up with more like.
The problem, of course, is you have a line length. This is also an advantage in that you constrain the lines and don't have random memory allocation issues, but computers have so much memory and VM these days. How many times have I seen this code, and seen someone have to change 1000 to 10000 one day?
What I would like is a simple function that reads a line and mallocs space as needed. Indeed, it could return the allocated space or NULL for error (EOF, or malloc fail). You'd have to free it, but no big issue. Would also be nice if (a) it stripped the line ending as I literally NEVER want that in the line, and (b) seamlessly handled bloody DOS style carriage returns...
So whilst trying to explain some basic C to my mates, whilst at sea, in the middle of the Atlantic, I tried to explain this whilst making a simple CSV file parsing program for them. We did some googling, and found that I am not alone in trying to find such a function. It seems that fscanf() may be the answer. [update: clearly I did not google well enough!]
To be honest fscanf() is a function I just don't use enough. It is very powerful, but I always find myself parsing things more directly. However, I had not considered it as a means to just get a line.
The magic incantation is something like...
This reads any characters up to a newline, allocates space (that is what m is), and stores in line. Just what we need. A minor variation to handle carriage returns seems to work too...
Bingo, we have our magic line malloc file reader function. Perfect.
And get this, reading the man page it is clear that using the [ function does not consume the leading white space, which is perfect... So all good
Except that is not what happens. We did the CSV stuff, and then went on to TSV (tab separated) and magically leading TABs (i.e. empty first field) were stripped by fscanf()
Why?!?!?!?!?!
Please someone tell me I am being thick and that there is a standard function to do just this. Yes, I could write my own, but this is surely so basic it should be standard C library stuff.
[code mistakes in examples left in for the reader to find]
Subscribe to:
Post Comments (Atom)
Trying Tindie
So some good news, it is worked. I tried Tindie for the "coasters", listed 5 of them, and by the end of the day all sold and shipp...
-
Broadband services are a wonderful innovation of our time, using multiple frequency bands (hence the name) to carry signals over wires (us...
-
For many years I used a small stand-alone air-conditioning unit in my study (the box room in the house) and I even had a hole in the wall fo...
-
It seems there is something of a standard test string for anti virus ( wikipedia has more on this). The idea is that systems that look fo...
Perhaps separate the functions of reading a line & then processing the resulting string? Essentially this is how it works in Python:
ReplyDeleteimport fileinput
for line in fileinput.input()
line = line.strip()
...
or other ways of processing the line. Python has a lot of methods on string objects for munging them - better than the standard ones in C. If you are writing little scripts to do stuff then Python is the way to go unless speed or volume is of the essence. Even then libraries with C under the hood help a lot - I'm just now using FFTs in numpy whereas I would have used fftw3 in C previously.
ssize_t getline(char **lineptr, size_t *n, FILE *stream);
ReplyDeleteAfter the call n contains the length of the line so stripping the line ending is always O(n) - where n in this case is the number of terminating chars.
It'll malloc() or even realloc() - I think it ticks all the boxes you're looking for.
Indeed: it's designed so that you can just call it in a loop with the same buffer and n over and over again:
Deletesize_t size = 0;
char *line = NULL;
while ((size = getline (&line, &size, stream)) >= 0)
...
/* then check for feof(), ferror() etc. */
It does indeed work perfectly. Looks like 2008, so long after I learned C, which is what sometimes catches me out. Thank you - just what I hoped for.
DeleteIt was very annoying when it was introduced, because, y'know, getline() was not a reserved identifier before POSIX.1 2008, and it's a pretty obvious name, so a *lot* of programs had used it, as a function name, as a variable name, you name it... and they all broke. Nice forward planning!
Delete(... oh, obviously, you'll probably want to free() the buffer after the loop is over, too. It gets reused repeatedly by getline, so if you stash it somewhere, remember to strdup() it first. Yes, I made that mistake.)
Delete// One of the many reason I love C++!
ReplyDelete#include
#include
#include
int main(int argc, char** argv) {
if (argc < 2) {
std::cout << "Usage:" << argv[0] << " \n";
return -1;
}
std::string line;
std::ifstream infile(argv[1]);
if (infile) {
while (getline(infile, line)) {
std::cout << line << '\n';
}
}
infile.close();
return 0;
}
And you really think that someone besides you will find this more readable than Nick's example above?
DeleteC++ most often transforms simple C code to inelegant garbage made of concatenation of unlikely operators, without bringing any value, except providing you some extra time to drink your coffee while it compiles of course.