File input and output¶
Writing data to files works in essentially the same way as writing to the terminal. In fact, writing to files in C is very similar to writing to files in Python (or perhaps the reverse is more appropriate).
Just as we did in the Python section, we will limit our discussion to writing ASCII data. Writing binary data is actually identical in behavior on all POSIX systems. As usual, Windows does it’s own strange thing, which doesn’t concern us here.
The stdio.h
header that we’ve been including to write to the terminal is
also the appropriate header to use for file I/O. This header provides a
typedef
called FILE
that absorbs any platform specific considerations.
Opening a file with read permissions proceeds as:
const char* fname = "filename.ext";
FILE *fp1 = fopen(fname, "r");
/* Or */
FILE *fp2 = fopen("anotherOne.ext", "r");
Note that we are declaring a pointer to the type FILE
. This is to allow
fp
to be passed by reference to any functions that interact with it.
Closing the file associated to a file pointer proceeds as:
fclose(fp);
Note the similarity between fopen
/ fclose
, and malloc
/ free
.
As mentioned in the dynamic memory section, I would advise always writing the
function fclose
as soon as you write fopen
to help prevent (some) bugs
from coming up.
Files can be opened in different modes by changing the second argument. The supported modes are:
"r"
to read from a file"w"
to write to a file"a"
to append a fileIf no file exists with the given name this operates as
"w"
"r+"
to read from a file (extended)"w+"
to write to a file (extended)"a+"
to append a file (extended)
The extended modes allow mixed input/output access, but have some touchy stipulations that we aren’t going to talk about here.
Formatted input/output¶
Just like printf
and scanf
allow us to write to (read from) the
terminal, the functions fprintf
and fscanf
allow us to write to (read
from) files. They behave exactly as before, but now a file pointer is given
before the remaining arguments.
Let’s start by writing some files out. Consider this sample code:
1/* File: fileOut.c 2 * Author: Ian May 3 * Purpose: Demonstrate writing to a file 4 */ 5 6#include <stdlib.h> 7#include <stdio.h> 8 9int main() 10{ 11 /* Set filename and open */ 12 const char* fname = "sim.dat"; 13 FILE *fp = fopen(fname,"w"); 14 15 /* Write something to the file */ 16 fprintf(fp, "This file is %s\n", fname); 17 fprintf(fp, "%d %le %lf\n", 7, 2.7e-3, 13.46); 18 19 /* Close the file */ 20 fclose(fp); 21 22 return 0; 23}
Then you can compile it by:
gcc fileOut.c -o fileOut.ex
Try running this program and calling cat
on the generated file.
Exercise: Try changing what gets written to the file, and try using the append mode.
Exercise: Write a program that creates 10 files with names sim_00.dat
,
sim_01.dat
, … , sim_09.dat
. Have your program write a timestamp into
each file.
Formatted input proceeds exactly as output, just using the "r"
file mode
and fscanf
in place of fprintf
. Consider this code which can read files
generated by the former example:
1/* File: fileIn.c 2 * Author: Ian May 3 * Purpose: Demonstrate reading from a very specific file 4 */ 5 6#include <stdlib.h> 7#include <stdio.h> 8 9int main() 10{ 11 /* Set filename and open */ 12 const char* fname = "sim.dat"; 13 FILE *fp = fopen(fname,"r"); 14 15 /* Read from the file */ 16 char s0[16], s1[16], s2[16], s3[16]; 17 fscanf(fp, "%s %s %s %s", s0, s1, s2, s3); 18 int n; 19 double p, q; 20 fscanf(fp, "%d %le %lf\n", &n, &p, &q); 21 22 /* Close the file */ 23 fclose(fp); 24 25 /* Report what was read in */ 26 printf("The file contained:\n"); 27 printf("%s %s %s %s\n", s0, s1, s2, s3); 28 printf("%d %e %f\n", n, p, q); 29 30 return 0; 31}
Then you can compile it by:
gcc fileIn.c -o fileIn.ex
Running this will indeed echo back to you the file that the previous example wrote (assuming you didn’t change it too much). However, this probably doesn’t look like what you were expecting. One particular question arises.
Question: Why are there four %s
format specifiers when reading the first
line of our special file?
Reading whole lines from a file¶
It is frequently easier to read a whole line from a file, then decide how to
process it. In the above example we saw that fscanf
stops reading at any
whitespace character, and in particular stops reading if it encounters a
space without you explicitly accounting for that. Instead, we’ll use the
function fgets
. Consider this small re-write of the above example:
1/* File: fileIn2.c 2 * Author: Ian May 3 * Purpose: Demonstrate reading from a very specific file, now with fgets 4 */ 5 6#include <stdlib.h> 7#include <stdio.h> 8#include <string.h> 9 10#define MAX_LINE_LENGTH 256 11 12int main() 13{ 14 /* Set filename and open */ 15 const char* fname = "sim.dat"; 16 FILE *fp = fopen(fname,"r"); 17 18 /* Read the first line from the file */ 19 char l1[MAX_LINE_LENGTH]; 20 fgets(l1, MAX_LINE_LENGTH, fp); 21 /* Remove newline character from string if present */ 22 if (l1[strlen(l1)-1] == '\n') { 23 l1[strlen(l1)-1] = '\0'; 24 } 25 26 /* Read the second line from the file */ 27 char l2[MAX_LINE_LENGTH]; 28 fgets(l2, sizeof(l2), fp); 29 30 /* Extract numbers from second line */ 31 int n; 32 double p, q; 33 sscanf(l2, "%d %le %lf\n", &n, &p, &q); 34 35 /* Close the file */ 36 fclose(fp); 37 38 /* Report what was read in */ 39 printf("The file contained:\n"); 40 printf("%s\n", l1); 41 printf("%d %e %f\n", n, p, q); 42 43 return 0; 44}
Then you can compile it by:
gcc fileIn2.c -o fileIn2.ex
This is a little better than before, we can at least read full lines from the file
without having to account for whitespace so explicitly. Observe the following about
fgets
:
The first argument is a character array, or buffer, to store the read line into
The function reads at most
N-1
characters from the line, whereN
is the second argument. The final character will always be a null-terminatorThe function stops reading if a newline character is encountered
The function stops reading if the EOF (end of file) character is encountered
The buffer will contain the newline character if it was found
The example shows one way remove this character
Typically we will want to continue reading a file regardless of how many lines are
present. The function fgets
actually has a return value that we can use to
accomplish this. To see how, lets write a simplified clone of the cat
command:
1/* File: catClone.c 2 * Author: Ian May 3 * Purpose: Use fgets to write a simplified version of cat 4 */ 5 6#include <stdlib.h> 7#include <stdio.h> 8#include <string.h> 9 10#define MAX_LINE_LENGTH 256 11 12int main(int argc, char **argv) 13{ 14 /* Exit if no file was supplied */ 15 /* Note that we should also do other error checking */ 16 if (argc < 2) { 17 printf("You need to supply an input file!\n"); 18 return 0; 19 } 20 21 /* Set filename and open */ 22 FILE *fp = fopen(argv[1],"r"); 23 24 /* Buffer to store lines in */ 25 char line[MAX_LINE_LENGTH]; 26 27 /* Read the lines from the file until EOF */ 28 while (fgets(line, MAX_LINE_LENGTH, fp) != NULL) { 29 /* Print the line */ 30 printf("%s",line); 31 } 32 33 /* Close the file */ 34 fclose(fp); 35 36 return 0; 37}
Then you can compile it by:
gcc catClone.c -o catClone.ex
and run it like so:
$ ./catClone sim.dat
This is pretty weird way to write a while loop compared to what we’ve seen so far. Let’s walk through how this works:
The
fgets
function returns a pointer tochar
. This pointer isNULL
when reading the line failed (e.g. there is nothing left in the file).We put the call to
fgets
inside the conditional statement for the while loopThis means
fgets
is called, andline
is filled every time the conditional is evaluated.If
fgets
returnsNULL
, theline
buffer is not filled, but that doesn’t matter since the loop body doesn’t run.
Exercise: Try printing line
an additional time after the while loop. Does it do
what you expected?
As a brief aside, did you know that gcc
can compile from stdin
? This means we can
pass a file from cat
to gcc
and get an executable back. That is, we can use our
clone of the cat command to compile itself:
./catClone catClone.c | gcc -x c - -o catClone2.ex
It isn’t very useful, but it’s fun little curiosity.
Putting it all together¶
As a more useful example, let’s write a program that can read the input files we used for the Mathieu example back in the Fortran chapter. Recall that these are structured like this:
num_points 101
q_index 40
run_name Mathieu_101_40
Every line consists of two entries, the name and value of each parameter we want to read in. Our goal is to write a program that can extract these values from the input file. Additionally, the program should work regardless of what order lines are specified in, and regardless of whether all lines are present.
We can accomplish this in the following:
1/* File: readInit.c 2 * Author: Ian May 3 * Purpose: Short program to read init files from the Mathieu example 4 */ 5 6#include <stdlib.h> 7#include <stdio.h> 8#include <string.h> 9 10#define MAX_LINE_LENGTH 256 11 12int main(int argc, char **argv) 13{ 14 /* Exit if no file was supplied */ 15 /* Note that we should also do other error checking */ 16 if (argc < 2) { 17 printf("You need to supply an init file!\n"); 18 return 0; 19 } 20 21 /* Declare the relevant parameters and give them default values */ 22 int N = 101; 23 double q = 40.; 24 char runName[MAX_LINE_LENGTH-1]; 25 strcpy(runName,"default_run"); /* why is this line here? */ 26 27 /* Set filename and open */ 28 FILE *fp = fopen(argv[1],"r"); 29 30 /* Buffers to store lines in */ 31 char line[MAX_LINE_LENGTH]; 32 char front[MAX_LINE_LENGTH]; 33 char back[MAX_LINE_LENGTH]; 34 35 /* Read the lines from the file until EOF */ 36 while (fgets(line, MAX_LINE_LENGTH, fp) != NULL) { 37 /* Process the line to see what it specifies */ 38 sscanf(line,"%s %s\n",front,back); 39 /* We only have three settings, so an if-else chain is easy */ 40 if (strcmp(front,"num_points") == 0) { 41 N = atoi(back); 42 } else if (strcmp(front,"q_index") == 0) { 43 q = atof(back); 44 } else if (strcmp(front,"run_name") == 0) { 45 strcpy(runName,back); 46 } else { 47 printf("Specified parameter unrecognized: %s\n",front); 48 } 49 } 50 51 /* Close the file */ 52 fclose(fp); 53 54 /* Report back the found values */ 55 printf("We read in the following parameters:\n"); 56 printf("%d grid points\n",N); 57 printf("A q value of %f\n",q); 58 printf("A run name of %s\n",runName); 59 60 return 0; 61}
Then you can compile it by:
gcc readInit.c -o readInit.ex
Try running it as:
./readInit mathieu.init
Before examining how this program works, try running it a few times. Try mixing up the lines in the init file, try messing up the spelling of the parameters. What happens if the supplied values don’t make sense? What happens if you specify the same parameter multiple times?
Now, let’s observe how this program actually works:
Just like in the
catClone
program, we read lines usingfgets
until we reach the end of the fileWe split each given line into two strings
The first is compared to the parameter names we care about
If the first matches, then the second is used to set that parameter
If the line has more than two fields separated by spaces, then the later ones are ignored
Since we keep checking all possible matches, the order of the parameters doesn’t matter, and specifying them multiple times will just overwrite the earlier values.
Unrecognized lines are reported to the user, and ignored.