2.8. I/O in C (Standard and File)

C supports many functions for performing standard I/O as well as file I/O. In this section, we discuss some of the most commonly used interfaces for I/O in C.

2.8.1. Standard Input/Output (I/O)

Every running program begins with three default I/O streams: standard out (stdout), standard in (stdin), and standard error (stderr). A program can write (print) output to stdout and stderr, and it can read input values from stdin. stdin is usually defined to read in input from the keyboard, whereas stdout and stderr output to the terminal.

The C stdio.h library provides the printf function used for printing to standard out and the scanf function that can be used to read in values from standard in. C also has functions to read and write one character at a time (getchar and putchar), as well as other functions and libraries for reading and writing characters to standard I/O streams. A C program must explicitly include stdio.h to call these functions.

You can change the location that a running program’s stdin, stdout and/or stderr read from or write to. One way to do this is by re-directing one or all of these to read or write to a file. Here are some example shell commands for redirecting a program’s stdin, stdout, or stderr to a file ($ is the shell prompt):

#  redirect a.out's stdin to read from file infile.txt:
$ ./a.out < infile.txt

#  redirect a.out's stdout to print to file outfile.txt:
$ ./a.out > outfile.txt

# redirect a.out's stdout and stderr to a file out.txt
$ ./a.out &> outfile.txt

# redirect all three to different files:
#   (< redirects stdin, 1> stdout, and 2> stderr):
$ ./a.out < infile.txt 1> outfile.txt 2> errorfile.txt

printf

C’s printf function resembles formatted print calls in Python, where the caller specifies a format string to print. The format string often contains special format specifiers including special characters that will print tabs (\t) or newlines (\n), or that specify placeholders for values in the output (% followed by a type specifier). When adding placeholders in a format string passed to printf, pass their corresponding values as additional arguments following the format string. Here are some example calls to printf:

printf.c
int x = 5, y = 10;
float pi = 3.14;

printf("x is %d and y is %d\n", x, y);

printf("%g \t %s \t %d\n", pi, "hello", y);

When run, these printf statements output:

x is 5 and y is 10
3.14 	 hello 	 10

Note the tab characters (\t) get printed in the second call, and the different formatting placeholders for different types of values (%g, %s, and %d).

Here’s a set of formatting placeholders for common C types:

%f, %g: placeholders for a float or double value
%d:     placeholder for a decimal value (char, short, int)
%u:     placeholder for an unsigned decimal
%c:     placeholder for a single character
%s:     placeholder for a string value
%p:     placeholder to print an address value

# long and long long include an l or ll prefix:
%ld: placeholder for a long value
%lu: placeholder for an unsigned long value
%lld: placeholder for a long long value
%llu: placeholder for an unsigned long long  value

Here are some examples of their use:

float labs;
int midterm;

labs = 93.8;
midterm = 87;

printf("Hello %s, here are your grades so far:\n", "Tanya");
printf("\t midterm: %d (out of %d)\n", midterm, 100);
printf("\t lab ave: %f\n", labs);
printf("\t final report: %c\n", 'A');

When run, the output will look like this:

Hello Tanya, here are your grades so far:
	 midterm: 87 (out of 100)
	 lab ave: 93.800003
	 final report: A

C also allows you to specify the field width with format placeholders. Here are some examples:

%5.3f: print float value in space 5 chars wide, with 3 places beyond decimal
%20s:  print the string value in a field of 20 chars wide, right justified
%-20s: print the string value in a field of 20 chars wide, left justified
%8d:   print the int value in a field of 8 chars wide, right justified
%-8d:  print the int value in a field of 8 chars wide, left justified

Here’s a larger example that uses field width specifiers with placeholders in the format string:

printf_format.c
#include <stdio.h> // library needed for printf

int main() {
    float x, y;
    char ch;

    x = 4.50001;
    y = 5.199999;
    ch = 'a';      // ch stores ASCII value of 'a' (the value 97)

    printf("%.1f %.1f\n", x, y); // .1: print x and y with single precision

    printf("%6.1f \t %6.1f \t %c\n", x, y, ch);
    printf("%6.1f \t %6.1f \t %c\n", x+1, y+1, ch+1);  // ch+1 is 98, the ASCII value of 'b'
    printf("%6.1f \t %6.1f \t %c\n", x*20, y*20, ch+2);
    return 0;
}

When run, the program output looks like this:

4.5 5.2
   4.5 	    5.2 	 a
   5.5 	    6.2 	 b
  90.0 	  104.0 	 c

Note how the use of tabs and field width in the last three printf statements result in a tabular output.

Finally, C defines placeholders for displaying values in different representations:

%x:     print value in hexadecimal (base 16)
%o:     print value in octal (base 8)
%d:     print value in signed decimal  (base 10)
%u:     print value in unsigned decimal (unsigned base 10)
%e:     print float or double in scientific notation
(there is no formatting option to display a value in binary)

Here is an example using placeholders to print values in different representations:

int x;
char ch;

x = 26;
ch = 'A';

printf("x is %d in decimal, %x in hexadecimal and %o in octal\n", x, x, x);
printf("ch value is %d which is the ASCII value of  %c\n", ch, ch);

When run, the program output looks like this:

x is 26 in decimal, 1a in hexadecimal and 32 in octal
ch value is 65 which is the ASCII value of  A

scanf

The scanf function provides one method for reading in values from stdin (usually from the user entering them via the keyboard) and storing them in program variables. The scanf function is a bit picky about the exact format in which the user enters data, which can make it sensitive to badly formed user input.

The arguments to the scanf function are similar to those of printf: scanf takes a format string that specifies the number and type of input values to read in followed by the locations of program variables into which the values should be stored. Programs typically combine the address of (&) operator with a variable name to produce the location of the variable in the program’s memory — the memory address of the variable. Here’s an example call to scanf that reads in two values (an int and a float):

scanf_ex.c
int x;
float pi;

// read in an int value followed by a float value ("%d%g")
// store the int value at the memory location of x (&x)
// store the float value at the memory location of pi (&pi)
scanf("%d%g", &x, &pi);

Individual input values must be separated by at least one whitespace character (e.g. spaces, tabs, newlines). However, scanf skips over leading and trailing whitespace characters as it finds the start and end of each numeric literal value. As a result, a user could enter the value 8 and 3.14 with any amount of white space before or after the two values (and at least one or more whitespace characters between) and scanf will always read in 8 and assign it to x and read in 3.14 and assign it to pi. For example, this input with lots of spaces between the two values will result in reading in 8 and storing it in x, and 3.14 and storing in pi:

           8                   3.14

Programmers often write format strings for scanf that only consist of placeholder specifiers without any other characters between them. For reading in the two numbers above, the format string might look like:

// read in an int and a float separated by at least one white space character
scanf("%d%g",&x, &pi);

getchar and putchar

The C functions getchar and putchar respectively read or write a single character value from stdin and to stdout. getchar is particularly useful in C programs that need to support careful error detection and handling of badly formed user input (scanf is not robust in this way).

ch = getchar();  // read in the next char value from stdin
putchar(ch);     // write the value of ch to stdout

2.8.2. File Input/Output (I/O)

The C standard I/O library (stdio.h) includes a stream interface for file I/O. A file stores persistent data: data that lives beyond the execution of the program that created it. A text file represents a stream of characters, and each open file tracks its current position in the character stream. When opening a file, the current position starts at the very first character in the file, and it moves as a result of every character read (or written) to the file. To read the 10th character in a file, the first 9 characters need to first be read (or the current position must be explicitly moved to the 10th character using the fseek function).

C’s file interface views a file as an input or output stream, and library functions read or write to the next position in the file stream. The fprintf and fscanf functions serve as the file I/O counterparts to printf and scanf. They use a format string to specify what to write or read and include arguments that provide values or storage for the data that gets written or read. Similarly, the library provides the fputc, fgetc, fputs, and fgets functions for reading and writing individual characters or strings to file streams. While other libraries support file I/O in C programs, we only present the stdio.h library’s stream interface to text files in detail.

Text files may contain special chars like the stdin and stdout streams: newlines ('\n'), tabs ('\t'), etc. Additionally, upon reaching the end of a file’s data, C’s I/O library generates a special end-of-file character (EOF) that represents the end of the file. Functions reading from a file can test for EOF to determine when they have reached the end of the file stream.

2.8.3. Using text files in C

To read or write a file in C, follow these steps:

  1. DECLARE a FILE * variable

    FILE *infile;
    FILE *outfile;

    The declarations above create pointer variables to a library-defined FILE * type. These pointers cannot be dereferenced in an application program. Instead, they refer to a specific file stream when passed to I/O library functions.

  2. OPEN the file: associate the variable with an actual file stream by calling fopen. When opening a file, the mode parameter determines whether the program opens it for reading ("r"), writing ("w"), or appending ("a").

    infile = fopen("input.txt", "r");  // relative path name of file, read mode
    if (infile == NULL) {
        printf("Error: unable to open file %s\n", "input.txt");
        exit(1);
    }
    
    // fopen with absolute path name of file, write mode
    outfile = fopen("/home/me/output.txt", "w");
    if (outfile == NULL) {
        printf("Error: unable to open outfile\n");
        exit(1);
    }

    The fopen function returns NULL to report errors, which may occur if it’s given an invalid file name or the user doesn’t have permissions to open the specified file (e.g., not having write permissions to the output.txt file).

  3. USE I/O operations to read, write, or move the current position in the file

    int ch;  // EOF is not a char value, but is an int.
             // since all char values can be stored in int, use int for ch
    
    ch = getc(infile);      // read next char from the infile stream
    if (ch != EOF) {
        putc(ch, outfile);  // write char value to the outfile stream
    }
  4. CLOSE the file: use fclose to close the file when the program no longer needs it.

    fclose(infile);
    fclose(outfile);

The stdio library also provides functions to change the current position in a file:

// to reset current position to beginning of file
void rewind(FILE *f);

rewind(infile);

// to move to a specific location in the file:
fseek(FILE *f, long offset, int whence);

fseek(f, 0, SEEK_SET);    // seek to the beginning of the file
fseek(f, 3, SEEK_CUR);    // seek 3 chars forward from the current position
fseek(f, -3, SEEK_END);   // seek 3 chars back from the end of the file

2.8.4. Standard and File I/O functions in stdio.h

The C stdio.h library has many functions for reading and writing to files and to the standard file-like streams (stdin, stdout, and stderr). These functions can be classified into character-based, string-based, and formatted I/O functions. Briefly, here’s some additional details about a subset of these functions:

 ---------------
 Character Based
 ---------------

// returns the next character in the file stream (EOF is an int value)
int fgetc(FILE *f);

// writes the char value c to the file stream f
// returns the char value written
int fputc(int c, FILE *f);

// pushes the character c back onto the file stream
// at most one char (and not EOF) can be pushed back
int ungetc(int c, FILE *f);

// like fgetc and fputc but for stdin and stdout
int getchar();
int putchar(int c);


 -------------
 String  Based
 -------------

// reads at most n-1 characters into the array s stopping if a newline is
// encountered, newline is included in the array which is '\0' terminated
char *fgets(char *s, int n, FILE *f);

// writes the string s (make sure '\0' terminated) to the file stream f
int fputs(char *s, FILE *f);


 ---------
 Formatted
 ---------

// writes the contents of the format string to file stream f
//   (with placeholders filled in with subsequent argument values)
// returns the number of characters printed
int fprintf(FILE *f, char *format, ...);

// like fprintf but to stdout
int printf(char *format, ...);

// use fprintf to print stderr:
fprintf(STDERR, "Error return value: %d\n", ret);

// read values specified in the format string from file stream f
//   store the read-in values to program storage locations of types
//   matching the format string
// returns number of input items converted and assigned
//   or EOF on error or if EOF was reached
int fscanf(FILE *f, char *format, ...);

// like fscanf but reads from stdin
int scanf(char *format, ...);

In general, scanf and fscanf are sensitive to badly formed input. However, for file I/O, often programmers can assume that an input file is well-formatted, so fscanf may be robust enough in such cases. With scanf, badly formed user input will often cause a program to crash. Reading in one character at a time and including code to test values before converting them to different types is more robust, but it requires the programmer to implement more complex I/O functionality.

The format string for fscanf can include the following syntax specifying different types of values and ways of reading from the file stream:

%d integer
%f float
%lf double
%c character
%s string, up to first white space

%[...] string, up to first character not in brackets
%[0123456789] would read in digits
%[^...] string, up to first character in brackets
%[^\n] would read everything up to a newline

It can be tricky to get the fscanf format string correct, particularly when reading a mix of numeric and string or character types from a file.

Here are a few example calls to fscanf (and one to fprintf) with different format strings (assume the fopen calls from above have executed successfully):

int x;
double d;
char c, array[MAX];

// write int & char values to file separated by colon with newline at the end
fprintf(outfile, "%d:%c\n", x, c);

// read an int & char from file where int and char are separated by a comma
fscanf(infile, "%d,%c", &x, &c);

// read a string from a file into array (stops reading at whitespace char)
fscanf(infile,"%s", array);

// read a double and a string up to 24 chars from infile
fscanf(infile, "%lf %24s", &d, array);

// read in a string consisting of only char values in the specified set (0-5)
// stops reading when...
//   20 chars have been read OR
//   a character not in the set is reached OR
//   the file stream reaches end-of-file (EOF)
fscanf(infile, "%20[012345]", array);

// read in a string; stop when reaching a punctuation mark from the set
fscanf(infile, "%[^.,:!;]", array);

// read in two integer values: store first in long, second in int
// then read in a char value following the int value
fscanf(infile, "%ld %d%c", &x, &b, &c);

In the final example above, the format string explicitly reads in a character value after a number to ensure that the file stream’s current position gets properly advanced for any subsequent calls to fscanf. For example, this pattern is often used to explicitly read in (and discard) a whitespace character (like '\n'), to ensure that the next call to fscanf begins from the next line in the file. Reading an additional character is necessary if the next call to fscanf attempts to read in a character value. Otherwise, having not consumed the newline, the next call to fscanf will read the newline rather than the intended character. If the next call reads in a numeric type value, then leading whitespace chars are automatically discarded by fscanf and the programmer does not need to explicitly read the \n character from the file stream.