Strings

We’ve used strings a few times already, but have yet to formalize them. Really, strings are just arrays of characters, but with one extra feature. Strings can be passed around without consideration for their length, that is we (mostly) don’t need to pass an integer along with the array of characters.

Why is this, and what makes characters so special? Strings are arrays of characters that are null-terminated. This means that a string should always end with the special non-printing character '\0'. The strings defined in the following code snippet are infact identical:

const char hArr[] = {'H','e','l','l','o','\0'};
const char *hLit = "Hello";

There are a few interesting things to note here:

  • We only need to specify the termination character when we set the string from an array explicitly

  • The keyword const is present on each of these declarations.

    • It is unimportant for the first declaration.

    • In the second case, the string "Hello" is a string-literal and the compiler is free to place it into read only memory. Therefore, we had better not modify it!

Remark: Does this last point remind you of anything from Python?

Functions acting on strings

There are many functions that act on strings. Since characters are really just single byte integers, strings can also act as a low level representation of memory, and many functions acting on strings don’t immediately seem relevant to strings. We’ll mostly ignore these, though it is good to know that they exist.

Actually, we’ve already used a few functions that act on strings, consider the main function:

#include <stdlib.h>

int main(int argc, char **argv)
{
  int m = argc>1 ? atoi(argv[1]) : 2;
  /* ... */

  return 0;
}

The function atoi is provided by stdlib.h and converts a given string into an integer if it can.

A few interesting functions acting on strings are demonstrated in the following example:

 1/* File: stringExamples.c
 2 * Author: Ian May
 3 * Purpose: Demonstrate c strings
 4 */
 5
 6#include <stdlib.h>
 7#include <stdio.h>
 8#include <string.h>
 9
10int main()
11{
12  /* Simple conversion */
13  const char* one = "1";
14  const char* small = "2.e-13";
15  printf("one = %d, and small = %1.1e\n",atoi(one),atof(small));
16
17  const char* hi = "Hello World";
18  char hc[12];
19
20  /* Copy hi into hc */
21  strcpy(hc,hi);
22  printf("strcpy: %s => %s\n",hi,hc);
23
24  /* Clobber the null-terminator */
25  hc[11] = '!';
26
27  /* Copy hc into another string */
28  int count = 13;
29  char hx[count];
30  strncpy(hx,hc,count-1);
31  hx[count-1] = '\0';
32  printf("strncpy: %s => %s\n",hc,hx);
33
34  /* Test if strings are equivalent */
35  if (strcmp(hi,hc) == 0) {
36    printf("String hi is the same as string hc\n");
37  } else {
38    printf("String hi is not the same as string hc\n");
39  }
40  
41  if (strcmp(hx,hc) == 0) {
42    printf("String hx is the same as string hc\n");
43  } else {
44    printf("String hx is not the same as string hc\n");
45  }
46
47  /* Get string length */
48  const char* il = "I love ";
49  const char* sc = "scientific computing";
50  int clen = strlen(il) + strlen(sc) + 1;
51  
52  /* Concatenate into another string */
53  char ilsc[clen];
54  ilsc[0] = '\0';
55  strcat(ilsc,il);
56  printf("First strcat: %s\n",ilsc);
57  strcat(ilsc,sc);
58  printf("Second strcat: %s\n",ilsc);
59
60  return 0;
61}

Download this code

Then you can compile it by:

gcc stringExamples.c -o stringExamples.ex

Let’s observe a few things from this example:

  • The functions strcpy and strncpy both copy a source string into a destination string. What is the difference?

    • strcpy uses the null-terminator on the source string to determine the number of bytes to copy. This can fail in a few ways: if the destination is too small, and if the source string is missing the null-terminator.

    • strncpy behaves the same as strcpy but now writes a maximum number of bytes given by the third argument to prevent buffer overflow.

    • Many other string functions come in pairs like this, one guarded and another unguarded.

  • The strcmp function compares two strings, and returns 0 if they are the same. If they are not the same, it returns a non-zero integer whose sign tells the lexicographic order of the two strings.

  • The strlen function does not include the null-terminator when finding the length of a string. Does the usage here indicate why this is the case?

  • The strcat function concatentates strings, but only takes two arguments. That is, it concatentates the second onto the first.

    • Note that the first call to strcat could be replaced by strcpy

Exercise: Add a print statement that shows the third character of the string hi. Try ignoring the above advice and writing a new value over this character. Can you generate a compile time error? How about a run time error?

Writing formatted strings

We’ve already seen the flexibility of printf for writing to the terminal. What if we wanted to use that same flexibility to write strings? Consider this common problem: you have a code that generates data at each time step of a problem, and you want to save that data to disc before modifying it for the next time step. How can you generate filenames that include a timestamp? Ideally, you could generate files like:

$ ls
sim_00000.dat
sim_00001.dat
sim_00002.dat
...
sim_00108.dat

and so on. We already know that we could print strings like this to the screen using:

for(int nT=0; nT<nSteps; nT++) {
  printf("sim_%05d.dat\n",nT);
}

To create files we will need to capture these into strings, not print them to the screen. This need is so common that there is a special function called sprintf which, you guessed it, writes formatted output into a string instead of onto the screen. We can repair the above by writing:

char fname[14];
for(int nT=0; nT<nSteps; nT++) {
  sprintf(fname,"sim_%05d.dat",nT);
}

Just like before, there is also snprintf as a byte-limited version. We’ll see this in action in the next section on file I/O. This latter version can also be used to query the number of bytes needed in the destination buffer, and gives a way to check how much space to allocate for the string before actually writing it.

Exercise: Re-write the string concatentation part of the above example to use sprintf instead. Try using snprintf to query the needed allocation size.

Reading from strings

Just as there is a function sprintf that writes formatted data into strings, there is sscanf that extracts formatted data from strings. This perhaps sounds a little strange, but we’ll find a use for it soon. In the interim, consider this small example:

 1/* File: stringScan.c
 2 * Author: Ian May
 3 * Purpose: Extract formatted data from a string
 4 */
 5
 6#include <stdlib.h>
 7#include <stdio.h>
 8
 9int main()
10{
11  const char *fdata = "latitude: 36.971944 longitude: -122.026389";
12  double lat,lng;
13  sscanf(fdata, "latitude: %lf longitude: %lf",&lat,&lng);
14
15  printf("Santa Cruz has a latitude of %lf, and a longitude of %lf\n",lat,lng);
16
17  return 0;
18}

Download this code

Then you can compile it by:

gcc stringScan.c -o stringScan.ex

Exercise: Try modifying the format string in the call to sscanf. Can you make the call fail? What values are written on failure?