Strings¶
We’ve used strings a few times already, but have yet to formalize them. Really, strings are just arrays of characters, but with one extra feature. Strings can be passed around without consideration for their length, that is we (mostly) don’t need to pass an integer along with the array of characters.
Why is this, and what makes characters so special? Strings are arrays of
characters that are null-terminated. This means that a string should always
end with the special non-printing character '\0'
. The strings defined in
the following code snippet are infact identical:
const char hArr[] = {'H','e','l','l','o','\0'};
const char *hLit = "Hello";
There are a few interesting things to note here:
We only need to specify the termination character when we set the string from an array explicitly
The keyword
const
is present on each of these declarations.It is unimportant for the first declaration.
In the second case, the string
"Hello"
is a string-literal and the compiler is free to place it into read only memory. Therefore, we had better not modify it!
Remark: Does this last point remind you of anything from Python?
Functions acting on strings¶
There are many functions that act on strings. Since characters are really just single byte integers, strings can also act as a low level representation of memory, and many functions acting on strings don’t immediately seem relevant to strings. We’ll mostly ignore these, though it is good to know that they exist.
Actually, we’ve already used a few functions that act on strings, consider the main function:
#include <stdlib.h>
int main(int argc, char **argv)
{
int m = argc>1 ? atoi(argv[1]) : 2;
/* ... */
return 0;
}
The function atoi
is provided by stdlib.h
and converts a given string
into an integer if it can.
A few interesting functions acting on strings are demonstrated in the following example:
1/* File: stringExamples.c 2 * Author: Ian May 3 * Purpose: Demonstrate c strings 4 */ 5 6#include <stdlib.h> 7#include <stdio.h> 8#include <string.h> 9 10int main() 11{ 12 /* Simple conversion */ 13 const char* one = "1"; 14 const char* small = "2.e-13"; 15 printf("one = %d, and small = %1.1e\n",atoi(one),atof(small)); 16 17 const char* hi = "Hello World"; 18 char hc[12]; 19 20 /* Copy hi into hc */ 21 strcpy(hc,hi); 22 printf("strcpy: %s => %s\n",hi,hc); 23 24 /* Clobber the null-terminator */ 25 hc[11] = '!'; 26 27 /* Copy hc into another string */ 28 int count = 13; 29 char hx[count]; 30 strncpy(hx,hc,count-1); 31 hx[count-1] = '\0'; 32 printf("strncpy: %s => %s\n",hc,hx); 33 34 /* Test if strings are equivalent */ 35 if (strcmp(hi,hc) == 0) { 36 printf("String hi is the same as string hc\n"); 37 } else { 38 printf("String hi is not the same as string hc\n"); 39 } 40 41 if (strcmp(hx,hc) == 0) { 42 printf("String hx is the same as string hc\n"); 43 } else { 44 printf("String hx is not the same as string hc\n"); 45 } 46 47 /* Get string length */ 48 const char* il = "I love "; 49 const char* sc = "scientific computing"; 50 int clen = strlen(il) + strlen(sc) + 1; 51 52 /* Concatenate into another string */ 53 char ilsc[clen]; 54 ilsc[0] = '\0'; 55 strcat(ilsc,il); 56 printf("First strcat: %s\n",ilsc); 57 strcat(ilsc,sc); 58 printf("Second strcat: %s\n",ilsc); 59 60 return 0; 61}
Then you can compile it by:
gcc stringExamples.c -o stringExamples.ex
Let’s observe a few things from this example:
The functions
strcpy
andstrncpy
both copy a source string into a destination string. What is the difference?strcpy
uses the null-terminator on the source string to determine the number of bytes to copy. This can fail in a few ways: if the destination is too small, and if the source string is missing the null-terminator.strncpy
behaves the same asstrcpy
but now writes a maximum number of bytes given by the third argument to prevent buffer overflow.Many other string functions come in pairs like this, one guarded and another unguarded.
The
strcmp
function compares two strings, and returns 0 if they are the same. If they are not the same, it returns a non-zero integer whose sign tells the lexicographic order of the two strings.The
strlen
function does not include the null-terminator when finding the length of a string. Does the usage here indicate why this is the case?The
strcat
function concatentates strings, but only takes two arguments. That is, it concatentates the second onto the first.Note that the first call to
strcat
could be replaced bystrcpy
Exercise: Add a print statement that shows the third character of the string hi
.
Try ignoring the above advice and writing a new value over this character. Can you
generate a compile time error? How about a run time error?
Writing formatted strings¶
We’ve already seen the flexibility of printf
for writing to the terminal. What if
we wanted to use that same flexibility to write strings? Consider this common problem:
you have a code that generates data at each time step of a problem, and you want to
save that data to disc before modifying it for the next time step. How can you generate
filenames that include a timestamp? Ideally, you could generate files like:
$ ls
sim_00000.dat
sim_00001.dat
sim_00002.dat
...
sim_00108.dat
and so on. We already know that we could print strings like this to the screen using:
for(int nT=0; nT<nSteps; nT++) {
printf("sim_%05d.dat\n",nT);
}
To create files we will need to capture these into strings, not print them to the screen.
This need is so common that there is a special function called sprintf
which, you
guessed it, writes formatted output into a string instead of onto the screen. We can repair
the above by writing:
char fname[14];
for(int nT=0; nT<nSteps; nT++) {
sprintf(fname,"sim_%05d.dat",nT);
}
Just like before, there is also snprintf
as a byte-limited version. We’ll see this in
action in the next section on file I/O. This latter version can also be used to query
the number of bytes needed in the destination buffer, and gives a way to check how much
space to allocate for the string before actually writing it.
Exercise: Re-write the string concatentation part of the above example to use sprintf
instead. Try using snprintf
to query the needed allocation size.
Reading from strings¶
Just as there is a function sprintf
that writes formatted data into strings, there is
sscanf
that extracts formatted data from strings. This perhaps sounds a little strange,
but we’ll find a use for it soon. In the interim, consider this small example:
1/* File: stringScan.c 2 * Author: Ian May 3 * Purpose: Extract formatted data from a string 4 */ 5 6#include <stdlib.h> 7#include <stdio.h> 8 9int main() 10{ 11 const char *fdata = "latitude: 36.971944 longitude: -122.026389"; 12 double lat,lng; 13 sscanf(fdata, "latitude: %lf longitude: %lf",&lat,&lng); 14 15 printf("Santa Cruz has a latitude of %lf, and a longitude of %lf\n",lat,lng); 16 17 return 0; 18}
Then you can compile it by:
gcc stringScan.c -o stringScan.ex
Exercise: Try modifying the format string in the call to sscanf
. Can you make the
call fail? What values are written on failure?