.. _ch06_c_strings: ====================== Strings ====================== We've used strings a few times already, but have yet to formalize them. Really, strings are just arrays of characters, but with one extra feature. Strings can be passed around without consideration for their length, that is we (mostly) don't need to pass an integer along with the array of characters. Why is this, and what makes characters so special? Strings are arrays of characters that are *null-terminated*. This means that a string should always end with the special non-printing character ``'\0'``. The strings defined in the following code snippet are infact identical: .. code-block:: C const char hArr[] = {'H','e','l','l','o','\0'}; const char *hLit = "Hello"; There are a few interesting things to note here: * We only need to specify the termination character when we set the string from an array explicitly * The keyword ``const`` is present on each of these declarations. - It is unimportant for the first declaration. - In the second case, the string ``"Hello"`` is a string-literal and the compiler is free to place it into read only memory. Therefore, we had better not modify it! **Remark:** Does this last point remind you of anything from Python? ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Functions acting on strings ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ There are many functions that act on strings. Since characters are really just single byte integers, strings can also act as a low level representation of memory, and many functions acting on strings don't immediately seem relevant to strings. We'll mostly ignore these, though it is good to know that they exist. Actually, we've already used a few functions that act on strings, consider the main function: .. code-block:: C #include int main(int argc, char **argv) { int m = argc>1 ? atoi(argv[1]) : 2; /* ... */ return 0; } The function ``atoi`` is provided by ``stdlib.h`` and converts a given string into an integer if it can. A few interesting functions acting on strings are demonstrated in the following example: .. literalinclude:: ./codes/stringExamples.c :language: c :linenos: :download:`Download this code <./codes/stringExamples.c>` Then you can compile it by: .. code-block:: console gcc stringExamples.c -o stringExamples.ex Let's observe a few things from this example: * The functions ``strcpy`` and ``strncpy`` both copy a source string into a destination string. What is the difference? - ``strcpy`` uses the null-terminator on the source string to determine the number of bytes to copy. This can fail in a few ways: if the destination is too small, and if the source string is missing the null-terminator. - ``strncpy`` behaves the same as ``strcpy`` but now writes a maximum number of bytes given by the third argument to prevent buffer overflow. - Many other string functions come in pairs like this, one guarded and another unguarded. * The ``strcmp`` function compares two strings, and returns 0 if they are the same. If they are not the same, it returns a non-zero integer whose sign tells the lexicographic order of the two strings. * The ``strlen`` function does not include the null-terminator when finding the length of a string. Does the usage here indicate why this is the case? * The ``strcat`` function concatentates strings, but only takes two arguments. That is, it concatentates the second onto the first. - Note that the first call to ``strcat`` could be replaced by ``strcpy`` **Exercise:** Add a print statement that shows the third character of the string ``hi``. Try ignoring the above advice and writing a new value over this character. Can you generate a compile time error? How about a run time error? ^^^^^^^^^^^^^^^^^^^^^^^^^^ Writing formatted strings ^^^^^^^^^^^^^^^^^^^^^^^^^^ We've already seen the flexibility of ``printf`` for writing to the terminal. What if we wanted to use that same flexibility to write strings? Consider this common problem: you have a code that generates data at each time step of a problem, and you want to save that data to disc before modifying it for the next time step. How can you generate filenames that include a timestamp? Ideally, you could generate files like: .. code-block:: console $ ls sim_00000.dat sim_00001.dat sim_00002.dat ... sim_00108.dat and so on. We already know that we could print strings like this to the screen using: .. code-block:: c for(int nT=0; nT` Then you can compile it by: .. code-block:: console gcc stringScan.c -o stringScan.ex **Exercise:** Try modifying the format string in the call to ``sscanf``. Can you make the call fail? What values are written on failure?