.. _ch02-fortran-debugging: ============================================================= Fortran debugging ============================================================= Print statements ---------------- Adding print statements to a program is a tried and true method of debugging, and probably the the most ubiquitous method in use. This is not because it is the best method, but rather because it is the simplest. That said, print statements are still incredibly useful, and in spite of their simplicity it is still worth looking at them in a little more detail. Print statements can be added almost anywhere in a Fortran code to print things out to the terminal as it chugs along. You might want to put some special symbols in debugging statements to flag them as such, which makes it easier to see what output is your debug output. It also makes it easier to find them again later to remove from the code, e.g. you might use ``+++`` or ``DEBUG``. Recall that you can use the upper case file extension ``.F90`` which allows the use of the C-style preproccesor. In this way, you can write your code to print out debugging statements only when desired. Consider this example: .. literalinclude:: ./codes/segfault.F90 :language: fortran :linenos: :download:`Download this code <./codes/segfault.F90>` Here you can turn on the print statements at compile time by adding ``-DDEBUG_MODE`` to your compile flags (note the leading ``D`` which stands for `define`). Compiling with various gfortran flags --------------------------------------- There are a number of flags you can use when compiling your code that will make it easier to debug. Here's a generic set of options you might try .. code-block:: console $ gfortran -g -Wall -Wextra -fcheck=bounds -pedantic-errors \ -ffpe-trap=zero,invalid,overflow,underflow program.f90 See :ref:`ch02-fortran-flags` or the `gfortran man page `_ for more information. Most of these options indicate that the program should give warnings or die if certain bad things happen. Compiling with the ``-g`` flag indicates that information should be generated and saved inside the executable during compilation. This information can be used to help debug the code through a debugger such as ``gdb`` or ``lldb``. You generally have to compile with this option to use a debugger. The ``gdb`` debugger ---------------------- `GDB `_ is the GNU open source debugger for GNU compilers such as gfortran. Unfortunately it often works poorly on MacOS (GDB works better on Linux). You may find that `lldb `_ works better on Mac, and functions in essentially the same way. See more on `GDB commands `_. Also, take a look at this nice `GDB tutorial `_. .. note:: Due to the security policies on macOS, it could be very painful to install ``gdb`` on there. Reportedly, the situation has improved with the release of the `Catalina` version of macOS, and installation from ``homebrew`` should succeed. `High Sierra` had little to no compatibility, though you may try these `instructions `_. I recommend using a Linux system for ``gdb`` if possible, or to use ``lldb`` on macOS. The commands for ``lldb`` and ``gdb`` are not quite the same, but this `command map `_ should help. Consider the following example: .. literalinclude:: ./codes/segfault1.f90 :language: fortran :linenos: :download:`Download this code <./codes/segfault1.f90>` First compile the code with .. code-block:: console $ gfortran segfault1.f90 and run it. You should see something like .. code-block:: console Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: #0 0x7f9115d13d9f in ??? #1 0x563962fa4202 in ??? #2 0x563962fa42f3 in ??? #3 0x7f9115cfeb24 in ??? #4 0x563962fa40bd in ??? #5 0xffffffffffffffff in ??? [1] 34879 segmentation fault (core dumped) ./a.out Now if you compile it with a ``-g`` flag .. code-block:: console $ gfortran -g segfault1.f90 -o segfault.ex and run it again you should see something like: .. code-block:: console Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: #0 0x7fce73accd9f in ??? #1 0x55978c040202 in segfault1 at //LectureF22/Fortran/segfault1.f90:8 #2 0x55978c0402f3 in main at //LectureF22/Fortran/segfault1.f90:12 [1] 34985 segmentation fault (core dumped) ./a.out Now let's see what happens if we run it inside of GDB. We do this by passing the executable name as an argument to GDB: .. code-block:: console $ gfortran -g segfault1.f90 -o segfault.ex $ gdb segfault.ex Note that GDB does not start running your program immediately. Instead it gives you time to setup any breakpoints or mark any variables to watch. Let's set a breakpoint at line 7 using ``break segfault1.f90:7`` and then run the executable using ``r``. You should see something similar to this: .. code-block:: console (gdb) break segfault1.f90:8 Breakpoint 1 at 0x11ea: file segfault1.f90, line 8. (gdb) r Starting program: //LectureF22/Fortran/segfault.ex Breakpoint 1, segfault1 () at segfault1.f90:8 8 a(i) = i (gdb) print a 1: a = (0, 0, 0, 0, 0, 0, 0, 0, 0, 0) (gdb) print i 2: i = 1 (gdb) c Continuing. 1.00000000 Breakpoint 1, segfault1 () at segfault1.f90:8 8 a(i) = i 1: a = (1, 0, 0, 0, 0, 0, 0, 0, 0, 0) 2: i = 200 (gdb) c Continuing. 200.000000 Breakpoint 1, segfault1 () at segfault1.f90:8 8 a(i) = i 1: a = (1, 0, 0, 0, 0, 0, 0, 0, 0, 0) 2: i = 399 (gdb) c Continuing. 399.000000 Continuing this, you will eventually see: .. code-block:: console Breakpoint 1, segfault1 () at segfault1.f90:8 8 a(i) = i 1: a = (1, 0, 0, 0, 0, 0, 0, 0, 0, 0) 2: i = 1195 (gdb) c Continuing. 1195.00000 Breakpoint 1, segfault1 () at segfault1.f90:8 8 a(i) = i 1: a = (1, 0, 0, 0, 0, 0, 0, 0, 0, 0) 2: i = 1394 (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. 0x0000555555555202 in segfault1 () at segfault1.f90:8 8 a(i) = i 1: a = (1, 0, 0, 0, 0, 0, 0, 0, 0, 0) 2: i = 1394 This at least reveals the error happened when the compiler tried to access ``a(i)`` with ``i=1394``, which is way beyond the limits of the array. The ``print`` command tells gdb to write the value of the variable to the screen at every break. The ``c`` command (short for ``continue``) tells gdb to run until the next breakpoint without leaving the current subroutine. You can use the ``next`` or ``n`` command to run the code one line at a time, without re-routing into function/subroutine calls. Finally, the ``step`` or ``s`` command will run one line at a time and attempt to step inside function/subroutine calls. In this example we've seen that the trouble comes from the ``i`` variable. Instead of writing ``print i`` all the time we could use the ``watch`` command. This is kind of like a breakpoint, but it triggers every time the value of variable changes regardless of what line caused the change. Consider the following session: .. code-block:: console (gdb) break segfault1.f90:8 Breakpoint 1 at 0x11ea: file segfault1.f90, line 8. (gdb) r Starting program: //LectureF21/Fortran/segfault.ex Breakpoint 1, segfault1 () at segfault1.f90:8 8 a(i) = i (gdb) watch i Hardware watchpoint 2: i (gdb) disable 1 (gdb) c Continuing. <...> (gdb) c Continuing. 1195.00000 Hardware watchpoint 2: i Old value = 1195 New value = 1394 0x0000555555555291 in segfault1 () at segfault1.f90:7 7 do i = 1, 5000, 199 (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. 0x0000555555555202 in segfault1 () at segfault1.f90:8 8 a(i) = i (gdb) c Continuing. Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: #0 0x7ffff799ed9f in ??? #1 0x555555555202 in segfault1 at //LectureF21/Fortran/segfault1.f90:8 #2 0x5555555552f3 in main at //LectureF21/Fortran/segfault1.f90:12 Program received signal SIGSEGV, Segmentation fault. Then, after failure you can examine the backtrace with the command ``bt``, and visit different stack frames with ``frame ``. This is powerful indeed! Note that some of the same information that we got interactively before, can be obtained from the stack frames. This will often be easier than specifying ``print`` and stepping forward with ``s`` or ``n``. Just use whatever suits the situation at hand! .. note:: Let's summarize the commands we've used: * ``break :`` sets a breakpoint at the specified line in that file. Execution will be paused each time that line is executed. * ``run`` or ``r`` starts the run of the executable. * ``print `` or ``p `` watches a variable and pauses execution every time it's value changes. The variable must be in the current scope, so you may need a breakpoint to get there first. * ``c``, ``n``, and ``s`` all resume execution after it has been paused. ``c`` runs until the next break or watch point, ``n`` runs line by line, and ``s`` runs line by line as well as into function/subroutine calls. * ``bt`` shows the backtrace, and ``frame `` lets you move around in that trace to look around. * ``q`` quits execution and kills the process. On the other hand, if you compile it with the ``-g`` and ``-fcheck=bounds`` flags .. code-block:: console $ gfortran -g -fbounds-check segfault1.f90 and running it again (even without gdb), you see now .. code-block:: console $ ./segfault $ 1.00000000 $ At line 8 of file segfault1.f90 $ Fortran runtime error: Index '200' of dimension 1 of array 'a' above upper bound of 10 $ $ Error termination. Backtrace: $ #0 0x55b0a0bb424c in segfault1 $ at //LectureF22/Fortran/segfault1.f90:8 $ #1 0x55b0a0bb4396 in main $ at //LectureF22/Fortran/segfault1.f90:12 Valgrind -------------------- `Valgrind `_ is a freely available open source programming tool for detecting many memory leaks, memory bugs, and additionally provides profiling information. Originally, it was designed as a free memory debugging tool for Linux, and is now also available on Mac OS, Solaris, and even Android. To use Valgrind for debugging, take the following steps: #. Use package manager * On Linux use your distribution's package manager (e.g. ``sudo apt install valgrind``, ``sudo pacman -S valgrind``, or whatever) * On macOS try ``homebrew`` (e.g., ``brew install valgrind``, followed by ``brew link valgrind`` if needed). If this doesn't work, and generates some complaints about ruby updates, please follow the instructions `here `_ `and here `_. #. Download the recent Valgrind release (e.g., 3.19 released April 11, 2022) from the Valgrind `website `_. Untar it (e.g., ``tar -xvf valgrind-3.15.0.tar.bz2``, open the README and follow the steps therein. The installation should look something like this: .. code-block:: console $ ./configure --prefix=/usr/local/opt/valgrind $ make $ make install If the last command ``make install`` doesn't work due to permission, try ``sudo make install``. .. note:: The closed source nature of MacOS means that Valgrind may or may not be fully supported depending on your version of MacOS. If installation fails using your package manager then you should try installing it from source using the second option. Recall our buggy code: .. literalinclude:: ./codes/segfault1.f90 :language: fortran :linenos: :download:`Download this code <./codes/segfault1.f90>` To use Valgrind, first ensure that you compile the code with various useful debugging flags, and at the very least use the ``-g`` flag. E.g.: .. code-block:: console $ gfortran -g -Wall -Wextra -Wimplicit-interface segfault1.f90 -o segfault1 Then pass the executable to Valgrind. There are many flags to control how Valgrind behaves and what it reports. On MacOS, it is suggested to include ``--dsymutil=yes``, but it is unnecessary on Linux. .. code-block:: console $ valgrind --leak-check=full --dsymutil=yes --track-origins=yes ./segfault1 If you compile the code without such flags in the above (``-g`` is the most important one), then you probably won't get useful information (e.g., line numbers in the source files) from running Valgrind. .. note:: Valgrind and GDB satisfy different use cases. GDB is inherently interactive, letting you poke at the executable while it runs. It also has very low overhead, and can be used during fairly large runs of the code. However, you need to put more work into it to get very complete information out of it. Valgrind has much higher overhead, and can be hard to use on large runs of the code. This overhead allows Valgrind to report back very comprehensive information, especially around memory leaks and errors. Valgrind also includes a host of other more advanced tools. Together GDB and Valgrind (and good old print statements!) cover most debugging needs. **Question:** Can you think of any types of errors/bugs that are *best* handled by print statements compared to these more powerful and capable tools?