Interacting with the OS from Python

os module: directory and file manipulation

The os module is a portable way of using operating system dependent functionality. Notably, the interface provided by os is heavily inspired by Bash (see Basic Unix/Linux Commands), which makes it quite intuitive to learn. For a more complete list please see https://docs.python.org/3.7/library/os.path.html. From here we’ll take a little tour of what is included in this module.

Get the current directory as a string (Bash equivalent: pwd):

>>> import os
>>> os.getcwd()
'/Users/mickey_mouse/Documents/ucsc/2030_fall/am129/playground/python'

List contents of a directory (Bash equivalent ls):

>>> os.curdir
'.'
>>> type(os.curdir)
str
>>> os.listdir(os.curdir)  # alternatively, os.listdir('.')
['ch05_python_exception_debugging.rst', 'ch05_python_oop.rst', 'ch05_python_IO.rst', 'ch05_python_libraries.rst']

Change to another directory (Bash equivalent cd):

>>> path = '../'
>>> os.chdir(path)
>>> os.getcwd()
'/Users/mickey_mouse/Documents/ucsc/2030_fall/am129/playground'

Make a directory (Bash equivalent mkdir):

>>> os.mkdir('codes')
>>> 'codes' in os.listdir(os.curdir)
True

Rename a directory or file (Bash equivalent mv):

>>> os.rename('codes','example_codes')
>>> 'codes' in os.listdir(os.curdir)
False
>>> 'example_codes' in os.listdir(os.curdir)
True

>>> os.rmdir('example_codes')
>>> 'example_codes' in os.listdir(os.curdir)
False

Delete a file (Bash equivalent rm):

>>> fp = open('junk.txt','w')
>>> fp.close()
>>> 'junk.txt' in os.listdir(os.curdir)
True
>>> os.remove('junk.txt')
>>> 'junk.txt' in os.listdir(os.curdir)
False

os.path: path manipulation

os.path provides common operations on paths

>>> fp = open('junk.txt','w')
>>> fp.close()

>>> a = os.path.abspath('junk.txt') # displays the absolute path to a file/directory
>>> a
'/Users/mickey_mouse/Documents/ucsc/2030_fall/am129/playground/junk.txt'

>>> os.path.split(a)
('/Users/mickey_mouse/Documents/ucsc/2030_fall/am129/playground', 'junk.txt')

>>> os.path.dirname(a)  # alternatively, os.path.split(a)[0]
'/Users/mickey_mouse/Documents/ucsc/2030_fall/am129/playground'

>>> os.path.basename(a) # alternatively, os.path.split(a)[1]
'junk.txt'

>>> os.path.splitext(os.path.basename(a))
('junk', '.txt')

>>> os.path.exists('junk.txt') # alternatively, 'junk.txt' in os.listdir(os.curdir)'
True

>>> os.path.isfile('junk.txt')
True

>>> os.path.isdir('junk.txt')
False

Many of these operations seem trivial, but consider the wide range of possible paths and all of the characters that might be present. How might you account for hidden files and directories that have a leading ‘.’? What about files with multiple extensions?

Running shell commands

You can also run shell commands directly (refer back to Basic Unix/Linux Commands) using os.system, though this won’t be as portable as the above operations were. Consider:

>>> cmd1 = 'ls -l'
>>> os.system(cmd1)
total 0
drwxr-xr-x  5 mickey_mouse  staff  160 May  3 19:34 fortran
-rw-r--r--  1 mickey_mouse  staff    0 May 29 18:33 junk.txt
drwxr-xr-x  4 mickey_mouse  staff  128 May 29 17:43 python
drwxr-xr-x  5 mickey_mouse  staff  160 May 14 12:25 sphinx
0

With os.system you can indeed perform almost everything within a directory

>>> os.system('touch foo')
0
>>> os.system(cmd1)
total 0
-rw-r--r--  1 mickey_mouse  staff    0 May 29 18:37 foo
drwxr-xr-x  5 mickey_mouse  staff  160 May  3 19:34 fortran
-rw-r--r--  1 mickey_mouse  staff    0 May 29 18:33 junk.txt
drwxr-xr-x  4 mickey_mouse  staff  128 May 29 17:43 python
drwxr-xr-x  5 mickey_mouse  staff  160 May 14 12:25 sphinx
0

What is the 0 that gets returned from these system calls?

glob: pattern matching on files

The glob module provides convenient file pattern matching, like ls *.txt shows all files that have the .txt within bash. For example, if we wish to find all files that have .txt file extensions from inside a Python script, you can do:

>>> import glob
>>> glob.glob('*.py')
[ ... ]
>>> glob.glob('*.???')
[ ... ]

This returns a list of strings of all filenames that matched. If you only care about iterating over this list, and don’t need it for anything else, you can use iglob:

>>> for fn in glob.iglob('*.py'):
...     print(os.path.splitext(fn)[0])
...
histogram
global
<and so on>

sys module: system-specific information

We’ve encountered the sys module a few times now. This module provides system-specific information related to the Python interpreter. We’ve previously used it to look at the path for importing modules, and for access to command line arguments.

To see which version of Python you are running and where it is installed

>>> import sys
>>> sys.platform
'linux'

>>> sys.version
'3.9.7 (default, Oct 10 2021, 15:13:22) \n[GCC 11.1.0]'

>>> sys.prefix
'/usr'

Recall that sys.path is a list of strings that specifies the search path for modules. Initialized from PYTHONPATH

>>> sys.path
['/usr/bin',
'/usr/lib/python39.zip',
'/usr/lib/python3.9',
'/usr/lib/python3.9/lib-dynload',
'',
'/home/ian/.local/lib/python3.9/site-packages',
'/usr/lib/python3.9/site-packages',
'/usr/lib/python3.9/site-packages/IPython/extensions',
'/home/ian/.ipython']

Exercise

  1. Write a simple routine that removes those junk_1.txt, …, junk_n.txt files you created in the exercise problems from the previous section Recap & Outlook.