Overview on Python Programming¶
Google drive folder with Jupyter Notebooks used in class: Jupyter Notebooks from class
In this section we are going to have a brief overview on Python programming, showing some of the basic syntax through some short examples. We are going to study more details on them later in the chapter.
Interactive Python¶
In what follows we’ll display the >>>
prompt of the standard
Python shell. To enter into an interactive mode call python
from your terminal:
$ python
>>>
If you want to exit interactive mode, you can type:
>>> exit()
$
Or, just type ctrl+d
(like we saw when running cat
interactively).
As already mentioned, multi-line Python statements are best written in a
*.py
file rather than typing them interactively at the prompt, though
initially we’ll just play around with the the interactive prompt.
When typing a line, if Python recognizes it as an unfinished block, it will give a line starting with three dots, like
>>> if 1>2:
... print("oops!")
... else:
... print("This is what we expect")
...
This is what we expect
>>>
Once done with the full command, press <return> at the ...
prompt.
This tells Python we are done and it will execute the command. Note here
that you need to give indentations for the print statements, otherwise you
will see an error message like
IndentationError: expected an indented block
Now let’s talk about more why this happens.
Indentation: In Python indentation is everything!¶
Most computer languages have some form of begin-end structure, or opening and closing braces, or some such thing to clearly delineate scope. Good programmers generally also indent their code, so it is easier for a reader to see what is inside a loop, particularly if there are multiple nested loops. In most languages, including Fortran, indentation is just a matter of style. Consider this perfectly valid Fortran code:
1!! /lectureNote/chapters/chapt03/codes/examples/noIndentationForFortran.f90 2!! 3!! In this example, we show that there is no need to write a Fortran routine 4!! using proper indentations. Without indentations, it may look ugly but it 5!! still compiles and runs. 6!! 7program noIndentationForFortran 8 9implicit none 10integer,parameter :: a=1 11if (a>2) then 12 print*,'oops!' 13 else 14 print*,'this is what we expect' 15 end if 16 17 18 19end program noIndentationForFortran
In Python, indentation is everything. There are no begin-end’s or
curly braces, rather scope is enforced and notated directly through
indentation. Everything that is supposed to be at one level of a loop,
or within a conditional block, must be indented to that level. Once the
loop is done the indentation must go back out to the previous level.
There are some other rules you need to learn, such as the “else” in an
if-else block like the above has to be indented to the same level as the
if
. The same is true when defining a function, in which the function
body must be indented at least one level beyond the def
statement.
Before proceeding further, let’s have a quick look at a Python example which approximates \(\pi\) with a series:
1""" 2/lectureNote/chapters/chapt03/codes/examples/pi.py 3 4Remarks: 51. Docstring goes here with triple double quotation marks. 6 The end of the docstring is closed by another triple double 7 quotation marks. 8 92. You can put a block of comment lines in this way, anywhere in 10 the code. 11 123. If you want to comment an individual line, you can use # 13 14""" 15 16 17# Let's import NumPy library and give it a short nickname as 'np' 18import numpy as np 19 20def estimate_pi(threshold): 21 22 # INDENTATION, INDENTATION, AND INDENTATION!!! 23 print('pi estimator using threshold = ', threshold) 24 25 error = 10*threshold # initialize to be larger than the threshold 26 n=0 # initialize counter 27 pi_appn = 0 # initialize pi approximation 28 29 while (error > threshold): 30 pi_appn += 16.**(-n) * (4./(8.*n+1.) - 2./(8.*n+4.) - 1./(8.*n+5.) - 1./(8.*n+6.)) 31 32 # Putting 'a dot' followed by 'absolute' after 'np' means that 33 # 'absolute' is one of methods which is available and provided in 34 # the NumPy module. 35 error = np.absolute(pi_appn - np.pi) 36 37 # 'augmented assignment statement', same as n = n+1 38 n += 1 39 40 # output to screen 41 print(n, pi_appn,error) 42 43""" 44Block comment: 45Now call to the function, estimate_pi, using 10^(-16) as a threshold value. 46Call the function only when the file is executed in the script mode, but not 47when it is imported as a module (We will learn more on this soon!) 48""" 49 50print('Printing name:', __name__) 51 52if __name__ == "__main__": 53 # INDENTATION, INDENTATION, AND INDENTATION!!! 54 estimate_pi(1e-14) 55
The result should look like:
$ python3 pi.py
Printing name: __main__
pi estimator using threshold = 1e-14
1 3.13333333333 0.00825932025646
2 3.14142246642 0.000170187167327
3 3.14158739035 5.26324321148e-06
4 3.14159245757 1.96022357457e-07
5 3.14159264546 8.1294566634e-09
6 3.14159265323 3.61705332352e-10
7 3.14159265357 1.69122493787e-11
8 3.14159265359 8.20232770593e-13
9 3.14159265359 4.08562073062e-14
10 3.14159265359 1.7763568394e-15
How many spaces to indent each level is a matter of style, but you must be consistent within a single code. The standard is 4 spaces, and all newly written code should conform to this. Indentation must be done using spaces! A good text editor will let you bind the tab key to auto-indent, but note that the actual TAB character can not be used.
You can also call the function estimate_pi
in the above Python
routine from other Python codes as well. For instance, if you have
a routine called call_estimate_pi_from_pi.py
which looks like:
1""" 2/lectureNote/chapters/chapt03/codes/examples/call_estimate_pi_from_pi.py 3 4Remark: 51. In this caller routine, you can import pi.py as a module 6 and use the estimate_pi function defined therein. 7 82. We will learn more on this soon. 9 10""" 11 12import pi 13 14print(pi.__name__) 15 16pi.estimate_pi(1e-14)
you can get the same result by running it in the script mode
$ python3 call_estimate_pi_from_pi.py
Printing name: pi
pi
pi estimator using threshold = 1e-14
1 3.13333333333 0.00825932025646
2 3.14142246642 0.000170187167327
3 3.14158739035 5.26324321148e-06
4 3.14159245757 1.96022357457e-07
5 3.14159264546 8.1294566634e-09
6 3.14159265323 3.61705332352e-10
7 3.14159265357 1.69122493787e-11
8 3.14159265359 8.20232770593e-13
9 3.14159265359 4.08562073062e-14
10 3.14159265359 1.7763568394e-15
Notice that in the first line of the output it says pi
,
rather than __main__
. The __name__
value
is set to __main__
only if the file is executed as a
script (i.e., $python3 pi.py
), but not if it is
imported as a module from another routine
as just shown.
We will see more on this in the following sections (Python scripts and modules) and take a look at how to import modules.
Ending and wrapping lines¶
Normally, in Python each statement is one line, and there is no need to delimit the end of a line. However, a semi-colon may be used to mark the end of a line which then allows multiple statements to be written on one line, such as:
>>> x = 5; print(x)
5
It is generally advised to avoid this for the sake of readability.
If a line of code is too long to fit on a single line, you can break it into multiple lines by putting a backslash at the end of a line:
>>> y = 3 + \
... 4
>>> y
7
Python data types¶
The built-in types in Python are:
Numerics
Sequences
Mappings
Classes
Instances
Exceptions
which are dicussed in the documentation. Since Python is object oriented class types allow for a wide array of custom user defined types, and types provided by external libraries.
There are a few fundamental types that we should look at before proceeding. These are:
Numbers (immutable)
Strings (immutable)
Lists (mutable)
Tuples (immutable)
Dictionaries (mutable)
Types can be broadly classified as being either mutable, or immutable. We’ll discuss this more later. Numbers behave mostly as you would expect, so we won’t say too much about them for now. We’ll return to numerics in Python when we cover NumPy and related libraries.
Strings¶
Strings are specified using either single or double quotes, e.g.:
>>> s = 'some text'
>>> s = "some text"
are the same. This is useful if you want strings that themselves contain quotes of a different type
>>> s = ' "some" text '
>>> print(s)
"some" text
>>> s = " 'some' text "
>>> print(s)
'some' text
Strings are our first example of a sequence type. In this case they are sequences of characters. This means that you can access the characters through indices, iterate over them, and test for inclusion:
>>> s = "AM129 is great!"
>>> s[6]
'i'
>>> for c in s:
print(c)
A
M
1
2
9
i
s
g
r
e
a
t
!
>>> "AM" in s
True
You can also use triple double quotes, which allow you to write strings spanning multiple lines
>>> s = """Note that a ' doesn't end
... this string and that it spans two lines"""
>>> s
"Note that a ' doesn't end\nthis string and that it spans two lines"
>>> print(s)
Note that a ' doesn't end
this string and that it spans two lines
When it prints, the carriage return at the end of the line shows up as \n
.
This is what is actually stored. When we print(s)
it gets printed as a
carriage return again.
You can put \n
in your strings as another way to break lines
>>> print("This spans \ntwo lines")
This spans
two lines
We will learn more about strings as we go. For now you can check out some built-in String methods.
Docstrings¶
Often the first thing you will see in a Python script or module, or in a
function or class defined in a module, is a brief description that is
enclosed in triple quotes. Although ordinarily this would just be a
string, in this special position it is stored by Python into a special
local variable called __doc__
. It is called the docstring because it is
part of the documentation and some Python tools automatically use the
docstring in various ways.
For instance, these lecture notes are written in a Python code documentation tool called Sphinx. This tool can automatically generate html or Latex documentation for a code-base, and often makes use of these docstrings.
It’s a good idea to get in the habit of putting a docstring at the top of every Python file and function you write.
Lists¶
Lists are ordered collections of (potentially disparate) objects. Just like strings, lists are a sequence type which can be iterated over and indexed into. Unlike strings, lists can store anything in each position.
Lists are declared using square braces:
>>> L = [4,5,6]
>>> L[0]
4
>>> L[1]
5
Note that indexing starts at zero, and just like above we index into the list using square brackets.
Elements of a list need not all have the same type. For example, here’s a list with 5 elements
>>> L = [5, 2.3, 'abc', [4,'b'], np.cos]
Here’s a way to see what each element of the list is, and its type
>>> for index,value in enumerate(L):
... print('L[%s] is %16s %s' % (index,value,type(value)))
...
L[0] is 5 <type 'int'>
L[1] is 2.3 <type 'float'>
L[2] is abc <type 'str'>
L[3] is [4, 'b'] <type 'list'>
L[4] is <ufunc 'cos'> <type 'numpy.ufunc'>
Note that L[3]
is itself a list containing an integer and a string and
that L[4]
is a function.
Here we used the function called enumerate
. It returns an enumeration of any
sequence object. A couple examples are as follows:
>>> list(enumerate(L))
[(0, 5), (1, 2.3), (2, 'abc'), (3, [4, 'b']), (4, <ufunc 'cos'>)]
>>> list(enumerate(L,start=2))
[(2, 5), (3, 2.3), (4, 'abc'), (5, [4, 'b']), (6, <ufunc 'cos'>)]
Here we have wrapped the calls to enumerate
inside the list
constructor for the
sake of readability. To appreciate the individual items here we’ll first need to talk
about tuples
.
One nice feature of Python is that you can also index backwards from the end: since L[0]
is the first item, L[-1]
is what you get going one to the left of this, and wrapping around
(periodic boundary conditions in math terms):
>>> for index in [-1, -2, -3, -4, -5]:
... print('L[%s] is %16s' % (index, L[index]))
...
L[-1] is <ufunc 'cos'>
L[-2] is [4, 'b']
L[-3] is abc
L[-4] is 2.3
L[-5] is 5
In particular, L[-1]
always refers to the last item in list L
.
Tuples¶
Tuples are essentially immutable versions of lists (more on mutability later!), and are again a sequence type. They are declared by grouping items together with parentheses. Consider:
>>> t = (1,'AM129',[3,4,'stuff'])
>>> t[1]
AM129
This declares t
as a tuple with 3 elements inside it, first the integer 1, then a string,
and finally a list that itself contains 3 elements.
You will frequently see tuples used to assign multiple items at the same time. Similarly, you will frequently see tuples unpacked through assignment to multiple items. Continuing from above we can do:
>>> i,s,l = t
>>> i
1
>>> s
AM129
>>> l
[3, 4, 'stuff']
This also explains how the enumerate
call earlier worked. The enumeration is a
type that pairs each entry in another sequence type with its index. Each pair of
index and item is stored in a tuple:
>>> el = list(enumerate(l))
>>> el[0]
(0, 3)
>>> type(el[0])
tuple
Again, we have expanded the enumeration into a list so that we can see the individual pairings. A more typical usage is:
>>> for idx,val in enumerate(l):
print(idx,val)
0 3
1 4
2 stuff
In particular, idx
and val
are set by unpacking each tuple that enumerate
returns.
Iterators and range
¶
We have already been iterating over sequence types, but haven’t explicitly talked about the loop syntax. Let’s remedy that now. This generally takes the form:
>>> for A in B:
# Do something with A...
Here, B
can be any sequence type (or more specifically any iterable type), and
A
will get assigned to each value inside B
throughout the loop. The loop body
occurs on new lines after the colon, and must be indented correctly.
Let’s return to our enumerate
example one more time:
>>> for idx,val in enumerate(l):
print(idx,val)
0 3
1 4
2 stuff
Here enumerate(l)
is an iterable object, and idx,val
are set according to each
tuple
the enumeration generates.
We can also loop over regular ranges of integers like we did in Fortran. The range
class provides iterables that do exactly this:
>>> for idx in range(3):
print(l[idx])
3
4
stuff
This is typically paired with the len
instrinsic function that returns the length
of any sequene type. Consider equivalently:
>>> for idx in range(len(l)):
print(l[idx])
3
4
stuff
Of course, this second approach is much less error prone.
Python objects¶
Python is an object-oriented language, which means that virtually everything you encounter in Python (variables, functions, modules, etc.) is an object of some class. There are many classes of objects built into Python and in this course we will primarily be using these pre-defined classes. For large-scale programming projects you would probably define some new classes, which is easy to do. We will take a look at some object-oriented programming with Python at the end of this chapter.
The type
command can be used to reveal the type of an object
>>> import numpy as np
>>> type(np)
<type 'module'>
>>> type(np.pi)
<type 'float64'>
>>> type(np.cos)
<type 'numpy.ufunc'>
We see that np
is a module, np.pi
is a double precision floating point real
number, and np.cos
is of a special class that’s defined in the NumPy module.
The linspace
command creates a numerical array that is also a special
class defined within the NumPy module:
>>> x = np.linspace(0, 5, 6)
>>> x
array([ 0., 1., 2., 3., 4., 5.])
>>> type(x)
<type 'numpy.ndarray'>
Objects of a particular class generally have certain operations that are
defined on them as part of the class definition. For example, NumPy
numerical arrays have a max
method defined, which we can use on x
in
two different ways
>>> np.max(x)
5.0
>>> x.max()
5.0
The first way applies the method max
defined in the NumPy module np
to x
.
The second way uses the fact that x
, by virtue of being of type
numpy.ndarray
, automatically has a max
method which can be invoked (on
itself) by calling the function x.max()
with no argument. Which way is
better depends in part on what you’re doing.
Here’s another example
>>> L = [0, 1, 2]
>>> type(L)
<type 'list'>
>>> L.append(4)
>>> L
[0, 1, 2, 4]
L
is a list (a standard Python class) and so has a method append that
can be used to append an item to the end of the list.
See a list of methods for Python Lists.
Declaring variables?¶
In many languages, such as Fortran, you must declare variables before
you can use them. Once you’ve specified that, say, x
is a real number,
this is the only type of data that can be stored in x
. A statement like
x = 'string'
would not be allowed.
In Python you do not need to explicitly declare variables. The first usage of a variable acts to declare it. You can write:
>>> x = 3.4
>>> 2*x
6.7999999999999998
>>> x = 'string'
>>> 2*x
'stringstring'
>>> x = [4, 5, 6]
>>> 2*x
[4, 5, 6, 4, 5, 6]
Here x
is first used for a real number, then for a character string, then
for a list. Note, by the way, that multiplication behaves differently for
objects of each different type (which has been specified as part of the
definition of each class of objects).
In Fortran if you declare x
to be a real variable then it sets aside a
particular 4 bytes of memory for x
, enough to hold one floating point
number. There’s no way to store 6 characters or a list of 3 integers in
these 4 bytes.
In Python it is often better to think of x
as simply being a pointer
that points to some object. When you type x = 3.4
Python creates an
object of type float
holding one real number and points x
to that. When
you type x = 'string'
it creates a new object of type str
and now points x
to that, and so on.
Copying objects¶
One implication of the fact that variables are just pointers to objects is that two names can point to the same object, which can sometimes cause confusion. Consider this example
>>> x = [4,5,6]
>>> y = x
>>> y
[4, 5, 6]
>>> y.append(9)
>>> y
[4, 5, 6, 9]
So far nothing too surprising. We initialized y
to be x
and then we
appended another list element to y
. But take a look at x
>>> x
[4, 5, 6, 9]
We didn’t really append 9 to x
, we appended it to the object that y
points to, which happens to be the same object that x
points to!
Failing to pay attention to this sort of thing can lead to many headaches.
What if we really want y
to be a different object that happens to be
initialized by copying x
? We can do this by:
>>> x = [4,5,6]
>>> y = list(x)
>>> y
[4, 5, 6]
>>> y.append(9)
>>> y
[4, 5, 6, 9]
>>> x
[4, 5, 6]
This is probably what we wanted. Here list(x)
creates a new object, which is a list,
using the elements of the list x
to initialize it, and y
points to this
new object. Changing this object doesn’t change the one x
pointed to.
You could also use the copy
module, which works for more general objects
>>> import copy
>>> y = copy.copy(x)
Things can sometimes be more complicated, particularly if the list x
contains more complex
objects. See http://docs.python.org/library/copy.html for more information.
There are some objects that cannot be changed once created (immutable
objects, as described further below). In particular, for floats
and integers
,
you can do things like
>>> x = 3.4
>>> y = x
>>> y = y + 1
>>> y
4.4000000000000004
>>> x
3.3999999999999999
Here changing y
did not change x
, luckily. We don’t have to explicitly make a copy
of x
for y
in this case. If we did, writing any sort of numerical code in Python
would be a nightmare.
We didn’t because the line
>>> y = y + 1
above is not changing the object that y
points to, instead it is creating a new
object that y
now points to, while x
still points to the old object.
For more about built-in data types in Python, see the reference manual.
Above, we saw that append
method adds a new element to the end of the existing list.
More generally, one can use insert(index, newEntry)
method which will add newEntry
before index
>>> x
[4, 5, 6]
>>> x.insert(0,-10)
>>> x
[-10, 4, 5, 6]
>> x.insert(2, -20)
>>> x
[-10, 4, -20, 5, 6]
Note
Like man pages in Linux/Unix, one can invoke the built-in Python
command help
to see a help page on the object under
consideration. For example, try
>>> help(x)
or in IPython, you can use a question mark (?) as an alias for the help function, e.g.
In [4]: ? x
and see if you can find insert
, append
, etc.
Note
It is also useful to combine type
and help
>>> type(x)
<type 'list'>
>>> help(list)
Help also works on as module names, and their aliases
>>> import numpy as np
>>> help(np)
Don’t forget that you can always find more details by searching online. Python is heavily used and well documented online. That said, there are also hacky solutions to problems, so take online advice with a grain of salt.
In Python you can also initialize multiple variables in a compact way
>>> x = y = z = 2
>>> print(x, y, z)
2 2 2
>>> x,y = 5.1, 159 # this is in fact a tuple assignment without using (...)
>>> x
5.1
>>> y
159
Now if you want to swap x
and y
, you could simply do
>>> x,y = y,x
>>> x
159
>>> y
5.1
Mutable and Immutable objects¶
Some objects can be changed after they have been created while others can’t. Understanding the difference is key to understanding why the examples above concerning copying objects behave as they do. Here we briefly look at the following data types:
mutable: lists, dictionaries
immutable: strings, numbers, tuples
Note
The main difference between mutable and immutable objects is illustrated by the following:
id(mutable_object)
does not change after changingmutable_object
id(immutable_object)
does change after changingimmutable_object
Lists (mutable)¶
A list is a mutable object. The statement
x = [4,5,6]
above created an object that x
points to, and the data held in this object
can be changed without having to create a new object. The statement
y = x
points y
at the same object, and since it can be changed, any change will
affect the object itself and be seen whether we access it using the pointer
x
or y
.
We can check this by
>>> id(x)
1823768
>>> id(y)
1823768
>>> x is y
True
The id
function is analogous to the location in memory where the object is
stored (and is the same as the memory address when using CPython). If you do
something like x[0] = 1
, you will still find that the objects’ id’s have
not changed – they both point to the same object, but the data stored in
the object has changed (Please check this out for yourself). This also means that
any modifications done to a mutable object can be done in-place, which helps
avoid copying things around in memory.
Strings and numbers (immutable)¶
Some data types are immutable. Once created, these objects cannot be changed. Integers, floats, tuples, and strings are immutable
>>> s = "This is a string"
>>> s[0]
'T'
>>> s[0] = 'b'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment
You can index into a string, but you can’t change a character in the string.
The only way to change s
is to redefine it entirely (rather than
partially) as a new string (which will be stored in a new object)
>>> id(s)
1850368
>>> s = "New string"
>>> id(s)
1850128
What happened to the old object? It depends on whether any other variable was pointing to it. If not, as in the example above, then Python’s garbage collection would recognize it’s no longer needed and free up the memory for other uses. But if any other variable is still pointing to it, the object will still exist, e.g.
>>> s2 = s
>>> id(s2) # same object as s above
1850128
>>> s = "Yet another string" # creates a new object
>>> id(s) # s now points to new object
1813104
>>> id(s2) # s2 still points to the old one
1850128
>>> s2
'New string'
Tuples (immutable)¶
We have seen that lists are mutable. For some purposes we need something
like a list but that is immutable (e.g. for dictionary keys, see below). A
tuple is precisely the tool to achieve this. It is defined with parentheses
(..)
(though you can define a tuple without parentheses in principle)
rather than square brackets [..]
as used in lists
>>> t = (4,5,6) #this is same as t = 4,5,6
>>> t[0]
4
>>> t[0] = 9
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
Example¶
Consider this example
1"""
2/lectureNote/chapters/chapt03/codes/examples/mutable_and_immutable/mutable_and_immutable.py
3
4Difference between mutable and immutable objects
5
6"""
7
8def updateInt(n):
9 n = n + 10
10
11
12def updateArray(m):
13 m.append(10)
14
15a = 10 # a is an Integer, so IMMUTABLE
16L = [1, 3, 5] # L is an Array, so MUTABLE
17
18print("before calling a function,")
19print("a = ", a)
20print("L = ", L)
21
22# lets try change the value of a and L
23# by calling functions
24updateInt(a)
25updateArray(L)
26
27print("after calling a function,")
28print("a = ", a)
29print("L = ", L)
We defined two functions, which try to update the argument passed into them. When you execute this file, the output would be different from what our intention may have been.
$ python3 mutable_and_immutable.py
before calling a function,
a = 10
L = [1, 3, 5]
after calling a function,
a = 10
L = [1, 3, 5, 10]
As you can see in the above result, only mutable value, array L
, reflects the
update. What happened to a
? It did not get updated, but we also did not get any
sort of error like we saw above. The variable n
only exists within the function
body, and was updated, but that change had no way to propagate back to where it
was called from.
Comments¶
Anything following a # in a line is ignored as a comment (unless of course the # appears in a string)
There is another form of comment, the docstring, discussed below following an introduction to strings.