Python Overview¶

David E. Bernal Neira
Davidson School of Chemical Engineering, Purdue University

Python Overview¶

Table of Contents¶

Programming Languages
Python Basics
- Dynamic “typing”
- Pointers vs Objects
- Strong typing
- Data types
- Operators
- Functions
  - Anonymous
- Indenting
- Namespace
- Scope
Data Structures
- Built-In Structures
- List - Examples
- Slice notation
- Tuple - Examples
- Set - Examples
- Dictionary - Examples
Control Statements
- for - Examples
- while - Examples
- if/elif/else - Examples
Object Comprehension
- List Comprehension - Examples
- Set/Dictionary Comprehension - Examples
Classes
- Built-In Classes
- File Objects
- Custom Classes

# If using this on Google colab, we need to install the packages
try:
  import google.colab
  IN_COLAB = True
except:
  IN_COLAB = False

If you are using google colab you should save this notebook and any associated textfiles to their own folder on your google drive. Then you will need to adapt the following commands so that the notebook runs from the location of that folder. This is only necessary for the brief section on reading text files into Python.

# If you want to use Google Drive to save/load files, set this to True
USE_GOOGLE_DRIVE = False
if IN_COLAB and USE_GOOGLE_DRIVE:
    from google.colab import drive
    drive.mount('/content/drive')

    # Colab command to navigate to the folder holding the homework,
    # CHANGE FOR YOUR SPECIFIC FOLDER LOCATION IN GOOGLE DRIVE
    # Note: if there are spaces in the path, you need to precede them with a backslash '\'
    %cd /content/drive/My\ Drive/CHE597/Notebooks/1-Python_Overview

Introduction to Python¶

Throughout this course we will learn by doing. At the outset we’ll assume that you know nothing, but getting you up to speed on Python is critical because we’ll need to move to more advanced topics quickly. Here is your crash course on Python.

In the following sections keywords you should know and in-text code will be presented as keywords and code, respectively.

Programming Languages¶

Getting a computer to do something you want is always a matter of translation, because you and the computer (the bundle of connected hardware, including your display, RAM, hard drive, processor, and GPU) speak different languages. Your computer speaks something called “machine code”, or one step higher, closely related “assembly” language, which is almost impossible for humans to read. If we had to write everything in assembly language it would be very difficult to get anything done. So what has developed in modern computing is a system of higher-level programming languages that are human readable (with practice) that get converted into machine code by specialized translators called compilers. Thus, all of the programming languages that you commonly think of like C, Fortran, Python, Java, etc. have specialized vocabulary for common tasks like addition, matrix multiplication, writing files, and generating pictures, that the associated compiler knows how to turn into machine code for the computer to execute.

For example:

Check your understanding¶

In your own words, what is the difference between source code and machine code?
Why does Python feel easier to use than languages that require a separate compile step?

a = 1
b = 2
c = print(a+b)

is about the simplest python program possible, but A LOT goes on behind the scenes to make things that simple.

For instance, we created two integers in the system’s memory (1 and 2). We didn’t tell the computer where to put those, we left it up to Python’s interpreter (a behind the scenes compiler) and the system’s operating system (which actually executed the compiled program) to just figure out whether to put those in RAM or on the disk drive, and to do it in a manner that didn’t conflict with other things that might be saved in those places.
We also attached labels to those integers (a and b), so that we could use them later. Creating pointers, or giving names to objects in a program, is something that comes naturally to us, but behind the scenes the compiler needs to do more hard work to provide this abstraction.
How did the computer know that 1 and 2 were integers? How did it know that we didn’t mean 1.0 and 2.0 (floats), or "1" and "2" (strings)? That’s a convention of the language and you have to play by its rules. In Python, a number without a decimal is interpreted as an integer and anything inside of single ('') or double ("") quotes is a string. In other languages you need to be more explicit.
How did the computer know what a+b meant? There are a lot of assumptions we made there without thinking about it. We assumed that the Python knows how to interpret +. Thankfully it does. We also assumed that it would know that a and b pointed to things that could be added. What if a and b had been strings?

a = "one"
b = "two"
print(a+b)

onetwo

# Mini-exercise
# Write one sentence explaining why high-level languages exist.
answer = ""
print(answer)
# Expected output: one sentence you wrote

In this case the program interpreted + in different way, to mean “add these strings together”. This is again something built into the language that we can take for granted in Python but will be different in other languages.
What is print()? I think everyone can guess that print() “prints” whatever is inside its parenthesis, but how did the program know it could do this? There are a lot of capabilities that are built into every modern programming language. For instance, all modern programming languages assume every user will potentially need to do things like print out results (print()) and add (+). Some capabilities aren’t available by default and we’ll need to alert the program that we want to use them (we cover this in the importing section).
Finally, at what point did the human-readable commands above (source-code) get turned into machine code? In Python this happens behind the scenes. Your code is first translated to an intermediate form (bytecode) and then executed by the Python runtime. You do not need to manually compile your source code, which is why Python is often called an interpreted language in contrast to languages where a separate compile step produces a standalone executable. The details are beyond what you need right now; just know that Python handles these steps for you so you can focus on writing code.

Python Basics¶

Dynamic “typing”¶

In Python, objects are one thing and their pointers (you might want to call them labels or variables) are another. As I told you above in the case of a=1, the integer was created somewhere in the available memory and we attached the pointer a to it so that we could use it later. Python automatically “knew” that we meant 1 to be an integer by convention. That is, the type of 1 was determined to be an integer on the fly. We call this aspect of a language “dynamic typing” and it means that the user doesn’t need to explicitly specify the type of an object. This can be a surprise for someone coming from the C-family or Fortran programming languages.

Pointers vs Objects¶

In a dynamically typed language, pointers are labels for objects in memory (like 1 above). They can be plucked off one object and placed on another. For example the two consecutive commands,

a=1
a='one'
print(a)

one

Strong typing¶

Even though Python determines object types dynamically when they are created, the object type is fixed. We call this aspect of a programming language strong typing. In practical terms, all that this means is that Python will throw an error if you try and do something with an object that it is incapable of. Python will not automatically convert unrelated types (e.g., int to str) just to make your code run. Thus, a in the example above might be alternately assigned to an integer or a string, but operations on the object that a points to must be consistent with that object. The compiler won’t guess your intentions. For example

a=1
b=a+1
print(b)

works fine, but

a=1
b=a+"one"

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipython-input-1912392243.py in <cell line: 0>()
      1 a=1
----> 2 b=a+"one"

TypeError: unsupported operand type(s) for +: 'int' and 'str'

throws an error, because string objects can’t be added to integers. Sometimes Python is forgiving, like if you add an integer and float (we’ll return to this), but it is best to be explicit.

Data types¶

We’ve partly covered this but let’s be comprehensive. We’ll refer to objects as anything that is stored in memory and data as special types of objects that are single values. Data objects are the most elementary of the objects we will work with in Python and more complicated objects like functions and structures will use data objects as inputs. The basic data types that we commonly work with are int, float, string, complex, and boolean. Python can handle all of these natively.

Check your understanding¶

What is the difference between an int and a float?
Give one example where Python implicitly converts a numeric type.

# Example 1: integer math
a = 1000
b = 2000
print(a+b)

# Example 2: float math
a = 1000.0
b = 2000.0
print(a+b)
print(type(a+b))

3000
3000.0
<class 'float'>

In the first example, adding two int objects returns an int result (you can tell by the fact that the printed number doesn’t have a decimal).

In the second example, adding two float objects returns a float result. We can also confirm this by printing out the type explicitly.

# Example 1: upcasting, integer math
a = 1
b = 3
print (a/b)

# Example 2: upcasting, mixed integer/float math
a = 1000
b = 2000.0
print(a+b)
print(b/a)

0.3333333333333333
3000.0
2.0

In all three cases we obtained float results. This is expected behavior known as upcasting. When you ask Python to do something with an object, it will sometimes change the object’s type (if it can) in order to do what you asked.

Example 1: division of two int objects yields a fraction that is not expressible as an int. Python upcasts the int objects to float objects in order to perform the division. Older versions of python (2.x) would not do this and would instead yield 0 as an answer (the int object that results from truncating the decimal digits).
Example 2: adding an int to a float gives a float result. This is another instance of upcasting, where the lower precision object (int) is converted to the type of the higher precision object (float) in order to carry out the operation. Older versions of Python support this type of upcasting.
Example 3: dividing a float by an int gives a float result. Yet another example of upcasting. In this case, older versions of python (2.x) would not perform upcasting in this case.

In summary, Python is often smart enough to convert numeric data types for you in order to yield sensible results (e.g., int to float). It generally will not convert across unrelated types (e.g., int to str) without you asking for it. However, what is “sensible” to the developers may not be the behavior you intended, and it is important to be aware of upcasting because it will occur without warning and can be difficult to debug when it is occurring unintentionally. Upcasting can be great if you know what you are doing, but generally it is best practice to explicitly cast your results using the built in type functions.

More examples:

# Example 1: Explicit casting, integer math
a = 2
b = 3
print(int(a/b))

# Example 2: Explicit casting, float math
a = float(2)
b = float(3)
print(a/b)

# Example 3: complex math
a = 1.0 + 1.0J
b = 1.0 + 5.0J
print(a+b)

# Example 4: string concatenation
a = "1.0"
b = '3.0'
print(a+b)

0
0.6666666666666666
(2+6j)
1.03.0

Example 1: We’ve used the standard Python function int() to convert the result to an int object. Note, even though 0.666666.... rounds up to 1, int() just returns the non-decimal digits resulting in 0. Sometimes this is convenient if you want the “floor” of a value.
Example 2: We’ve used the standard Python function float() to convert a and b to float objects prior to dividing them. There is no ambiguity about the result which is a float.
Example 3: This is just an example of adding complex numbers. a and b in this case are cast as complex objects since we used the J-notation when we defined them.
Example 4: “adding” two string objects doesn’t have mathematical meaning, but Python developers have given the + operator the definition of “concatenation” for string objects. Thus, adding two string objects returns a new string object with the second string attached to the first string. Note, we used double-quotes when we defined a and single-quotes when we defined b. Single and double quotes have the same effect (casting the object as a string). It is useful to have both available in case we want to make a string that contains quotes (e.g., a='the student answered "Python is my favorite programming language"').

What are functions and operators? I’ve snuck these concepts in without alerting you, but I think based on the above examples you might be able to guess what they are. Let’s give each a proper definition and further examples.

Operators¶

Operators are commands that “operate” on objects to their left and right and return a new object or value. This abstract notion is actually extremely intuitive. For example + and / are arithmetic operators for addition and division of the objects on either side of the operator. < is a logical operator, it returns True when the object on the right is larger than the object on the left (Note: logical operators can be used on things that aren’t obviously numeric). is is another logical operator, it returns True when the pointers on the left and right point to the same object, and False when they don’t. is is similar to == (“equals”), except that == can evaluate to True if two different objects evaluate to the same value. Note: using is for value comparison is a misuse because many equal values are not the same object (e.g., two different lists with the same contents). Use == for value equality and reserve is for identity checks like x is None. Here are some usage examples

Note: See Python operator reference: https://docs.python.org/3/reference/expressions.html#operator-precedence

a=1
b=1
c=None

# Example 1: logical "==" operator
print(a == b)

# Example 2: logical "is" operator (identity check)
print(c is None)

# Example 3: logical ">" operator
print(100>1)

# Example 4: logical "<" operator
print("abce"<="abcd")

True
True
True
False

Example 1: This is an example of using the == logical operator. This will be useful when we get to control structures like if statements where we want something to occur “only if”.
Example 2: This is an example of using the is logical operator for identity. It returns True only if they point to the same object in memory (e.g., x is None). It should not be used to compare values.
Example 3: This is an example of using the > operator. In this case, 100 is greater than 1 so it returns True.
Example 4: This is an example of using the <= (less than or equal to) operator on string data types. In string comparisons each character is compared until a difference is found. Placement in the alphabet is used to determine ‘greater than’ or ‘less than’, so 'e' is “greater than” 'd' and the operator evaluates to False. Be careful when making logical string comparisons!

Functions¶

In Python, a function is a special type of object that takes other objects as input and does something or optionally returns other object(s) as output(s). functions are the work-horse of most programs and they are more intuitive to see and use than this abstract definition makes them appear. At their most essential, functions allow you to reuse chunks of code for operation on new inputs or with tweaks to the parameters associated with the reused code.

A function is called in Python by using its name followed by parentheses (e.g., print(1) calls the print() function with the integer 1). Most, though not all, functions will accept some kind of object as an input. The inputted objects to a function are called its arguments, and these go inside the parentheses. We’ve seen several examples of functions already: print(), int(), float(). These are functions that are standard in Python and can be used by default. print() takes in an argument but doesn’t return anything. int() takes in an object that is convertible to an integer and returns an int object. For example:

Note: See defining functions: https://docs.python.org/3/reference/compound_stmts.html#function-definitions

Check your understanding¶

What is the difference between a function’s arguments and its return value?
Why does print() return None?

# example 1: calling the print() function on an integer
a=print(1)

# example 2: print() does not return an object
print(a)

1
None

Example 1: we call the print() function with the int 1. Very sensibly, print(1) prints the number to standard output (i.e., the place on your computer that displays output, in notebooks that is right below the code block). We also try to assign the label a to whatever object that print returns.
Example 2: we call print() on the object that a points to. The result is None, confirming that print does not return any object.

return: When we say that a function returns something, we mean that the function creates an object that we intend to keep for use in the rest of the program. Some functions, like print(), do not return objects. Some functions return several objects.

arguments: the objects that we pass to a function are its arguments. These can be optional or required depending on the function. When multiple arguments are supplied they are separated by a comma (,). For example, print() accepts one or more arguments that are convertible into strings. If an argument is required, the function will fail if you don’t supply it. Required arguments are usually positional, meaning that when you call the function it expects a given argument in a given position (see example 1 below). Functions may also have optional arguments that take on a default value when they are not supplied by the user (see example 4 below). optional arguments are always specified after positional arguments.

In summary, functions are nothing more than packages of reusable code for things we commonly need to do. It would be very inconvenient to rewrite the print() function every time we wanted to use it. Likewise, we can define our own functions for specific tasks that will be repeated in our programs. For example:

# example 1: function that returns an object (a and b are positional arguments)
def add(a,b):
    return a+b

# example 2: function that returns multiple objects
def add_sub_mult_div(a,b):
    return a+b,a-b,a*b,a/b

# print results
print(add(1,3))
print(add_sub_mult_div(1.,3.))

# example 3: label results for reuse
a,b,c,d = add_sub_mult_div(1.,3.)
print("a: "+str(a))
print("b: "+str(b))
print("c: "+str(c))
print("d: "+str(d))

4
(4.0, -2.0, 3.0, 0.3333333333333333)
a: 4.0
b: -2.0
c: 3.0
d: 0.3333333333333333

Example 1: We define a simple add function that has a and b as inputs and returns a+b as an output. def name(input1,input2,...): is the syntax in Python for defining a function. We test out this function by calling add(1,3) inside of print() with the result that 4 is printed to the screen. Note the order of operations: add(1,3) returns the int 4, which is then passed as an input to print().
Example 2: We define a function that has a and b as inputs and returns a+b,a-b,a*b,a/b. This function returns several floats. When we print() the result of add_sub_mult_div(1.,3.) we get multiple outputs: (4.0, -2.0, 3.0, 0.3333333333333333).
Example 3: If we wanted to save the outputs from a function for future use, we can name them by calling the function with an equality. Since add_sub_mult_div() returns four objects, we put four labels on the left-hand side of the equality, separated by commas, and we call the function on the right-hand side. When we print the individual results they are the same as in Example 2.

Note: add(1,3) returned an integer, whereas add_sub_mult_div(1.,3.) returned floats. Functions aren’t magic, they simply do what you program them to do. As written, each function will retain the type of the objects supplied. Likewise, if you supply incompatible inputs to the function, you will get an error (e.g., add_sub_mult_div('one','three')).

Finally, let’s reformulate example 1 so that the second argument b is optional. In this case, it acts the same as the original function when two arguments are supplied, but defaults to adding 0 when only one argument is supplied.

# example 4: add function with an optional second argument (a is positional, b is optional)
def add(a,b=0):
  return a+b

# print results
print(add(1,3))
print(add(1,b=3))
print(add(1))

4
4
1

Note: add(1,3) and add(1,b=3) are equivalent. For optional arguments, you can supply them out of order to the function as long as you specify their name. It is generally more robust to specify the name of optional arguments that you are specifying in case the positional ordering of the function arguments is updated in later versions of the function.

Note: a method is like a function except that it is associated with a specific class of object. For example, all string objects have the .format method associated with them. methods are covered in the later section on classes but there will be some references to them before that I want to plant the seed.

Anonymous¶

There is a second way of defining functions in Python that is common enough to be included in the discussion. Specifically, Python allows you to define functions on a single line with the keyword lambda followed by the input arguments and a single expression that is evaluated and returned. These functions are called anonymous because they don’t have a name. Specifically, the expression just creates a function object that the user can assign to a pointer or use in other expressions the same way any other function might be used.

The specific syntax for defining an anonymous function is lambda a,b,c: expression where a,b,c are input arguments to the function and expression is a python expression involving the arguments a,b,c. This is best illustrated with some examples:

sqroot = lambda x: x**(0.5)

print(sqroot)
print("sqroot(100): ",sqroot(100))

# Advanced example
l = range(11)
print(list(filter(lambda x: x%2==0,l)))
print(list(map(lambda x: x%2==0,l)))

<function <lambda> at 0x79711140cd60>
sqroot(100):  10.0
[0, 2, 4, 6, 8, 10]
[True, False, True, False, True, False, True, False, True, False, True]

Note: the first print statement reflects that the pointer sqroot points to a function object.

Note: in the second print statement we call sqroot() like we would any other function.

Note: in the last two examples anonymous functions that return a boolean if a number is even are passed to other functions filter and map. These functions haven’t been covered yet, but a common use case for anonymous functions is when passing an expression as an argument to another function.

Indenting¶

How did Python know where the function definitions ended? In most programming languages you would use a “start” keyword, like “def”, and a separate “end” keyword to let the compiler know that you were finished defining your function. In Python, instead of using “end” keywords for the various situations that require them, you instead use indenting to specify when code is part of the function and when it is part of the rest of the program. Indenting is used in this way anytime you are doing something that seems to logically require an “end” statement (e.g., this will figure prominently in the control statements section below).

Namespace¶

In the examples above, inside the functions the labels a and b have one meaning, but outside the functions, we use a and b to label outputs. How come these don’t conflict in some way? Namespace is the term used for how programming languages organize object labels. The short answer is that the names for things inside of functions (and later, classes) are actually renamed internally by Python without you knowing it in order to make sure that they don’t conflict with labels in the rest of the code. You can make an object “global” so that it is accessible everywhere in your code, but this shouldn’t be done naively and is beyond the scope of this intro. Also note that when you call a function (e.g., add(1,3) in the example above), Python automatically assigns the inputs to the function labels based on position (e.g., a and b in the case of add()), you don’t have to manually do that.

Scope¶

Similar to namespace, scope refers to which objects can be accessed by different parts of your program. For example, a function only has access to data objects that are passed as inputs, data objects that are defined in the function itself, and other functions that have been defined or imported (more on this later) into the program. Likewise, the rest of the program cannot access the objects generated by a function unless they are returned and given labels. On the other hand, a function has access to other functions that have already been defined in the program or are built-in to Python. For example:

# add function that returns an object
def add(a,b):
    return a+b

# subtract function that returns an object
def sub(a,b):
    return a-b

# example 1: function that calls other functions
def add_sub(a,b):
    return add(float(a),float(b)),sub(float(a),float(b))

print(add_sub(1,3))

(4.0, -2.0)

Example 1: In this example, the function add_sub() calls the user-defined functions add() and sub() and also uses the built-in function float() to cast the returned objects as floats. Note, add_sub() would not work if add() and sub() were defined after the first call of add_sub() (e.g., try moving the definition of add() to after the print() statement).

Scoping is a way of avoiding conflicts in complicated code so that the user is always working with the objects that they intend. Again, “global” objects are exceptions to this rule but will not be covered here.

# Mini-exercise
# Create variables a and b of different types.
a = ""  # TODO
b = ""  # TODO

# Valid operation (fill one):
# print(???)

# Invalid operation (leave commented):
# print(a + b)
# Expected output: a valid result from your chosen operation

Data Structures¶

Built-In Structures¶

We’ve already covered individual data objects like floats, ints, etc. In any practical application it will quickly become inconvenient to work with objects on an individual basis and we will want to collect objects together--that’s what data structures are for. Thankfully in Python the most useful structures come built-in or can easily be imported from standard packages. As with many things, it is easier to show than tell, so let’s run through the definitions and demonstrate their use:

list: lists are the structures that store data in order. Lists have built-in methods (see below) for adding more data, removing data, modifying data, and accessing data in the list. You can modify objects in the list without creating a whole new list, for this reason we say that lists are mutable. Defined using [] notation (check examples below).
tuple: tuples are structures that store data in order but are immutable. That is tuples are like lists, except that you can’t add or remove from them without creating a whole new object in memory. Sometimes you need things that are immutable (e.g., the next two structures require immutable objects!). Defined using () (check examples below).
set: sets are like lists but they are unordered and no duplicates are allowed. Unordered? Yes, sets collect objects but don’t keep track of where they were put. On a technical side, it is actually more accurate to say that sets are hyperspecific about where they’ve put things (internally) and so you as the user cannot modify that order. This makes sets very efficient at calculating membership (e.g., is “1” in our set?) and intersections (common objects in multiple sets). If you wanted to know if the numbers 7,23,and 108 existed in three lists or three sets, it would be much faster to do the latter. Defined using {} or set(list) notation. (check examples below).
dictionary: dictionaries are structures that store objects in pairs. Each pair consists of a key and a value. In analogy, a phone book is a dictionary, where the key is a person’s name, and the value their phone number. You could accomplish the same thing using two sets of ordered lists, but dictionaries have the advantage that their keys are stored...as...a...set! Returning to our phonebook example, this would make it really easy to figure out if someone is in the phonebook (otherwise you would need to read every name). It also means that there can’t be two identical keys in your dictionary. Defined with {} notation with key:value pairs. (check examples below)

List - Examples¶

# example 1 - creating a list of ints
a = [1,2,3,4]
print("a:"+str(a))

# example 2 - adding an element and creating a new list
b = a + [100]
print("b:"+str(b))

# example 3 - removing an element from the list and storing it in a new variable
c = b.pop(0)
print("c:"+str(c)+", b:"+str(b))

# example 4 - returning the minimum value
d = min(a)
print("d:"+str(d))

# example 5 - grabbing the first item in a list (without removal)
e = a[0]
print("e:"+str(e))

# example 6 - grabbing the last object in a list (without removal)
f = a[-1]
print("f:"+str(f))

# example 7 - grabbing the first three objects in a list (without removal)
g = a[:3]
print("g:"+str(g))

# example 8 - grabbing the last three objects in a list (without removal)
h = a[-3:]
print("h:"+str(h))

# example 9 - return all objects from the list (without removal)
i = a[:]
print("i:"+str(i))

# example 10 - grabbing every other value from a list (without removal)
j = a[::2]
print("j:"+str(j))

# example 11 - reassign the first element to 100000.0
a[0] = 100000.0
print("a:"+str(a))

a:[1, 2, 3, 4]
b:[1, 2, 3, 4, 100]
c:1, b:[2, 3, 4, 100]
d:1
e:1
f:4
g:[1, 2, 3]
h:[2, 3, 4]
i:[1, 2, 3, 4]
j:[1, 3]
a:[100000.0, 2, 3, 4]

Example 1: We create a list use [], or “square-bracket” notation. Python will interpret any square brackets that are not attached to a pointer (e.g., Example 5 is square brackets attached to a pointer) as a list. In this case there are five objects in the list (i.e., 4 int objects) and they will be stored in the order they were added.
Example 2: Adding two lists together is interpreted as “extend” or “concatenate”, with the first list followed by the second list. In this case the second list only has a single element (100), so the new list will have 5 objects.
Example 3: Two things are happening here that need explanations. First, we are using structure.method notation for the first time. A method is just a name for a function that is attached to a structure or class (we’ll discuss classes later). methods come with the structure (list in this case) for free, and they usually do something to the structure they are attached to, that’s why it makes sense to define the functionality as a method and not as an ordinary function. In this case the method is pop(), and it removes an element from the list based on which index you supply it as an argument. Since lists are 0-indexed the 0 corresponds to the first element in the list. b.pop(0) thus removes the first element, 1, and assigns it to the points c.
Example 4: min() and max() are functions that ship with Python by default. These functions accept lists, tuples, sets and other structures and return the minimum and maximum value (without removing it from the structure itself). Here we assign the minimum value the label d.
Example 5: You can access one or more objects from lists using “slice” notation (i.e., square brackets [] attached to a list or its pointer). In this case a is a pointer to the list [1,2,3,4], so a[0] returns the first object in the list (remember, everything in Python is 0-indexed).
Example 6: When using [] notation with negative integers the indexing will count from the end of the list. In this case a[-1] returns the last element, a[-2] would have returned the second from last, etc.
Example 7: index ranges can be specified using a colon : . If you used [start:stop] then the objects between index start (inclusive) and stop (exclusive) will be returned as a new list. If you don’t specify start or stop in this example, then it will automatically run to the beginning or end of the list, respectively. In this case, a[:3] returns all objects from the beginning of the list through the object at index 2 as a new list.
Example 8: Colon notation can also be used with negative integer for specifying indices relative to the end of the list. In this case, a[-3:], returns all objects from the third to the last through the end of the list as a new list.
Example 9: If neither the start or stop indices are specified then all objects are returned. In this case, the list i would be the same as the list a.
Example 10: An optional “stride” or “step” can be specified when using slice-notation. When it isn’t specified a default of 1 is used. In this example, a[::2], neither the start of end indices are set and we set the step to 2, so every other object is returned from the list.
Example 11: Since lists are mutable we can reassign their values. In this case, note that the resulting list has a mixture of int and float objects.

Slice notation¶

Since it is so important, here’s a quick summary of slice notation. list[start:stop:step] returns objects between the indices start through stop (excluding stop). list[start:] returns objects start through the end of the list, list[:stop] returns objects from the beginning of the list through the object at index stop (excluding the object at stop), list[:] returns all objects, and list[::step] returns objects every step apart. You can use slice notation for accessing objects from any ordered data structure (e.g., tuple).

Tuple - Examples¶

# example 1 - creating a tuple 
a = (1,2,3,4)
print("a:"+str(a))

# example 2 - extending a tuple
b = a + a
print("b:"+str(b))

# example 3 - finding the min of a tuple
c = min(a)
print("c:"+str(c))

# example 4 - returning the last two objects
d = a[-2:]
print("d:"+str(d))

# example 5 - try to reassign the first element to 100000.0 
a[0] = 100000.0

a:(1, 2, 3, 4)
b:(1, 2, 3, 4, 1, 2, 3, 4)
c:1
d:(3, 4)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipython-input-2069829904.py in <cell line: 0>()
     16 
     17 # example 5 - try to reassign the first element to 100000.0
---> 18 a[0] = 100000.0

TypeError: 'tuple' object does not support item assignment

Example 1: We create a tuple using (), with objects separated by commas. Python will interpret any parentheses that are not attached to a pointer or function (e.g., Example 3 the parentheses are associated with the min() function). In this case there are four objects in the tuple (i.e., 4 int objects) and they will be stored in the order they were added.
Example 2: Adding two tuples together is interpreted as “extend” or “concatenate”, with the first tuple followed by the second tuple.
Example 3: min() and max() are functions that ship with Python by default. These functions accept lists, tuples, sets, among other structures, and return the minimum and maximum value.
Example 4: You can use “slice” notation on tuple objects (and any other ordered data structure).
Example 5: tuples are immutable so they do not support item reassignment. Python should have printed a TypeError at this point.

Set - Examples¶

# example 1 - creating a set
a = {1,2,3,4}
print("a:"+str(a))

# example 2 - sets do not have order or duplicates
b = {4,4,4,4,3,2,1}
print("a==b: "+str(a==b))

# example 3 - extending a set
a.add(5.0)
print("a:"+str(a))

# example 4 - sets only store unique values
a.add(4)
print("a:"+str(a))

# example 5 - checking if a number is in the set
print("2 in a: "+str(2 in a))

# example 6 - finding the union and intersection of two sets
b = {3,4,5,6,7}
print("b:"+str(b))
print("union of a and b:"+str(a.union(b)))
print("intersection of a and b:"+str(a.intersection(b)))

a:{1, 2, 3, 4}
a==b: True
a:{1, 2, 3, 4, 5.0}
a:{1, 2, 3, 4, 5.0}
2 in a: True
b:{3, 4, 5, 6, 7}
union of a and b:{1, 2, 3, 4, 5.0, 6, 7}
intersection of a and b:{3, 4, 5}

Example 1: We create a set using {}, with objects separated by commas. Alternatively, we could have defined the same object as set([1,2,3,4]). Likewise, since sets don’t have duplicates and are unordered, all of the following would result in identical objects {1,2,3,4}, {4,3,2,1}, and {1,1,1,2,3,4}.
Example 2: This example demonstrates that two sets (a and b) initialized from lists of the same unique objects, albeit with different orders and quantities, result in the same object. Sets don’t have order or allow duplicates so a and b in this case must be identical.
Example 3: The set.add() method can be used to add objects to an existing set. After printing out the full set, we can see that the float 5.0 has been added. We can also see that sets support mixed data types (in this case int and float).
Example 4: If we try to add an object that is already in the set (in this case the int 4) nothing happens. Sets do not allow duplicate objects! If you wanted to find the unique objects in a really long list, converting the list to a set would do it.
Example 5: in is an operator that returns a true or false depending on if the object on the left hand side is in the data structure on the right hand side. in comparisons can be slow for lists because every object needs to be compared against the thing on the lhs of in. Because of the way that sets store objects, in can be evaluated very rapidly.
Example 6: sets can perform set logical operations very effectively (surprised?). If you want to calculate which objects two groups have in common, or which objects in group a are not in group b, then sets are the data structure that you want to use. In this case we’ve demonstrated how to calculate the union (set that results from combining unique objects of each) and intersection (set of objects in common to both) of two sets a and b, using the methods of the same name.

Dictionary - Examples¶

# example 1 - creating a dictionary
a = {1:"one",2:"two",3:"three",4:"four"}
print("a:"+str(a))

# example 2 - accessing a dictionary value by key
print("a[3]: "+a[3])

# example 3 - adding a key:value pair to a dictionary
a[99] = "ninety-nine"
print("a:"+str(a))

# example 4 - dictionary of dictionaries
a = {"benzene": {"MW": 78.11, "MF":"C6H6"}, "ethane": {"MW":30.07, "MF":"C2H6"}}
print(a["benzene"]["MW"])

a:{1: 'one', 2: 'two', 3: 'three', 4: 'four'}
a[3]: three
a:{1: 'one', 2: 'two', 3: 'three', 4: 'four', 99: 'ninety-nine'}
78.11

Example 1: We create a dictionary using {key:value,...}, with objects separated by commas. Only immutable objects can be keys in a dictionary. Basic data types like ints, floats, and strings are immutable (tuples too!) and can serve as dictionary keys. Any object type can be a value (e.g., we could have lists as values in a dictionary). In this example, the keys are integers, and the values are the string associated with each key.
Example 2: Dictionary values are accessed by their associated keys. In this case our dictionary is basically a translator, you give it an integer and it returns the string version. a[3] returns "three".
Example 3: Additional key value pairs can be added to dictionaries by supplying the key to an existing dictionary using [] notation and setting it equal to the desired value. In this case we’ve assigned the key 99 to the string ninety-nine.
Example 4: Dictionaries can hold dictionaries. Likewise, all of the other structures mentioned so far can hold other structures (not just integers and floats). Use sequential [][]...[] notation to access nested elements.

# Mini-exercise
# Build a list of 5 numbers, then map each to its square.
numbers = [ ]  # TODO
squares = {}
# TODO: fill in the loop
print(squares)
# Expected output: a dict mapping each number to its square

{}

Control Statements¶

We motivated the introduction of data structures by saying that it is difficult to do anything useful if you are limited to dealing with individual datum. Similarly, it is difficult to construct useful programs if you have to hard code every individual action you want to do. We will often want to perform the same action many times (for/while loops), or perform an action only under certain circumstances (if/elif/else logic), or will want to attempt something and if it fails implement a fallback plan (try/except logic), or some combination of all three. The tools that accomplish these feats are control statements. These are built into python, and they are so useful that every language has them in some form. We’ll quickly run through the definitions of each of these then demonstrate their usefulness with examples.

for: You will use for whenever you need to do something “for every x in y”, where x is an object and y is something like a list that potentially holds many objects. In a strict sense you can use for on any object that is iterable, which means the object has rules for what to do when for operates on it. For example, many built-in python objects are iterable, including strings (for loops over their individual characters), although their iteration rules might not be obvious without experimentation. The standard construction of a for loop is for variable in iterable: where variable is a convenient name for the objects returned by the iterable, and iterable is the pointer to an iterable object or an iterable defined directly in the for statement.
while: You will use while whenever you need to do something repeatedly “while x is True”, where x is a condition (e.g., len(x) > 0) and True is a boolean. In practice, while and for loops are redundant, although it is sometimes easier to do something in one than the other. The standard construction of a while loop is while condition: where condition is an expression that evaluates to a boolean. As long as condition evaluates to True, the loop will run again. A valid condition would be x is False, which would evaluate to True (you might need to think about this a little bit, but it becomes intuitive very quickly).
if/elif/else: You will use if (and possibly elif and else) when you want to do something “only if x is True”, where x is a condition (e.g., x is False) and True is a boolean. This is kind of like while except that if statements are only evaluated once, not repeatedly until they evaluate to False. If the condition x evaluates to False then the commands inside of if do not get executed. This isn’t the end of the story, a lot of the time we want to provide instructions along the lines of “if not a then do b”, or “if not a and if not b, then do c” or “if not ... a bunch of conditions... do this”. To accomplish this, if comes with two optional partners. The first is elif which is a portmanteau of “else” and “if”. You will use elif when you want to test an addition condition y when the if condition evaluated to False. In words, a typical situation would be “if x do a; elif y: do b”, where the elif condition would only be evaluated when the if condition evaluates to False. The second partner is else, which you would use for "do this when everything else has evaluated to False. Thus, else contains commands that only run when all of the preceding if and elif statements have evaluated to False. Both elif and else are optional. You can use if alone, in which case if the conditional expression evaluates to False, the implied else is “do nothing”.
try/except: You will use try and its complementary statement except when you want to “try x and if it fails do exception”. This sounds a lot like if and else, and it is, except that you don’t test anything in the try statement. try and except have a distinctly pythonic flavor, encapsulated in the epigram “it is better to ask forgiveness than permission”. This means that it is sometimes better to just attempt it, than to do a preliminary test to check if you should. This can lead to significant efficiency gains in some applications. Logically, try and except are nearly redundant with if and else, so you can get by without using them initially, but you really should become practiced with them since some exceptions are easier to catch than to test for in advance, and there are some situations where if and else are not equivalent, like when information is time sensitive--for instance, between the time an if statement evaluates to True and the code being evaluated something might have changed that leads the code to fail--using try and except makes your code robust to these kinds of situations. Note, else and finally are also complements to try but I’m intentionally not covering their use here.

Note about colons: Just like with functions, every control statement will need to be terminated with a colon, : , and all of the commands that belong to it will be indented at a consistent level. The following examples will make this clear.

for - Examples¶

Check your understanding¶

When would you choose a while loop instead of a for loop?
What does try/except let you do that if/else does not?

# example 1: loop over list objects
print("example 1:")
for i in [1,2,3,4,5]:
    print(i)
    
# example 2: loop over list object using pointer
print("\nexample 2:")
a = [1,2,3,4,5]
for dummy in a:
    print(dummy)
    
# example 3: loop over partial list
print("\nexample 3:")
for i in a[::2]:
    print(i)

# example 4: modify objects within loop (cumulative summation)
print("\nexample 4:")
cum_sum = [a[0]]
for i in a[1:]:
    cum_sum = cum_sum + [i+cum_sum[-1]]
print(cum_sum)


# example 5: break out of loop early (also if/else use!)
print("\nexample 5:")
a = [1,2,3,None,5,6,7]
for i in a:
    if i is None:
        break
    else:
        print(i)

example 1:
1
2
3
4
5

example 2:
1
2
3
4
5

example 3:
1
3
5

example 4:
[1, 3, 6, 10, 15]

example 5:
1
2
3

Example 1: Demonstration of for loop construction for printing out the elements of a list. Here the list is defined directly in the for loop initialization. We also choose the variable i to place each object in at each iteration. So printing i has the effect of printing out each object in the list, in order.
Example 2: This example is nearly identical to the first, except that we first define the list before the for loop. In the loop initialization we just use the point for the iterable object,a, and python is smart enough to carry out the for iteration on the iterable that a points to. We’ve also changed the iteration variable to dummy just to demonstrate that this choice is arbitrary and can be anything we want.
Example 3: This example demonstrates that slice notation can be used in conjunction with for loop initialization (in this case, to loop over every other item in the list). Most of the core functionality in Python stacks really well can be combined in this way to accomplish complicated tasks with relatively concise code.
Example 4: This example demonstrates that we can perform non-trivial operations within a loop that carry forward from iteration to iteration. In this case, we are performing a cumulative summation of all integers in the list. We initialize the list cum_sum with the first integer from a. We then use slice notation to loop over the objects in a, starting with the second (remember, python is 0-indexed). Every time the loop runs, the list cum_sum gets the extended with the current cumulative sum (cum_sum[-1]) plus the current value, i. The result at the end is a list of partial cumulative summations indexed to the original list, a.
Example 5: Sometimes we will want to break loops early if something happens. Intuitively, you use the break keyword for this. We’re also showing you how to use if and else within the for loop! These will be demonstrated more clearly soon, but I think you can intuit how they work pretty clearly from this example.

while - Examples¶

# example 1: increment an integer
print("example 1:")
i=1
while i < 6:
    print(i)
    i += 1
    
# example 2: loop over list object using pointer
print("\nexample 2:")
a = [1,2,3,4,5]
i=0
max=len(a)
while i<max:
    print(a[i])
    i += 1
    
# example 3: pop items from list
print("\nexample 3:")
a = [1,2,3,4,5]
while a:
    print(a.pop(0))
    
# example 4: modify objects within loop (multiples of 2)
print("\nexample 4:")
a = [1]
while a[-1] < 100:
    a += [a[-1]*2]
print(a[:-1])

# example 5: break out of loop early (also if/else use!)
print("\nexample 5:")
a = [1,2,3,None,5,6,7]
i = 0
while True:    
    if a[i] is None:
        break
    else:
        print(a[i])
    i += 1

example 1:
1
2
3
4
5

example 2:
1
2
3
4
5

example 3:
1
2
3
4
5

example 4:
[1, 2, 4, 8, 16, 32, 64]

example 5:
1
2
3

Example 1: Demonstration of while loop construction for incrementing a variable. In contrast to the for loop, there is no iterable built into the control statement (e.g., a list) we only need to define the break condition using a statement that evaluates to a boolean (in this case i < 6). i is incremented each iteration of while using the i += 1 shorthand.
Example 2: This example is just an extension of the first, except that we use our incremented variable i in order to print out sequential elements in an ordered list a. In this case, the result of this while loop is indistinguishable from an equivalent for loop over the list a. Note: we define the maximum value outside of the loop so that the len() function doesn’t get called every iteration (can you think of potential cases where we might want to call len() every time? What if the length of the list is changing?).
Example 3: This example is typical of where while might offer a somewhat more intuitive approach to the equivalent for. In this case the condition is on whether the list a is non-empty. Each iteration a is modified by the .pop(0) method inside the control statement. In this case, the while statement will exit once the list is empty. By default iterables evaluate to True in while statements if they are non-empty.
Example 4: This another illustration of when while is useful. In particular, if the exit condition isn’t easy to calculate. In this trivial case, we want the control statement to terminate once the first result greater than 100 is obtained. You can imagine less trivial cases where we wouldn’t know this in advance.
Example 5: This example is analogous to the for example above, and shows that 1) while statements are compatible with break conditions, and 2) while statements can be initialized without a logical operator. Specifically, the while condition requires a statement which evaluates to True or False each iterations. In this case, we have initialized it with True which will make it run indefinitely unless a break statement is supplied somewhere within the control statement.

if/elif/else - Examples¶

# example 1: if else, find smallest divisor of a besides 1.
print("example 1:")
a = 1890843693043
i = 2
while i < a:
    if a % i == 0:
        break
    else:
        i += 1
print("{} is the smallest divisor of {} greater than 1.".format(i,a))

# example 2: defined objects evaluate to true by default
print("\nexample 2:")
if a:
    print("a exists!")
    
# example 3: if elif else
print("\nexample 3:")
a = 2.1
if a % 2 == 0:
    print("a is even")
elif a % 1 == 0:
    print("a is odd")
else:
    print("a is fractional")

example 1:
8641 is the smallest divisor of 1890843693043 greater than 1.

example 2:
a exists!

example 3:
a is fractional

Example 1: This example utilizes a while loop to find the first number greater than 1 that evenly divides into a. if a match is found, then the while loop breaks. else the divisor, i gets incremented and the loop continues.
Example 2: This example illustrates that by Python’s default behavior is to evaluate any object that exists to True, thus triggering the corresponding if/elif statement. This can be useful if you are uncertain if a variable has been defined at a certain point in the program.
Example 3: This is an example of if/elif/else usage for the trivial example of determining if a number is even, odd, or fractional. Experiment with different numbers to confirm that the statements evaluate as they should.

#Example 1: try with a general except condition
print("Example 1:")
try:
    print(doesnt_exist)
except:
    print("that didn't work!")
 
#Example 2: try with a specific except condition
print("\nExample 2:")
try:
    print(doesnt_exist)
except NameError:
    print("Caught a NameError")

#Example 3: try with a do nothing exception
print("\nExample 3:")
try:
    print(doesnt_exist)
except:
    pass

# Example 4: try with a specific exception and general exception
print("\nExample 4:")
a="string"
try:
    a = a/2
except TypeError:
    print("'a' doesn't support division")
except:
    print("something besides a type error occurred")

Example 1:
that didn't work!

Example 2:
Caught a NameError

Example 3:

Example 4:
'a' doesn't support division

# Mini-exercise
# Write a for loop that prints only the even numbers.
numbers = [1, 2, 3, 4, 5, 6]
# TODO: loop and print evens
# Expected output:
# 2
# 4
# 6

Example 1: In this example the program tries to print out a variable that hasn’t been defined yet. This would ordinarily throw out a NameError and terminate execution of the program. In this case, the program instead falls back on the except code when it encounters the error. In this case except is completely general because haven’t listed a specific error we are trying to catch.
Example 2: This is identical to the first example, but here we are being explicit about the type of error we want to catch.
Example 3: This example is identical to the previous, but illustrates that we can do “nothing” by using the pass statement (If you don’t put anything Python will get angry).
Example 4: This example illustrates that we can chain together both specific and general exceptions.

Object Comprehension¶

iterables are used so extensively in Python that there is a really tidy shorthand notation, called list/set/dictionary comprehension, for performing for loop and if else operations on them. This can be somewhat intimidating at first, but comprehensions are ubiquitous in python and make for a really good exercise for improving your ability to think like a Python programmer. Let’s show a few examples of list comprehension to start with.

List Comprehension - Examples¶

Check your understanding¶

Rewrite a simple for loop as a list comprehension.
Where does the if go when you include both if and else?

a = list(range(10)) # Note: range is a built in function for generating lists of integers
print("Here is our iterable: {}".format(a))

print("\nExample 1:")
b = [ i*10 for i in a ]
print(b)

print("\nExample 2:")
c = [ i*10 for i in a if i % 2 == 0 ]
print(c)

print("\nExample 3:")
d = [ i*10 if i % 2 == 0 else i for i in a ]
print(d)


list_of_lists = [ list(range(i)) for i in a ]
print("\nExample 4:")
print("list_of_lists: {}".format(list_of_lists))
e = [ j*10 for i in list_of_lists for j in i ]
print("unwrapped and *10, list_of_lists: {}".format(e))

Here is our iterable: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Example 1:
[0, 10, 20, 30, 40, 50, 60, 70, 80, 90]

Example 2:
[0, 20, 40, 60, 80]

Example 3:
[0, 1, 20, 3, 40, 5, 60, 7, 80, 9]

Example 4:
list_of_lists: [[], [0], [0, 1], [0, 1, 2], [0, 1, 2, 3], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4, 5], [0, 1, 2, 3, 4, 5, 6], [0, 1, 2, 3, 4, 5, 6, 7], [0, 1, 2, 3, 4, 5, 6, 7, 8]]
unwrapped and *10, list_of_lists: [0, 0, 10, 0, 10, 20, 0, 10, 20, 30, 0, 10, 20, 30, 40, 0, 10, 20, 30, 40, 50, 0, 10, 20, 30, 40, 50, 60, 0, 10, 20, 30, 40, 50, 60, 70, 0, 10, 20, 30, 40, 50, 60, 70, 80]

Example 1: This is an example of list comprehension. You can tell that it is list comprehension and not set or dict comprehension, because of the square braces []. Here, i is a dummy variable for the for loop which iterates over a. In object comprehension the operations that we want to perform go at the front of the expression (i*10) followed by the definition of the iterable. The net effect of this one-liner is that a new list is created where each element is multiplied by 10 with respect to the original list a.
Example 2: This example is identical to the first, except that we only operate on even values of the original list a. This is accomplished by using an if statement after the for loop definition. When you only want to use if conditions (and not also else or elif) you put it at the end of an object comprehension.
Example 3: We add yet another twist here by using both if and else conditions. Here we multiply all even values by 10 and return the original odd values. For technical reasons that are beyond the current scope, if you want to use both if and else, you integrate them into the commands you want to execute and not after the for loop (like in example 2).
Example 4: This example illustrates that you can even nest for loops in an object comprehension if the iterable also returns iterables (in this case, a list of lists, but a list of strings would be a perfectly good example too). Here we do a pretty trivial thing and combine the individual elements of the list of lists into a single list and multiply them by 10.

The equivalent comprehensions exist for sets and dictionaries, except that we use {} notation instead and {} with key:value notation, respectively.

Set/Dictionary Comprehension - Examples¶

print("\nExample 1:")
a = { i/10.0 for i in range(10) if i % 2 == 0 }
print(a)

print("\nExample 2:")
b = { str(i):i for i in range(100) }
print(b)

print("\nExample 3:")
c = tuple( i/10.0 for i in range(10) if i % 2 == 0 )
print(c)


Example 1:
{0.0, 0.4, 0.6, 0.8, 0.2}

Example 2:
{'0': 0, '1': 1, '2': 2, '3': 3, '4': 4, '5': 5, '6': 6, '7': 7, '8': 8, '9': 9, '10': 10, '11': 11, '12': 12, '13': 13, '14': 14, '15': 15, '16': 16, '17': 17, '18': 18, '19': 19, '20': 20, '21': 21, '22': 22, '23': 23, '24': 24, '25': 25, '26': 26, '27': 27, '28': 28, '29': 29, '30': 30, '31': 31, '32': 32, '33': 33, '34': 34, '35': 35, '36': 36, '37': 37, '38': 38, '39': 39, '40': 40, '41': 41, '42': 42, '43': 43, '44': 44, '45': 45, '46': 46, '47': 47, '48': 48, '49': 49, '50': 50, '51': 51, '52': 52, '53': 53, '54': 54, '55': 55, '56': 56, '57': 57, '58': 58, '59': 59, '60': 60, '61': 61, '62': 62, '63': 63, '64': 64, '65': 65, '66': 66, '67': 67, '68': 68, '69': 69, '70': 70, '71': 71, '72': 72, '73': 73, '74': 74, '75': 75, '76': 76, '77': 77, '78': 78, '79': 79, '80': 80, '81': 81, '82': 82, '83': 83, '84': 84, '85': 85, '86': 86, '87': 87, '88': 88, '89': 89, '90': 90, '91': 91, '92': 92, '93': 93, '94': 94, '95': 95, '96': 96, '97': 97, '98': 98, '99': 99}

Example 3:
(0.0, 0.2, 0.4, 0.6, 0.8)

Example 1: This is an example of set comprehension. You can tell that it is set comprehension and not list or dict comprehension, because of the curly braces {} and the fact that we don’t have key:value pairs in the leading command. Here, i is a dummy variable for the for loop which iterates over a. We divide each value by 10 and save it to the new set.
Example 2: This is an example of dictionary comprehension. You can tell that it is dictionary comprehension and not list or set comprehension, because of the curley braces {} and the presence of key:value notation. In this case, we make a trivial dictionary with string versions of the integers as keys and the ints themselves as the objects.
Example 3: Unfortunately there is no tuple comprehension, because the round parentheses, (), are already reserved syntactically for functions. Although you can hack one by using tuple( commands for i in iterable ) as we have done in this example (on the backend this is just list comprehension with a final conversion to a tuple). You usually use a tuple instead of a list when you want something that is fixed length, so this example is pretty contrived.

# Mini-exercise
# Rewrite your even-number loop as a list comprehension.
numbers = [1, 2, 3, 4, 5, 6]
evens = [ ]  # TODO
print(evens)
# Expected output: [2, 4, 6]

[]

Classes¶

The built-in data types (int, str, float, etc.) and data structures (list, dictionary, etc.) are actually implemented in python as things called classes. I’ve somewhat mislead you to keep things simple, but here I will pull back the curtain. Everything we discussed previously about these objects is still true, but the class objects that they are examples of need more explanation, and so we will use these in our introductory discussion.

Built-In Classes¶

A class is the central component in what is known as object-oriented programming. A class is strictly an object that combines data and functions. When combined into a class, the associated data are called attributes and the associated function is called a method. A class is defined in a general form with the attributes it expects and methods it supports. An instance is a specific example of the class after it is defined with specific attributes. This is loosely analogous to when you define a function (class) as opposed to when you call a function (instance).

Note: because classes are so prevalent in the languages that support them and because classes are at the center of object-oriented programming, the term “object” is often used interchangeably with “class instance”.

Let’s take a look at some things we can do with the str class that wasn’t highlighted before:

Check your understanding¶

What is the role of __init__ in a class?
How is a method different from a function?

a = 'my string'
print(a.split())
print(a.split('t'))

['my', 'string']
['my s', 'ring']

In this example, .split() is a method associated with the str class that returns a list of tokens based on a delimiter. By default the delimiter is set to spaces; in the second example we have set it to the t character. We didn’t even know about the .split() method; we never defined it, but it is there. That is because, strings are actually implemented as classes and so every time you define an instance of a string, the methods associated with the class come along for free. In the above example, 'my string' is the attribute of this specific instance of the string class, and .split() is an example of one of the many methods associated with the string class. Here are some other useful methods for strings:

# Example 1: .format method replaces each {} in a string with a specified object
a = "substitute here -> {}"
print("Example 1 (substitution example):\n")
print(a)
print(a.format("my string is awesome!"))

# Example 2: {} syntax can be used to format strings with a given precision or number of spaces
b = "{:12.6f} {:<20s} {:<10s} {:<2d}".format(20,"20chars","10chars",1)
print("\nExample 2 (formatting examples):\n")
print(b)

# Example 3: the str class has several case methods for capitalization etc.
c = "MiXeD CaSe 123"
print("\nExample 3 (case functions):\n")
print(c.upper())
print(c.lower())
print(c.capitalize())
print(c.title())

# Example 4: .strip('chars') can be used to remove characters from the start
#            and end of a string. Methods can be chained together with left 
#            to right evaluation.
d = "this is a simple sentence."
print("\nExample 4 (.strip examples):\n")
print(d.strip('.te').title())
print(d.title().strip('.te'))

# Example 5: .join(list) can be used to join together strings from a list using
#            inserting the original string between the objects.
e = ["this","is","a","simple","sentence"]
f = [1,2,3,4,5,6,7,9,10]
print("\nExample 5 (.join examples):\n")
print("".join(e))
print(" ".join(e))
print("...".join(e))
print(" ".join([ "{:<10.6f}".format(i) for i in f ]))

Example 1 (substitution example):

substitute here -> {}
substitute here -> my string is awesome!

Example 2 (formatting examples):

   20.000000 20chars              10chars    1 

Example 3 (case functions):

MIXED CASE 123
mixed case 123
Mixed case 123
Mixed Case 123

Example 4 (.strip examples):

His Is A Simple Sentenc
This Is A Simple Sentenc

Example 5 (.join examples):

thisisasimplesentence
this is a simple sentence
this...is...a...simple...sentence
1.000000   2.000000   3.000000   4.000000   5.000000   6.000000   7.000000   9.000000   10.000000

Example 1: This is an example of the .format method. You can use it to replace a pair of curly braces ({}) with a string of your choice. Multiple braces can be present, in which case, the corresponding substitutions are separated by commas.
Example 2: The .format method is powerful because of the ability to format the string that you insert. In these examples we have used the formatting specifications {:12.6f} {:<20s} {:<10s} {:<2d}. The letters f, s and d, mean float, str, and int respectively. The < means left-align. The numbers 12.6 mean the float should occupy twelve characters, with six values after the decimal (0 or spaces are automatically padded). The : is always required for formatted substitution.
Example 3: This is an example of several case methods for the string class. .upper() and .lower() raise or lower all letters to upper-case and lower-case respectively. .capitalize() and .title() capitalize the first word and all words, respectively. Numbers are unaffected by these methods.
Example 4: This is an example of the .strip() method. Whatever characters are supplied to .strip('chars') are removed from the start and end of the returned string. This example also shows how methods can be chained together, with left-to-right evaluation. The ‘t’ is not removed from the second case, because it is first capitalized and no longer matches the t argument in .strip('.te').
Example 5: The .join() method is powerful for forming formatted strings from lists of strings. This is extremely common when writing delimited text files with data. The join method will put an instance of the string that it is called on between each string in the list. In the first case, the parent string is '', or an “empty string”, which results in no spaces. In the second and third cases, a space and ... are inserted, respectively. In the fourth case, you can see how list comprehension with the .format() method and .join() can be used to output a nicely formatted line of numbers.

Note: in some of these examples the method is called on a pointer to a string (e.g., a.format("my string is awesome") in Example 1) and in some the method is called on the string itself (e.g., " ".join(e)" in Example 5). There is no difference, but it is up to you to make sure that the method you call actually exists for the class instance that your pointer points to.

The above examples show how methods are similar to functions except that by default they have the attributes of the instance passed as inputs (i.e., the string itself in the above examples), and you call them with the .method(args) syntax. Similar to functions, methods can return one, many, or no objects.

Data structures like lists and dictionaries are likewise implemented as classes within python, and each have their own useful methods that will need to routinely be used. Here are some examples of list methods:

# Example 1: .index(object) can be used to find the *first* index in a list where the specified object occurs. 
a = [1,2,3,4,4,5]
print("Example 1 (index):\n")
print("a.index(4) = {}".format(a.index(4)))
print("a[a.index(4)] = {}".format(a[a.index(4)]))

# Example 2: .pop(index) can be used to remove the object at index from the list and return it. The returned
#            object can be saved by assigning it to a pointer (e.g., b and c in this case).
b = a.pop(4)
c = a.pop(-1)
print("\nExample 2 (pop):\n")
print("a = {}".format(a))
print("b = {}".format(b))
print("c = {}".format(c))

# Example 3: .sort() can be used to sort a list (order can be reversed using the reverse arguments). This 
#            method doesn't return a new list, it operates on the list attribute itself.
print("\nExample 3 (sort):\n")
b = [5,2,3,1,10]
print("before sort: {}".format(b))
b.sort()
print("after sort: {}".format(b))
b.sort(reverse=True) 
print("after sort(reverse=True): {}".format(b))

Example 1 (index):

a.index(4) = 3
a[a.index(4)] = 4

Example 2 (pop):

a = [1, 2, 3, 4]
b = 4
c = 5

Example 3 (sort):

before sort: [5, 2, 3, 1, 10]
after sort: [1, 2, 3, 5, 10]
after sort(reverse=True): [10, 5, 3, 2, 1]

Example 1: the .index(object) method of the list class expects an object that the user wants to find within the list. In the current case, the object is an integer (4) but more generally it could be anything that lists can hold (e.g., another list). .index() returns the index within the list of the first match. In this example, the index 3 is returned, which corresponds to the location within the list of the first 4.
Example 2: The .pop(int) method expects an integer corresponding to the index within the list. The method will remove the element at that index from the list and return it, leaving the original list in the same order as before, but missing the element. The indices of all subsequent elements are automatically shifted down by one. In this example, the first popped element (a 4) is returned and assigned to the pointer b. The second popped element (a 5) is assigned to the pointer c. You can pop elements based on their 0-index or position relative to the end of the list (e.g., -1 in the second case).
Example 3: The .sort() method sorts the list it is called on. Unique to these examples, .sort() does not return a new list, instead it replaces the original list with the result. .sort() also has the optional argument for reverse, which can be used to sort the list in descending order.

File Objects¶

Files are an example of a built-in class that we have not covered yet. They need to be mentioned here because files have special methods for reading data and this is central to almost all of the useful things we will do with Python.

The easiest way to create an instance of a file, is to open one:

import os
if not os.path.exists('hamlet.txt'):
    !wget -q https://raw.githubusercontent.com/SECQUOIA/PU_CHE597_DSinChemE/main/1-Python_Overview/hamlet.txt -O hamlet.txt

# Note: the program expects hamlet.txt to be in the folder that this notebook is executed from
f = open('hamlet.txt','r')
print(f)

<_io.TextIOWrapper name='hamlet.txt' mode='r' encoding='utf-8'>

Here open() is a built-in Python function that opens files for reading (r option) or writing (w option) and returns a file object. Our pointer f now points to the file object (i.e., _io.TextIOWrapper in the printed result is Python’s way of telling you that).

File objects have built-in methods for reading and writing data:

# Example 1: .readlines() and .close() method
f = open('hamlet.txt','r')
a = f.readlines(10)
print("a: {}".format(a))
f.close()

# Example 2: .read() method 
f = open('hamlet.txt','r')
b = f.read(10)
print("b: {}".format(b))

# Example 3: file objects remember their place
c = f.readlines(10)
print("c: {}".format(c))

a: ['ACT I\n', 'SCENE I. Elsinore. A platform before the castle.\n']
b: ACT I
SCEN
c: ['E I. Elsinore. A platform before the castle.\n']

Example 1: the .readlines(hint=-1) method of the file class will return each line of the file as a separate string in a list. By default this will read the whole file. This is bad for large files (see below for a better way). You can supply an optional argument to only return a certain number of bytes. The .close() method closes the file object and releases the associated memory.
Example 2: The .read(hint=-1) method of the file class will return the file contents as a string. By default this will read the whole file. This is bad for large files (see below for a better way). You can supply on optional argument to only return a certain number of bytes.
Example 3: The output of this call to .readlines(10) is different from example 1. This is because we didn’t reopen the file after the f.read(10) call. File objects have an attribute that points to the current position in the file (starting at the beginning when you open the file). As you read from it, the position of this attribute gets updated.

A Better Way to Read Large Files:File objects also have defined iteration rules. This allows you to loop over them line by line without reading the whole thing into memory. This is almost always preferred:

# Example 1: Reading files using for loops
f = open('hamlet.txt','r')
for line in f:
  parts = line.split()
  if parts:
    print(parts[0])
f.close()

ACT
SCENE
FRANCISCO
BERNARDO
Who's
FRANCISCO
Nay,
BERNARDO
Long
FRANCISCO
Bernardo?
BERNARDO
He.
FRANCISCO
You
BERNARDO
'Tis
FRANCISCO
For
And
BERNARDO
Have
FRANCISCO
Not
BERNARDO
Well,
If
The
FRANCISCO
I
Enter
HORATIO
Friends
MARCELLUS
And
FRANCISCO
Give
MARCELLUS
O,
Who
FRANCISCO
Bernardo
Give
Exit
MARCELLUS
Holla!
BERNARDO
Say,
What,
HORATIO
A
BERNARDO
Welcome,
MARCELLUS
What,
BERNARDO
I
MARCELLUS
Horatio
And
Touching
Therefore
With
That
He
HORATIO
Tush,
BERNARDO
Sit
And
That
What
HORATIO
Well,
And
BERNARDO
Last
When
Had
Where
The
Enter
MARCELLUS
Peace,
BERNARDO
In
MARCELLUS
Thou
BERNARDO
Looks
HORATIO
Most
BERNARDO
It
MARCELLUS
Question
HORATIO
What
Together
In
Did
MARCELLUS
It
BERNARDO
See,
HORATIO
Stay!
Exit
MARCELLUS
'Tis
BERNARDO
How
Is
What
HORATIO
Before
Without
Of
MARCELLUS
Is
HORATIO
As
Such
When
So
He
'Tis
MARCELLUS
Thus
With
HORATIO
In
But
This
MARCELLUS
Good
Why
So
And
And
Why
Does
What
Doth
Who
HORATIO
That
At
Whose
Was,
Thereto
Dared
For
Did
Well
Did
Which
Against
Was
To
Had
And
His
Of
Hath
Shark'd
For
That
As
But
And
So
Is
The
Of
BERNARDO
I
Well
Comes
That
HORATIO
A
In
A
The
Did
As
Disasters
Upon
Was
And
As
And
Have
Unto
But
Re-enter
I'll
If
Speak
If
That
Speak
Cock
If
Which,
Or
Extorted
For
Speak
MARCELLUS
Shall
HORATIO
Do,
BERNARDO
'Tis
HORATIO
'Tis
MARCELLUS
'Tis
Exit
We
To
For
And
BERNARDO
It
HORATIO
And
Upon
The
Doth
Awake
Whether
The
To
This
MARCELLUS
It
Some
Wherein
The
And
The
No
So
HORATIO
So
But,
Walks
Break
Let
Unto
This
Do
As
MARCELLUS
Let's
Where
Exeunt

In this example, we have iterated over the lines in the file hamlet.txt. The iteration rules for file objects returns each line of the file (based on new line characters) at each iteration of the for loop. Here we are printing out the first word in each line. We check for blank lines explicitly so we don’t index into an empty list.

If you are like me and commonly forget to close things, then you can make it a habit to use the with: control statement, which automatically closes anything it is called with:

with open('hamlet.txt','r') as f:
  for line in f:
    parts = line.split()
    if parts:
      print(parts[0])

ACT
SCENE
FRANCISCO
BERNARDO
Who's
FRANCISCO
Nay,
BERNARDO
Long
FRANCISCO
Bernardo?
BERNARDO
He.
FRANCISCO
You
BERNARDO
'Tis
FRANCISCO
For
And
BERNARDO
Have
FRANCISCO
Not
BERNARDO
Well,
If
The
FRANCISCO
I
Enter
HORATIO
Friends
MARCELLUS
And
FRANCISCO
Give
MARCELLUS
O,
Who
FRANCISCO
Bernardo
Give
Exit
MARCELLUS
Holla!
BERNARDO
Say,
What,
HORATIO
A
BERNARDO
Welcome,
MARCELLUS
What,
BERNARDO
I
MARCELLUS
Horatio
And
Touching
Therefore
With
That
He
HORATIO
Tush,
BERNARDO
Sit
And
That
What
HORATIO
Well,
And
BERNARDO
Last
When
Had
Where
The
Enter
MARCELLUS
Peace,
BERNARDO
In
MARCELLUS
Thou
BERNARDO
Looks
HORATIO
Most
BERNARDO
It
MARCELLUS
Question
HORATIO
What
Together
In
Did
MARCELLUS
It
BERNARDO
See,
HORATIO
Stay!
Exit
MARCELLUS
'Tis
BERNARDO
How
Is
What
HORATIO
Before
Without
Of
MARCELLUS
Is
HORATIO
As
Such
When
So
He
'Tis
MARCELLUS
Thus
With
HORATIO
In
But
This
MARCELLUS
Good
Why
So
And
And
Why
Does
What
Doth
Who
HORATIO
That
At
Whose
Was,
Thereto
Dared
For
Did
Well
Did
Which
Against
Was
To
Had
And
His
Of
Hath
Shark'd
For
That
As
But
And
So
Is
The
Of
BERNARDO
I
Well
Comes
That
HORATIO
A
In
A
The
Did
As
Disasters
Upon
Was
And
As
And
Have
Unto
But
Re-enter
I'll
If
Speak
If
That
Speak
Cock
If
Which,
Or
Extorted
For
Speak
MARCELLUS
Shall
HORATIO
Do,
BERNARDO
'Tis
HORATIO
'Tis
MARCELLUS
'Tis
Exit
We
To
For
And
BERNARDO
It
HORATIO
And
Upon
The
Doth
Awake
Whether
The
To
This
MARCELLUS
It
Some
Wherein
The
And
The
No
So
HORATIO
So
But,
Walks
Break
Let
Unto
This
Do
As
MARCELLUS
Let's
Where
Exeunt

Lastly, you will often want to write data to files. File objects have the .write() method for this purpose, but you need to be sure that you open a file object with the correct permission:

a = [1,2,3,4]
with open('test.txt','w') as f:
  for i in a:
    f.write("{:<20.6f}\n".format(float(i)))

If you executed the above cell, then it should have created a text file in the same folder as this notebook that holds formatted floats for each number in a on individual lines. Be careful, opening an existing text file with open(name,'w') will overwrite its contents. Use open(name,'a') when you open the file if you want to append to it.

Custom Classes¶

In the previous examples we revisited objects that we were familiar with (strings and lists) and showed how python actually implements these as classes. We highlighted a small number of methods that belong to these classes that are standard to these objects because python developers knew that people commonly need to do certain kinds of things with their strings, lists, etc. In general applications, you may find that you want to create your own classes, with certain attributes and methods beyond those that are available through other libraries or standard python distributions. Here we will go through the basics of creating a custom class.

Defining a basic class with only attributes and no methods, is almost identical to defining a function:

# Example 1: Create a class with only attributes (a,b,c) no methods.
class test:
  a=[1,2,3]
  b="string"
  c=1

# Create an instance of our class and print its attributes
instance = test()
print(instance.a)
print(instance.b)
print(instance.c)

[1, 2, 3]
string
1

Here, the first line class test: indicates that we are creating a class named test, then the subsequent indented section indicates the additional attributes and methods that belong to the class. In the current example we have no methods, only some hard-coded features (i.e., a,b, and c). We create an instance of the class test and assign it to the pointer instance. We can access attributes of a class using the .attribute syntax, without parenthesis.

This example is silly since the whole point of classes is to organize data and their associated functions (i.e., methods) into a single object. How do we get outside data into our class instance? It has to do with the parenthesis in the instance = test() call in the example above. Indeed, we can supply input objects to our class using these parenthesis, but we also need to make use of a built-in method __init__() that belongs to all python classes. Let’s show an example before explaining:

# Example 2: Create a class with attributes assigned upon initialization. 
class test:
  def __init__(self,a,b,c):
    self.a=a
    self.b=b
    self.c=c

# Create an instance of our class and print its attributes
inst = test([1,2,3],"string",1)
print(inst.a)
print(inst.b)
print(inst.c)

[1, 2, 3]
string
1

In this example, we’ve given our class a method: __init__. Methods are defined identically to functions, except that they are defined within a class definition (i.e., indented with respect to the class keyword) and the first argument of all methods is self. In this case, self is special because it is the first argument in the __init__ method, and in all subsequent places that it shows up within the class, python will interpret it as “current instance of this class and all of its attributes”. Thus, you don’t have to use self, you could use spaghetti if you are consistent, but self is standard.

__init__ is special, because python automatically executes this method anytime you initialize a class. Thus, __init__ varies from class to class and will commonly do things like define attributes of the current instance of the class. In the current example, we define the class to take three inputs from the user (a, b, and c) and assign them to the class attributes self.a, self.b, and self.c. To repeat, since self is the first argument of __init__ within the class, in all subsequent places that it shows up within the class, python will interpret it as “current instance of this class and all of its attributes”. If we define several instances of a class, they will have the same attributes but potentially different values for those attributes. For example:

# Create two instances of our class and print their attributes
inst1 = test(1,1,1)
inst2 = test(0,0,0)
print("inst1.a = {}    inst2.a = {}".format(inst1.a,inst2.a))
print("inst1.b = {}    inst2.b = {}".format(inst1.b,inst2.b))
print("inst1.c = {}    inst2.c = {}".format(inst1.c,inst2.c))

inst1.a = 1    inst2.a = 0
inst1.b = 1    inst2.b = 0
inst1.c = 1    inst2.c = 0

The __init__ method is the biggest stumbling block for people starting to study classes. In summary, it accomplishes two things. First, it gives the programmer the ability to pass inputs when an instance of the class is created. Second, it gives defines a pointer to the class instance (self by convention) that needs to be used when creating instance specific attributes and subsequent methods.

The reason that self is needed may become clearer when considering how you might add additional methods. Let’s add a method to our class:

# Example 3: Create a class with an add function
class test:
  def __init__(self,a,b,c):
    self.a=a
    self.b=b
    self.c=c
  def add(self):
    return self.a + self.b + self.c

inst = test(1,1,1)
print(inst.add())

Here we’ve added a second method to our class called add, which returns the sum of attributes. When we defined add we used the syntax def add(self), meaning that we are passing the class instance as an input argument to the function. By doing this we can define self.a + self.b + self.c without error, because the method will have access to those attributes through the class instance that is passed to the method. This may seem convoluted, but it is consistent with how functions are defined. If we had defined the add method as add(), then the method wouldn’t have any inputs to add together.

__init__ isn’t the only special method within python. There is a whole class of special methods that start and end with the double underscore (“dunder” methods). For example, we might want to perform comparisons between instances of our class using the == operator. We could make this definition using the __eq__ method:

# Example 3: Create a class with a custom equals comparison
class test:
  def __init__(self,a,b,c):
    self.a=a
    self.b=b
    self.c=c

  def add(self):
    return self.a + self.b + self.c
    
  def __eq__(self,other):
    return self.c == other.c

inst1 = test(1,1,1)
inst2 = test(2,2,1)
inst3 = test(2,2,2)
print("inst1 == inst2 : {}".format(inst1 == inst2))
print("inst1 == inst2 : {}".format(inst1 == inst3))

inst1 == inst2 : True
inst1 == inst2 : False

In the __eq__(self,other) method definition, other is interpreted by the operator to mean the “other object passed to the operator”. This is common to dunder methods associated with operators.

This is just the basics of methods, but all of the essentials have been set out. Happy programming!

# Mini-exercise
# Define a class with one attribute and one method.
class Greeter:
    def __init__(self, name):
        # TODO
        pass

    def greet(self):
        # TODO
        pass

g = Greeter("World")
print(g.greet())
# Expected output: Hello, World!

None

References¶

Python Tutorial: https://docs.python.org/3/tutorial/
Built-in Types: https://docs.python.org/3/library/stdtypes.html
Control Flow: https://docs.python.org/3/tutorial/controlflow.html
Functions: https://docs.python.org/3/tutorial/controlflow.html#defining-functions