Midterm 2 Review

Midterm 2 practice problems

Midterm Logistics

When and where: Thursday, November 14, 7:30 - 10:00 PM in MBH 102 (although the exam is intended to take less than 2 hours)
What can I bring?: One piece of letter-sized paper with notes on both sides (I will provide copies of the cheat sheet)
What can’t I bring?: Anything else, e.g., book, computer, other notes, etc.
I have a scheduling conflict, can I take the midterm at an alternate time?: Yes. Hopefully you already responded to the Google form, to let me know. You can pick up the exam from me in my office on Thursday (or whatever time we agreed upon).

What will the exam cover?

The exam will cover file reading through object-oriented programming, but not searching/sorting or big-O. It will specifically focus on exam topics 9-16, one question per topic. Those questions will or can involve the following:

Reading from files
Using command-line arguments
Memory model/references
Sets, Dictionaries, Tuples
- When to and can you use these and other data structures
- Initialization, querying, iterating, updating
- Use of set operators
Recursion
Object-oriented programming

The exam is not cumulative, i.e., it will focus on material since midterm 1, but we haven’t stopped writing functions, using integers/strings, writing loops, etc. as we use more advanced capabilities of Python. The exam will NOT include material that was in the book(s) but that we did not discuss in class, or use in our labs, or practice on PrairieLearn. The exam will NOT include searching/sorting, big-O or vector execution.

Types of questions

Determine the output of code
Rewrite code with better style
Identify bugs in code
Reassemble jumbled Python statements into a viable function
Write code to solve problem.
Short answer

How do you suggest I prepare?

Review the relevant exam topics and identify the key ideas and techniques associated with each topic. Do you understand that key idea?
Practice, practice, practice! Complete the previous exams, (re)-solve the practice problems and the in-class problems (available as “in-class questions” on course webpage).
Review the class notes. Treat the examples in the notes as practice problems, i.e., can you predict the result/solution before you look at it?

What do you suggest I put on my notes page?

Here are some (non-exhaustive) suggestions:

Common code snippets, e.g., reading from a file, creating a histogram with a dictionary, iterating through a dictionary
Common kinds of errors in recursive functions, e.g., missing base case, recursive case not getting smaller, etc.

Review questions:

Each semester I need to update my class notes. Write a Python program to print out the contents of a file with all instances of “F23” replaced with “F24”. The filename will be provided as a command line argument, as shown below, and your program should work for any filename provided. You don’t need to handle invalid arugments. If your program is imported, nothing should be printed.

>>> %Run program.py notes.txt
line1 F24
F24 line2 F24

if notes.txt contains

line1 F23
F23 line2 F23

Show a solution

# Import sys to make sys.argv variable with command line arguments available
import sys

def update_file(filename):
    with open(filename, "r") as file:
        for line in file:
            # Use strip to remove trailing new line in file to avoid printing two newlines
            print(line.strip().replace("F23", "F24"))

if __name__ == "__main__":
    # sys.argv is a list containin the filename and any command line arguments, e.g.
    # ["program.py", "notes.txt"] in the example above
    update_file(sys.argv[1])

Draw the memory model, as would be generated by Python Tutor/Visualizer (pythontutor.com), after the following code executes:
```
x = [[1, 2], 3]
y = x[:] + [3]*2
y[1] = 5
```
Show a solution

Python visualizer output

Would the memory model be different if the second line is y = x + [3] * 2? No. The concatenation operation (the +) also creates a new list with a copy of x. Like slicing, though, the copy is only one “layer” deep.

Is there an operation you could perform on y that would be visible via x? Yes, modifying the nested list. Since the slicing operation and concatenation copy the outer list, not the inner or nested lists, those nested lists (the [1,2]) remain aliased. For example y[0].append(5) would make changes to that nested list that are visible through both x and y.
For the following kinds of data, describe what data structure, e.g., list, set, dictionary, or tuple, would be the most appropriate to use.
1. Storing students in a class along with their grades
2. Storing a shopping list optimized for efficient traversal of the supermarket
3. Storing unique \(x,y\) coordinates
Show a solution
1. Since the grades as associated with a specific student, a dictionary with the student as the key and the grade as the value would be most appropriate.
2. Since the shopping list needs to be ordered in a specific way (e.g., based on the aisles in the store), and potentially re-ordered as items are added/removed, a list would be most appropriate (would also allow for duplicate items). Since there is no key-value association, a dictionary would not be useful.
3. Since the points are or should be a unique, a set is the most appropriate choice. The points themselves should be stored as tuples. Each point is a fixed size (two elements) and as an immutable data structure a tuple can be used with a set (a list could not).
In the following code
```
d = { 0: "0", 1: "I", 2: "II", 3: "III", 4: "IV", 5: "V" }
print(d[i])
```
which of alternate definitions of d below would print the same for any value of i in 0-5, inclusive? Select all that apply.
1. d = ["0", "I", "II", "III", "IV", "V"]
2. d = ("0", "I", "II", "III", "IV", "V")
3. d = {"0", "I", "II", "III", "IV", "V"}
4. d = "0IIIIIIIVV"
Show a solution

Answers 1 and 2 can be used as alternate definitions of d. Indexing can’t be used with sets (answer 3) and the indexing for answer 4 is not correct.
Write a function named shared_bday that takes a list tuples representing birthdays, e.g., ("January",1), for a group of individuals and returns True is any share a birthday.
Show a solution
def shared_bday(days): return len(set(days)) < len(days)
Since the values in a set have to be unique, if there are duplicate birthdays, the size of the set will be smaller than the original list. Recall that a set can be initialized directly from an iterable, e.g. a list.
Write a function that takes two parameters: a dictionary and a number. The function should update the dictionary by adding the number to each value in the dictionary.
Show a solution
def add_num(a_dict, number): for key in a_dict: a_dict[key] += number
Recall that for key in a_dict: is the same as for key in a_dict.keys():. We don’t need to return the dictionary. Instead this function modifies its arguments, i.e., it modifies the dictionary provided as the argument:
```
>>> a = { 1: 2 }
>>> add_num(a, 10)
>>> a
{1: 12}
```
What does the following function do (in one sentence) assuming x is a list:
```
def mystery(x):
    if len(x) <= 1:
        return True
    else:
        if x[0] < x[1]:
            return False
        else:
            return mystery(x[1:])
```
Show a solution

mystery returns True if the list is in descending sorted order. To figure that out we note that the function returns False if x[0] < x[1], i.e., the preceding value is less than the next value in the list. The only way to return True is to “make it” to the base case; to make it to the base case the preceding value must be greater than or equal than the next value fof all pairs of values.
What is the shape drawn by the following function when invoked as mystery(100,4), assuming the turtle starts at the origin facing to the right?
```
def mystery(a, b):
    if b > 0:
        for i in range(3):
            forward(a)
            left(120)
        forward(a)
        mystery(a/2, b-1)
```
Show a solution
A set of 4 adjacent equilateral triangles of decreasing size, where each next triangle is half the size of the one to its left.

Where does the turtle end up? At the bottom right corner of the figure. How could you modify the code to ensure the turtle ended back at its starting position? As we did in the fractal drawing assignment we could use pending operations, i.e., operations after the recursive call, to “undo” the operations we did before the recursive call. For example:
def mystery(a, b): if b > 0: for i in range(3): forward(a) left(120) forward(a) mystery(a/2, b-1) backward(a)
Write a recursive function all_upper that takes a list of strings as a parameter and returns a list of booleans with True for strings in the list that are all uppercase, False otherwise. Recall that the string class has an isupper method that checks if it is all uppercase. For example:
```
>>> all_upper(["YES", "no"])
[True, False]
```
Show a solution
def all_upper(strings): if len(strings) == 0: return [] else: return [strings[0].isupper()] + all_upper(strings[1:])
There are several problems with this recursive implementation of fibonacci. What are they? Recall that the Fibonacci sequence is 1, 1, 2, 3, 5, 8, 13 …, i.e., \(F_n = F_{n-1} + F_{n-2}\).
```
def fibonacci(n):
    """ Return nth fibonacci number """
    if n == 1 or 2:
        return 1
    else:
        fibonacci(n[1:]) + fibonacci(n[2:])
```
Show a solution
1. n == 1 or 2 is the same as (n == 1) or 2 and is always True because 2 always evaluates to True. Should be n == 1 or n == 2.
2. n is an integer, and so the slicing operator is not defined. The recursive case should be n-1 and n-2.
3. Missing return in the recursive case.

One useful property of averages is that we can compute the average without having the values in memory by maintaining a “running” sum and count of the number of values observed. Implement a class RunningAverage that maintains a running average of values added with an add method. That is it can perform the following computation.

>>> mean = RunningAverage()
>>> for val in range(1, 5):
    mean.add(val)

>>> mean.average()
2.5
>>> mean.add(5)
>>> mean.average()
3.0

Show a solution

class RunningAverage:
    def __init__(self):
        self.total = 0
        self.count = 0

    def add(self, value):
        self.total += value
        self.count += 1

    def average(self):
        return self.total / self.count

It is also possible to compute a “running” variance using Welford’s algorithm!

\[ \begin{aligned} M_{2,n} &= M_{2,n-1}+(x_{n}-{\bar {x}}_{n-1})(x_{n}-{\bar {x}}_{n}) \\ \sigma _{n}^{2} &= {\frac {M_{2,n}}{n}} \\ \end{aligned} \]

where \(M_{2,1}=0\). Implement a class RunningVariance that derives from RunningAverage and computes the variance “online”, i.e., without storing all of the data.
Show a solution
class RunningVariance(RunningAverage): def __init__(self): super().__init__() self.m2 = 0 def add(self, value): if self.count == 0: super().add(value) else: old_mean = self.average() super().add(value) new_mean = self.average() self.m2 += (value - old_mean) * (value - new_mean) def variance(self): return self.m2 / self.count
Note that we override the add method to keep track of the additional statistic, but delegate to the base class method (super().add(value)) to increment the total and count instance variables. By doing so, we don’t have to copy that code, and instead can reuse the code from RunningAverage. Our metric \(M_{2,n}\) is only defined for \(n>1\), so we handle the first addition differently (i.e., when self.count is 0).