Final Review

Final practice problems

Final Logistics

When and where: December 11 7:00-10:00PM in 75 SHS 102.
What can I bring?: One piece of letter-sized paper with notes on both sides (I will provide copies of the cheat sheet)
What can’t I bring?: Anything else, e.g., book, computer, other notes, etc.
Are there additional office hours?: Yes! I will hold regular office hours on Monday, December 9, and then Tues. Dec. 10 and Wed, Dec. 11 10:00AM-noon (as well as by appointment).

What will the exam cover?

The exam will effectively have two parts:

Four new questions on the topics covered since midterm 2 (topics numbered 17-20)
Retake opportunities for all topics from midterms 1 and 2 (topics numbered 1-16). Recall that you only retake problems for which don’t already have an M (3 points) or E (4 points).

The new topics are:

Searching and sorting algorithms
Use big-O analysis to inform algorithm design
Understand numeric representations and associated operations
Vectorized execution

The retake topics are:

Understand the role of function scope
Writing functions with randomness
Writing functions with loops
Choosing appropriate loops
Creating simpler equivalent conditionals
Finding errors
Writing functions with sequences
Utilizing turtle and other modules
Connect Python to the outside world with file I/O and command line arguments
Implications of the Python memory model
Applications of data structures
Writing functions with sets
Writing functions with dictionaries
Understanding and using recursive functions
Finding errors (in recursive functions)
Using Object-Oriented Programming

The exam will NOT include material that was in the readings but that we did not discuss in class, or use in our programming assignments or the practical problems.

Types of questions

Determine the output/result of code
Rewrite code with better style
Identify bugs in code
Reassemble jumbled Python statements into a viable function
Fill-in missing statements in code
Translate code from NumPy/datascience to “built-in” Python
Write code to solve a problem
Map algorithms to specific implementations and vice-versa
Determine and compare big-O time complexity and execution time

Review Questions

Solutions will be available after class (make sure to reload the page).

You are given four programs, each uses one of four different implementation for searching a sorted array: iterative linear search, recursive linear search, iterative binary search, recursive binary search. Unfortunately you don’t know which program uses which approach, but you do know the last five calls of the mystery sorting function for each program.
1. Search called with lists of length 433, 432, 431, 430, 429.
2. Search called once with list of length 1000.
3. Search called once with list of length 1000.
4. Search called with lists of length 16, 8, 4, 2, 1
Which program uses which approach. If there is insufficient information to answer uniquely, indicate the possible programs.

Show a solution

Since search in program A is called with lists decreasing in size by one, we infer recursive linear search. Since search in program D is called with lists decreasing by a factor of 2, we infer recursive binary search. Given the available information, programs B and C could both be iterative linear search or iterative binary search.

If there is insufficient information to answer uniquely suggest an experiment(s) to uniquely identify the approach used by each program.

Show a solution

Assuming that search is measurable portion of the runtime and we can change the inputs to the search function, we could try doubling the input size. Since linear search has a time complexity of \(\mathcal{O}(n)\), its runtime should double (or at least increase noticeably), while for binary search as a time complexity of \(\mathcal{O}(\log n)\), and so its run time would only increase minimally.

In practice we could likely observe differences in the absolute run time. But recall the big-O really describes the growth rate as the input grows very large (recall it is described as asymptotic analysis), not the runtime and that two algorithms have the same complexity necessarily mean that they have the same runtime, or if one has lower big-O time complexity that it is predictably faster than the other.
What is the Big-O worst-case time complexity of the following Python code? Assume that list1 and list2 are lists of the same length.
```
def difference(list1, list2):
    result = []
    for val1 in list1:
        if val1 not in list2:
            result.append(val1)
    return result
```
Show a solution

The outer loop has n iterations, while the in operation has a worst-case time complexity of n, so the total worst-case time complexity is \(\mathcal{O}(n^2)\).

The not in operator in this context is equivalent to not (val1 in list2). The worst case time complexity for in on a list is \(\mathcal{O}(n)\) because we potentially have to examine all the elements in the list. The average case is still \(\mathcal{O}(n)\), because on average we will need to examine half the elements.

How could solve this more efficiently? With the subtraction operator on sets.

Why does in use linear search instead of something faster, like binary search? The in operator is designed to work on any list, not just sorted lists. And as we observed in the practice problems, checking if a listed is sorted has the same time complexity as linear search. And so checking first if a list is sorted and then using binary search does not improve our worst case time complexity compared to just using linear search and likely makes the code slower in practice.
What decimal numbers are represented by the following binary numbers:
1. 1101
  
  Show a solution
  
  13
2. 111
  
  Show a solution
  
  7
3. 10010+11011
  Show a solution
  1 1 10010 18 +11011 27 ------ -- 101101 45
Translate the following function using NumPy to just use Python built-ins assuming a_list is a list of floats (instead of a NumPy vector) and lower is a single (scalar) float:
```
def sum_above(a_list, lower):
    return np.sum(a_list[a_list > lower])
```
Show a solution
Recall that a_list[a_list > lower] is “vectorized” indexing, that is a_list > lower computes a vector (array) of booleans by performing an element-wise comparison. The indexing operation keeps all values of a_list for which the corresponding boolean is True.
def sum_above(a_list, lower): """ Sum all value in a_list greater than lower """ result = 0 for val in a_list: if val > lower: result += val return result
[From retest #2] Add to the body of the mystery function below such that after the code below executes, z and y have the same value and neither is equal to y’s initial value. If no such body is possible, indicate so. Briefly explain your answer.
```
def mystery(x):
    x = x[:]
    # Add code here...
    return x

y = [1, 2, [3, 4]]
z = mystery(y)
# z and y have the same value, and y is no longer [1, 2, [3, 4]]
```
Show a solution
Since a slice copy is performed, only the nested lists remain aliased and so we want to modify the nested list, e.g., append a value, so z and y have the same value and are different from [1, 2, [3, 4]], the original y.
def mystery(x): x = x[:] x[2].append(6) return x
[From retest #2] Assume course enrollment data is a stored as a list of tuples, where each tuple contains a CRN number as a string and the ID number, also as a string, of a student enrolled in that course (i.e., ("92669", "00123456") for student “00123456” enrolled in the course “92669”). Write a function named under that takes this list and a integer floor, as parameters, and returns a list of tuples with the CRN and the number of students enrolled for courses with floor students or fewer. The order of the CRNs in the returned list does not matter. You can assume that the input list is non-empty and that there are no duplicate enrollments. For example:
```
>>> enrolled = [("92669", "00123456"), ("92669", "00123457"), ("92670", "00123457")]
>>> under(enrolled, 1)
[("92670", 1)]
>>> under(enrolled, 2)
[("92669", 2), ("92670", 1)]
```
Show a solution
def under(enrolled, floor): counts = {} for crn, student in enrolled: if crn in counts: counts[crn] += 1 else: counts[crn] = 1 result = [] for count in counts.items(): if count[1] <= floor: result.append(count) return result