Class 25: Recursion I

Objectives for today

Recursion in Action: Tower of Hanoi

Tower of Hanoi is a classic puzzle in which you need to transfer a set of discs from one pole to another pole using a spare pole. It has three simple rules:

  1. Only move one disc at a time
  2. Only the top-most disc on a pole can be moved and it must be placed on the top of another pole
  3. A disc can’t be placed on top of a smaller disc

Let’s develop an algorithm for playing this game, and specifically a function named move_tower that has parameters height, from_pole, to_pole and spare_pole, where height is the number of discs to move from the from_pole to the to_pole. A hint: there is an elegant recursive solution (check out hanoi.py).

Iterative factorial

As a starting example consider computing the factorial. A natural iterative solution is below. How could we approach this problem recursively?

def factorial(n):
    result = 1
    for i in range(2, n+1):
        result *= i
    return result

Recursion

A recursive algorithm is defined in terms of solutions to smaller versions of the same problem. A recursive function, then calls itself to solve a smaller version of the problem.

Let’s think about solving factorial recursively. Can we break down the factorial problem into a computation and an identical sub-problem?

5! = 5 * 4 * 3 * 2 * 1
5! = 5 * 4!

A first attempt at a recursive factorial function:

def factorial(n):
    return n * factorial(n-1)

Let’s visualize the call stack:

5 * factorial(4)
    |
    4 * factorial(3)
        |
        3 * factorial(2)
            |
            2 * factorial(1)
                |
                1 * factorial(0)
                    |
                    0 * factorial(-1)
                        |
                        ...

So when will this end? Never! At some point we need to terminate the recursion. We call that the base case. The base case and the recursive relationship are the two key elements of any recursive algorithm.

For factorial, we know that factorial(1) == 1 (and factorial(0) == 1) so:

def factorial(n):
    if n <= 1:
        return 1
    else:
        return n * factorial(n-1)

Here we see the typical structure of a recursive function: First we check if we are at the base case(s), if so return the result directly. If not, we invoke the function recursively on a sub-problem.

Clearly this works. But why? Doesn’t each call to factorial overwrite n? No. To help us understand what happens when we call a function (and what we mean by the call stack) let’s use Python Tutor on our factorial function.

Whenever we invoke a function we create a new “frame” on the “call stack” that contains the arguments (local variables and other state in the function). Thus we don’t “overwrite” the parameters when we repeatedly invoke our function.

PI Questions2

How to Write a Recursive Function

We employ a 4 step process:

  1. Define the function header, including the parameters
  2. Define the recursive case
    • Assume your function works as intended, but only on smaller instances of the problem. How would you implement your solution?
    • The recursive problem should get “smaller” (or it will never finish!).
  3. Define the base case
    • What is the smallest (or simplest) problem? It should have a direct (i.e. non-recursive) solution.
  4. Put it all together
    • First, check for the base and return (or do) something specific.
    • If the computation hasn’t reached the base case, compute the solution using the recursive definition and return the result.

Recursion has a similar feel to “induction” in mathematics:

  1. Prove the statement for the first number, or base case
  2. Assume the statement works for an arbitrary number or input
  3. Prove that the given statement for one number implies the statement is true for next number
  4. Therefore it must work for all values

Let’s use this process to recursively reverse a string (check it out in Python Tutor):

  1. Define the function header, including the parameters

    def reverse(a_string)
    
  2. Define the recursive case

    Assume we have a working reverse function that can only be called on smaller strings. To reverse a string:

    1. Remove the first character
    2. Reverse the remaining characters
    3. Append that first character to the end
  3. Define the base case

    The reverse of the empty string is just the empty string.

  4. Put it all together

    def reverse(a_string):
        if len(a_string) == 0:
            return ""
        else:
            return reverse(a_string[1:]) + a_string[0]
    

An implementation note … Why doesn’t a_string[1:] produce an index error when a_string is a single letter (e.g. "e"[1:])? Slicing has the nice property that slicing beyond the end of the string evaluates to the empty string, e.g.

>>> "e"[1:]
''

However, indexing a single value (not slicing) beyond the end of a string (or list) will produce an error, e.g.

>>> "e"[1]
Traceback (most recent call last):
  File "<pyshell>", line 1, in <module>
IndexError: string index out of range