Class 9

Objectives for today

  1. Practice implementing functions that loop over strings
  2. Use functions from the random module

Building up strings

In many of our functions will use a similar pattern in which we build up a new string piece-by-piece by appending characters to a result string (initialized as the empty string or ""). We saw some examples of this pattern previously and now want to implement another example. Specifically we want to write a function named password_gen that takes a single parameter length and generates a random password (as a string) of length characters.

How could we implement our password generator? Show a possible approach

There are many ways, but a simple one is to use randint to index into a string of allowed characters. More formally, this is an example of sampling with replacement, that is every time we sample an item from a set of potential items, e.g. the letters, we replace it in the set so it could be sampled again in the future. An alternative is “sampling without replacement”, in which each item can only be selected once. For this application, why would sampling with replacement be preferred?

As always we want to solve this problem in several steps, instead of trying to tackle the whole problem at once. What are some possible intermediate steps? As an example, I would start by defining a constant CHARS with the allowed characters. Some possible next steps

  1. Define a constant CHARS with the allowed characters
  2. Create a version of password_gen to create a string of the specified length with a fixed character
  3. Enhance password_gen to create a string with random characters

Now let’s implement those steps. Show a possible implementation

  1. Define a constant CHARS with the allowed characters

     CHARS = "abcdefghijklmnopqrstuvwxyz0123456789_!@#$%^&*"
    
  2. Create a version of password_gen to create a string of the specified length with a fixed character

     CHARS = "abcdefghijklmnopqrstuvwxyz0123456789_!@#$%^&*"
        
     def password_gen(length):
         result = ""
         for i in range(length):
             result = result + CHARS[0]
         return result
    
  3. Enhance password_gen to create a string with random characters

     from random import randint
        
     CHARS = "abcdefghijklmnopqrstuvwxyz0123456789_!@#$%^&*"
        
     def password_gen(length):
         result = ""
         for i in range(length):
             result = result + CHARS[randint(0, len(CHARS)-1)]
         return result
    

    Recall that randint has an inclusive end, and so to not exceed the length of CHARS we need to use len(CHARS)-1 as the end argument.

  4. And finally add the finishing touches, e.g. docstrings.

     from random import randint
        
     CHARS = "abcdefghijklmnopqrstuvwxyz0123456789_!@#$%^&*"
        
     def password_gen(length):
         """
         Generate a random password
            
         Args:
             length: number of characters in the password
                
         Returns:
             Password string
         """
         result = ""
         for i in range(length):
             result = result + CHARS[randint(0, len(CHARS)-1)]
         return result
    

Breaking down strings

Our example above focused on string generation, our next example will focus on extracting information from a string, specifically a date written as MM/DD/YYYY. For example imagine we had the string “10/20/2022”. We could use slicing to extract the various pieces, e.g.

>>> date = "10/20/2022"
>>> month = date[:2]
>>> day = int(date[3:5])
>>> year = int(date[6:])
>>> month
10
>>> day
20
>>> year
2022

What is the potential problem with our approach? What if the person didn’t use exactly two digits for the month and the day and four digits for the year? For example, what would happen if someone entered “1/1/22”?

>>> date = "1/1/22"
>>> month = int(date[:2])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '1/'

Let’s develop an approach that works for both of those formats. Since the month, day and year are separated by slashes, want we really want to know is the position of those slashes. Once we know the position of the two slashes, we can readily figure out the indices we should use with our slicing operations.

Let’s start then by writing a function that takes two arguments, a string date and an integer n and returns the index of the nth slash in the string. For simplicity we will assume that date contains at least n slashes. Since we are trying to find the location of a substring in a larger string, the find method is a natural tool to use. We will start by finding the first slash.

def find_slash(date, n):
    index = date.find("/")
    return index
>>> find_slash("1/1/22",1)
1

By default, find returns the lowest index of the match. If we want to find the next slash we will somehow need to start searching after the current preceding slash. When we look at the documentation for find (shown below), we see it takes an optional argument, the position to start searching. Let’s try that out…

>>> help(str.find)
Help on method_descriptor:

find(...)
    S.find(sub[, start[, end]]) -> int
    
    Return the lowest index in S where substring sub is found,
    such that sub is contained within S[start:end].  Optional
    arguments start and end are interpreted as in slice notation.
    
    Return -1 on failure.

We notice that if we set that argument to 2, we get the index of the next slash! Why 2 and not 1, the index of the preceding slash? If we set the start to 1, we begin searching at the index of the first slash and keeping finding it over again. Instead we need to increment by 1.

>>> "1/1/22".find("/",0)
1
>>> "1/1/22".find("/",1)
1
>>> "1/1/22".find("/",2)
3

If we combine find with a loop we can extend our function to find any number of slashes! Try it out before looking at a possible implementation

def find_slash(date, n):
    """
    Find index of nth forward slash "/" in date

    Args:
        date: Date string containing at least n forward slashes
        n: Find index of nth slash

    Returns:
        Index of nth slash
    """
    index = -1
    for i in range(n):
        # Start the search at the next character after the preceding slash
        index = date.find("/", index + 1)
    return index

With that in place, we can now successfully parse dates with different length fields!

>>> date = "1/1/22"
>>> first = find_slash(date,1)
>>> second = find_slash(date,2)
>>> int(date[:first])
1
>>> int(date[first+1:second])
1
>>> int(date[second+1:])
22