'Aardvark' < 'Zebra'
'aardvark' < 'Zebra'
True
False
While Loops
Will the following print statement get executed?
Yes. Most values can be used in a boolean context. In Python, 0
, 0.0
, None
, empty sequences (e.g. ""
) and a few other values evaluate as False
(often termed “falsy”) and everything else is True
(often termed “truthy”).
In general, I don’t recommend using this “implicit” type conversion as it just increases the chances for difficult-to-find bugs.
Now that we know about implicit booleans, though, we can resolve a common bug. In a previous in-class question, the solution was a == b or a == 5
. Can we simplify that expression as a == b or 5
(i.e., is ==
distributive)? No, that expression is evaluated as a (a == b) or 5
, which is not the same (and will always evaluate to True
as 5 evaluates as True
). What about a == (b or 5)
? No, equality is not distributive. Instead if b
is truthy, the above expression simplifies to a == b
, if not, it simplifies to a == 5
. If a
and b
are both 0, we should get True, but will get False.
Let’s think about a and b
. If a
evaluates to False
do I need to evaluate b
? Similarly if in a or b
a
evaluates to True
do I need to evaluate b
?
No. Python (and many other languages) “short-circuit” the evaluation of logical expressions. Doing so can both improve efficiency (we don’t perform computations we don’t need) and help us manage potentially problematic situations. For example in the following, we only ever execute dangerous_operation
if is_valid(input)
evaluates to True
.
We can also apply relational operators, i.e. <
, ==
, etc., to other types; most notably strings. For example:
That second one doesn’t seem to make much sense… Just as +
has different meaning for strings than integers, the <
a meaning specific to strings. Python compares strings using lexicographic ordering, i.e., it compares the ordering of corresponding characters. The first characters are compared, if equal, then the 2nd characters are compared and so on. If one string is a substring of the other, it is lexicographically less than, e.g.,
>>> "abc" < "abcdef"
True
This is not the same as a case-insensitive alphabetical ordering. When two letters are being compared, that comparison is based on their underlying numeric encoding. In the character encoding used by Python (and lots of other software), all upper case letters are less than lower case letters. Hence “aardvark” is “greater than” “Zebra”. You can access this numerical encoding with the ord
function, e.g.,
Some more examples:
If we wanted to ensure that a string comparison was case insensitive, how could be we do so? Use the upper
or lower
methods to ensure consistent case.
while
loopsWe previously used for
loops to execute a block of code for a specific number of repetitions. What if we don’t know how many iterations are needed? What might be such a situation? Obtaining a valid input from a user. We don’t know how many tries it will take someone to provide a valid input.
This is where we apply while
loops. The general structure of a while
loop is:
The statements in the loop body (i.e. statement1 … statementn) will be executed repeatedly until the boolean expression, the loop conditional, evaluates to False
.
Here is a concrete example. What will this code print?
What is the necessary implication of while
loops? That some statement(s) within the body of the while
loop will change the loop conditional (or otherwise terminate the loop). What happens if that is not the case, e.g.
Hit Ctrl-C (Ctrl and C simultaneously) or the Thonny “stop sign” button to stop execution. This is called an infinite loop and is a common problem. You will likely need to use Ctrl-C or the stop sign button at some point.
What about the following loop? Will it terminate?
No. Because i
starts at 0 and only gets smaller it will always be less than 10. What about this loop?
from random import randint
i = randint(1, 20)
while i < 10:
print("How many times will this string get printed?")
i = i + 1
Yes. If the value is less than 10, the loop with terminate after some number of iterations. If the value is greater than or equal to 10, the loop won’t execute at all.
In addition to changing the loop conditional we can also explicitly terminate the loop with the break
statement. As its name suggests, break
terminates the loop and begins executing the first statement after the loop. Although break
is most commonly used with while
loops, it can also be used to end for
loops “early” (i.e., only perform a subset of the iterations).
Let’s implement a guessing game in which Python picks a random integer from 1-20 (inclusive), and keeps asking the user to guess the number until they get it right. To help the user, the game should give hints “higher”, or “lower”.
What would be some key elements in such a program?
Check out guessing_game.py. Here we see 3 strategies for implementing the loop:
correct
to track if the user answered correctlybreak
out of a “while True” loop on a correct answerAdditionally, note the use of the input
function for reading user inputs. input
prints its prompt
argument then waits for and returns the string the user typed before hitting Enter/Return. input
returns a string and so we need to convert the result to an int
for use in our game.
while
vs for
loopsOften we can implement the same functionality with both a for
loop and a while
loop. In fact, in Python any for
loop can be readily implemented with a while
loop. The reverse is trickier. There are tricks that would enable us to make a for loop behave like a while loop in some situations, but they are exactly that – tricks. So for our purposes we should think of while loops as a superset of for
loops.
As an example, how could we write a for
loop to print the even numbers from 1-10 inclusive?
And the same with a while
loop?
We can see the clear correspondence between the while
loop and the for
loop. Here we effectively re-implemented the range
within the while
loop by setting the initial value, the end condition and the increment. What are some other ways we could have achieved the same result? One is to iterate through all integers in the range [1,10], but use an if
statement, e.g., if i % 2 == 0:
to identify and print the even values.
for
vs. while
?So when do we use a for
loop and when do we use a while
loop?
We use a for
loop when
As an example, iterating over all the elements in a sequence, e.g. a string, is an example of a situation where the number of iterations is known at the initiation of the loop (number of elements in sequence) and the increment is consistent (increment one element each iteration).
We would use a while
loop in other settings, such as
This an example where “style” matters but there are not necessarily clear “rules”. Often one approach or the other is more appropriate. The right choice will make the code more elegant, easier to reason about (and easier to debug).
while
loops in actionGenes have different versions, or variants, termed alleles. These different alleles can be associated with different traits, e.g. do you taste certain chemicals as bitter. Population genetics is the study of how evolutionary processes affect the frequencies of alleles in the population. For example, if a population starts with a mixture of two alleles and if there is no advantage for one allele over the other, then one of the alleles will eventually disappear and the other will be present in 100% of the individuals (described as becoming fixed in the population).
To convince ourselves of this phenomenon, we are going to create a simple of simulation of a haploid organism (just one copy of chromosome) that has two alleles ‘a’ and ‘A’. We will represent our population of size n with a string of length n containing the letters ‘a’ and ‘A’. We will then simulate each new generation by randomly sampling from the current population with replacement to create a new population of the same size. We then want to return the number of generations required for one of the alleles to become fixed.
As always we want to decompose our problem into smaller problems that are easier to solve and thus build up the solution piece-by-piece. How could we break this problem into a set of functions that solve smaller problems and what semantic tools are needed for those functions?
Write a function named next_gen
that takes the current generation as a parameter and returns the next generation.
What semantic tools are needed here? Likely a for
loop, a way to randomly sample from a string, and the string accumulation pattern.
Write a function named pop_sim
that takes an initial population as a string and returns the number of generations to required for fixation.
What semantic tools are needed here? A loop to iterate over the generations. And a way to detect if both alleles are present in the string.
Let’s start with next_gen
and then implement pop_sim
. next_gen
has a single string parameter, pop
the current population and returns the new population, a string of the same size. An example would be:
What “pattern” will this function likely take? We could use the string construction pattern we used in PA3 in which we build strings up character by character in a for
loop. In this context, the pattern might look like:
Here we want to randomly sample from pop
with replacement. As we did before we can use indexing and randint
, e.g. pop[randint(0, len(pop)-1)]
. As you might imagine this a very common task, and so the random module has a choice
function to do exactly this kind of sampling. The choice
function randomly selects one element from a non-empty sequence. I suspect you will find choice
helpful in PA4 (and beyond).
Now let’s turn to pop_sim
, which has a single string parameter, pop
, the initial population and returns the number of generations till fixation. We need a loop to generate the successive generations, but do we know how many generations we will have to simulate? No.
A while
loop. In the previous next_gen
function, we know the number of loop iterations (the size of the population or the length of the string) and thus could (and should) use a for loop. Here we don’t know the number generations required to reach fixation and so need to use a while loop
When should the loop in pop_sim
terminate? When one allele becomes fixed. Alternately, when should the loop keep executing? As long as both alleles are present in the population, i.e., both “a” and “A” are in the population string.
What do you want to do each iteration? Here, each iteration is a new generation, that is we want to simulate the next generation resulting from the current generation, or the new pop resulting from the existing pop. Recall we already implemented a function next_gen
that generates a new population from an existing population. Let’s use it here.
What is the ultimate return value? We want to return the number of generations, i.e., the number of loop iterations. To keep track of the number of generations, we need to count how many times the loop executes. With for loops that is always a known quantity. With while loops we will need to introduce a “counter” variable that is incremented each time the loop executes.
def pop_sim(pop):
"""
Simulate allele fixation in a population
Args:
pop: Initial population as a string
Returns:
Integer number of generations need to achieve fixation
"""
generations = 0
while "a" in pop and "A" in pop:
pop = next_gen(pop)
generations += 1
return generations
Check out a full implementation including a function to generate an initial population.
Adapted from (Libeskind-Hadas and Bush 2014).