Class 28: Object-oriented Programming I

Objectives for today

Why Object-Oriented Programming (OOP)

How many of you have used Object-Oriented Programming, often called “OOP” (sometimes rhymes with “goop”)? Another trick question. Everyone has. Recall that everything in Python is an object, i.e., integers, strings, etc. and so at some level all Python programming is OOP. But that does not provide any explanation of why that is the case, or what the advantages might be for OOP.

Python has rich library of types “built-in” (its tag line is “batteries included”) but that can’t possibly include types for every kind of data you might want to represent in your program. Instead Python provides a mechanism - Object-Oriented Programming - for us to create new types as needed, and use these those types in the same way we use integers, lists, etc., including performing arithmetic or other operations on those values.

In OOP, we create a Class to define a new type. A Class is like a blueprint for creating objects, or specific instances of that type (much in the way a single construction blueprint can be used to create one or more specific instances of a house). The Class specifies the instance variables (data) contained within an instance and the methods, the computations that can be performed on those instance variables.

As we will see, this approach can offer two key benefits:

  1. We can “encapsulate” all of the complexity of specific data type, including any data and associated operations on that data. Using objects we can define a “higher level” interface the “abstracts” the specifics of that object
  2. Facilitate code reuse through shared interfaces and inheritance

To see those benefits in action, let’s revisit floating point numbers and their peculiarities…

Creating our own data types

We can observe unexpected behavior when comparing floating point numbers due to insufficient precision. For example,

>>> 0.1 + 0.2 <= 0.3
False

Due to insufficient precision 0.1 + 0.2 rounds to a number slightly larger than 0.3. We could avoid the inherent imprecision by representing floating numbers as rational numbers, i.e., the ratio of two integers. For example 0.1 would be 1 / 10. We could then express all of the above operations as operations on ratios, i.e. \(\frac{a}{b} + \frac{c}{d}\) is \(\frac{ad+cb}{bd}\). Since all the values are integers that computation produces an exact result!

Using the tools already in our toolbox, we could implement our rational number as a tuple of the numerator and denominator, i.e.,

>>> r1 = (1, 10)
>>> r2 = (2, 10)
>>> r1
(1, 10)

and addition as a function, e.g.

def add(left, right):
    return (left[0]*right[1] + right[0]*left[1], left[1]*right[1])
>>> add(r1, r2)
(30, 100)

Doing so is definitely workable, but it requires anyone using that “rational” tuple to know that the first element is the numerator and the second is the denominator, and we can’t use any of Python’s built-in operations in the way we might want, for example + performs tuple concatenation. That is we don’t have a very effective “abstraction” for rational numbers.

>>> r1 + r2
(1, 10, 2, 10)

Instead let’s create a Rational class, that is a Rational data type to encapsulate the numerator and denominator and any associated operations. See the linked Python file for the complete implementation with docstrings, etc.

# Define a class named Rational (we capitalize class names to distinguish them from functions, etc.)
# Note that classes need docstrings too (see linked file)!
class Rational:
    # Define an initializer that sets the numerator and denominator instance variables. It too needs a
    # docstring, but note we don't include a return value since it does not return. We also
    # don't document the self parameter since it already has a defined role in the Python
    # language specification.
    def __init__(self, numerator, denominator):
        # Initialize the object instance variables
        self.numerator = numerator
        self.denominator = denominator

Above we have defined the Rational class. Within that class, any functions definitions are the “methods” of that class. We have started with the “initializer”, __init__, a special method used to construct new instances or objects of a class. We can create new instances of Rational just as we created new integers, etc. (that is when used int, list, etc. we were creating new instances of those classes):

>>> r1 = Rational(1, 10)

Behind the scenes, Python is creating a new empty object and invoking the initializer, __init__. The self parameter is a special parameter that is always first in the parameter list and is automatically set by Python to be a reference to the newly created (still empty at this point) object. The initializer takes self, the reference to that newly created object and defines and sets the numerator and denominator instance variables of that object. When it is done, we have fully initialized Rational object with its own unique numerator and denominator instance variables . Check this code out in Python Tutor.

We can access those instance variables using the “dot” syntax.

>>> r1 = Rational(1, 10)
>>> r1.numerator
1
>>> r1.denominator
10
>>> r2 = Rational(2, 10)
>>> r2.numerator
2
>>> r2.denominator
10

To implement our motivating example above, we need to support addition and comparison. Let’s start with the former. We implement an add method that will add two Rational numbers, via r1.add(r2). Notice the special self parameter, this is always the first parameter for methods, and is automatically set to the object on which the method is invoked, i.e., r1 in the example above. other will be a reference to r2. We perform the addition and return a new Rational object with the appropriate numerator and denominator.

def add(self, other):
    numerator = self.numerator * other.denominator + other.numerator * self.denominator
    denominator = self.denominator * other.denominator
    return Rational(numerator, denominator)
>>> r1 = Rational(1, 10)
>>> r2 = Rational(2, 10)
>>> r3 = r1.add(r2)
>>> r3
<__main__.Rational object at 0x1017e7550>
>>> r3.numerator
30
>>> r3.denominator
100

PI Questions

Overriding operators

If we attempt any comparisons of these objects, however, we will get unexpected results:

>>> r4 = Rational(3, 10)
>>> r3 == r4
False
>>> r5 = Rational(30, 100)
>>> r3 == r5
False

Why is that? Don’t r3 and r4, and especially r3 and r5 represent the same value? They do, but at the moment Python doesn’t know that. It doesn’t know how Rational objects should be compared and so it compares them based on their location in memory, not that value they represent. Since those are distinct objects (and distinct locations in memory) they compare as False. To implement the appropriate comparison we need to “overload” equality to be specific to Rational. Python actually lets us overload quite a few operators. In our case, we will overload ==, <= and + (so we can implement our motivating example), and override __str__, to customize how Rational objects are displayed. We overload these operators by implementing the special methods __eq__, __le__, and __add__ in our class. Python calls these methods automatically when performing ==, etc.

def __eq__(self, other):
    return self.numerator * other.denominator == other.numerator * self.denominator 

def __le__(self, other):
    return self.numerator * other.denominator <= other.numerator * self.denominator

def __add__(self, other):
    numerator = self.numerator * other.denominator + other.numerator * self.denominator
    denominator = self.denominator * other.denominator
    return Rational(numerator, denominator)

We can now correctly perform the computation in our motivating example! By implementing a class and overriding those operators we have “abstracted away” the details of operations on rational numbers. We can implement our code just as if we were using floats, but we get the correct results!

>>> Rational(1, 10) + Rational(2, 10) <= Rational(3, 10)
True

Much as Python didn’t know how to compare Rational objects by default, it doesn’t know how to print them either. So it defaults to printing the location in memory. By overriding the __str__ method, we can get a more useful display:

>>> print(r1)
<__main__.Rational object at 0x105d04748>

With the new method below, we get a more useful output:

def __str__(self):
    return str(self.numerator) + "/" + str(self.denominator)
>>> r1 = Rational(1, 10)
>>> print(r1)
1/10

Reuse through inheritance

As we noted before, a common use case for precise arithmetic is dealing with currency. We would like to avoid rounding errors when there is money at stake! US dollars could be very naturally represented as a Rational where the numerator is the number of cents and the denominator is 100. But instead of displaying currency as “3/10” we would like to display it as something like “$0.30”. That is we want a slightly customized version of Rational. We could certainly copy all of the code for Rational into a new class Dollar customizing the methods we want, but that seems like poor design and style (all that copying should make us nervous). Instead we recognize that Dollar is a Rational number, and thus the Dollar class can be derived (or inherit) from the Rational class. By doing so, it “inherits” all the instance variables and methods from Rational and then overrides just those that need to be different.

# The Dollar class is derived from Rational, that is Rational is the "base" class (sometimes called 
# "parent" class) and Dollar the "derived" (or "child") class. As a result, Dollar inherits all of 
# Rational's instance variables and methods.
class Dollar(Rational):
    def __init__(self, cents):
        # Use `super()` to call `Rational`'s initializer
        super().__init__(cents, 100)
    
    def __str__(self):
        # Use format method to specify 0 padding up to two digits for cents
        return "${}.{:02d}".format(self.numerator // 100, self.numerator % 100)

Via inheritance we get to reuse all the capabilities of Rational, but with the customized initializer that has an assumed denominator (of 100 cents) and the customized string generation.

>>> d1 = Dollar(10)
>>> print(d1)
$0.10
>>> d2 = Dollar(10) + Dollar(20)
>>> d2 == Dollar(30)
True
>>> print(d2)
3000/10000
>>> type(d2)
<class '__main__.Rational'>

But we also noticed that when we printed d2, we got the Rational version. Why is that? d2 is actually a Rational object not a Dollar. The root cause is the __add__ method, which returns a Rational. We would like to return a Rational when self is a Rational object and Dollar when self is a Dollar object. To do so we modify that method to return an instance of the same type as self, e.g.,

def __add__(self, other):
    numerator = self.numerator * other.denominator + other.numerator * self.denominator
    denominator = self.denominator * other.denominator
    return self.__class__(numerator, denominator)

However when we do so we get an error because the Dollar initializer only takes one argument, but add provides two arguments: the numerator and denominator. We could fix that by using an optional argument that defaults to 100.

def __init__(self, cents, denominator=100):
    # Use `super()` to call `Rational`'s initializer
    super().__init__(cents, denominator)

Now we can print d2, the sum of 10 and 20 cents:

>>> d2 = Dollar(10) + Dollar(20)
>>> print(d2)
$30.00
>>> d2.numerator
3000
>>> d2.denominator
10000

Except we don’t get the right value! Why? Our __str__ method assumes the denominator is 100, but addition produces a larger denominator. How could we fix that? For Dollar the denominator should always be a multiple of 100, so we could use the initializer to re-normalize the denominator every time we create a new object, e.g.

def __init__(self, cents, denominator=100):
    # Reduce numerator to have 100 as the denominator, so the values stay "normalized"
    normalize = (denominator // 100)
    super().__init__(cents // normalize, 100)

Now we get the correct result and can take advantage of all the capabilities of Rational without needing to re-implement those methods. The result is much more concise and maintainable!

Make sure to check out the complete implementation which includes docstrings and additional comments, explanation, etc.

PI Questions

Python inheritance vocabulary

In the above example, Dollar derives or inherits from Rational . We would describe Rational as the base or parent class, and Dollar the derived or child class. The derived class inherits all the methods and instance variables of its base class(es) (a class can have “grandparent” classes, “great grandparent” classes, etc. forming a chain). When you invoke a method on an object or access an instance variable, Python first searches that class for the relevant method or instance variable then its base class, continuing recursively “up the chain” until the name is method/instance variable is found or there is no more base class. Thus methods in a derived class override methods of the same name in parent classes. That is invoking the __str__ method on a Dollar object executes the __str__ method defined in Dollar, while invoking __add__ executes the method defined in Rational because there is no implementation of __add__ in Dollar (and thus Python find it in Rational).

Note that there many different ways to implement OOP and inheritance. Other languages, such as Java (which you learn in CS201) have slightly different approaches, but the core ideas are similar.

So when is OOP useful?

That is hard question to answer in the abstract… but I would say that anytime you have a single logical entity, e.g. a rational number, that requires multiple pieces of data to represent, e.g., a numerator and denominator, OOP could be a useful tool. That is we could benefit from creating a new data type for that logical entity. Another example would be calendar dates. A date is a single logical entity, but encoding a date requires multiples pieces of information, e.g., some combination of year, day and/or month. And we would like to perform computations on dates, e.g., time elapsed between two dates, without needing to know the underlying encoding. Implementing a Date class, with methods for those operations, is a very natural approach.

Credits: Parts of this class were adapted from Alvarado C. et al. (2020). CS for all. Franklin Beedle & Associates