Almost every value in Python is actually an object. int
s, float
s, etc., are
objects, just like strings are objects. For example, here are all the methods
available for integers:
>>> x= -10
>>> dir(x)
['__abs__', '__add__', '__and__', '__bool__', '__ceil__', '__class__', '__delattr__', '__dir__', '__divmod__', '__doc__', '__eq__', '__float__', '__floor__', '__floordiv__', '__format__', '__ge__', '__getattribute__', '__getnewargs__', '__gt__', '__hash__', '__index__', '__init__', '__int__', '__invert__', '__le__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__', '__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__', '__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__', '__rfloordiv__', '__rlshift__', '__rmod__', '__rmul__', '__ror__', '__round__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__', '__rtruediv__', '__rxor__', '__setattr__', '__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__', '__trunc__', '__xor__', 'bit_length', 'conjugate', 'denominator', 'from_bytes', 'imag', 'numerator', 'real', 'to_bytes']
>>> help(x)
When we create a new object, we are allocating memory for that object’s data (e.g., the integer above). When we assign that object to a variable that variable is now a reference to that object. Let’s look at the memory model for the following code:
z = 2
x = 1
y = x
x = z
When we reassign a variable we are changing the object the variable “points
to”. When we assign one variable to another, e.g. y=x
, we are setting both
variables to point to the same object in memory. Changing these pointers does
not change the underlying objects.
We saw that lists are very similar to strings in many respects. But one
important difference is mutability. In Python strings are immutable, i.e., they
can’t be modified. Python lists are mutable. Let’s see the impact of
mutability. Notice that changes to the list “pointed to” by a
are reflected
in b
, since they both reference the same underlying list object.
>>> a = [1, 2, 3]
>>> b = a
>>> a[1] = 4
>>> a
[1, 4, 3]
>>> b
[1, 4, 3]
Let’s draw out the memory model.
In the above code, a
and b
point to the same
underlying list. When we modify that list via a
or b
, the changes are
reflected in both variables. We would say a
and b
are aliased.
Aliasing can occur with function parameters as well.
>>> def aliasing(param):
... param[1] = 4
...
>>> a = [1, 2, 3]
>>> aliasing(a)
>>> a
[1, 4, 3]
Let’s check out the memory model.
Since strings, integers, floats, etc., are also objects, why don’t we have the same aliasing problems? Recall that strings are immutable. So are integers, etc. There are no operations on integers that modify the underlying objects, all methods and operators create new integers.
Let’s consider the function below:
def my_function(a, b):
return a+b
Recall that a
and b
are the parameters, specifically “formal parameters”. When
we call my_function
we:
When the arguments are immutable, the multiple references to the same objects don’t matter. But as we saw earlier for mutable objects, like lists, function calls create the possibility for aliasing. Modifications applied via the function parameters are reflected in other variables that are references to the same underlying object.
For example, consider the following:
def my_function(a, b):
a += b
return True
Sending lists as arguments, we see that this code changes
the argument that variable a
references:
>>> x = [1,2,3]
>>> y = [4,5]
>>> my_function(x, y)
True
>>> x
[1, 2, 3, 4, 5]
Consider another example:
>>> def my_function(a):
... a = [0]*5 # variable a now points to a new object
... a[0] = 6
...
>>> x = [1, 2, 3, 4, 5]
>>> my_function(x)
>>> x
[1, 2, 3, 4, 5]
Let’s check out the memory model.
In my_function
, why isn’t x
modified? The statement a = [0] * 5
we are
creating a new list object, different than that originally pointed to by a
(and x
), assigning it to a
. Any subsequent change to a
affects this new
object.
Why does Python implement variable and parameter assignment via references, termed “shallow copies”? Performance. We don’t need to copy the entire object. Often we just want to “use” the value, not modify it.
What if we need a “deep copy”? That is, to perform variable assignment without the potential for aliasing?
For lists we can use slicing. Slices of a list are not shallow copies.
>>> x = [1, 2, 3, 4, 5]
>>> y = x[0:2]
>>> y
[1, 2]
>>> y[0] = 6
>>> y
[6, 2]
>>> x
[1, 2, 3, 4, 5]
>>> y = x[:]
>>> y[4] = 12
>>> y
[1, 2, 3, 4, 12]
>>> x
[1, 2, 3, 4, 5]
Note that slices aren’t truly deep copies. If the list contains mutable values, e.g., other lists, those values are not deep copied. The memory model for the following example:
>>> x = [1, 2, [3, 4], 5, 6]
>>> y = x[:]
>>> y[2][0] = 7
>>> y[3] = 8
>>> y
[1, 2, [7, 4], 8, 6]
>>> x
[1, 2, [7, 4], 5, 6]
For truly deep copies, you will need the copy module.
swap
For the code below, what are the values after invoking swap
:
>>> def swap(a, b):
... temp = a
... a = b
... b = temp
...
>>> x = 10
>>> y = 20
>>> swap(x, y)
>>> x
10
>>> y
20
What is happening here? Let’s visualize the memory model. So can we not implement swap in Python? Recall that we can implement swap using tuples:
>>> x = 10
>>> y = 20
>>> (y, x) = (x, y)
>>> x
20
>>> y
10
For those interested, this optional section provides an example constrasting modifying a list parameter vs returning a new copy of a list.
Consider the following two functions:
def insert_after(a_list, n1, n2):
"""
Insert n2 after each occurrence of n1 in a_list.
(list of int, int, int) -> NoneType
"""
and
def insert_after2(a_list, n1, n2):
"""
Return a new list consisting of all elements from a_list,
plus a copy of n2 after each occurrence of n1.
(list of int, int, int) -> list of int
>>> insert_after2([3, 4, 5], 3, 10)
[3, 10, 4, 5]
"""
How would the implementations differ? The first would directly modify
its parameter. The second will need to create a copy of the list to be
returned. Why a while
loop in the first function below? Since we
are modifying the list, the indices are changing and so need evaluate
the stopping conditional every iteration.
def insert_after(a_list, n1, n2):
"""
Insert n2 after each occurrence of n1 in a_list.
(list of int, int, int) -> NoneType
"""
i=0
while i < len(a_list):
if a_list[i] == n1:
a_list.insert(i+1, n2)
i += 1
i += 1
def insert_after2(a_list, n1, n2):
"""
Return a new list consisting of all elements from a_list,
plus a copy of n2 after each occurrence of n1.
(list of int, int, int) -> list of int
>>> insert_after2([3, 4, 5], 3, 10)
[3, 10, 4, 5]
"""
new_list = []
for element in a_list:
new_list.append(element)
if element == n1:
new_list.append(n2)
return new_list