Data scientist at Port Jackson Partners in Sydney, Australia. My PhD was in computational biology. In my spare time I write about medical research at BioSky.co.CVAbout
Did you know that every object in your Python program is given a unique identifier by the interpreter which you can return using the ‘id()’ function? Let’s see what happens when we assign variables to each other in Python and then print out the variable value and object id:
x = 5 y = x print(x, id(x)) print(y, id(y))
Now I am printing out two things about each variable: the value it stores (the number 5) and the object id. The object id for both variables should be the same. This is telling us that both the x and y variables are referring to the same object. Now, what happens if I were to change the y variable?
x = 5 y = x y += 1 print(x, id(x)) print(y, id(y))
Here I’ve added 1 to the y variable, and as you can see, both the value and object id of the y variable is now different to the x variable. This change has occurred because integers in Python are immutable data types. Integers, floats, strings and tuples are examples of immutable objects in Python. What this means is that when you change the number a variable refers to, the Python interpreter points the variable to a completely different object.
Now, the distinction between mutable and immutable objects in Python starts to become more important when you start working with mutable data types like lists:
x =  y = x y.append(5) print(x, id(x)) print(y, id(y))
The thing that may surprise a lot of people is that now the list both x and y points to has changed! Now the list has a 1 and a 5 in it, even though I only appended to y. This is very easy to get tripped up on if you don’t know what Python is really doing under the hood.
Now how do you get around this? Luckily the standard library has us covered with a handy package called ‘copy’:
import copy x =  y = copy.copy(x) y.append(5) print(x, id(x)) print(y, id(y))
Once you use the ‘copy’ method on x, you can see from the output that y has a new object id and that changes you make to the y variable no longer affect x. Problem solved, right?
Well, there’s now just one final thing to consider – what happens if I have a list of lists and I use the copy method?
import copy x = [[1, 2], [3, 4]] y = copy.copy(x) y.append(5) print(x, id(x)) print(y, id(y))
You can see that I’ve made a copy of the x variable, then appended to the first list in y (the one containing [1, 2]). However, when you see the output, you’ll notice that like before, the change made to y has also modified x. You can see the reason for this by looking at the object id of the first list of x and y:
print(x, id(x)) print(y, id(y))
Both lists refer to the same object, only the outer list was copied and received a new object id! To solve this, we’ll need to use the ‘deepcopy’ method. This will create unique copies of all the objects throughout the list – not just a copy of the outer list itself:
import copy x = [[1, 2], [3, 4]] y = copy.deepcopy(x) y.append(5) print(x, id(x)) print(y, id(y))
Now we should see that any changes to the lists in x or y will not affect the variable anymore.
So how is this useful? The reason I decided to write this post this evening was because some of the code I was working on today involved initialising multiple dictionaries with many keys. I could have copied and pasted them, but it was a lot easier and more readable for me to create an initial variable, then use ‘deepcopy’ to initialise the other variables. If I hadn’t been aware of Python object ids and mutable types, then I could very easily have created a massive bug in my code by creating several variables that all modified the same dictionary.