What are the options to clone or copy a list in Python?
In Python 3, a shallow copy can be made with:
a_copy = a_list.copy()
In Python 2 and 3, you can get a shallow copy with a full slice of the original:
a_copy = a_list[:]
There are two semantic ways to copy a list. A shallow copy creates a new list of the same objects, a deep copy creates a new list containing new equivalent objects.
A shallow copy only copies the list itself, which is a container of references to the objects in the list. If the objects contained themselves are mutable and one is changed, the change will be reflected in both lists.
There are different ways to do this in Python 2 and 3. The Python 2 ways will also work in Python 3.
In Python 2, the idiomatic way of making a shallow copy of a list is with a complete slice of the original:
a_copy = a_list[:]
You can also accomplish the same thing by passing the list through the list constructor,
a_copy = list(a_list)
but using the constructor is less efficient:
>>> timeit
>>> l = range(20)
>>> min(timeit.repeat(lambda: l[:]))
0.30504298210144043
>>> min(timeit.repeat(lambda: list(l)))
0.40698814392089844
In Python 3, lists get the list.copy
method:
a_copy = a_list.copy()
In Python 3.5:
>>> import timeit
>>> l = list(range(20))
>>> min(timeit.repeat(lambda: l[:]))
0.38448613602668047
>>> min(timeit.repeat(lambda: list(l)))
0.6309100328944623
>>> min(timeit.repeat(lambda: l.copy()))
0.38122922903858125
Using new_list = my_list then modifies new_list every time my_list changes. Why is this?
my_list
is just a name that points to the actual list in memory. When you say new_list = my_list
you're not making a copy, you're just adding another name that points at that original list in memory. We can have similar issues when we make copies of lists.
>>> l = [[], [], []]
>>> l_copy = l[:]
>>> l_copy
[[], [], []]
>>> l_copy[0].append('foo')
>>> l_copy
[['foo'], [], []]
>>> l
[['foo'], [], []]
The list is just an array of pointers to the contents, so a shallow copy just copies the pointers, and so you have two different lists, but they have the same contents. To make copies of the contents, you need a deep copy.
To make a deep copy of a list, in Python 2 or 3, use deepcopy
in the copy
module:
import copy
a_deep_copy = copy.deepcopy(a_list)
To demonstrate how this allows us to make new sub-lists:
>>> import copy
>>> l
[['foo'], [], []]
>>> l_deep_copy = copy.deepcopy(l)
>>> l_deep_copy[0].pop()
'foo'
>>> l_deep_copy
[[], [], []]
>>> l
[['foo'], [], []]
And so we see that the deep copied list is an entirely different list from the original. You could roll your own function - but don't. You're likely to create bugs you otherwise wouldn't have by using the standard library's deepcopy function.
eval
You may see this used as a way to deepcopy, but don't do it:
problematic_deep_copy = eval(repr(a_list))
In 64 bit Python 2.7:
>>> import timeit
>>> import copy
>>> l = range(10)
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
27.55826997756958
>>> min(timeit.repeat(lambda: eval(repr(l))))
29.04534101486206
on 64 bit Python 3.5:
>>> import timeit
>>> import copy
>>> l = list(range(10))
>>> min(timeit.repeat(lambda: copy.deepcopy(l)))
16.84255409205798
>>> min(timeit.repeat(lambda: eval(repr(l))))
34.813894678023644