This question already has answers here:
What is the difference between shallow copy, deepcopy and normal assignment operation?
(12 answers)
Closed 8 years ago.
From my understanding of deep/shallow copying. Shallow copying assigns a new identifier to point at the same object.
>>>x = [1,2,3]
>>>y = x
>>>x,y
([1,2,3],[1,2,3])
>>>x is y
True
>>>x[1] = 14
>>>x,y
([1,14,3],[1,14,3])
Deep copying creates a new object with equivalent value :
>>>import copy
>>>x = [1,2,3]
>>>y = copy.deepcopy(x)
>>>x is y
False
>>>x == y
True
>>>x[1] = 14
>>>x,y
([1,14,3],[1,2,3])
My confusion is if x=y creates a shallow copy and the copy.copy() function also creates a shallow copy of the object then:
>>> import copy
>>> x = [1,2,3]
>>> y = x
>>> z = copy.copy(x)
>>> x is y
True
>>> x is z
False
>>> id(x),id(y),id(z)
(4301106640, 4301106640, 4301173968)
why it is creating a new object if it is supposed to be a shallow copy?
A shallow copy creates a new list object and copies across all the references contained in the source list. A deep copy creates new objects recursively.
You won't see the difference with just immutable contents. Use nested lists to see the difference:
>>> import copy
>>> a = ['foo', 'bar', 'baz']
>>> b = ['spam', 'ham', 'eggs']
>>> outer = [a, b]
>>> copy_of_outer = copy.copy(outer)
>>> outer is copy_of_outer
False
>>> outer == copy_of_outer
True
>>> outer[0] is a
True
>>> copy_of_outer[0] is a
True
>>> outer[0] is copy_of_outer[0]
True
A new copy of the outer list was created, but the contents of the original and the copy are still the same objects.
>>> deep_copy_of_outer = copy.deepcopy(outer)
>>> deep_copy_of_outer[0] is a
False
>>> outer[0] is deep_copy_of_outer[0]
False
The deep copy doesn't share contents with the original; the a list has been recursively copied as well.
Related
I would like to create a copy of an object. I want the new object to possess all properties of the old object (values of the fields). But I want to have independent objects. So, if I change values of the fields of the new object, the old object should not be affected by that.
To get a fully independent copy of an object you can use the copy.deepcopy() function.
For more details about shallow and deep copying please refer to the other answers to this question and the nice explanation in this answer to a related question.
How can I create a copy of an object in Python?
So, if I change values of the fields of the new object, the old object should not be affected by that.
You mean a mutable object then.
In Python 3, lists get a copy method (in 2, you'd use a slice to make a copy):
>>> a_list = list('abc')
>>> a_copy_of_a_list = a_list.copy()
>>> a_copy_of_a_list is a_list
False
>>> a_copy_of_a_list == a_list
True
Shallow Copies
Shallow copies are just copies of the outermost container.
list.copy is a shallow copy:
>>> list_of_dict_of_set = [{'foo': set('abc')}]
>>> lodos_copy = list_of_dict_of_set.copy()
>>> lodos_copy[0]['foo'].pop()
'c'
>>> lodos_copy
[{'foo': {'b', 'a'}}]
>>> list_of_dict_of_set
[{'foo': {'b', 'a'}}]
You don't get a copy of the interior objects. They're the same object - so when they're mutated, the change shows up in both containers.
Deep copies
Deep copies are recursive copies of each interior object.
>>> lodos_deep_copy = copy.deepcopy(list_of_dict_of_set)
>>> lodos_deep_copy[0]['foo'].add('c')
>>> lodos_deep_copy
[{'foo': {'c', 'b', 'a'}}]
>>> list_of_dict_of_set
[{'foo': {'b', 'a'}}]
Changes are not reflected in the original, only in the copy.
Immutable objects
Immutable objects do not usually need to be copied. In fact, if you try to, Python will just give you the original object:
>>> a_tuple = tuple('abc')
>>> tuple_copy_attempt = a_tuple.copy()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'copy'
Tuples don't even have a copy method, so let's try it with a slice:
>>> tuple_copy_attempt = a_tuple[:]
But we see it's the same object:
>>> tuple_copy_attempt is a_tuple
True
Similarly for strings:
>>> s = 'abc'
>>> s0 = s[:]
>>> s == s0
True
>>> s is s0
True
and for frozensets, even though they have a copy method:
>>> a_frozenset = frozenset('abc')
>>> frozenset_copy_attempt = a_frozenset.copy()
>>> frozenset_copy_attempt is a_frozenset
True
When to copy immutable objects
Immutable objects should be copied if you need a mutable interior object copied.
>>> tuple_of_list = [],
>>> copy_of_tuple_of_list = tuple_of_list[:]
>>> copy_of_tuple_of_list[0].append('a')
>>> copy_of_tuple_of_list
(['a'],)
>>> tuple_of_list
(['a'],)
>>> deepcopy_of_tuple_of_list = copy.deepcopy(tuple_of_list)
>>> deepcopy_of_tuple_of_list[0].append('b')
>>> deepcopy_of_tuple_of_list
(['a', 'b'],)
>>> tuple_of_list
(['a'],)
As we can see, when the interior object of the copy is mutated, the original does not change.
Custom Objects
Custom objects usually store data in a __dict__ attribute or in __slots__ (a tuple-like memory structure.)
To make a copyable object, define __copy__ (for shallow copies) and/or __deepcopy__ (for deep copies).
from copy import copy, deepcopy
class Copyable:
__slots__ = 'a', '__dict__'
def __init__(self, a, b):
self.a, self.b = a, b
def __copy__(self):
return type(self)(self.a, self.b)
def __deepcopy__(self, memo): # memo is a dict of id's to copies
id_self = id(self) # memoization avoids unnecesary recursion
_copy = memo.get(id_self)
if _copy is None:
_copy = type(self)(
deepcopy(self.a, memo),
deepcopy(self.b, memo))
memo[id_self] = _copy
return _copy
Note that deepcopy keeps a memoization dictionary of id(original) (or identity numbers) to copies. To enjoy good behavior with recursive data structures, make sure you haven't already made a copy, and if you have, return that.
So let's make an object:
>>> c1 = Copyable(1, [2])
And copy makes a shallow copy:
>>> c2 = copy(c1)
>>> c1 is c2
False
>>> c2.b.append(3)
>>> c1.b
[2, 3]
And deepcopy now makes a deep copy:
>>> c3 = deepcopy(c1)
>>> c3.b.append(4)
>>> c1.b
[2, 3]
Shallow copy with copy.copy()
#!/usr/bin/env python3
import copy
class C():
def __init__(self):
self.x = [1]
self.y = [2]
# It copies.
c = C()
d = copy.copy(c)
d.x = [3]
assert c.x == [1]
assert d.x == [3]
# It's shallow.
c = C()
d = copy.copy(c)
d.x[0] = 3
assert c.x == [3]
assert d.x == [3]
Deep copy with copy.deepcopy()
#!/usr/bin/env python3
import copy
class C():
def __init__(self):
self.x = [1]
self.y = [2]
c = C()
d = copy.deepcopy(c)
d.x[0] = 3
assert c.x == [1]
assert d.x == [3]
Documentation: https://docs.python.org/3/library/copy.html
Tested on Python 3.6.5.
I believe the following should work with many well-behaved classed in Python:
def copy(obj):
return type(obj)(obj)
(Of course, I am not talking here about "deep copies," which is a different story, and which may be not a very clear concept -- how deep is deep enough?)
According to my tests with Python 3, for immutable objects, like tuples or strings, it returns the same object (because there is no need to make a shallow copy of an immutable object), but for lists or dictionaries it creates an independent shallow copy.
Of course this method only works for classes whose constructors behave accordingly. Possible use cases: making a shallow copy of a standard Python container class.
For example, say I want to make a deep copy of a list a, called b:
a = [1,2,3,4,5]
Is there any difference between:
import copy
b = copy.deepcopy(a)
and:
b = a*1
In both cases I've created a new object (i.e id(a) == id(b) is False), so are there any practical differences I should understand? Thanks!
No, they aren't equivalent. The multiplication operator makes only a shallow copy. A deep copy means that the references within the list are also copied (that is, new references are created), while a shallow copy only makes a new copy of the top-level reference but not the references within, as demonstrated below:
import copy
a = [[],[]]
b = copy.deepcopy(a)
c = a * 1
for i, v in enumerate(a):
print(id(v), id(b[i]), id(c[i]))
This outputs:
31231832 31261480 31231832
31260800 31261400 31260800
Why in the following assignment..
d = deque('abc')
a = d
d.clear()
print a
deque([])
returns a empty deque? I expect to retain data in a despite clearing the old deque.
a and d reference the same object. So if you clear it, it will be cleared for "both variables".
You could check that by printing the identity of objects.
>>> id(a)
44988624L
>>> id(d)
44988624L
Copy values by assignment is just possible for fundamental data types like int etc.
If you deal with objects you have to copy it because the variables itself just holding a reference to the object.
You could do that with
d = deque('abc')
a = deque('abc')
or with
>>> import copy
>>> d = copy.copy(a)
which results in
>>> id(a)
44988624L
>>> id(d)
44989352L
but then you will get two different objects in a and d which will differ after you use it.
The line:
a = d
does not create a copy - it just creates another name for the same object.
To create a copy, do this:
d = deque('abc')
a = deque(d)
>>> from copy import deepcopy
>>> d = deque('abc')
>>> a = deepcopy(d)
>>> d.clear()
>>> a
deque(['a', 'b', 'c'])
Or you can use deque's built-in copy function.
>>> d = deque('abc')
>>> a = d.__copy__
>>> a
<built-in method __copy__ of collections.deque object at 0x02437C70>
>>> a = d.__copy__()
>>> a
deque(['a', 'b', 'c'])
>>> d.clear()
>>> a
deque(['a', 'b', 'c'])
You were giving reference to the same object thats why after clearing the d even a was getting cleared. For that you need to copy the object d to a using deepcopy. Which copies the object for you instead of referencing it
>>> id(a)
37976360
>>> id(d)
37976248
The official Python docs say that using the slicing operator and assigning in Python makes a shallow copy of the sliced list.
But when I write code for example:
o = [1, 2, 4, 5]
p = o[:]
And when I write:
id(o)
id(p)
I get different id's and also appending one one list does not reflect in the other list. Isn't it creating a deep copy or is there somewhere I am going wrong?
You are creating a shallow copy, because nested values are not copied, merely referenced. A deep copy would create copies of the values referenced by the list too.
Demo:
>>> lst = [{}]
>>> lst_copy = lst[:]
>>> lst_copy[0]['foo'] = 'bar'
>>> lst_copy.append(42)
>>> lst
[{'foo': 'bar'}]
>>> id(lst) == id(lst_copy)
False
>>> id(lst[0]) == id(lst_copy[0])
True
Here the nested dictionary is not copied; it is merely referenced by both lists. The new element 42 is not shared.
Remember that everything in Python is an object, and names and list elements are merely references to those objects. A copy of a list creates a new outer list, but the new list merely receives references to the exact same objects.
A proper deep copy creates new copies of each and every object contained in the list, recursively:
>>> from copy import deepcopy
>>> lst_deepcopy = deepcopy(lst)
>>> id(lst_deepcopy[0]) == id(lst[0])
False
You should know that tests using is or id can be misleading of whether a true copy is being made with immutable and interned objects such as strings, integers and tuples that contain immutables.
Consider an easily understood example of interned strings:
>>> l1=['one']
>>> l2=['one']
>>> l1 is l2
False
>>> l1[0] is l2[0]
True
Now make a shallow copy of l1 and test the immutable string:
>>> l3=l1[:]
>>> l3 is l1
False
>>> l3[0] is l1[0]
True
Now make a copy of the string contained by l1[0]:
>>> s1=l1[0][:]
>>> s1
'one'
>>> s1 is l1[0] is l2[0] is l3[0]
True # they are all the same object
Try a deepcopy where every element should be copied:
>>> from copy import deepcopy
>>> l4=deepcopy(l1)
>>> l4[0] is l1[0]
True
In each case, the string 'one' is being interned into Python's internal cache of immutable strings and is will show that they are the same (they have the same id). It is implementation and version dependent of what gets interned and when it does, so you cannot depend on it. It can be a substantial memory and performance enhancement.
You can force an example that does not get interned instantly:
>>> s2=''.join(c for c in 'one')
>>> s2==l1[0]
True
>>> s2 is l1[0]
False
And then you can use the Python intern function to cause that string to refer to the cached object if found:
>>> l1[0] is s2
False
>>> s2=intern(s2)
>>> l1[0] is s2
True
Same applies to tuples of immutables:
>>> t1=('one','two')
>>> t2=t1[:]
>>> t1 is t2
True
>>> t3=deepcopy(t1)
>>> t3 is t2 is t1
True
And mutable lists of immutables (like integers) can have the list members interred:
>>> li1=[1,2,3]
>>> li2=deepcopy(li1)
>>> li2 == li1
True
>>> li2 is li1
False
>>> li1[0] is li2[0]
True
So you may use python operations that you KNOW will copy something but the end result is another reference to an interned immutable object. The is test is only a dispositive test of a copy being made IF the items are mutable.
This question already has answers here:
List assignment with [:]
(6 answers)
Closed 8 years ago.
Reading the Python 3.2 tutorial here, towards the end one of the examples is
a[:] = []
Is this equivalent to
a = []
? If it is, why did they write a[:] instead of a? If it isn't, what is the difference?
They are not equivalent. These two examples should get you to understand the difference.
Example 1:
>>> b = [1,2,3]
>>> a = b
>>> a[:] = []
>>> print b
[]
Example 2:
>>> b = [1,2,3]
>>> a = b
>>> a = []
>>> print b
[1,2,3]
That is explained, as you would expect, right there were they use it:
This means that the following slice returns a shallow copy of the list a
The second line doesn't modify the list, it simply arranges for a to point to a new, empty, list. The first line modifies the list pointed at by a. Consider this sample seesion in the python interpreter:
>>> b=[1,2,3]
>>> a=b
>>> a[:]=[]
>>> a
[]
>>> b
[]
Both a and b point to the same list, so we can see that a[:]=[] empties the list and now both a and b point to the same empty list.