python shallow copies can they be similar to deepcopies - python

I am playing about with shallow copies in python. I came across a gotcha that I was not expecting.
My assumption was that a shallow copy was a new instance of a class with references to the objects in the class. This behavior is shown below.
>>> a = { 'a': 1, 'b':2, 'c': [[1,2],2,3,4,5,6] }
>>> c = copy.copy(a['c'])
>>> d = a['c']
>>> a['c'] is c
False
>>> a['c'][0] is c[0]
True
>>> a['c'] is d
True
>>> a['c'][0] is d[0]
True
What surprised me was the following. As the elements of the shallow copied list are references to the elements in the list from a I assumed when I changed the mutable 1st element it would also change in a.
>>> c[0] = [3,3]
>>> c
[[3, 3], 2, 3, 4, 5, 6]
>>> a
{'a': 1, 'c': [[1, 2], 2, 3, 4, 5, 6], 'b': 2}
>>> a['c'][0] is c[0]
False
I see since the change the first element is no longer a reference of a.
My Question:
If I changed all elements in the list would it be similar to a deepcopy?

As the elements of the shallow copied list are references to the elements in the list from 'a' I assumed when I changed the mutable 1st element it would also change in 'a'.
The 1st element of c is a list, which is mutable. So, if you actually did mutate it, the result would be visible in a. For example:
>>> a = { 'a': 1, 'b':2, 'c': [[1],2,3,4,5,6] }
>>> c = copy.copy(a['c'])
>>> c[0].append(0)
>>> a
{'a': 1, 'b':2, 'c': [[1, 0], 2, 3, 4, 5, 6]}
But you didn't mutate it; you just replaced it with a different value.
The fact that both the original value ([1, 2]) and the new one ([3, 3]) are mutable is irrelevant; you aren't mutating anything (except for c, of course… but c is, as you already know, a shallow copy of a['c'], not the same object).
So:
If I changed all elements in the list would it be similar to a deepcopy?
No, on two counts. Changing shared elements means you're changing all references. Replacing all elements of the list would be "similar to a deepcopy"… but not the same, unless you replaced them with deepcopy-like copies of the originals. If you replace them with shallow copies, you only push the exact same issue one level down. For example:
>>> a = [[[0]]]
>>> b = copy.copy(a[0])
>>> b[0] = copy.copy(b[0])
>>> a[0] is b
False
>>> a[0][0] is b[0]
False
>>> a[0][0][0] is b[0][0]
True
(In your example, you're replacing them with entirely different and unrelated values, which isn't really like a copy at all… but I think I know what you meant.)

Related

why mutable objects having same value have different id in Python

Thank you for your valuable time, I have just started learning Python. I came across Mutable and Immutable objects.
As far as I know mutable objects can be changed after their creation.
a = [1,2,3]
print(id(a))
45809352
a = [3,2,1]
print(id(a))
52402312
Then why id of the same list "a" gets changed when its values are changed.
your interpretation is incorrect.
When you assign a new list to a, you change its reference.
On the other hand you could do:
a[:] = [3,2,1]
and then the reference would not change.
mutable means that the content of the object is changed. for example a.append(4) actually make a equal to [1, 2, 3, 4], while on the contrary, appending to a string (which is immutable) does not change it, it creates a new one.
However, when you re-assign, you create a new object and assign it to a, you don't alter the existing content of a. The previous content is lost (unless refered-to by some other variable)
If you change a list, its id doesn't change. But you may do things that instead create a new list, and then it will also have a new id.
E.g.,
>>> l=[]
>>> id(l)
140228658969920
>>> l.append(3) # Changes l
>>> l
[3]
>>> id(l)
140228658969920 # Same ID
>>> l = l + [4] # Computes a new list that is the result of l + [4], assigns that
>>> l
[3, 4]
>>> id(l)
140228658977608 # ID changed
When you do
a = [3, 2, 1]
You unlink the list of [1, 2, 3] from variable a.
Create a new list [3, 2, 1] then assign it to a variable.
Being immutable doesn't mean you assign a new object, it means your original object can be changed "in place" for example via .append()
>>> my_list = [1,2,3]
>>> id(my_list)
140532335329544
>>> my_list.append(5)
>>> id(my_list)
140532335329544
>>> my_list[3] = 4
>>> my_list
[1, 2, 3, 4]
>>> id(my_list)
140532335329544

List being identified with another, using new items in the loop [duplicate]

This question already has answers here:
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
Closed 8 years ago.
I have a list named MyList.
I want to copy the list to a new one, then add items to the new one, so I do:
MySecondList=MyList
for item in MyList:
if item==2:
MySecondList.append(item)
The problem I am having is that the items will be added also to MyList, and as a matter of fact the loop keeps going through MyList new items too!!
Is that normal? why does it happen? shouldnt the iteration use only the original list MyList for it, instead of dinamically increase with the items I add to other list?
Yes, it is normal as lists are mutable in python and this operation:
MySecondList = MyList
simply creates a new reference to the same list object and list.append modifies the same object in-place.(other operations like +=, list.extend, list.pop etc also modify the list in-place)
You can use a shallow copy here:
MySecondList = MyList[:]
Demo:
>>> from sys import getrefcount
>>> lis = [1,2,3]
>>> foo = lis #creates a new reference to the same object [1,2,3]
>>> lis is foo
True
>>> getrefcount(lis) #number of references to the same object
3 #foo , lis and shell itself
#you can modify the list [1,2,3] from any of it's references
>>> foo.append(4)
>>> lis.append(5)
>>> foo,lis
([1, 2, 3, 4, 5], [1, 2, 3, 4, 5])
>>> lis = [1,2,3]
>>> foo = lis[:] #assigns a shallow copy of lis to foo
>>> foo is lis
False
>>> getrefcount(lis) #still 2(lis + shell_, as foo points to a different object
2
#different results here
>>> foo.append(4)
>>> lis.append(5)
>>> foo, lis
([1, 2, 3, 4], [1, 2, 3, 5])
For a lists of lists(or list of mutable objects) a shallow copy is not enough as the inner lists(or objects) are just new references to the same object:
>>> lis = [[1,2,3],[4,5,6]]
>>> foo = lis[:]
>>> foo is lis #lis and foo are different
False
>>> [id(x) for x in lis] #but inner lists are still same
[3056076428L, 3056076716L]
>>> [id(x) for x in foo] #same IDs of inner lists, i.e foo[0] is lis[0] == True
[3056076428L, 3056076716L]
>>> foo[0][0] = 100 # modifying one will affect the other as well
>>> lis[0],foo[0]
([100, 2, 3], [100, 2, 3])
For such cases use copy.deepcopy:
>>> from copy import deepcopy
>>> lis = [[1,2,3],[4,5,6]]
>>> foo = deepcopy(lis)
Because they both reference the same list (and their ids are the same). Observe:
>>> a = [1,2,3]
>>> b = a
>>> b
[1, 2, 3]
>>> a is b
True
>>> b += [1]
>>> b
[1, 2, 3, 1]
>>> a
[1, 2, 3, 1]
>>> a is b
True
Do this instead:
MySecondList = MyList[:]
What this does is makes a copy of a list which won't change the original list. You can also use list(MyList).

Python assigning multiple variables to same value? list behavior

I tried to use multiple assignment as show below to initialize variables, but I got confused by the behavior, I expect to reassign the values list separately, I mean b[0] and c[0] equal 0 as before.
a=b=c=[0,3,5]
a[0]=1
print(a)
print(b)
print(c)
Result is:
[1, 3, 5]
[1, 3, 5]
[1, 3, 5]
Is that correct? what should I use for multiple assignment?
what is different from this?
d=e=f=3
e=4
print('f:',f)
print('e:',e)
result:
('f:', 3)
('e:', 4)
If you're coming to Python from a language in the C/Java/etc. family, it may help you to stop thinking about a as a "variable", and start thinking of it as a "name".
a, b, and c aren't different variables with equal values; they're different names for the same identical value. Variables have types, identities, addresses, and all kinds of stuff like that.
Names don't have any of that. Values do, of course, and you can have lots of names for the same value.
If you give Notorious B.I.G. a hot dog,* Biggie Smalls and Chris Wallace have a hot dog. If you change the first element of a to 1, the first elements of b and c are 1.
If you want to know if two names are naming the same object, use the is operator:
>>> a=b=c=[0,3,5]
>>> a is b
True
You then ask:
what is different from this?
d=e=f=3
e=4
print('f:',f)
print('e:',e)
Here, you're rebinding the name e to the value 4. That doesn't affect the names d and f in any way.
In your previous version, you were assigning to a[0], not to a. So, from the point of view of a[0], you're rebinding a[0], but from the point of view of a, you're changing it in-place.
You can use the id function, which gives you some unique number representing the identity of an object, to see exactly which object is which even when is can't help:
>>> a=b=c=[0,3,5]
>>> id(a)
4473392520
>>> id(b)
4473392520
>>> id(a[0])
4297261120
>>> id(b[0])
4297261120
>>> a[0] = 1
>>> id(a)
4473392520
>>> id(b)
4473392520
>>> id(a[0])
4297261216
>>> id(b[0])
4297261216
Notice that a[0] has changed from 4297261120 to 4297261216—it's now a name for a different value. And b[0] is also now a name for that same new value. That's because a and b are still naming the same object.
Under the covers, a[0]=1 is actually calling a method on the list object. (It's equivalent to a.__setitem__(0, 1).) So, it's not really rebinding anything at all. It's like calling my_object.set_something(1). Sure, likely the object is rebinding an instance attribute in order to implement this method, but that's not what's important; what's important is that you're not assigning anything, you're just mutating the object. And it's the same with a[0]=1.
user570826 asked:
What if we have, a = b = c = 10
That's exactly the same situation as a = b = c = [1, 2, 3]: you have three names for the same value.
But in this case, the value is an int, and ints are immutable. In either case, you can rebind a to a different value (e.g., a = "Now I'm a string!"), but the won't affect the original value, which b and c will still be names for. The difference is that with a list, you can change the value [1, 2, 3] into [1, 2, 3, 4] by doing, e.g., a.append(4); since that's actually changing the value that b and c are names for, b will now b [1, 2, 3, 4]. There's no way to change the value 10 into anything else. 10 is 10 forever, just like Claudia the vampire is 5 forever (at least until she's replaced by Kirsten Dunst).
* Warning: Do not give Notorious B.I.G. a hot dog. Gangsta rap zombies should never be fed after midnight.
Cough cough
>>> a,b,c = (1,2,3)
>>> a
1
>>> b
2
>>> c
3
>>> a,b,c = ({'test':'a'},{'test':'b'},{'test':'c'})
>>> a
{'test': 'a'}
>>> b
{'test': 'b'}
>>> c
{'test': 'c'}
>>>
In python, everything is an object, also "simple" variables types (int, float, etc..).
When you changes a variable value, you actually changes it's pointer, and if you compares between two variables it's compares their pointers.
(To be clear, pointer is the address in physical computer memory where a variable is stored).
As a result, when you changes an inner variable value, you changes it's value in the memory and it's affects all the variables that point to this address.
For your example, when you do:
a = b = 5
This means that a and b points to the same address in memory that contains the value 5, but when you do:
a = 6
It's not affect b because a is now points to another memory location that contains 6 and b still points to the memory address that contains 5.
But, when you do:
a = b = [1,2,3]
a and b, again, points to the same location but the difference is that if you change the one of the list values:
a[0] = 2
It's changes the value of the memory that a is points on, but a is still points to the same address as b, and as a result, b changes as well.
Yes, that's the expected behavior. a, b and c are all set as labels for the same list. If you want three different lists, you need to assign them individually. You can either repeat the explicit list, or use one of the numerous ways to copy a list:
b = a[:] # this does a shallow copy, which is good enough for this case
import copy
c = copy.deepcopy(a) # this does a deep copy, which matters if the list contains mutable objects
Assignment statements in Python do not copy objects - they bind the name to an object, and an object can have as many labels as you set. In your first edit, changing a[0], you're updating one element of the single list that a, b, and c all refer to. In your second, changing e, you're switching e to be a label for a different object (4 instead of 3).
You can use id(name) to check if two names represent the same object:
>>> a = b = c = [0, 3, 5]
>>> print(id(a), id(b), id(c))
46268488 46268488 46268488
Lists are mutable; it means you can change the value in place without creating a new object. However, it depends on how you change the value:
>>> a[0] = 1
>>> print(id(a), id(b), id(c))
46268488 46268488 46268488
>>> print(a, b, c)
[1, 3, 5] [1, 3, 5] [1, 3, 5]
If you assign a new list to a, then its id will change, so it won't affect b and c's values:
>>> a = [1, 8, 5]
>>> print(id(a), id(b), id(c))
139423880 46268488 46268488
>>> print(a, b, c)
[1, 8, 5] [1, 3, 5] [1, 3, 5]
Integers are immutable, so you cannot change the value without creating a new object:
>>> x = y = z = 1
>>> print(id(x), id(y), id(z))
507081216 507081216 507081216
>>> x = 2
>>> print(id(x), id(y), id(z))
507081248 507081216 507081216
>>> print(x, y, z)
2 1 1
in your first example a = b = c = [1, 2, 3] you are really saying:
'a' is the same as 'b', is the same as 'c' and they are all [1, 2, 3]
If you want to set 'a' equal to 1, 'b' equal to '2' and 'c' equal to 3, try this:
a, b, c = [1, 2, 3]
print(a)
--> 1
print(b)
--> 2
print(c)
--> 3
Hope this helps!
What you need is this:
a, b, c = [0,3,5] # Unpack the list, now a, b, and c are ints
a = 1 # `a` did equal 0, not [0,3,5]
print(a)
print(b)
print(c)
Simply put, in the first case, you are assigning multiple names to a list. Only one copy of list is created in memory and all names refer to that location. So changing the list using any of the names will actually modify the list in memory.
In the second case, multiple copies of same value are created in memory. So each copy is independent of one another.
The code that does what I need could be this:
# test
aux=[[0 for n in range(3)] for i in range(4)]
print('aux:',aux)
# initialization
a,b,c,d=[[0 for n in range(3)] for i in range(4)]
# changing values
a[0]=1
d[2]=5
print('a:',a)
print('b:',b)
print('c:',c)
print('d:',d)
Result:
('aux:', [[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0]])
('a:', [1, 0, 0])
('b:', [0, 0, 0])
('c:', [0, 0, 0])
('d:', [0, 0, 5])
To assign multiple variables same value I prefer list
a, b, c = [10]*3#multiplying 3 because we have 3 variables
print(a, type(a), b, type(b), c, type(c))
output:
10 <class 'int'> 10 <class 'int'> 10 <class 'int'>
Initialize multiple objects:
import datetime
time1, time2, time3 = [datetime.datetime.now()]*3
print(time1)
print(time2)
print(time3)
output:
2022-02-25 11:52:59.064487
2022-02-25 11:52:59.064487
2022-02-25 11:52:59.064487
E.g: basically a = b = 10 means both a and b are pointing to 10 in the memory, you can test by id(a) and id(b) which comes out exactly equal to a is b as True.
is matches the memory location but not its value, however == matches the value.
let's suppose, you want to update the value of a from 10 to 5, since the memory location was pointing to the same memory location you will experience the value of b will also be pointing to 5 because of the initial declaration.
The conclusion is to use this only if you know the consequences otherwise simply use , separated assignment like a, b = 10, 10 and won't face the above-explained consequences on updating any of the values because of different memory locations.
The behavior is correct. However, all the variables will share the same reference. Please note the behavior below:
>>> a = b = c = [0,1,2]
>>> a
[0, 1, 2]
>>> b
[0, 1, 2]
>>> c
[0, 1, 2]
>>> a[0]=1000
>>> a
[1000, 1, 2]
>>> b
[1000, 1, 2]
>>> c
[1000, 1, 2]
So, yes, it is different in the sense that if you assign a, b and c differently on a separate line, changing one will not change the others.
Here are two codes for you to choose one:
a = b = c = [0, 3, 5]
a = [1, 3, 5]
print(a)
print(b)
print(c)
or
a = b = c = [0, 3, 5]
a = [1] + a[1:]
print(a)
print(b)
print(c)

What's the difference between a[] and a[:] when assigning values?

I happen to see this snippet of code:
a = []
a = [a, a, None]
# makes a = [ [], [], None] when print
a = []
a[:] = [a, a, None]
# makes a = [ [...], [...], None] when print
It seems the a[:] assignment assigns a pointer but I can't find documents about that. So anyone could give me an explicit explanation?
In Python, a is a name - it points to an object, in this case, a list.
In your first example, a initially points to the empty list, then to a new list.
In your second example, a points to an empty list, then it is updated to contain the values from the new list. This does not change the list a references.
The difference in the end result is that, as the right hand side of an operation is evaluated first, in both cases, a points to the original list. This means that in the first case, it points to the list that used to be a, while in the second case, it points to itself, making a recursive structure.
If you are having trouble understanding this, I recommend taking a look at it visualized.
The first will point a to a new object, the second will mutate a, so the list referenced by a is still the same.
For example:
a = [1, 2, 3]
b = a
print b # [1, 2, 3]
a[:] = [3, 2, 1]
print b # [3, 2, 1]
a = [1, 2, 3]
#b still references to the old list
print b # [3, 2, 1]
More clear example from #pythonm response
>>> a=[1,2,3,4]
>>> b=a
>>> c=a[:]
>>> a.pop()
4
>>> a
[1, 2, 3]
>>> b
[1, 2, 3]
>>> c
[1, 2, 3, 4]
>>>

Understanding dict.copy() - shallow or deep?

While reading up the documentation for dict.copy(), it says that it makes a shallow copy of the dictionary. Same goes for the book I am following (Beazley's Python Reference), which says:
The m.copy() method makes a shallow
copy of the items contained in a
mapping object and places them in a
new mapping object.
Consider this:
>>> original = dict(a=1, b=2)
>>> new = original.copy()
>>> new.update({'c': 3})
>>> original
{'a': 1, 'b': 2}
>>> new
{'a': 1, 'c': 3, 'b': 2}
So I assumed this would update the value of original (and add 'c': 3) also since I was doing a shallow copy. Like if you do it for a list:
>>> original = [1, 2, 3]
>>> new = original
>>> new.append(4)
>>> new, original
([1, 2, 3, 4], [1, 2, 3, 4])
This works as expected.
Since both are shallow copies, why is that the dict.copy() doesn't work as I expect it to? Or my understanding of shallow vs deep copying is flawed?
By "shallow copying" it means the content of the dictionary is not copied by value, but just creating a new reference.
>>> a = {1: [1,2,3]}
>>> b = a.copy()
>>> a, b
({1: [1, 2, 3]}, {1: [1, 2, 3]})
>>> a[1].append(4)
>>> a, b
({1: [1, 2, 3, 4]}, {1: [1, 2, 3, 4]})
In contrast, a deep copy will copy all contents by value.
>>> import copy
>>> c = copy.deepcopy(a)
>>> a, c
({1: [1, 2, 3, 4]}, {1: [1, 2, 3, 4]})
>>> a[1].append(5)
>>> a, c
({1: [1, 2, 3, 4, 5]}, {1: [1, 2, 3, 4]})
So:
b = a: Reference assignment, Make a and b points to the same object.
b = a.copy(): Shallow copying, a and b will become two isolated objects, but their contents still share the same reference
b = copy.deepcopy(a): Deep copying, a and b's structure and content become completely isolated.
Take this example:
original = dict(a=1, b=2, c=dict(d=4, e=5))
new = original.copy()
Now let's change a value in the 'shallow' (first) level:
new['a'] = 10
# new = {'a': 10, 'b': 2, 'c': {'d': 4, 'e': 5}}
# original = {'a': 1, 'b': 2, 'c': {'d': 4, 'e': 5}}
# no change in original, since ['a'] is an immutable integer
Now let's change a value one level deeper:
new['c']['d'] = 40
# new = {'a': 10, 'b': 2, 'c': {'d': 40, 'e': 5}}
# original = {'a': 1, 'b': 2, 'c': {'d': 40, 'e': 5}}
# new['c'] points to the same original['d'] mutable dictionary, so it will be changed
It's not a matter of deep copy or shallow copy, none of what you're doing is deep copy.
Here:
>>> new = original
you're creating a new reference to the the list/dict referenced by original.
while here:
>>> new = original.copy()
>>> # or
>>> new = list(original) # dict(original)
you're creating a new list/dict which is filled with a copy of the references of objects contained in the original container.
Adding to kennytm's answer. When you do a shallow copy parent.copy() a new dictionary is created with same keys,but the values are not copied they are referenced.If you add a new value to parent_copy it won't effect parent because parent_copy is a new dictionary not reference.
parent = {1: [1,2,3]}
parent_copy = parent.copy()
parent_reference = parent
print id(parent),id(parent_copy),id(parent_reference)
#140690938288400 140690938290536 140690938288400
print id(parent[1]),id(parent_copy[1]),id(parent_reference[1])
#140690938137128 140690938137128 140690938137128
parent_copy[1].append(4)
parent_copy[2] = ['new']
print parent, parent_copy, parent_reference
#{1: [1, 2, 3, 4]} {1: [1, 2, 3, 4], 2: ['new']} {1: [1, 2, 3, 4]}
The hash(id) value of parent[1], parent_copy[1] are identical which implies [1,2,3] of parent[1] and parent_copy[1] stored at id 140690938288400.
But hash of parent and parent_copy are different which implies
They are different dictionaries and parent_copy is a new dictionary having values reference to values of parent
"new" and "original" are different dicts, that's why you can update just one of them.. The items are shallow-copied, not the dict itself.
In your second part, you should use new = original.copy()
.copy and = are different things.
Contents are shallow copied.
So if the original dict contains a list or another dictionary, modifying one them in the original or its shallow copy will modify them (the list or the dict) in the other.

Categories