Why does this empty dict break shared references? - python

I have found some Python behavior that confuses me.
>>> A = {1:1}
>>> B = A
>>> A[2] = 2
>>> A
{1: 1, 2: 2}
>>> B
{1: 1, 2: 2}
So far, everything is behaving as expected. A and B both reference the same, mutable, dictionary and altering one alters the other.
>>> A = {}
>>> A
{} # As expected
>>> B
{1: 1, 2: 2} # Why is this not an empty dict?
Why do A and B no longer reference the same object?
I have seen this question: Python empty dict not being passed by reference? and it verifies this behavior, but the answers explain how to fix the provided script not why this behavior occurs.

Here is a pictorial representation *:
A = {1: 1}
# A -> {1: 1}
B = A
# A -> {1: 1} <- B
A[2] = 2
# A -> {1: 1, 2: 2} <- B
A = {}
# {1: 1, 2: 2} <- B
# A -> {}
A = {} creates a completely new object and reassigns the identifier A to it, but does not affect B or the dictionary A previously referenced. You should read this article, it covers this sort of thing pretty well.
Note that, as an alternative, you can use the dict.clear method to empty the dictionary in-place:
>>> A = {1: 1}
>>> B = A
>>> A[2] = 2
>>> A.clear()
>>> B
{}
As A and B are still references to the same object, both now "see" the empty version.
* To a first approximation - similar referencing behaviour is going on within the dictionary too, but as the values are immutable it's less relevant.

Remember, variables in python act like labels. So, in the first example, you have a dictionary {1: 1, 2: 2}. That dictionary stays in memory. In the first example, A points to that dictionary, and you say B points to what A is pointing to (It won't point to the label A, but rather what the label A is pointing to).
In the second example, A and B are both pointing to this dictionary, but you point A to a new dictionary ({}). B stays pointing to the old dictionary in memory from the first example.

you are changing the dictionary A points to when you say A={} not destroying the old dictionary ... this sample should demonstrate for you
A={1:1}
print id(A)
B = A
print id(B)
B[2] = 5
print id(B)
print A
print id(A)
A = {}
print id(A)

It's about the difference between creating a new dictionary and changing an existing dictionary.
A[2] = 2
Is modifying the dictionary by adding a new key, the existing stuff is still part of that dictionary.
A = {}
This creates a totally new empty dictionary.

Think about it like this: A is the name of one object, then you make B a different name for that object. That's the first part, but then in the second code you make a new object and say ok that old object isn't called A anymore now this new object is called A.
B isn't pointing at A. B and A are both names for the same object, then names for two different objects.

Related

Creating list of dictionaries replaced the value of dict with the last element [duplicate]

I tried the following in the python interpreter:
>>> a = []
>>> b = {1:'one'}
>>> a.append(b)
>>> a
[{1: 'one'}]
>>> b[1] = 'ONE'
>>> a
[{1: 'ONE'}]
Here, after appending the dictionary b to the list a, I'm changing the value corresponding to the key 1 in dictionary b. Somehow this change gets reflected in the list too. When I append a dictionary to a list, am I not just appending the value of dictionary? It looks as if I have appended a pointer to the dictionary to the list and hence the changes to the dictionary are getting reflected in the list too.
I do not want the change to get reflected in the list. How do I do it?
You are correct in that your list contains a reference to the original dictionary.
a.append(b.copy()) should do the trick.
Bear in mind that this makes a shallow copy. An alternative is to use copy.deepcopy(b), which makes a deep copy.
Also with dict
a = []
b = {1:'one'}
a.append(dict(b))
print a
b[1]='iuqsdgf'
print a
result
[{1: 'one'}]
[{1: 'one'}]
use copy and deep copy
http://docs.python.org/library/copy.html

Why do variables containing lists in Python act differently from say variable containing integers in terms of storing/pointing towards values? [duplicate]

List reference append code
a = [1,2,3,4,5]
b = a
b.append(6)
print(a)
print(b)
#ans:
[1,2,3,4,5,6]
[1,2,3,4,5,6]
Integer reference in int
a = 1
b = a
b +=1
print(a)
print(b)
#ans:
1
2
how reference works in python integer vs list ? in list both value are same, why is in integer section a value is not 2 ?
In Python, everything is an object. Everything is a name for an address (pointer) per the docs.
On that page you can scroll down and find the following:
Numeric objects are immutable; once created their value never changes
Under that you'll see the int type defined, so it makes perfect sense your second example works.
On the top of the same page, you'll find the following:
Every object has an identity, a type and a value. An object’s identity never changes once it has been created; you may think of it as the object’s address in memory.
Python behaves just like C and Java in that you cannot reassign where the pointer to a name points. Python, like Java, is also pass-by-value and doesn't have a pass-by-reference semantic.
Looking at your first example:
>>> a = 1
>>> hex(id(a))
'0x7ffdc64cd420'
>>> b = a + 1
>>> hex(id(b))
'0x7ffdc64cd440'
>>> print(a)
1
>>> print(b)
2
Here it is shown that the operation b = a + 1 leaves a at 1 and b is now 2. That's because int is immutable, names that point to the value 1 will always point to the same address:
>>> a = 1
>>> b = 2
>>> c = 1
>>> hex(id(a))
'0x7ffdc64cd420'
>>> hex(id(b))
'0x7ffdc64cd440'
>>> hex(id(c))
'0x7ffdc64cd420'
Now this only holds true for the values of -5 to 256 in the C implementation, so beyond that you get new addresses, but the mutability shown above holds. I've shown you the sharing of memory addresses for a reason. On the same page you'll find the following:
Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that c = d = [] assigns the same object to both c and d.)
So your example:
>>> a = [1, 2, 3, 4, 5]
>>> hex(id(a))
'0x17292e1cbc8'
>>> b = a
>>> hex(id(b))
'0x17292e1cbc8'
I should be able to stop right here, its obvious that both a and b refer to the same object in memory at address 0x17292e1cbc8. Thats because the above is like saying:
# Lets assume that `[1, 2, 3, 4, 5]` is 0x17292e1cbc8 in memory
>>> a = 0x17292e1cbc8
>>> b = a
>>> print(b)
'0x17292e1cbc8'
Long and skinny? You're simply assigning a pointer to a new name, but both names point to the same object in memory! Note: This is not the same as a shallow copy because no external compound object is made.

Does python3 dict.copy still only create shallow copies?

After reading on a few places including here: Understanding dict.copy() - shallow or deep?
It claims that dict.copy will create a shallow copy otherwise known as a reference to the same values. However, when playing with it myself in python3 repl, I only get a copy by value?
a = {'one': 1, 'two': 2, 'three': 3}
b = a.copy()
print(a is b) # False
print(a == b) # True
a['one'] = 5
print(a) # {'one': 5, 'two': 2, 'three': 3}
print(b) # {'one': 1, 'two': 2, 'three': 3}
Does this mean that shallow and deep copies do not necessarily affect immutable values?
Integers are inmutable, the problem comes when referencing objects, check this similar example:
import copy
a = {'one': [], 'two': 2, 'three': 3}
b = a.copy()
c = copy.deepcopy(a)
print(a is b) # False
print(a == b) # True
a['one'].append(5)
print(a) # {'one': [5], 'two': 2, 'three': 3}
print(b) # {'one': [5], 'two': 2, 'three': 3}
print(c) # {'one': [], 'two': 2, 'three': 3}
Here you have it live
What you are observing has nothing to do with dictionaries at all. You are getting confused by the difference between binding and mutation.
Let's forget dictionaries at first, and demonstrate the issue with simple variables. Once we understand the fundamental point, we can then go back to the dictionary example.
a = 1
b = a
a = 2
print(b) # prints 1
On the first line you create a binding between the name a and the object 1.
On the second line you create a binding between the name b and the value of the expression a ... which is the very same object 1 which was bound to the name a on the previous line.
On the third line you create a binding between the name a and the object 2, in the process forgetting that there ever was a binding between a and the 1.
It is vital to note that this last step cannot in any way affect b!
The situation is completely symmetric, so if line 3 were b = 2 this would have absolutely no effect on a.
Now, people often mistakenly claim that this is somehow a result of the immutability of integers. Integers are immutable in Python, but that is completely irrelevant. If we do something similar with some mutable objects, say lists, then we get equivalent results.
a = [1]
b = a
a = [2]
print(b) # prints [1]
Once again
a is bound to some object
b is bound to the same object
a is now rebound to some different object
This cannot affect b or the object to which it is bound [*] in any way! No attempt has been made anywhere to mutate any object, so mutability is completely irrelevant to this situation.
[*] actually, it does change the reference count of the object (at least in CPython) but that's not really an observable property of the object.
However, if, instead of rebinding a, we
Use a to access the object to which it is bound
Mutate that object
then we will affect b, because the object to which b is bound will be mutated:
a = [1]
b = a
a[0] = 2
print(b) # prints [2]
In summary, you have to understand
The difference between binding and mutation. The former affects a variable (or more generally a location) while the latter affects an object. Therein lies the key difference
Rebinding a name (or location in general) cannot affect the object to which that name was previously bound (beyond changing its reference count).
Now, in your example you create something that looks (conceptually) like this:
a ---> { 'three' ----------------------> 3
'two' -------------> 2 ^
'one' ---> 1 } ^ |
^ | |
| | |
b ---> { 'one' ----- | |
'two' --------------- |
'three' -------------------------
and then a['one'] = 5 simply rebinds the location a['one'] so that it is no longer bound to the 1 but to 5. In other words, that arrow coming out of the first 'one', now points somewhere else.
It is important to remember that this has absolutely nothing to do with the immutability of integers. If you make each and every integer in your example mutable (for example by replacing it with a list which contains it: i.e. replace every occurance of 1 with [1] (and similarly for 2 and 3)) then you will still observe essentially the same behaviour: a['one'] = [1] will not affect the value of b['one'].
Now, in this latest example, where the values stored in your dictionary are lists and therefore structured, it becomes possible to distinguish between shallow and deep copy:
b = a will not copy the dictionary at all: it will simply make b a new binding to the same single dictionary
b = copy.copy(b) will create a new dictionary with internal bindings to the same lists. The dictionary is copied but its contents (below the top level) are simply referenced by the new dictionary.
b = copy.deepcopy(a) will also create a new dictionary, but it will also create new objects to populate that dictionary, rather than referencing the original ones.
Consequently, if you mutate (rather than rebind) something in the shallow copy case, the other dictionary will 'see' mutation, because the two dictionaries share objects. This does not happen in the deep copy.
please consider this situation explained hence you will be able to understand the referencing and copy() method easily.
dic = {'data1': 100, 'data2': -54, 'data3': 247}
dict1 = dic
dict2 = dic.copy()
print(dict2 is dic)
# False
print(dict1 is dic)
# true
First print statement prints false because dict2 and dic are 2 separate dictionary with separate memory spaces even though they have same contents. This happens when we use copy function.
secondly when assigning dic to dict1 does not create a separate dictionary with separate memory spaces instead dict1 makes a refernce to dic.
A shallow copy of some container means that a new identical object is returned, but that its values are the same objects.
This means that mutating the values of the copy will mutate the values of the original. In your example, you are not mutating a value, you are instead updating a key.
Here is an example of value mutation.
d = {'a': []}
d_copy = d.copy()
print(d is d_copy) # False
print(d['a'] is d['a']) # True
d['a'].append(1)
print(d_copy) # {'a': [1]}
On the other side, a deepcopy of a container returns a new identical object, but where the values have been recursively copied as well.

Why is the dictionary key not changing

a = (1,2)
b = {a:1}
print(b[a]) # This gives 1
a = (1,2,3)
print(b[a]) # This gives error but b[(1,2)] is working fine
What I understood is python doesn't run garbage collector after a is changed to (1,2,3) as the tuple (1,2,3) is created as a new object and the tuple (1,2) is still being referenced in b.
What I didn't understood is why 'b' doesn't change the key after 'a' is changed
b = {a:1} creates a dictionary with the value of a as a key and 1 as a value. When you assign a value to a, you create a new value, and b retrain the old value as its key.
The following example, using id, may illustrate it:
>>> a = (1,2)
>>> b = {a:1}
>>> id(a)
139681226321288
>>> a = (1,2,3)
>>> id(a)
139681416297520
>>> id(b.keys()[0])
139681226321288
Integers, floats, strings, tuples in python are immutable. A dictionary would allow only those keys which would be hashable (immutable built-in objects are hashable). As #Mureinik correctly specified the reason behind the cause, I would give you another example where you can mutate the data by the process you followed above.
>>> l = [1,2,3]
>>> b = {'3' : l}
>>> b
{'3': [1, 2, 3]}
>>> l.append(5)
>>> l
[1, 2, 3, 5]
>>> b
{'3': [1, 2, 3, 5]}
But you cannot change the keys of a dictionary as they are hashed (only values can be updated). You either have to delete existing key-value pair or add new pair.

Different ways of deleting lists

I want to understand why:
a = [];
del a; and
del a[:];
behave so differently.
I ran a test for each to illustrate the differences I witnessed:
>>> # Test 1: Reset with a = []
...
>>> a = [1,2,3]
>>> b = a
>>> a = []
>>> a
[]
>>> b
[1, 2, 3]
>>>
>>> # Test 2: Reset with del a
...
>>> a = [1,2,3]
>>> b = a
>>> del a
>>> a
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined
>>> b
[1, 2, 3]
>>>
>>> # Test 3: Reset with del a[:]
...
>>> a = [1,2,3]
>>> b = a
>>> del a[:]
>>> a
[]
>>> b
[]
I did find Different ways of clearing lists, but I didn't find an explanation for the differences in behaviour. Can anyone clarify this?
Test 1
>>> a = [1,2,3] # set a to point to a list [1, 2, 3]
>>> b = a # set b to what a is currently pointing at
>>> a = [] # now you set a to point to an empty list
# Step 1: A --> [1 2 3]
# Step 2: A --> [1 2 3] <-- B
# Step 3: A --> [ ] [1 2 3] <-- B
# at this point a points to a new empty list
# whereas b points to the original list of a
Test 2
>>> a = [1,2,3] # set a to point to a list [1, 2, 3]
>>> b = a # set b to what a is currently pointing at
>>> del a # delete the reference from a to the list
# Step 1: A --> [1 2 3]
# Step 2: A --> [1 2 3] <-- B
# Step 3: [1 2 3] <-- B
# so a no longer exists because the reference
# was destroyed but b is not affected because
# b still points to the original list
Test 3
>>> a = [1,2,3] # set a to point to a list [1, 2, 3]
>>> b = a # set b to what a is currently pointing at
>>> del a[:] # delete the contents of the original
# Step 1: A --> [1 2 3]
# Step 2: A --> [1 2 3] <-- B
# Step 2: A --> [ ] <-- B
# both a and b are empty because they were pointing
# to the same list whose elements were just removed
Of your three "ways of deleting Python lists", only one actually alters the original list object; the other two only affect the name.
a = [] creates a new list object, and assigns it to the name a.
del a deletes the name, not the object it refers to.
del a[:] deletes all references from the list referenced by the name a (although, similarly, it doesn't directly affect the objects that were referenced from the list).
It's probably worth reading this article on Python names and values to better understand what's going on here.
Test 1: rebinds a to a new object, b still holds a reference to the original object, a is just a name by rebinding a to a new object does not change the original object that b points to.
Test 2: you del the name a so it no longer exists but again you still have a reference to the object in memory with b.
Test 3 a[:] just like when you copy a list or want to change all the elements of a list refers to references to the objects stored in the list not the name a. b gets cleared also as again it is a reference to a so changes to the content of a will effect b.
The behaviour is documented:
There is a way to remove an item from a list given its index instead
of its value: the del statement. This differs from the pop()
method which returns a value. The del statement can also be used to
remove slices from a list or clear the entire list (which we did
earlier by assignment of an empty list to the slice). For example:
>>>
>>> a = [-1, 1, 66.25, 333, 333, 1234.5]
>>> del a[0]
>>> a
[1, 66.25, 333, 333, 1234.5]
>>> del a[2:4]
>>> a
[1, 66.25, 1234.5]
>>> del a[:]
>>> a
[]
del can also be used to delete entire variables:
>>>
>>> del a
Referencing the name a hereafter is an error (at least until another
value is assigned to it). We'll find other uses for del later.
So only del a actually deletes a, a = [] rebinds a to a new object and del a[:] clears a. In your second test if b did not hold a reference to the object it would be garbage collected.
del a
is removing the variable a from the scope. Quoting from python docs:
Deletion of a name removes the binding of that name from the local or
global namespace, depending on whether the name occurs in a global
statement in the same code block.
del a[:]
is simply removing the contents of a, since the deletion is passed to the a object, instead of applied to it. Again from the docs:
Deletion of attribute references, subscriptions and slicings is passed
to the primary object involved; deletion of a slicing is in general
equivalent to assignment of an empty slice of the right type (but even
this is determined by the sliced object).
.
Of those three methods, only the third method actually results in deleting the list that 'a' points to. Lets do a quick overview.
When you right a = [1, 2, 3] it creates a list in memory, with the items [1, 2, 3] and then gets 'a' to point to it. When you write b = a this preforms whats' called a 'shallow copy,' i.e. it makes 'b' point to the same block of memory as 'a.' a deep copy would involve copying the contents of the list into a new block of memory, then pointing to that.
now, when you write a = [] you are creating a new list with no items in it, and getting 'a' to point to it. the original list still exists, and 'b' is pointing to it.
in the second case, del a deletes the pointer to [1,2,3] and not the array it's self. this means b can still point to it.
lastly, del a[:] goes through the data 'a' is pointing to and empties it's contents. 'a' still exists, so you can use it. 'b' also exists, but it points to the same empty list 'a' does, which is why it gives the same output.
To understand the difference between different ways of deleting lists, let us see each of them one by one with the help of images.
>>> a1 = [1,2,3]
A new list object is created and assigned to a1.
>>> a2 = a1
We assign a1 to a2. So, list a2 now points to the list object to which a1 points to.
DIFFERENT METHODS EXPLAINED BELOW:
Method-1 Using [] :
>>> a1 = []
On assigning an empty list to a1, there is no effect on a2. a2 still refers to the same list object but a1 now refers to an empty list.
Method-2 Using del [:]
>>> del a1[:]
This deletes all the contents of the list object which a1 was pointing to. a1 now points to an empty list. Since a2 was also referring to the same list object, it also becomes an empty list.
Method-3 Using del a1
>>> del a1
>>> a1
NameError: name 'a1' is not defined
This deletes the variable a1 from the scope. Here, just the variable a1 is removed, the original list is still present in the memory. a2 still points to that original list which a1 used to point to. If we now try to access a1, we will get a NameError.

Categories