Why does multiple assignment make distinct references for ints, but not lists or other objects?
>>> a = b = 1
>>> a += 1
>>> a is b
>>> False
>>> a = b = [1]
>>> a.append(1)
>>> a is b
>>> True
In the int example, you first assign the same object to both a and b, but then reassign a with another object (the result of a+1). a now refers to a different object.
In the list example, you assign the same object to both a and b, but then you don't do anything to change that. append only changes the interal state of the list object, not its identity. Thus they remain the same.
If you replace a.append(1) with a = a + [1], you end up with different object, because, again, you assign a new object (the result of a+[1]) to a.
Note that a+=[1] will behave differently, but that's a whole other question.
primitive types are immutable. When a += 1 runs, a no longer refers to the memory location as b:
https://docs.python.org/2/library/functions.html#id
CPython implementation detail: This is the address of the object in memory.
In [1]: a = b = 100000000000000000000000000000
print id(a), id(b)
print a is b
Out [1]: 4400387016 4400387016
True
In [2]: a += 1
print id(a), id(b)
print a is b
Out [2]: 4395695296 4400387016
False
Python works differently when changing values of mutable object and immutable object
Immutable objects:
This are objects whose values which dose not after initialization
i.e.)int,string,tuple
Mutable Objects
This are objects whose values which can be after initialization
i.e.)All other objects are mutable like dist,list and user defined object
When changing the value of mutable object it dose not create a new memory space and transfer there it just changes the memory space where it was created
But it is exactly the opposite for immutable objects that is it creates a new space and transfer itself there
i.e.)
s="awe"
s[0]="e"
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-19-9f16ce5bbc72> in <module>()
----> 1 s[0]="e"
TypeError: 'str' object does not support item assignment
This is trying to tell u that you can change the value of the string memory
you could do this
"e"+s[1:]
Out[20]: 'ewe'
This creates a new memory space and allocates the string there .
Like wise making A=B=1 and changing A A=2 will create a new memory space and variable A will reference to that location so that's why B's value is not changed when changing value of A
But this not the case in List since it is a mutable object changing the value does not transfer it to a new memory location it just expands the used memory
i.e.)
a=b=[]
a.append(1)
print a
[1]
print b
[1]
Both gives the same value since it is referencing the same memory space so both are equal
The difference is not in the multiple assignment, but in what you subsequently do with the objects. With the int, you do +=, and with the list you do .append.
However, even if you do += for both, you won't necessarily see the same result, because what += does depends on what type you use it on.
So that is the basic answer: operations like += may work differently on different types. Whether += returns a new object or modifies the existing object is behavior that is defined by that object. To know what the behavior is, you need to know what kind of object it is and what behavior it defines (i.e., the documentation). All the more, you cannot assume that using an operation like += will have the same result as using a method like .append. What a method like .append does is defined by the object you call it on.
Related
I have been reading the Python Data Model. The following text is taken from here:
Types affect almost all aspects of object behavior. Even the
importance of object identity is affected in some sense: for immutable
types, operations that compute new values may actually return a
reference to any existing object with the same type and value, while
for mutable objects this is not allowed. E.g., after a = 1; b = 1, a
and b may or may not refer to the same object with the value one,
depending on the implementation, but after c = []; d = [], c and d are
guaranteed to refer to two different, unique, newly created empty
lists. (Note that c = d = [] assigns the same object to both c and d.)
So, it mentions that, for immutable types, operations that compute new values may actually return a reference to an existing object with same type and value. So, I wanted to test this. Following is my code:
a = (1,2,3)
b = (1,2)
c = (3,)
k = b + c
print(id(a))
>>> 2169349869720
print(id(k))
>>> 2169342802424
Here, I did an operation to compute a new tuple that has same the value and type as a. But I got an object referencing to different id. This means I got an object which references different memory than a. Why is this?
Answering the question based on comments from #jonrsharpe
Note "may actually return" - it's not guaranteed, it would likely be
less efficient for Python to look through the existing tuples to find
out if one that's the same as the one your operation creates already
exists and reuse it than it would to just create a new one.
If two variable values are identical then it is said to be sharing same memory...
so python follows shared memory concept ?....and if i change one value will it change another?
See Python data model described here
Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that c = d = [] assigns the same object to both c and d.)
I understand that a namedtuple in python is immutable and the values of its attributes cant be reassigned directly
N = namedtuple("N",['ind','set','v'])
def solve()
items=[]
R = set(range(0,8))
for i in range(0,8):
items.append(N(i,R,8))
items[0].set.remove(1)
items[0].v+=1
Here last like where I am assigning a new value to attribute 'v' will not work. But removing the element '1' from the set attribute of items[0] works.
Why is that and will this be true if set attribute were of List type
Immutability does not get conferred on mutable objects inside the tuple. All immutability means is you can't change which particular objects are stored - ie, you can't reassign items[0].set. This restriction is the same regardless of the type of that variable - if it was a list, doing items[0].list = items[0].list + [1,2,3] would fail (can't reassign it to a new object), but doing items[0].list.extend([1,2,3]) would work.
Think about it this way: if you change your code to:
new_item = N(i,R,8)
then new_item.set is now an alias for R (Python doesn't copy objects when you reassign them). If tuples conferred immutability to mutable members, what would you expect R.remove(1) to do? Since it is the same set as new_item.set, any changes you make to one will be visible in the other. If the set had become immutable because it has become a member of a tuple, R.remove(1) would suddenly fail. All method calls in Python work or fail depending on the object only, not on the variable - R.remove(1) and new_item.set.remove(1) have to behave the same way.
This also means that:
R = set(range(0,8))
for i in range(0,8):
items.append(N(i,R,8))
probably has a subtle bug. R never gets reassigned here, and so every namedtuple in items gets the same set. You can confirm this by noticing that items[0].set is items[1].set is True. So, anytime you mutate any of them - or R - the modification would show up everywhere (they're all just different names for the same object).
This is a problem that usually comes up when you do something like
a = [[]] * 3
a[0].append(2)
and a will now be [[2], [2], [2]]. There are two ways around this general problem:
First, be very careful to create a new mutable object when you assign it, unless you do deliberately want an alias. In the nested lists example, the usual solution is to do a = [[] for _ in range(3)]. For your sets in tuples, move the line R = ... to inside the loop, so it gets reassigned to a new set for each namedtuple.
The second way around this is to use immutable types. Make R a frozenset, and the ability to add and remove elements goes away.
You mutate the set, not the tuple. And sets are mutable.
>>> s = set()
>>> t = (s,)
>>> l = [s]
>>> d = {42: s}
>>> t
(set([]),)
>>> l
[set([])]
>>> d
{42: set([])}
>>> s.add('foo')
>>> t
(set(['foo']),)
>>> l
[set(['foo'])]
>>> d
{42: set(['foo'])}
a = [1,2,3,4,5]
b = a[1]
print id(a[1],b) # out put shows same id.hence both represent same object.
del a[1] # deleting a[1],both a[1],b have same id,hence both are aliases
print a # output: [1,3,4,5]
print b # output: 2
Both b,a[1] have same id but deleting one isn't effecting the other.Python reference states that 'del' on a subscription deletes the actual object,not the name object binding. Output: [1,3,4,5] proves this statement.But how is it possible that 'b' remains unaffected when both a[0] and b have same id.
Edit: The part 'del' on a subscription deletes the actual object,not the name object binding is not true.The reverse is true. 'del' actually removes the name,object bindings.In case of 'del' on subscription (eg. del a[1]) removes object 2 from the list object and also removes the current a[1] binding to 2 and makes a[1] bind to 3 instead. Subsequent indexes follow the pattern.
del doesn't delete objects, it deletes references.
There is an object which is the integer value 2. That one single object was referred to by two places; a[1] and b.
You deleted a[1], so that reference was gone. But that has no effect on the object 2, only on the reference that was in a[1]. So the reference accessible through the name b still reaches the object 2 just fine.
Even if you del all the references, that has no effect on the object. Python is a garbage collected language, so it is responsible for noticing when an object is no longer referenced anywhere at all, so that it can reclaim the memory occupied by the object. That will happen some time after the object is no longer reachable.1
1 CPython uses reference counting to implement it's garbage collection2, which allows us to say that objects will usually be reclaimed as soon as their last reference dies, but that's an implementation detail not part of the language specification. You don't have to understand exactly how Python collects its garbage and shouldn't write programs that depend on it; other Python implementations such as Jython, PyPy, and IronPython do not implement garbage collection this way.
2 Plus an additional garbage collection mechanism to detect cyclic garbage, which reference counting can't handle.
del merely decrements the reference count for that object. So at after b = a[1] the object at a[1] has 2 (let's say) references. After delete a[1], it is gone from the list and now only has 1 reference, as it's still referenced by b. No actual deletion occurs until the ref. count is 0, and then only on a GC cycle.
There are multiple issues at work here. First, calling del on a list member removes the item from the list, which releases the reference count on the object, but it will not deallocate it since the variable b still reference it. You can never deallocate something which you have a reference to.
The second issue to note here is that integer numbers close to zero are actually pooled and are never deallocated. You should normally not have to bother knowing about this though.
They have the same id because Python reuses the id for small integers, even if you delete these... This is mentioned in the docs:
The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object.
We can see this behaviour:
>>> c = 256
>>> id(c)
140700180101352
>>> del c
>>> d = 256
>>> id(d)
140700180101352 # same as id(c) was
>>> e = 257
>>> id(e)
140700180460152
>>> del e
>>> f = 257
>>> id(f)
140700180460128 # different to id(e) !
On p.35 of "Python Essential Reference" by David Beazley, he first states:
For immutable data such as strings, the interpreter aggressively
shares objects between different parts of the program.
However, later on the same page, he states
For immutable objects such as numbers and strings, this assignment
effectively creates a copy.
But isn't this a contradiction? On one hand he is saying that they are shared, but then he says they are copied.
An assignment in python never ever creates a copy (it is technically possible only if the assignment for a class member is redefined for example by using __setattr__, properties or descriptors).
So after
a = foo()
b = a
whatever was returned from foo has not been copied, and instead you have two variables a and b pointing to the same object. No matter if the object is immutable or not.
With immutable objects however it's hard to tell if this is the case (because you cannot mutate the object using one variable and check if the change is visible using the other) so you are free to think that indeed a and b cannot influence each other.
For some immutable objects also Python is free to reuse old objects instead of creating new ones and after
a = x + y
b = x + y
where both x and y are numbers (so the sum is a number and is immutable) may be that both a and b will be pointing to the same object. Note that there is no such a guarantee... it may also be that instead they will be pointing to different objects with the same value.
The important thing to remember is that Python never ever makes a copy unless specifically instructed to using e.g. copy or deepcopy. This is very important with mutable objects to avoid surprises.
One common idiom you can see is for example:
class Polygon:
def __init__(self, pts):
self.pts = pts[:]
...
In this case self.pts = pts[:] is used instead of self.pts = pts to make a copy of the whole array of points to be sure that the point list will not change unexpectedly if after creating the object changes are applied to the list that was passed to the constructor.
It effectively creates a copy. It doesn't actually create a copy. The main difference between having two copies and having two names share the same value is that, in the latter case, modifications via one name affect the value of the other name. If the value can't be mutated, this difference disappears, so for immutable objects there is little practical consequence to whether the value is copied or not.
There are some corner cases where you can tell the difference between copies and different objects even for immutable types (e.g., by using the id function or the is operator), but these are not useful for Python builtin immutable types (like strings and numbers).
No, assigning a pre-existing str variable to a new variable name does not create an independent copy of the value in memory.
The existence of unique objects in memory can be checked using the id() function. For example, using the interactive Python prompt, try:
>>> str1 = 'ATG'
>>> str2 = str1
Both str1 and str2 have the same value:
>>> str1
'ATG'
>>> str2
'ATG'
This is because str1 and str2 both point to the same object, evidenced by the fact that they share the same unique object ID:
>>> id(str1)
140439014052080
>>> id(str2)
140439014052080
>>> id(str1) == id(str2)
True
Now suppose you modify str1:
>>> str1 += 'TAG' # same as str1 = str1 + 'TAG'
>>> str1
ATGTAG
Because str objects are immutable, the above assignment created a new unique object with its own ID:
>>> id(str1)
140439016777456
>>> id(str1) == id(str2)
False
However, str2 maintains the same ID it had earlier:
>>> id(str2)
140439014052080
Thus, execution of str1 += 'TAG' assigned a brand new str object with its own unique ID to the variable str1, while str2 continues to point to the original str object.
This implies that assigning an existing str variable to another variable name does not create a copy of its value in memory.