Why does the original list change? - python

Running this:
a = [[1], [2]]
for i in a:
i *= 2
print(a)
Gives
[[1, 1], [2, 2]]
I would expect to get the original list, as happens here:
a = [1, 2]
for i in a:
i *= 2
print(a)
Which gives:
[1, 2]
Why is the list in the first example being modified?

You are using augmented assignment statements. These operate on the object named on the left-hand side, giving that object the opportunity to update in-place:
An augmented assignment expression like x += 1 can be rewritten as x = x + 1 to achieve a similar, but not exactly equal effect. In the augmented version, x is only evaluated once. Also, when possible, the actual operation is performed in-place, meaning that rather than creating a new object and assigning that to the target, the old object is modified instead.
(bold emphasis mine).
This is achieved by letting objects implement __i[op]__ methods, for =* that's the __imul__ hook:
These methods are called to implement the augmented arithmetic assignments (+=, -=, *=, #=, /=, //=, %=, **=, <<=, >>=, &=, ^=, |=). These methods should attempt to do the operation in-place (modifying self) and return the result (which could be, but does not have to be, self).
Using *= on a list multiplies that list object and returns the same list object (self) to be 'assigned' back to the same name.
Integers on the other hand are immutable objects. Arithmetic operations on integers return new integer objects, so int objects do not even implement the __imul__ hook; Python has to fall back to executing i = i * 3 in that case.
So for the first example, the code:
a = [[1], [2]]
for i in a:
i *= 2
really does this (with the loop unrolled for illustration purposes):
a = [[1], [2]]
i = a[0].__imul__(2) # a[0] is altered in-place
i = a[1].__imul__(2) # a[1] is altered in-place
where the list.__imul__ method applies the change to the list object itself, and returns the reference to the list object.
For integers, this is executed instead:
a = [1, 2]
i = a[0] * 2 # a[0] is not affected
i = a[1] * 2 # a[1] is not affected
So now the new integer objects are assigned to i, which is independent from a.

The reason your results are different for each example is because list are mutable, but integers are not.
That means when you you modify an integer object in place, the operator must return a new integer object. However, since list are mutable, the changes are simply added to already existing list object.
When you used *= in the for-loop in the first example, Python modify the already existing list. But when you used *= with the integers, a new integer object had to be returned.
This can also be observed with a simple example:
>>> a = 1
>>> b = [1]
>>>
>>> id(a)
1505450256
>>> id(b)
52238656
>>>
>>> a *= 1
>>> b *= 1
>>>
>>> id(a)
1505450256
>>> id(b)
52238656
>>>
As you can see above, the memory address for a changed when we multiplied it in-place. So *= returned a new object. But the memory for the list did not change. That means *= modified the list object in-place.

In the first case your i in the for loop is a list. So you're telling python hey, take the ith list and repeat it twice. You're basically repeating the list 2 times, this is what the * operator does to a list.In the second case your i is a value, so you're applying * to a value, not a list.

Related

Why do variables containing lists in Python act differently from say variable containing integers in terms of storing/pointing towards values? [duplicate]

List reference append code
a = [1,2,3,4,5]
b = a
b.append(6)
print(a)
print(b)
#ans:
[1,2,3,4,5,6]
[1,2,3,4,5,6]
Integer reference in int
a = 1
b = a
b +=1
print(a)
print(b)
#ans:
1
2
how reference works in python integer vs list ? in list both value are same, why is in integer section a value is not 2 ?
In Python, everything is an object. Everything is a name for an address (pointer) per the docs.
On that page you can scroll down and find the following:
Numeric objects are immutable; once created their value never changes
Under that you'll see the int type defined, so it makes perfect sense your second example works.
On the top of the same page, you'll find the following:
Every object has an identity, a type and a value. An object’s identity never changes once it has been created; you may think of it as the object’s address in memory.
Python behaves just like C and Java in that you cannot reassign where the pointer to a name points. Python, like Java, is also pass-by-value and doesn't have a pass-by-reference semantic.
Looking at your first example:
>>> a = 1
>>> hex(id(a))
'0x7ffdc64cd420'
>>> b = a + 1
>>> hex(id(b))
'0x7ffdc64cd440'
>>> print(a)
1
>>> print(b)
2
Here it is shown that the operation b = a + 1 leaves a at 1 and b is now 2. That's because int is immutable, names that point to the value 1 will always point to the same address:
>>> a = 1
>>> b = 2
>>> c = 1
>>> hex(id(a))
'0x7ffdc64cd420'
>>> hex(id(b))
'0x7ffdc64cd440'
>>> hex(id(c))
'0x7ffdc64cd420'
Now this only holds true for the values of -5 to 256 in the C implementation, so beyond that you get new addresses, but the mutability shown above holds. I've shown you the sharing of memory addresses for a reason. On the same page you'll find the following:
Types affect almost all aspects of object behavior. Even the importance of object identity is affected in some sense: for immutable types, operations that compute new values may actually return a reference to any existing object with the same type and value, while for mutable objects this is not allowed. E.g., after a = 1; b = 1, a and b may or may not refer to the same object with the value one, depending on the implementation, but after c = []; d = [], c and d are guaranteed to refer to two different, unique, newly created empty lists. (Note that c = d = [] assigns the same object to both c and d.)
So your example:
>>> a = [1, 2, 3, 4, 5]
>>> hex(id(a))
'0x17292e1cbc8'
>>> b = a
>>> hex(id(b))
'0x17292e1cbc8'
I should be able to stop right here, its obvious that both a and b refer to the same object in memory at address 0x17292e1cbc8. Thats because the above is like saying:
# Lets assume that `[1, 2, 3, 4, 5]` is 0x17292e1cbc8 in memory
>>> a = 0x17292e1cbc8
>>> b = a
>>> print(b)
'0x17292e1cbc8'
Long and skinny? You're simply assigning a pointer to a new name, but both names point to the same object in memory! Note: This is not the same as a shallow copy because no external compound object is made.

Why are lists linked in Python in a persistent way?

A variable is set. Another variable is set to the first. The first changes value. The second does not. This has been the nature of programming since the dawn of time.
>>> a = 1
>>> b = a
>>> b = b - 1
>>> b
0
>>> a
1
I then extend this to Python lists. A list is declared and appended. Another list is declared to be equal to the first. The values in the second list change. Mysteriously, the values in the first list, though not acted upon directly, also change.
>>> alist = list()
>>> blist = list()
>>> alist.append(1)
>>> alist.append(2)
>>> alist
[1, 2]
>>> blist
[]
>>> blist = alist
>>> alist.remove(1)
>>> alist
[2]
>>> blist
[2]
>>>
Why is this?
And how do I prevent this from happening -- I want alist to be unfazed by changes to blist (immutable, if you will)?
Python variables are actually not variables but references to objects (similar to pointers in C). There is a very good explanation of that for beginners in http://foobarnbaz.com/2012/07/08/understanding-python-variables/
One way to convince yourself about this is to try this:
a=[1,2,3]
b=a
id(a)
68617320
id(b)
68617320
id returns the memory address of the given object. Since both are the same for both lists it means that changing one affects the other, because they are, in fact, the same thing.
Variable binding in Python works this way: you assign an object to a variable.
a = 4
b = a
Both point to 4.
b = 9
Now b points to somewhere else.
Exactly the same happens with lists:
a = []
b = a
b = [9]
Now, b has a new value, while a has the old one.
Till now, everything is clear and you have the same behaviour with mutable and immutable objects.
Now comes your misunderstanding: it is about modifying objects.
lists are mutable, so if you mutate a list, the modifications are visible via all variables ("name bindings") which exist:
a = []
b = a # the same list
c = [] # another empty one
a.append(3)
print a, b, c # a as well as b = [3], c = [] as it is a different one
d = a[:] # copy it completely
b.append(9)
# now a = b = [3, 9], c = [], d = [3], a copy of the old a resp. b
What is happening is that you create another reference to the same list when you do:
blist = alist
Thus, blist referes to the same list that alist does. Thus, any modifications to that single list will affect both alist and blist.
If you want to copy the entire list, and not just create a reference, you can do this:
blist = alist[:]
In fact, you can check the references yourself using id():
>>> alist = [1,2]
>>> blist = []
>>> id(alist)
411260888
>>> id(blist)
413871960
>>> blist = alist
>>> id(blist)
411260888
>>> blist = alist[:]
>>> id(blist)
407838672
This is a relevant quote from the Python docs.:
Assignment statements in Python do not copy objects, they create bindings between a target and an object. For collections that are mutable or contain mutable items, a copy is sometimes needed so one can change one copy without changing the other.
Based on this post:
Python passes references-to-objects by value (like Java), and
everything in Python is an object. This sounds simple, but then you
will notice that some data types seem to exhibit pass-by-value
characteristics, while others seem to act like pass-by-reference...
what's the deal?
It is important to understand mutable and immutable objects. Some
objects, like strings, tuples, and numbers, are immutable. Altering
them inside a function/method will create a new instance and the
original instance outside the function/method is not changed. Other
objects, like lists and dictionaries are mutable, which means you can
change the object in-place. Therefore, altering an object inside a
function/method will also change the original object outside.
So in your example you are making the variable bList and aList point to the same object. Therefore when you remove an element from either bList or aList it is reflected in the object that they both point to.
The short answer two your question "Why is this?": Because in Python integers are immutable, while lists are mutable.
You were looking for an official reference in the Python docs. Have a look at this section:
http://docs.python.org/2/reference/simple_stmts.html#assignment-statements
Quote from the latter:
Assignment statements are used to (re)bind names to values and to
modify attributes or items of mutable objects
I really like this sentence, have never seen it before. It answers your question precisely.
A good recent write-up about this topic is http://nedbatchelder.com/text/names.html, which has already been mentioned in one of the comments.

Store reference to primitive type in Python?

Code:
>>> a = 1
>>> b = 2
>>> l = [a, b]
>>> l[1] = 4
>>> l
[1, 4]
>>> l[1]
4
>>> b
2
What I want to instead see happen is that when I set l[1] equal to 4, that the variable b is changed to 4.
I'm guessing that when dealing with primitives, they are copied by value, not by reference. Often I see people having problems with objects and needing to understand deep copies and such. I basically want the opposite. I want to be able to store a reference to the primitive in the list, then be able to assign new values to that variable either by using its actual variable name b or its reference in the list l[1].
Is this possible?
There are no 'primitives' in Python. Everything is an object, even numbers. Numbers in Python are immutable objects. So, to have a reference to a number such that 'changes' to the 'number' are 'seen' through multiple references, the reference must be through e.g. a single element list or an object with one property.
(This works because lists and objects are mutable and a change to what number they hold is seen through all references to it)
e.g.
>>> a = [1]
>>> b = a
>>> a
[1]
>>> b
[1]
>>> a[0] = 2
>>> a
[2]
>>> b
[2]
You can't really do that in Python, but you can come close by making the variables a and b refer to mutable container objects instead of immutable numbers:
>>> a = [1]
>>> b = [2]
>>> lst = [a, b]
>>> lst
[[1], [2]]
>>> lst[1][0] = 4 # changes contents of second mutable container in lst
>>> lst
[[1], [4]]
>>> a
[1]
>>> b
[4]
I don't think this is possible:
>>> lst = [1, 2]
>>> a = lst[1] # value is copied, not the reference
>>> a
2
>>> lst[1] = 3
>>> lst
[1, 3] # list is changed
>>> a # value is not changed
2
a refers to the original value of lst[1], but does not directly refer to it.
Think of l[0] as a name referring to an object a, and a as a name that referring to an integer.
Integers are immutable, you can make names refer to different integers, but integers themselves can't be changed.
There were a relevant discussion earlier:
Storing elements of one list, in another list - by reference - in Python?
According to #mgilson, when doing l[1] = 4, it simply replaces the reference, rather than trying to mutate the object. Nevertheless, objects of type int are immutable anyway.

Strange behavior with a list as a default function argument [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
“Least Astonishment” in Python: The Mutable Default Argument
List extending strange behaviour
Pyramid traversal view lookup using method names
Let's say I have this function:
def a(b=[]):
b += [1]
print b
Calling it yields this result:
>>> a()
[1]
>>> a()
[1, 1]
>>> a()
[1, 1, 1]
When I change b += [1] to b = b + [1], the behavior of the function changes:
>>> a()
[1]
>>> a()
[1]
>>> a()
[1]
How does b = b + [1] differ from b += [1]? Why does this happen?
In Python there is no guarantee that a += b does the same thing as a = a + b.
For lists, someList += otherList modifies someList in place, basically equivalent to someList.extend(otherList), and then rebinds the name someList to that same list. someList = someList + otherList, on the other hand, constructs a new list by concatenating the two lists, and binds the name someList to that new list.
This means that, with +=, the name winds up pointing to the same object it already was pointing to, while with +, it points to a new object. Since function defaults are only evaluated once (see this much-cited question), this means that with += the operations pile up because they all modify the same original object (the default argument).
b += [1] alters the function default (leading to the least astonishment FAQ). b = b + [1] takes the default argument b - creates a new list with the + [1], and binds that to b. One mutates the list - the other creates a new one.
When you define a function
>>> def a(b=[]):
b += [1]
return b
it saves all the default arguments in a special place. It can be actually accessed by:
>>> a.func_defaults
([],)
The first default value is a list has the ID:
>>> id(a.func_defaults[0])
15182184
Let's try to invoke the function:
>>> print a()
[1]
>>> print a()
[1, 1]
and see the ID of the value returned:
>>> print id(a())
15182184
>>> print id(a())
15182184
As you may see, it's the same as the ID of the list of the first default value.
The different output of the function is explained by the fact that b+=... modifies the b inplace and doesn't create a new list. And b is the list being kept in the tuple of default values. So all your changes to the list are saved there, and each invocation of the function works with the different value of b.

The immutable object in python

I see a article about the immutable object.
It says when:
variable = immutable
As assign the immutable to a variable.
for example
a = b # b is a immutable
It says in this case a refers to a copy of b, not reference to b.
If b is mutable, the a wiil be a reference to b
so:
a = 10
b = a
a =20
print (b) #b still is 10
but in this case:
a = 10
b = 10
a is b # return True
print id(10)
print id(a)
print id(b) # id(a) == id(b) == id(10)
if a is the copy of 10, and b is also the copy of 10, why id(a) == id(b) == id(10)?
"Simple" immutable literals (and in particular, integers between -1 and 255) are interned, which means that even when bound to different names, they will still be the same object.
>>> a = 'foo'
>>> b = 'foo'
>>> a is b
True
While that article may be correct for some languages, it's wrong for Python.
When you do any normal assignment in Python:
some_name = some_name_or_object
You aren't making a copy of anything. You're just pointing the name at the object on the right side of the assignment.
Mutability is irrelevant.
More specifically, the reason:
a = 10
b = 10
a is b
is True, is that 10 is interned -- meaning Python keeps one 10 in memory, and anything that is set to 10 points to that same 10.
If you do
a = object()
b = object()
a is b
You'll get False, but
a = object()
b = a
a is b
will still be True.
Because interning has already been explained, I'll only address the mutable/immutable stuff:
As assign the immutable to a variable.
When talking about what is actually happening, I wouldn't choose this wording.
We have objects (stuff that lives in memory) and means to access those objects: names (or variables), these are "bound" to an object in reference. (You could say the point to the objects)
The names/variables are independent of each other, they can happen to be bound to the same object, or to different ones. Relocating one such variable doesn't affect any others.
There is no such thing as passing by value or passing by reference. In Python, you always pass/assign "by object". When assigning or passing a variable to a function, Python never creates a copy, it always passes/assigns the very same object you already have.
Now, when you try to modify an immutable object, what happens? As already said, the object is immutable, so what happens instead is the following: Python creates a modified copy.
As for your example:
a = 10
b = a
a =20
print (b) #b still is 10
This is not related to mutability. On the first line, you bind the int object with the value 10 to the name a. On the second line, you bind the object referred to by a to the name b.
On the third line, you bind the int object with the value 20 to the name a, that does not change what the name b is bound to!
It says in this case a refers to a copy of b, not reference to b. If b
is mutable, the a wiil be a reference to b
As already mentioned before, there is no such thing as references in Python. Names in Python are bound to objects. Different names (or variables) can be bound to the very same object, but there is no connection between the different names themselves. When you modify things, you modify objects, that's why all other names that are bound to that object "see the changes", well they're bound to the same object that you've modified, right?
If you bind a name to a different object, that's just what happens. There's no magic done to the other names, they stay just the way they are.
As for the example with lists:
In [1]: smalllist = [0, 1, 2]
In [2]: biglist = [smalllist]
In [3]: biglist
Out[3]: [[0, 1, 2]]
Instead of In[1] and In[2], I might have written:
In [1]: biglist = [[0, 1, 2]]
In [2]: smalllist = biglist[0]
This is equivalent.
The important thing to see here, is that biglist is a list with one item. This one item is, of course, an object. The fact that it is a list does not conjure up some magic, it's just a simple object that happens to be a list, that we have attached to the name smalllist.
So, accessing biglist[i] is exactly the same as accessing smalllist, because they are the same object. We never made a copy, we passed the object.
In [14]: smalllist is biglist[0]
Out[14]: True
Because lists are mutable, we can change smallist, and see the change reflected in biglist. Why? Because we actually modified the object referred to by smallist. We still have the same object (apart from the fact that it's changed). But biglist will "see" that change because as its first item, it references that very same object.
In [4]: smalllist[0] = 3
In [5]: biglist
Out[5]: [[3, 1, 2]]
The same is true when we "double" the list:
In [11]: biglist *= 2
In [12]: biglist
Out[12]: [[0, 1, 2], [0, 1, 2]]
What happens is this: We have a list: [object1, object2, object3] (this is a general example)
What we get is: [object1, object2, object3, object1, object2, object3]: It will just insert (i.e. modify "biglist") all of the items at the end of the list. Again, we insert objects, we do not magically create copies.
So when we now change an item inside the first item of biglist:
In [20]: biglist[0][0]=3
In [21]: biglist
Out[21]: [[3, 1, 2], [3, 1, 2]]
We could also just have changed smalllist, because for all intents and purposes, biglist could be represented as: [smalllist, smalllist] -- it contains the very same object twice.

Categories