Why are lists linked in Python in a persistent way? - python

A variable is set. Another variable is set to the first. The first changes value. The second does not. This has been the nature of programming since the dawn of time.
>>> a = 1
>>> b = a
>>> b = b - 1
>>> b
0
>>> a
1
I then extend this to Python lists. A list is declared and appended. Another list is declared to be equal to the first. The values in the second list change. Mysteriously, the values in the first list, though not acted upon directly, also change.
>>> alist = list()
>>> blist = list()
>>> alist.append(1)
>>> alist.append(2)
>>> alist
[1, 2]
>>> blist
[]
>>> blist = alist
>>> alist.remove(1)
>>> alist
[2]
>>> blist
[2]
>>>
Why is this?
And how do I prevent this from happening -- I want alist to be unfazed by changes to blist (immutable, if you will)?

Python variables are actually not variables but references to objects (similar to pointers in C). There is a very good explanation of that for beginners in http://foobarnbaz.com/2012/07/08/understanding-python-variables/
One way to convince yourself about this is to try this:
a=[1,2,3]
b=a
id(a)
68617320
id(b)
68617320
id returns the memory address of the given object. Since both are the same for both lists it means that changing one affects the other, because they are, in fact, the same thing.

Variable binding in Python works this way: you assign an object to a variable.
a = 4
b = a
Both point to 4.
b = 9
Now b points to somewhere else.
Exactly the same happens with lists:
a = []
b = a
b = [9]
Now, b has a new value, while a has the old one.
Till now, everything is clear and you have the same behaviour with mutable and immutable objects.
Now comes your misunderstanding: it is about modifying objects.
lists are mutable, so if you mutate a list, the modifications are visible via all variables ("name bindings") which exist:
a = []
b = a # the same list
c = [] # another empty one
a.append(3)
print a, b, c # a as well as b = [3], c = [] as it is a different one
d = a[:] # copy it completely
b.append(9)
# now a = b = [3, 9], c = [], d = [3], a copy of the old a resp. b

What is happening is that you create another reference to the same list when you do:
blist = alist
Thus, blist referes to the same list that alist does. Thus, any modifications to that single list will affect both alist and blist.
If you want to copy the entire list, and not just create a reference, you can do this:
blist = alist[:]
In fact, you can check the references yourself using id():
>>> alist = [1,2]
>>> blist = []
>>> id(alist)
411260888
>>> id(blist)
413871960
>>> blist = alist
>>> id(blist)
411260888
>>> blist = alist[:]
>>> id(blist)
407838672
This is a relevant quote from the Python docs.:
Assignment statements in Python do not copy objects, they create bindings between a target and an object. For collections that are mutable or contain mutable items, a copy is sometimes needed so one can change one copy without changing the other.

Based on this post:
Python passes references-to-objects by value (like Java), and
everything in Python is an object. This sounds simple, but then you
will notice that some data types seem to exhibit pass-by-value
characteristics, while others seem to act like pass-by-reference...
what's the deal?
It is important to understand mutable and immutable objects. Some
objects, like strings, tuples, and numbers, are immutable. Altering
them inside a function/method will create a new instance and the
original instance outside the function/method is not changed. Other
objects, like lists and dictionaries are mutable, which means you can
change the object in-place. Therefore, altering an object inside a
function/method will also change the original object outside.
So in your example you are making the variable bList and aList point to the same object. Therefore when you remove an element from either bList or aList it is reflected in the object that they both point to.

The short answer two your question "Why is this?": Because in Python integers are immutable, while lists are mutable.
You were looking for an official reference in the Python docs. Have a look at this section:
http://docs.python.org/2/reference/simple_stmts.html#assignment-statements
Quote from the latter:
Assignment statements are used to (re)bind names to values and to
modify attributes or items of mutable objects
I really like this sentence, have never seen it before. It answers your question precisely.
A good recent write-up about this topic is http://nedbatchelder.com/text/names.html, which has already been mentioned in one of the comments.

Related

Why can't a list be constructed and modified in the same line?

For example, why is a not equal to b?
a = [1]
a.append(2)
print(a) # [1, 2]
b = [1].append(2)
print(b) # None
The syntax for b doesn't look wrong to me, but it is. I want to write one-liners to define a list (e.g. using a generator expression) and then append elements, but all I get is None.
It's because:
append, extend, sort and more list function are all "in-place".
What does "in-place" mean? it means it modifies the original variable directly, some things you would need:
l = sorted(l)
To modify the list, but append already does that, so:
l.append(3)
Will modify l already, don't need:
l = l.append(3)
If you do:
l = [1].append(2)
Yes it will modify the list of [1], but it would be lost in memory somewhere inaccessible, whereas l will become None as we discovered above.
To make it not "in-place", without using append either do:
l = l + [2]
Or:
l = [*l, 2]
The one-liner for b does these steps:
Defines a list [1]
Appends 2 to the list in-place
Append has no return, so b = None
The same is true for all list methods that alter the list in-place without a return. These are all None:
c = [1].extend([2])
d = [2, 1].sort()
e = [1].insert(1, 2)
...
If you wanted a one-liner that is similar to your define and extend, you could do
c2 = [1, *[2]]
which you could use to combine two generator expressions.
All built-in methods under class 'List' in Python are just modifying the list 'in situ'. They only change the original list and return nothing.
The advantage is, you don't need to pass the object to the original variable every time you modify it. Meanwhile, you can't accumulatively call its methods in one line of code such as what is used in Javascript. Because Javascript always turns its objects into DOM, but Python not.

Python not referencing to same list

In code below:
a=[0,1]
b=a
for i in range(2):
for j in b:
a=a+[j]
why does a print as:
[0,1,0,1,0,1]
and b as:
[0,1]
However when executed on idle both lists change:
>>> c=[9,0]
>>> d=c
>>> d+=[7]
>>> c
[9, 0, 7]
Since a is being appended, why doesn't b change as is the property of python list assignment?
Since a is being appended why doesn't b change as is the property of python list assignment?
a is not appended. When you write:
a = a+[j]
you each time construct a list [j] and then construct a new list a+[j] that contains all the elements of a and then j.
Now you let a refer to the new list, but b still refers to the old list. Since the old list is not updated (the state is not altered, for instance through append), the list remains the same (which is good since iterating over a list you alter can have unwanted side effects).
If you would use a.append(j) or a += [j] instead of a = a + [j], then the list will be updated (in the latter case, you implicitly call a.extends([j])). Since both a and b refer to that list, b will thus also be updated. But mind that since we iterate over b at the same time, we could end up in an infinite loop. So you better do not do that anyway.
a is not appended. Appending is done with the append command like so:
a.append(1)
Every time you add (a + [j]) you construct a new object.

Copy a list of list by value and not reference [duplicate]

This question already has answers here:
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
Copying nested lists in Python
(3 answers)
Closed 4 years ago.
To understand why I was getting an error in a program , in which I tried to find the "minor" of a determinant, I wrote a simpler program because my variables were messed up. This function below takes in a 2 * 2 matrix as an input, and returns a list containing its rows (pointless and inefficient, I know, but I'm trying to understand the theory behind this).
def alpha(A): #where A will be a 2 * 2 matrix
B = A #the only purpose of B is to store the initial value of A, to retrieve it later
mylist = []
for i in range(2):
for j in range(2):
del A[i][j]
array.append(A)
A = B
return mylist
However, here it seems that B is assigned the value of A dynamically, in the sense that I'm not able to store the initial value of A in B to use it later. Why is that?
Because python passes lists by reference
This means that when you write "b=a" you're saying that a and b are the same object, and that when you change b you change also a, and viceversa
A way to copy a list by value:
new_list = old_list[:]
If the list contains objects and you want to copy them as well, use generic copy.deepcopy():
import copy
new_list = copy.deepcopy(old_list)
Since Python passes list by reference, A and B are the same objects. When you modify B you are also modifying A. This behavior can be demonstrated in a simple example:
>>> A = [1, 2, 3]
>>> def change(l):
... b = l
... b.append(4)
...
>>> A
[1, 2, 3]
>>> change(A)
>>> A
[1, 2, 3, 4]
>>>
If you need a copy of A use slice notation:
B = A[:]
A looks like a reference type, not a value type. Reference types are not copied on assignment (unlike e.g. R). You can use copy.copy to make a deep copy of an element

what is the difference between del a[:] and a = [] when I want to empty a list called a in python? [duplicate]

This question already has answers here:
Different ways of deleting lists
(6 answers)
Closed 7 years ago.
Please what is the most efficient way of emptying a list?
I have a list called a = [1,2,3]. To delete the content of the list I usually write a = [ ]. I came across a function in python called del. I want to know if there is a difference between del a [:] and what I use.
There is a difference, and it has to do with whether that list is referenced from multiple places/names.
>>> a = [1, 2, 3]
>>> b = a
>>> del a[:]
>>> print(b)
[]
>>> a = [1, 2, 3]
>>> b = a
>>> a = []
>>> print(b)
[1, 2, 3]
Using del a[:] clears the existing list, which means anywhere it's referenced will become an empty list.
Using a = [] sets a to point to a new empty list, which means that other places the original list is referenced will remain non-empty.
The key to understanding here is to realize that when you assign something to a variable, it just makes that name point to a thing. Things can have multiple names, and changing what a name points to doesn't change the thing itself.
This can probably best be shown:
>>> a = [1, 2, 3]
>>> id(a)
45556280
>>> del a[:]
>>> id(a)
45556280
>>> b = [4, 5, 6]
>>> id(b)
45556680
>>> b = []
>>> id(b)
45556320
When you do a[:] you are referring to all elements within the list "assigned" to a. The del statement removes references to objects. So, doing del a[:] is saying "remove all references to objects from within the list assigned to a". The list itself has not changed. We can see this with the id function, which gives us a number representing an object in memory. The id of the list before using del and after remains the same, indicating the same list object is assigned to a.
On the other hand, when we assign a non-empty list to b and then assign a new empty list to b, the id changes. This is because we have actually moved the b reference from the existing [4, 5, 6] list to the new [] list.
Beyond just the identity of the objects you are dealing with, there are other things to be aware of:
>>> a = [1, 2, 3]
>>> b = a
>>> del a[:]
>>> print a
[]
>>> print b
[]
Both b and a refer to the same list. Removing the elements from the a list without changing the list itself mutates the list in place. As b references the same object, we see the same result there. If you did a = [] instead, then a will refer to a new empty list while b continues to reference the [1, 2, 3] list.
>>> list1 = [1,2,3,4,5]
>>> list2 = list1
To get a better understanding, let us see with the help of pictures what happens internally.
>>> list1 = [1,2,3,4,5]
This creates a list object and assigns it to list1.
>>> list2 = list1
The list object which list1 was referring to is also assigned to list2.
Now, lets look at the methods to empty an list and what actually happens internally.
METHOD-1: Set to empty list [] :
>>> list1 = []
>>> list2
[1,2,3,4,5]
This does not delete the elements of the list but deletes the reference to the list. So, list1 now points to an empty list but all other references will have access to that old list1.
This method just creates a new list object and assigns it to list1. Any other references will remain.
METHOD-2: Delete using slice operator[:] :
>>> del list1[:]
>>> list2
[]
When we use the slice operator to delete all the elements of the list, then all the places where it is referenced, it becomes an empty list. So list2 also becomes an empty list.
Well, del uses just a little less space in the computer as the person above me implied. The computer still accepts the variable as the same code, except with a different value. However, when you variable is assigned something else, the computer assigns a completely different code ID to it in order to account for the change in memory required.

The immutable object in python

I see a article about the immutable object.
It says when:
variable = immutable
As assign the immutable to a variable.
for example
a = b # b is a immutable
It says in this case a refers to a copy of b, not reference to b.
If b is mutable, the a wiil be a reference to b
so:
a = 10
b = a
a =20
print (b) #b still is 10
but in this case:
a = 10
b = 10
a is b # return True
print id(10)
print id(a)
print id(b) # id(a) == id(b) == id(10)
if a is the copy of 10, and b is also the copy of 10, why id(a) == id(b) == id(10)?
"Simple" immutable literals (and in particular, integers between -1 and 255) are interned, which means that even when bound to different names, they will still be the same object.
>>> a = 'foo'
>>> b = 'foo'
>>> a is b
True
While that article may be correct for some languages, it's wrong for Python.
When you do any normal assignment in Python:
some_name = some_name_or_object
You aren't making a copy of anything. You're just pointing the name at the object on the right side of the assignment.
Mutability is irrelevant.
More specifically, the reason:
a = 10
b = 10
a is b
is True, is that 10 is interned -- meaning Python keeps one 10 in memory, and anything that is set to 10 points to that same 10.
If you do
a = object()
b = object()
a is b
You'll get False, but
a = object()
b = a
a is b
will still be True.
Because interning has already been explained, I'll only address the mutable/immutable stuff:
As assign the immutable to a variable.
When talking about what is actually happening, I wouldn't choose this wording.
We have objects (stuff that lives in memory) and means to access those objects: names (or variables), these are "bound" to an object in reference. (You could say the point to the objects)
The names/variables are independent of each other, they can happen to be bound to the same object, or to different ones. Relocating one such variable doesn't affect any others.
There is no such thing as passing by value or passing by reference. In Python, you always pass/assign "by object". When assigning or passing a variable to a function, Python never creates a copy, it always passes/assigns the very same object you already have.
Now, when you try to modify an immutable object, what happens? As already said, the object is immutable, so what happens instead is the following: Python creates a modified copy.
As for your example:
a = 10
b = a
a =20
print (b) #b still is 10
This is not related to mutability. On the first line, you bind the int object with the value 10 to the name a. On the second line, you bind the object referred to by a to the name b.
On the third line, you bind the int object with the value 20 to the name a, that does not change what the name b is bound to!
It says in this case a refers to a copy of b, not reference to b. If b
is mutable, the a wiil be a reference to b
As already mentioned before, there is no such thing as references in Python. Names in Python are bound to objects. Different names (or variables) can be bound to the very same object, but there is no connection between the different names themselves. When you modify things, you modify objects, that's why all other names that are bound to that object "see the changes", well they're bound to the same object that you've modified, right?
If you bind a name to a different object, that's just what happens. There's no magic done to the other names, they stay just the way they are.
As for the example with lists:
In [1]: smalllist = [0, 1, 2]
In [2]: biglist = [smalllist]
In [3]: biglist
Out[3]: [[0, 1, 2]]
Instead of In[1] and In[2], I might have written:
In [1]: biglist = [[0, 1, 2]]
In [2]: smalllist = biglist[0]
This is equivalent.
The important thing to see here, is that biglist is a list with one item. This one item is, of course, an object. The fact that it is a list does not conjure up some magic, it's just a simple object that happens to be a list, that we have attached to the name smalllist.
So, accessing biglist[i] is exactly the same as accessing smalllist, because they are the same object. We never made a copy, we passed the object.
In [14]: smalllist is biglist[0]
Out[14]: True
Because lists are mutable, we can change smallist, and see the change reflected in biglist. Why? Because we actually modified the object referred to by smallist. We still have the same object (apart from the fact that it's changed). But biglist will "see" that change because as its first item, it references that very same object.
In [4]: smalllist[0] = 3
In [5]: biglist
Out[5]: [[3, 1, 2]]
The same is true when we "double" the list:
In [11]: biglist *= 2
In [12]: biglist
Out[12]: [[0, 1, 2], [0, 1, 2]]
What happens is this: We have a list: [object1, object2, object3] (this is a general example)
What we get is: [object1, object2, object3, object1, object2, object3]: It will just insert (i.e. modify "biglist") all of the items at the end of the list. Again, we insert objects, we do not magically create copies.
So when we now change an item inside the first item of biglist:
In [20]: biglist[0][0]=3
In [21]: biglist
Out[21]: [[3, 1, 2], [3, 1, 2]]
We could also just have changed smalllist, because for all intents and purposes, biglist could be represented as: [smalllist, smalllist] -- it contains the very same object twice.

Categories