Changing one list unexpectedly changes another, too [duplicate] - python

This question already has answers here:
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
Closed 4 years ago.
I have a list of the form
v = [0,0,0,0,0,0,0,0,0]
Somewhere in the code I do
vec=v
vec[5]=5
and this changes both v and vec:
>>> print vec
[0, 0, 0, 0, 0, 5, 0, 0, 0]
>>> print v
[0, 0, 0, 0, 0, 5, 0, 0, 0]
Why does v change at all?

Why does v change at all?
vec and v are both references.
When coding vec = v you assign v address to vec.
Therefore changing data in v will also "change" vec.
If you want to have two different arrays use:
vec = list(v)

Because v is pointed to the same list as vec is in memory.
If you do not want to have that you have to make a
from copy import deepcopy
vec = deepcopy(v)
or
vec = v[:]

Python points both lists in vec = v to the same spot of memory.
To copy a list use vec = v[:]
This might all seem counter-intuitive. Why not make copying the list the default behavior? Consider the situation
def foo():
my_list = some_function()
# Do stuff with my_list
Wouldn't you want my_list to contain the exact same list that was created in some_function and not have the computer spend extra time creating a copy. For large lists copying the data can take some time. Because of this reason, Python does not copy a list upon assignment.
Misc Notes:
If you're familiar with languages that use pointers. Internally, in the resulting assembly language, vec and v are just pointers that reference the address in memory where the list starts.
Other languages have been able to overcome the obstacles I mentioned through the use of copy on write which allows objects to share memory until they are modified. Unfortunately, Python never implemented this.
For other ways of copying a list, or to do a deep copy, see List changes unexpectedly after assignment. Why is this and how can I prevent it?

Run this code and you will understand why variable v changes.
a = [7, 3, 4]
b = a
c = a[:]
b[0] = 10
print 'a: ', a, id(a)
print 'b: ', b, id(b)
print 'c: ', c, id(c)
This code prints the following output on my interpreter:
a: [10, 3, 4] 140619073542552
b: [10, 3, 4] 140619073542552
c: [7, 3, 4] 140619073604136
As you can see, lists a and b point to the same memory location. Whereas, list c is a different memory location altogether. You can say that variables a and b are alias for the same list. Thus, any change done to either variable a or b will be reflected in the other list as well, but not on list c
Hope this helps! :)

you could use
vec=v[:] #but
"Alex Martelli's opinion (at least back in 2007) about this is, that it is a weird syntax and it does not make sense to use it ever. ;) (In his opinion, the next one is more readable)."
vec=list(v)
I mean it was Erez's link... "How to clone or copy a list in Python?"

Related

Why does concatenating a list to another one creates another object in memory, whereas other manipulations induce mutation? [duplicate]

This question already has answers here:
Object id in Python
(4 answers)
Closed 5 years ago.
Case A:
list1=[0, 1, 2, 3]
list2=list1
list1=list1+[4]
print(list1)
print(list2)
Output:
[0, 1, 2, 3, 4]
[0, 1, 2, 3]
(This non-mutating behavior also happens when concatenating a list of more than a single entry, and when 'multiplying' the list, e.g. list1=list1*2, in fact any type of "re-assignment" that performs an operation with an infix operator to the list and then assigns the result of that operation to the same list name using "=" )
In this case the original list object that list1 pointed to has not been altered in memory and list2 still points to it, another object has simply been created in memory for the result of the concatenation that list1 now points to (there are now two distinct, different list objects in memory)
Case B:
list1=[0, 1, 2, 3]
list2=list1
list1.append(4)
print(list1)
print(list2)
---
Output :
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
Case C:
list1=[0, 1, 2, 3]
list2=list1
list1[-1]="foo"
print(list1)
print(list2)
Outputs:
[0, 1, 2, 'foo']
[0, 1, 2, 'foo']
In case B and C the original list object that list1 points to has mutated, list2 still points to that same object, and as a result the value of list2 has changed. (there is still one single list object in memory and it has mutated).
This behavior seems inconsistent to me as a noob. Is there a good reason/utility for this?
EDIT :
I changed the list variables names from "list" and "list_copy" to "list1" and "list2" as this was clearly a very poor and confusing choice of names.
I chose Kabanus' answer as I liked how he pointed out that mutating operations are always(?) explicit in python.
In fact a short and simple answer to my question can be done summarizing Kabanus' answer into two of his statements :
-"In python mutating operations are explicit"
-"The addition[or multiplication] operator [performed on list objects] creates a new object, and doesn't change x[the list object] implicitly."
I could also add:
-"Every time you use square brackets to describe a list, that creates a new list object"
[this is from : http://www-inst.eecs.berkeley.edu/~selfpace/cs9honline/Q2/mutation.html , great explanations there on this topic]
Also I realized after Kabanus' answer how careful one must be tracking mutations in a program :
l=[1,2]
z=l
l+=[3]
z=z+[3]
and
l=[1,2]
z=l
z=z+[3]
l+=[3]
Will yield completely different values for z. This must be a frequent source of errors isn't it?
I'm only in the beginning of my learning and haven't delved deeply into OOP concepts just yet, but I think I'm already starting to understand what the fuss around functional paradigm is about...
l += [4]
is equivalent to:
l.append(4)
and won't create a copy.
l = l + [4]
is an assignment to l, it first evaluates the right side of the assignment, then assigns the resulting object to name l. There is no way for this operation to be mutating l.
Update: I guess I haven't made myself clear enough. Of course operations on the RHS of the assignment may involve mutating the object that is the current value of LHS; but finally, the result of computing RHS is assigned to the LHS, thus overwriting any previous mutations. Example:
def increment_first(x):
x[0] += 1
return []
l = [ 1 ]
l = increment_first(l)
While the call to increment_first will increment l[0] as its side effect, the mutated list object will be lost anyway as soon as the value of RHS (in this case - an empty list) is assigned to l.
This is by design. The point is python does not like non explicit side effects. Suppose this valid line in your file:
x=[1,2]
print(x+[3,4])
Note there is no assignment, but it's still a valid line. Do you expect x to have changed after that second line? For me it doesn't make sense.
That's what you're seeing - the addition operator creates a new object, and doesn't change x implicitly. If you feel it should, then what about:
[3,4]+x
Of course, addition does not change behavior in assignment, to avoid confusion.
In python mutating operations are explicit:
x+=[3,4]
Or your example:
x[0]=1
Here you are explicitly asking to change a cell, i.e. explicit mutation. These things are consistent - an operation is always a mutation or it isn't, and it won't be both. Usually it makes sense as well, such as concatenating lists on the fly.

Deleting first element of a list in Python

I am aware that if I wish to delete only the first element of a list t in Python I can do it with :
del t[0]
which is about as straightforward as it gets. However :
t = t[1:]
also works. In the textbook that I am learning from it says it is generally considered bad practice to use the latter approach since it does not delete the head of the list per se but
"The slice operator creates a new list and the assignment makes t refer to it, but none of that
has any effect on the list that was passed as an argument."
Why is this bad ? Can you name an example where such an approach would significantly alter a function ? Thanks in advance.
There are multiple reasons this is not a good idea:
Creating a new list just makes unnecessary work making the new list and deallocating the old list. And in between the two steps, twice the memory is used (because the original list and new list are alive at the same time, just prior to the assignment).
If something else refers to the same list, it does not get updated: u = t; del t[0] changes both u and t. But u = t; t = t[1:] assigns the new list to t while leaving u unchanged.
Lastly, del t[0] is clearer about its intension to remove the element than the more opaque t = t[1:].
Consider a function with the two implementations:
def remove_first_a(t):
t = t[1:]
def remove_first_b(t):
del t[0]
Now, see those functions in use:
> l = [1, 2, 3]
> remove_first_a(l)
> l
[1, 2, 3]
> remove_first_b(l)
> l
[2, 3]
The first implementation only reassigns the local variable t which has no effect on the object that was passed as a parameter. The second implementation actually mutates that object. The first function is rather useless in its present shape. You could change it:
def remove_first_a(t):
return t[1:]
> l = [1, 2, 3]
> x = remove_first_b(l)
> x
[2, 3]
Whether you want one or the other, depends more on the actual use case. Sometimes you want the original list to still be around unchanged for later use, and sometimes you want to make sure the original gets changed in all places that still have a reference to it.
just a example for the del and slice.
In [28]: u = t = [1, 2,3,4]
In [30]: id(u) == id(t) # now the id is same,they point one obj
Out[30]: True
if we use the del operator.
In [31]: del t[0]
In [32]: t
Out[32]: [2, 3, 4]
In [33]: u
Out[33]: [2, 3, 4]
but if we use the slice operator.
In [35]: t = t[1:]
In [36]: t
Out[36]: [2, 3, 4]
In [37]: id(t) == id(u)
Out[37]: False
In [39]: u
Out[39]: [1, 2, 3, 4]
and we found that t and u point different obj now.so we deal the list t, the list u is not change.

Numpy vs built-in copy list

what is the difference below codes
built-in list code
>>> a = [1,2,3,4]
>>> b = a[1:3]
>>> b[1] = 0
>>> a
[1, 2, 3, 4]
>>> b
[2, 0]
numpy array
>>> c = numpy.array([1,2,3,4])
>>> d = c[1:3]
>>> d[1] = 0
>>> c
array([1, 2, 0, 4])
>>> d
array([2, 0])
as it is seen in numpy array c is effected directly. I think in built-in lists, new memory is allocated for the variable b. Probably in numpy the reference of c[1:3] is assigned d, I am not clear about these.
How this works for numpy and built-in?
The key point to understand is that every assignment in Python associates a name with an object in memory. Python never copies on assignment. It now becomes important to understand when new objects are created and how they behave.
In your first example, the slicing in the list creates a new list object. In this case, both of the lists reference some of the same objects (the int 2 and the int 3). The fact that these references are copied is what is called a "shallow" copy. In other words, the references are copied, but the objects they refer to are still the same. Keep in mind that this will be true regardless of the type of thing that is stored in the list.
Now, we create a new object (the int 0) and assign b[1] = 0. Because a and b are separate lists, it should not surprise us that they now show different elements.
I like the pythontutor visualisation of this situation.
In the array case, "All arrays generated by basic slicing are always views of the original array.".
This new object shares data with the original, and indexed assignment is handled in such a way that any updates to the view will update the shared data.
This has been covered alot, but finding a good duplicate is too much work. :(
Let's see if I can quickly describe things with your examples:
>>> a = [1,2,3,4] # a list contains pointers to numbers elsewhere
>>> b = a[1:3] # a new list, with copies of those pointers
>>> b[1] = 0 # change one pointer in b
>>> a
[1, 2, 3, 4] # does not change any pointers in a
>>> b
[2, 0]
An array has a different structure - it has a data buffer with 'raw' numbers (or other byte values).
numpy array
>>> c = numpy.array([1,2,3,4])
>>> d = c[1:3] # a view; a new array but uses same data buffer
>>> d[1] = 0 # change a value in d;
>>> c
array([1, 2, 0, 4]) # we see the change in the corrsponding slot of c
>>> d
array([2, 0])
The key point with lists is that they contain pointers to objects. You can copy the pointers without copying the objects; and you can change pointers without changing other copies of the pointers.
To save memory and speed numpy as implemented a concept of view. It can make a new array without copying values from the original - because it can share the data buffer. But it is also possible to make a copy, e.g.
e = c[1:3].copy()
e[0] = 10
# no change in c
view v copy is a big topic in numpy and a fundamental one, especially when dealing with different kinds of indexing (slices, basic, advanced). We can help with questions, but you also should read the numpy docs. There's no substitute for understanding the basics of how a numpy array is stored.
http://scipy-cookbook.readthedocs.io/items/ViewsVsCopies.html
http://www.scipy-lectures.org/advanced/advanced_numpy/ (may be more advanced that what you need now)

Variable assignment and manipulation in Python [duplicate]

This question already has answers here:
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
Why does a[0] change? [duplicate]
(1 answer)
Closed 7 years ago.
I am pretty confused with the difference in results of the Python codes below. Why in first case "a" does not change upon the change of "b", but in second one does?
a=b=[]
b = [1,2,3]
print b,a
########## compare with below ################################
a=b=[]
b.append([1,2,3])
print b,a
Result of first part is:
[1,2,3] [ ]
and result of the second part is:
[[1,2,3]] [[1,2,3]]
It was suggested that it is a duplicate question but I think here I am changing the list "b" with two different ways and I guess it could show how these two methods of variable manipulation could be different.
Because = is not the same as append.
When you do b = [1, 2, 3], you assign a new value to b. This does not involve the old value of b in any way; you can do this no matter what value b held before.
When you do b.append(...), you modify the object that is the existing value of b. If that object is also the value of other names (in this case, a), then those names will also "see" the change. Note that, unlike assignment, these types of operations depend on what kind of value you have. You can do b.append(...) because b is a list. If b was, say, an integer, you could not do b.append, but you could still do b = [1, 2, 3].
In the first example, you are reassigning the variable b, and now it points to another list.
a = b = [] # a -> [] <- b
b = [1, 2, 3] # a -> [], b -> [1, 2, 3]
In the second example, both a and b variables are pointing to the same list, and you are inserting values in the list:
a = b = [] # a -> [] <- b
b.append([1, 2, 3]) # a -> [[1, 2, 3]] <- b
To elaborate on the other answers, this is straight from the Python docs:
list.append(x)
Add an item to the end of the list; equivalent to a[len(a):] = [x].
Looking at:
a[len(a):] = [x]
it is easy to see that using append is modifying the original list (a in the python sample code, b in your case).
As the others people have pointed out, using an assignment statement is assigning a new value to b, instead of modifying it via append.

Problem with matrices in Python [duplicate]

This question already has answers here:
Multiply operator applied to list(data structure)
(2 answers)
Closed 9 years ago.
I was writing a programm in Python (2.5.4), and I realized that my code was not working because of something very unusual. I am going to give an example:
A = [[0]*2]*2
When I print A, I get:
[[0, 0], [0, 0]]
That is ok. But now I want to change the element in the first column and firts row. So I type:
A[0][0] = 1
But when I print A again, I get:
[[1, 0], [1, 0]]
However, I was expecting
[[1, 0], [0, 0]]
This is ruinning all my code. I want to know why this is happening and how I can fix it.
On the other hand, when I type:
B = [[0,0],[0,0]]
And make:
B[0][0] = 1
I get:
[[1, 0], [0, 0]]
This is even stranger! Aren't the two ways of implementing matrices equivalent? What if I wanted a 100x100 matrix with zeros? For this case, with a 2x2 matrix, I can type [[0, 0], [0, 0]]. But that is not a good solution.
This is because your list contains several references to a list.
>>> a = [0]
>>> l = [a,a]
>>> l[0][0] = "A"
>>> l
[['A'], ['A']]
We create a list and binded it to a. We then store two references to a in the list l via l=[a,a]. Then we manipulate one reference to a, and change it's first element to "A". Since a reference refers to a location in memory, my manipulating that reference (either element in l) we change the value in memory, hence affecting all other references to a.
This illustration, depicts the example above. The arrows represent a reference to a. They are the a's in l = [a,a]. When you change one of them, you change the value which they both point to. That interaction could be depicted like this:
We manipulate a via manipulating l[0] (l[0] is a reference to a), as such we can change the first element in a by changing l[0][0] (which would be the same as a[0]) to "A".
A depiction your list [[0]*2]*2 would look like this
"What if you wanted a 100 x 100 matrix of zeros?"
Use a list comprehension:
[[0] * 100 for x in range(100)]

Categories