I have been observing this strange behaviour for a while. Now I would like to know the reason.
See the example below.
Can someone explain why - and whether there are other options more similar to the first version that do what the second does.
>>> a
>>> [1, 0, 1, 1]
>>> for el in a:
el = 1
>>> a
>>> [1, 0, 1, 1]
>>> for i in range(len(a)):
a[i] = 1
>>> a
>>> [1, 1, 1, 1]
Your first snippet:
for el in a:
Gets only the values of the items in a, they're not references to the items in the list. So when you try to reassign it, you only change the value of el, not the item in your list.
While this:
a[i]
Retrieves the items of a themselves, not just the values.
To change all the values of a, you can create a new copy and reassign it back to a:
a = [1 for _ in a]
This is the most effective way. If you want to have both the value, and the index to reassign it, use enumerate:
for index, el in enumerate(a):
print el #print the current value
a[index] = 1 #change it,
print el #and print the new one!
Hope this helps!
I generally end up using something like this:
a = [1 for el in a]
List comprehension is my preferred way of updating items in a list avoiding indices.
Pythons variables are names for objects, not names for positions in memory. Hence
el = 1
Does not change the object el is pointing to to be 1. Instead it make el now point to the object 1. It does not modify anything in the list at all. For that you have to modify the list directly:
a[2] = 1
fast enumeration gives a immutable copy of the element thus the speed advantage over the index iteration method.
for el in a:# el is immutable
where as
for i in range(len(a)):
a[i] = 1
modifies the value at its memory location as listname[<index>] is actually baseaddress+index*offset. one of the reasons why index start with zero
Related
I have the below python code where an insert a sublist tmp[] into the main list lis[]. Every time, I add the sublist to the main list, all the list elements in the main list get replaced by the sublist
tmp = [0]
lis = []
tmp[0] = 0
lis.insert(0,tmp)
print lis
tmp[0] = 1
lis.insert(1,tmp)
print lis
Output:
[[0]]
[[1], [1]]
What change should I make to get output like below
[[0]]
[[0], [1]]
Another way of doing it:
lis = []
tmp = [0]
lis.insert(0,tmp)
print(lis)
tmp = [1]
lis.insert(1,tmp)
print(lis)
Output:
[[0]]
[[0], [1]]
The technical issue is that lists are mutable, which means that you are actually inserting the same object twice instead of inserting two distinct objects. Changes to tmp are reflected in both places that you see it, the 0th and 1st entries of lis. There isn't a great way to get around this without making a new variable or overwriting the old name.
>>> tmp = [0]
>>> lis = []
>>> tmp[0] = 0
>>> lis.insert(0,tmp)
>>> print(lis)
[[0]]
>>> tmp2 = [1]
>>> lis.insert(1,tmp2)
>>> print(lis)
[[0], [1]]
This behavior in Python occurs because lists in Python are mutable.
According to the Python documentation:
Mutable objects can change their value but keep their id().
So essentially when you insert the list tmp into lis at index 0, tmp is lis[0] evaluates to True and id(tmp) == id(lis[0]) also evaluates to True. Then when you later change the element at index 0 of the list tmp from 0 to 1, because mutable objects "change their value but keep their id()", both tmp is lis[0] and id(tmp) == id(lis[0]) still evaluate to True. So therefore inserting tmp[0] into lis[1] results in duplicate elements at lis[0] and lis[1].
Solution
To mitigate this instead of using lists we can use an immutable object (as you'd expect immutable objects change their value but lose their id()) such as tuples. But of course tuples have limitations too, because tuples are immutable we cannot modify their contents after they have been initialized (through insert, remove etc.). Tuples in Python are notated the same as lists except instead of initializing them with [] (or square brackets) we use ().
For example a solution (using tuples) to your example problem would be:
tmp = (0,) # In order to initialize a tuple a comma is always required after the first element
lis = []
lis.insert(0, tmp)
print(lis)
tmp = (1,) # We cannot modify tuples therefore we must create a new one.
lis.insert(1, tmp)
print(lis)
I encountered a (in my opinion) extremely strange behavior, when looping through a list of lists. It is very difficult to explain, but here is an example code:
k = [[0],[1],[2]]
for lis in k:
lis_copy = lis
lis_copy.append(0)
print lis
When executing this, I get:
[0, 0]
[1, 0]
[2, 0]
This is very strange for me, as the list which is appended is a copy of lis,
but lis is appended as well. I would assume this list to be untouched.
For example doing the same with a list of integers the following happens:
k = [0,1,2]
for num in k:
num_copy = num
num_copy = 0
print num
Output:
0
1
2
Just as expected num is not touched by the manipulation of num_copy.
If somebody could explain why this is happening and how to avoid this,
like how to disconnect the lis_copy from is, that would be great.
Wow, I am amazed I did not encounter mayor problems before, without knowing this. I think I should review quiet some of my code. Anyway I thought this is somehow connected to the loop, this seems not to be the case, therefore I think the best explanation can be found here:
How to clone or copy a list?
This is because Python lists (and dicts) are not copied to a new list, but the new list becomes a reference to that list. if you truly want to copy the list, use deepcopy
You could use copy.copy() or copy.deepcopy()to avoid this behavior:
import copy
k = [[0],[1],[2]]
for lis in k:
lis_copy = copy.copy(lis)
lis_copy.append(0)
print lis
Output:
[0]
[1]
[2]
Source: https://docs.python.org/2/library/copy.html
Case a:
k = [[0],[1],[2]]
for lis in k:
lis_copy = lis
lis_copy.append(0)
print lis
We have a pointer to a list, and inside the loop we have another pointer made that points to the inner list objects. Then a zero is appended to each object.
Case b:
k = [0,1,2]
for num in k:
num_copy = num
num_copy = 0
print num
We have a pointer to a list, and inside the loop we have another pointer made that points to the inner integers. The difference is that in this case the pointer is changed to then point to a zero object rather than the list elements.
So...
a = [2,3,4,5]
for x in a:
x += 1
a = [2,3,4,5]
Nada.
but if I ...
a[2] += 1
a = [2,3,5,5]
Clearly my mind fails to comprehend the basics. print(x) returns only the integer within the cell so it should simply add the one automatically for each list cell. What's the solution and what am I not grasping?
In this case you are defining a new variable x, that references each element of a in turn. You cannot modify the int that x refers to, because ints are immutable in Python. When you use the += operator, a new int is created and x refers to this new int, rather than the one in a. If you created a class that wrapped up an int, then you could use your loop as-is because instances of this class would be mutable. (This isn't necessary as Python provides better ways of doing what you want to do)
for x in a:
x += 1
What you want to do is generate a new list based on a, and possibly store it back to a.
a = [x + 1 for x in a]
To understand what's happening here, consider these two pieces of code. First:
for i in range(len(a)):
x = a[i]
x += 1
Second:
for x in a:
x += 1
These two for loops do exactly the same thing to x. You can see from the first that changing the value of x doesn't change a at all; the same holds in the second.
As others have noted, a list comprehension is a good way to create a new list with new values:
new_a = [x + 1 for x in a]
If you don't want to create a new list, you can use the following patterns to alter the original list:
for i in range(len(a)): # this gets the basic idea across
a[i] += 1
for i, _ in enumerate(a): # this one uses enumerate() instead of range()
a[i] += 1
for i, x in enumerate(a): # this one is nice for more complex operations
a[i] = x + 1
If you want to +1 on elements of a list of ints:
In [775]: a = [2,3,4,5]
In [776]: b=[i+1 for i in a]
...: print b
[3, 4, 5, 6]
Why for x in a: x += 1 fails ?
Because x is an immutable object that couldn't be modified in-place. If x is a mutable object, += might work:
In [785]: for x in a:
...: x+=[1,2,3] #here x==[] and "+=" does the same thing as list.extend
In [786]: a
Out[786]: [[1, 2, 3], [1, 2, 3]]
When you say
for x in a:
x += 1
Python simply binds the name x with the items from a on each iteration. So, in the first iteration x will be referring to the item which is in the 0th index of a. But when you say
x += 1
it is equivalent to
x = x + 1
So, you are adding 1 to the value of x and making x refer to the newly created number (result of x + 1). That is why the change is not visible in the actual list.
To fix this, you can add 1 to each and every element like this
for idx in range(len(a)):
a[idx] += 1
Now the same thing happens but we are replacing the old element at index i with the new element.
Output
[3, 4, 5, 6]
Note: But we have to prefer the list comprehension way whenever possible, since it leave the original list altered but constructs a new list based on the old list. So, the same thing can be done like this
a = [item + 1 for item in a]
# [3, 4, 5, 6]
The major difference is that, earlier we were making changes to the same list now we have created a new list and make a refer to the newly created list.
In your for loop, you declare a new variable x,
for x in a
It's this variable you next adds one to
x += 1
And then you do nothing with x.
You should save the xsomewhere if you want to use it later on :)
The variable x inside the for loop is a copy of each cell in the a list. If you modify x you will not affect a.
A more "correct" way to increment each element of a list by one is using a list comprehension:
a = [elem + 1 for elem in a]
You could also use the map function:
a = map(lambda x: x + 1, a)
when you put a[2], you are reffering to the third variable in the array 'a'
because the first element which in your case is 2 is stored at a[0] similarly, 3 at a[1] ,4 at a[2] and 5 at a[3]
I was iterating through a list with a for loop, when I realized del seemed to not work. I assume this is because i is representing an object of the for loop and the del is simply deleting that object and not the reference.
And yet, I was sure I had done something like this before and it worked.
alist = [6,8,3,4,5]
for i in alist:
if i == 8:
del i
In my code its actually a list of strings, but the result is the same: even though the if conditional is satisfied, deleting i has no effect.
Is there a way I can delete a number or string in this way? Am I doing something wrong?
Thanks.
Your idea as to why you are seeing that behavior is correct. Hence, I won't go over that.
To do what you want, use a list comprehension to filter the list:
>>> alist = [6,8,3,4,5]
>>> [x for x in alist if x != 8]
[6, 3, 4, 5]
>>> alist = [6,8,8,3,4,5]
>>> [x for x in alist if x != 8]
[6, 3, 4, 5]
>>>
This approach is also a lot more efficient than a for-loop.
The for loop assigns a new value to i at each run.
So, essentially, your for loop above does
i = 6
i = 8
del i
i = 3
i = 4
i = 5
which has no effect.
del does not delete an object. It simply decrements the reference count of the object referenced by its argument. In your code
alist = [6,8,3,4,5]
for i in alist:
if i == 8:
del i
you have 6 objects: 5 separate integers, and a list of 5 references (one per integer). The for loop works by executing its body once per element in alist, with i holding a reference to a different element in alist in each iteration. Whichever object is referenced by i has a reference count of at least 2: the reference held by alist and i itself. When you call del i, you are simply decrementing its reference count by making i point to nothing.
While the following techinically works, by deleting all (known) references to the object, it has its own problems (involving modifying a list you are currently iterating over) and should not be used.
>>> alist=[6,8,3,4,5]
>>> for i, a in enumerate(alist):
... if a == 8:
... del a # delete the loop index reference
... del alist[i] # delete the reference held by the list
>>> alist
[6,3,4,5]
Instead, simply use a list comprehension to build a new list to replace the old one
alist = [ x for x in alist if x != 8 ]
If you really want to use del you need to use it on the list:
del alist[i]
(note that in this case i is an index, not the value you want to remove)
But really here you should probably just create another list using list comprehension:
[x for x in alist if x != 8]
When I have a for loop:
for row in list:
row = something_or_other
It seems that sometimes I can assign a value (or append/extend etc.) directly to row and the list changes accordingly, and sometimes I have to do something roundabout like:
for row in list:
list[list.index(row)] = something_or_other
What gives?!?
You can never reassign the value row (or in general, whatever your iterating variable is) like this:
x = [1, 2, 3]
for x in lst:
x = # code
because this is reassigning the variable x entirely (it's saying "forget that x was a member of a list").
However, if x is mutable, for example if it's a list, you can do:
lst = [[1, 2], [3, 4]]
for x in lst:
x.append(10)
and it will actually change the values (to [[1, 2, 10], [3, 4, 10]]). In technical terms, this is the difference between a rebinding and mutating operations.
Assigning to lst[lst.index(row)] results in O(n²) performance instead of O(n), and may cause errors if the list contains multiple identical items.
Instead, assign a new list, constructed with a list comprehension or map:
lst = [1,2,3,4]
doubled = [n*2 for n in lst]
Alternatively, you can use enumerate if you really want to modify the original list:
for i,n in enumerate(lst):
lst[i] = n*2
row in the for loop is just a name for the original (but re-assigning it inside the for - effectively breaks the link). So if it's mutable then you can use methods on it (such as append, add, extend etc...) which will reflect in the underlying object.
The correct idiom is to use:
for rowno, row in enumerate(some_list):
some_list[rowno] = #...