Unexpected behavior when iterating over a list - python

I did not expect this to work, since I was modifying the object being iterated over, but I didn't expect it to fail this way. I actually expected an exception to be raised.
>>> x = [1, 2, 3]
>>> for a in x:
... print a, x.pop(0)
...
1 1
3 2
>>> x
[3]
With a slightly larger range:
>>> x = [1, 2, 3, 4]
>>> for a in x:
... print a, x.pop(0)
...
1 1
3 2
>>> x
[3, 4]
And a little bit larger still:
>>> x = [1, 2, 3, 4, 5]
>>> for a in x:
... print a, x.pop(0)
...
1 1
3 2
5 3
>>> x
[4, 5]
It's like the for loop is creating a generator from the list, but comparing the "index" to the length of the list to decide when iteration is over.
It still seems like this should generate an exception though, not this bizarre behavior. Is there some reason it doesn't raise an exception?

As you have intuited, the for loop is using an index internally, incrementing it by 1 on each iteration, and stopping when the index exceeds the length of the list. This is how iteration is defined for lists. It is defined in other ways for other types, and you can define it for your own classes by implementing __iter__.
Modifying a list while iterating over it is legal, and predictable once you understand how it works. When you remove one or more items, the items after it shift down to lower indexes, and if the item removed is at an index less than or equal to the loop's current index, you end up skipping items when the loop increments the index.
There are a number of solutions: iterate in reverse order, iterate over a copy, make a list of the indices to be deleted and do that separately, build a new list instead of modifying the existing one, use a while loop instead of an if statement if you are conditionally removing items... there are probably other approaches as well.

Related

How to iterate and modify over a numpy array?

I have an array that I would like to iterate over and modify the array itself, through inserts or deletions.
for idx, ele in enumerate(np.nditer(array)):
if idx + 1 < array.shape[0] and ele > array[idx+1]:
array = np.delete(array, idx+1)
print(ele)
Given [5, 4, 3, 2, 1] I want the loop to print out 5 3 1 because 4 and 2 are smaller than their previous elements. But because python creates an iterator based on the first instance of array so it prints 5 4 3 2 1. I want to know if I can get
Generally speaking I want the iterator to be modified if I modify the array within the body of my loop.
You cannot mutate the length of a numpy array, because numpy assigns the required memory for an array upon its creation.
With
array = np.delete(array, idx+1)
You are creating a new array on the right hand side of the = and reassign the name array.
The return value for enumerate(np.nditer(array)) has already been created at that point and won't recognize that the name array has been rebound.
In principle, you can iterate over a sequence and mutate its length at the same time (generally not a good idea). The object just needs to have methods that let you mutate its length (like lists, for example).
Consider:
>>> l = [5, 4, 3, 2, 1]
>>> for idx, ele in enumerate(l):
...: if ele == 3:
...: l.pop(idx) # mutates l
...: print(ele)
...:
5
4
3
1
>>> l
[5, 4, 2, 1]
Notice that
l is mutated.
The loop does not print 2 because popping an element reduces the indexes of all the remaining elements by one. Now l[2] == 2, but index 2 has already been visited by the iterator, so the next print-call prints l[3] which is 1.
This proves that mutations to l have effect on subsequent iterations.
Instead of looping over an array, you can use where method to find
indices of elements meeting some condition.
Then to delete the selected element (or elements), you can use
delete method, passing the source array, and a list of indices.
Then save the result, e.g. under the same variable.
To add an element, you can use append or insert methods
(for details see Numpy documentation).
I have also found a SO post concerning how to loop and delete over an array.
See Deleting elements from numpy array with iteration

Remove a specific item from a list by iterating [duplicate]

This question already has answers here:
How to remove items from a list while iterating?
(25 answers)
Closed 5 years ago.
I want to remove specific items from python list by iterating and checking if it meet some requirements. At first, I just operate on a list of customized class objects, but it actually meet some errors, and I experiment on a python list of primitive type int, just to find strange result!
Here is some code excerpts:
>>> a=[1,2,3,4,5]
>>> for i in a:
... a.remove(i)
...
>>> a
[2, 4]
I expect the a should be [] after the loop, but it proves to be [2,4], I wonder why actually. I found a related question in Remove items from a list while iterating, but it only gives a solution on how to remove specific items, not concerning the mechanism actually. I really want to know the reason of this strange result.
lets try to print the values of i and a while iterating.
for i in a:
print i, a
a.remove(i)
the output will be:
1 [1, 2, 3, 4, 5]
3 [2, 3, 4, 5]
5 [2, 4, 5]
so when you remove an element, the indices will change, so while value at index 1 was 2 earlier, now it is 3. This will be the value of i.
So you've exposed a little of the python implementation. Basically, an array powers the python list, and it is simply incrementing the array index by 1. So it'll go to a[0], a[1], a[2]... and check before each iteration that it's not gonna run off the end of the array. As you remove the first item '1' from the list, '2' moves to a[0]. The array now looks like [2,3,4,5]. The iterator is now pointing to a[1], so now '3' gets removed. Finally, skipping over '4', '5' gets removed.
a = [1,2,3,4,5]
for i in a:
print("a:%s i=%s"%(a,i))
a.remove(i)
print("final a: %s"%a)
Gives the output
a:[1, 2, 3, 4, 5] i=1
a:[2, 3, 4, 5] i=3
a:[2, 4, 5] i=5
final a: [2, 4]
Here's the real nuts and bolts if you're interested.
https://github.com/python/cpython/blob/master/Objects/listobject.c#L2832
The reason your solution doesn't work as expected is because the iterator doesn't behave the way you'd expect if the list is modified. If you're example was rewritten this way, you'd get the result you expect.
>>> a=[1,2,3,4,5]
>>> b = a[:]
>>> for i in b:
... a.remove(i)
...
>>> a
[]
This is because 'b' is a copy of 'a', so be doesn't get modified when a does. This means the iterator doesn't have the data structure modified underneath it.
A more efficient solution is:
a = [1,2,3,4,5]
a = [i for i in a if not condition(i)]
This list comprehension copies as it goes through the source list, and only bothers to copy the elements that aren't being removed.

printing items in a list represented by bit list

I have this problem on writing a python function which takes a bit list as input and prints the items represented by this bit list.
so the question is on Knapsack and it is a relatively simple and straightforward one as I'm new to the python language too.
so technically the items can be named in a list [1,2,3,4] which corresponds to Type 1, Type 2, Type 3 and etc but we won't be needing the "type". the problem is, i represented the solution in a bit list [0,1,1,1] where 0 means not taken and 1 means taken. in another words, item of type 1 is not taken but the rest are taken, as represented in the bit list i wrote.
now we are required to write a python function which takes the bit list as input and prints the item corresponding to it in which in this case i need the function to print out [2,3,4] leaving out the 1 since it is 0 by bit list. any help on this? it is a 2 mark question but i still couldn't figure it out.
def printItems(l):
for x in range(len(l)):
if x == 0:
return False
elif x == 1:
return l
i tried something like that but it is wrong. much appreciated for any help.
You can do this with the zip function that takes two tiers Lee and returns them in pairs:
for bit_item, item in zip(bit_list, item_list):
if bit_item:
print item
Or if you need a list rather than printing them, you can use a list comprehension:
[item for bit_item, item in zip(bit_list, item_list) if bit_item]
You can use itertools.compress for a quick solution:
>>> import itertools
>>> list(itertools.compress(itertools.count(1), [0, 1, 1, 1]))
[2, 3, 4]
The reason your solution doesn't work is because you are using return in your function, where you need to use print, and make sure you are iterating over your list correctly. In this case, enumerate simplifies things, but there are many similar approaches that would work:
>>> def print_items(l):
... for i,b in enumerate(l,1):
... if b:
... print(i)
...
>>> print_items([0,1,1,1])
2
3
4
>>>
You may do it using list comprehension with enumerate() as:
>>> my_list = [0, 1, 1, 1]
>>> taken_list = [i for i, item in enumerate(my_list, 1) if item]
>>> taken_list # by default start with 0 ^
[2, 3, 4]
Alternatively, in case you do not need any in-built function and want to create your own function, you may modify your code as:
def printItems(l):
new_list = []
for x in range(len(l)):
if l[x] == 1:
new_list.append(x+1) # "x+1" because index starts with `0` and you need position
return new_list
Sample run:
>>> printItems([0, 1, 1, 1])
[2, 3, 4]

Python Variable assignment in a for loop

I understand that in Python regular c++ style variable assignment is replaced by references to stuff ie
a=[1,2,3]
b=a
a.append(4)
print(b) #gives [1,2,3,4]
print(a) #gives [1,2,3,4]
but I'm still confused why an analogous situation with basic types eg. integers works differently?
a=1
b=a
a+=1
print(b) # gives 1
print(a) # gives 2
But wait, it gets even more confusing when we consider loops!
li=[1,2,3]
for x in li:
x+=1
print(li) #gives [1,2,3]
Which is what I expected, but what happens if we do:
a,b,c=1,2,3
li=[a,b,c]
for x in li:
x+=1
print(li) #gives [1,2,3]
Maybe my question should be how to loop over a list of integers and change them without map() as i need a if statement in there. The only thing I can come up short of using
for x in range(len(li)):
Do stuff to li[x]
is packaging the integers in one element list. But there must be a better way.
Well, you need to think of mutable and immutable type.
For a list, it's mutable.
For a integer, it's immutable, which means you will refer to a new object if you change it. When a+=1 is executed, a will be assigned a new object, but b is still refer to the same one.
a=[1,2,3]
b=a
a.append(4)
print(b) #[1,2,3,4]
print(a) #[1,2,3,4]
Here you are modifying the list. The list content changes, but the list identity remains.
a=1
b=a
a+=1
This, however, is a reassignment. You assign a different object to a.
Note that if you did a += [4] in the 1st example, you would have seen the same result. This comes from the fact that a += something is the same as a = a.__iadd__(something), with a fallback to a = a.__add__(something) if __iadd__() doesn't exist.
The difference is that __iadd__() tries to do its job "inplace", by modifying the object it works on and returning it. So a refers to the same as before. This only works with mutable objects such as lists.
On immutable objects such as ints __add__() is called. It returns a different object, which leads to a pointing to another object than before. There is no other choice, as ints are immutable.
a,b,c=1,2,3
li=[a,b,c]
for x in li:
x+=1
print(li) #[1,2,3]
Here x += 1 means the same as x = x + 1. It changes where x refers to, but not the list contents.
Maybe my question should be how to loop over a list of integers and change them without >map() as i need a if statement in there.
for i, x in enumerate(li):
li[i] = x + 1
assigns to every list position the old value + 1.
The important thing here are the variable names. They really are just keys to a dictionary. They are resolved at runtime, depending on the current scope.
Let's have a look what names you access in your code. The locals function helps us: It shows the names in the local scope (and their value). Here's your code, with some debugging output:
a = [1, 2, 3] # a is bound
print(locals())
for x in a: # a is read, and for each iteration x is bound
x = x + 3 # x is read, the value increased and then bound to x again
print(locals())
print(locals())
print(x)
(Note I expanded x += 3 to x = x + 3 to increase visibility for the name accesses - read and write.)
First, you bind the list [1, 2, 3]to the name a. Then, you iterate over the list. During each iteration, the value is bound to the name x in the current scope. Your assignment then assigns another value to x.
Here's the output
{'a': [1, 2, 3]}
{'a': [1, 2, 3], 'x': 4}
{'a': [1, 2, 3], 'x': 5}
{'a': [1, 2, 3], 'x': 6}
{'a': [1, 2, 3], 'x': 6}
6
At no point you're accessing a, the list, and thus will never modify it.
To fix your problem, I'd use the enumerate function to get the index along with the value and then access the list using the name a to change it.
for idx, x in enumerate(a):
a[idx] = x + 3
print(a)
Output:
[4, 5, 6]
Note you might want to wrap those examples in a function, to avoid the cluttered global namespace.
For more about scopes, read the chapter in the Python tutorial. To further investigate that, use the globals function to see the names of the global namespace. (Not to be confused with the global keyword, note the missing 's'.)
Have fun!
For a C++-head it easiest tho think that every Python object is a pointer. When you write a = [1, 2, 3] you essentially write List * a = new List(1, 2, 3). When you write a = b, you essentially write List * b = a.
But when you take out actual items from the lists, these items happen to be numbers. Numbers are immutable; holding a pointer to an immutable object is about as good as holding this object by value.
So your for x in a: x += 1 is essentially
for (int x, it = a.iterator(); it->hasMore(); x=it.next()) {
x+=1; // the generated sum is silently discarded
}
which obviously has no effect.
If list elements were mutable objects you could mutate them exactly the way you wrote. See:
a = [[1], [2], [3]] # list of lists
for x in a: # x iterates over each sub-list
x.append(10)
print a # prints [[1, 10], [2, 10], [3, 10]]
But unless you have a compelling reason (e.g. a list of millions of objects under heavy memory load) you are better off making a copy of the list, applying a transformation and optionally a filter. This is easily done with a list comprehension:
a = [1, 2, 3, 0]
b = [n + 1 for n in a] # [2, 3, 4, 1]
c = [n * 10 for n in a if n < 3] # [10, 20, 0]
Either that, or you can write an explicit loop that creates another list:
source = [1, 2, 3]
target = []
for n in source:
n1 = <many lines of code involving n>
target.append(n1)
Your question has multiple parts, so it's going to be hard for one answer to cover all of them. glglgl has done a great job on most of it, but your final question is still unexplained:
Maybe my question should be how to loop over a list of integers and change them without map() as i need a if statement in there
"I need an if statement in there" doesn't mean you can't use map.
First, if you want the if to select which values you want to keep, map has a good friend named filter that does exactly that. For example, to keep only the odd numbers, but add one to each of them, you could do this:
>>> a = [1, 2, 3, 4, 5]
>>> b = []
>>> for x in a:
... if x%2:
... b.append(x+1)
Or just this:
>>> b = map(lambda x: x+1, filter(lambda x: x%2, a))
If, on the other hand, you want the if to control the expression itself—e.g., to add 1 to the odd numbers but leave the even ones alone, you can use an if expression the same way you'd use an if statement:
>>> for x in a:
... if x%2:
... b.append(x+1)
... else:
... b.append(x)
>>> b = map(lambda x: x+1 if x%2 else x, a)
Second, comprehensions are basically equivalent to map and filter, but with expressions instead of functions. If your expression would just be "call this function", then use map or filter. If your function would just be a lambda to "evaluate this expression", then use a comprehension. The above two examples get more readable this way:
>>> b = [x+1 for x in a if x%2]
>>> b = [x+1 if x%2 else x for x in a]
You can do something like this: li = [x+1 for x in li]

having trouble understanding this code

I just started learning recursion and I have an assignment to write a program that tells the nesting depth of a list. Well, I browsed around and found working code to do this, but I'm still having trouble understanding how it works. Here's the code:
def depth(L) :
nesting = []
for c in L:
if type(c) == type(nesting) :
nesting.append(depth(c))
if len(nesting) > 0:
return 1 + max(nesting)
return 1
So naturally, I start to get confused at the line with the append that calls recursion. Does anyone have a simple way of explaining what's going on here? I'm not sure what is actually being appended, and going through it with test cases in my head isn't helping. Thanks!
edit: sorry if the formatting is poor, I typed this from my phone
Let me show it to you the easy way, change the code like this:
(### are the new lines I added to your code so you can watch what is happening there)
def depth(L) :
nesting = []
for c in L:
if type(c) == type(nesting) :
print 'nesting before append', nesting ###
nesting.append(depth(c))
print 'nesting after append', nesting ###
if len(nesting) > 0:
return 1 + max(nesting)
return 1
Now lets make a list with the depth of three:
l=[[1,2,3],[1,2,[4]],'asdfg']
You can see our list has 3 element. one of them is a list, the other is a list which has another list in itself and the last one is a string. You can clearly see the depth of this list is 3 (i.e there are 2 lists nested together in the second element of the main list)
Lets run this code:
>>> depth(l)
nesting before append []
nesting after append [1]
nesting before append [1]
nesting before append []
nesting after append [1]
nesting after append [1, 2]
3
Piece of cake! this function appends 1 to the nesting. then if the element has also another list it appends 1 + maximum number in nesting which is the number of time function has been called itself. and if the element is a string, it skips it.
At the end, it returns the maximum number in the nesting which is the maximum number of times recursion happened, which is the number of time there is a list inside list in the main list, aka depth. In our case recursion happened twice for the second element + 1=3 as we expected.
If you still have problem getting it, try to add more print statements or other variables to the function and watch them carefully and eventually you'll get it.
So what this seems to be is a function that takes a list and calculates, as you put it, the nesting depth of it. nesting is a list, so what if type(c) == type(nesting) is saying is: if the item in list L is a list, run the function again and append it and when it runs the function again, it will do the same test until there are no more nested lists in list L and then return 1 + the max amount of nested lists because every list has a depth of 1.
Please tell me if any of this is unclear
Let's start with a couple of examples.
First, let's consider a list with only one level of depth. For Example, [1, 2, 3].
In the above list, the code starts with a call to depth() with L = [1, 2, 3]. It makes an empty list nesting. Iterates over all the elements of L i.e 1, 2, 3 and does not find a single element which passes the test type(c) == type(nesting). The check that len(nesting) > 0 fails and the code returns a 1, which is the depth of the list.
Next, let's take an example with a depth of 2, i.e [[1, 2], 3]. The function depth() is called with L = [[1, 2], 3] and an empty list nesting is created. The loop iterates over the 2 elements of L i.e [1, 2] , 3 and since type([1, 2]) == type(nesting), nesting.append(depth(c)) is called. Similar to the previous example, depth(c) i.e depth([1, 2]) returns a 1 and nesting now becomes [1]. After the execution of the loop, the code evaluates the test len(nesting) > 0 which results in True and 1 + max(nesting) which is 1 + 1 = 2 is returned.
Similarly, the code follows for the depth 3 and so on.
Hope this was helpful.
This algorithm visits the nested lists and adds one for each level of recursion. The call chain is like this:
depth([1, 2, [3, [4, 5], 6], 7]) =
1 + depth([3, [4, 5], 6]) = 3
1 + depth([4, 5]) = 2
1
Since depth([4,5]) never enters the if type(c) == type(nesting) condition because no element is a list, it returns 1 from the outer return, which is the base case.
In the case where, for a given depth, you have more than one nested list, e.g. [1, [2, 3], [4, [5, 6]], both the max depth of [2,3]and [4, [5, 6]] are appended on a depth call, of which the max is returned by the inside return.

Categories