Why does list.remove() not behave as one might expect?

Why does list.remove() not behave as one might expect? - python

from pprint import *
sites = [['a','b','c'],['d','e','f'],[1,2,3]]
pprint(sites)
for site in sites:
sites.remove(site)
pprint(sites)
outputs:
[['a', 'b', 'c'], ['d', 'e', 'f'], [1, 2, 3]]
[['d', 'e', 'f']]
why is it not None, or an empty list [] ?

It's because you're modifying a list as you're iterating over it. You should never do that.
For something like this, you should make a copy of the list and iterate over that.
for site in sites[:]:
sites.remove(site)

Because resizing a collection while iterating over it is the Python equivalent to undefined behaviour in C and C++. You may get an exception or subtly wrong behaviour. Just don't do it. In this particular case, what likely happens under the hood is:
The iterator starts with index 0, stores that it is at index 0, and gives you the item stored at that index.
You remove the item at index 0 and everything afterwards is moved to the left by one to fill the hole.
The iterator is asked for the next item, and faithfully increments the index it's at by one, stores 1 as the new index, and gives you the item at that index. But because of said moving of items caused by the remove operation, the item at index 1 is the item that started out at index 2 (the last item).
You delete that.
The iterator is asked for the next item, but signals end of iteration as the next index (2) is out of range (which is now just 0..0).

Normally I would expect the iterator to bail out because of modifying the connected list. With a dictionary, this would happen at least.
Why is the d, e, f stuff not removed? I can only guess: Probably the iterator has an internal counter (or is even only based on the "fallback iteration protocol" with getitem).
I. e., the first item yielded is sites[0], i. e. ['a', 'b', 'c']. This is then removed from the list.
The second one is sites[1] - which is [1, 2, 3] because the indexes have changed. This is removed as well.
And the third would be sites[2] - but as this would be an index error, the iterator stops.

Related

How to print an item in python and remove that item after printing? [duplicate]

This question already has answers here:
How to remove items from a list while iterating?
(25 answers)
Closed 6 months ago.
I am trying to write simple code to print an item of a list and remove it after printing:
list = ['a', 'b', 'c']
for i in list:
print(i)
list.remove(i)
But output is weird:
a
c
Why is output thay way?

When you iterate over a list, you get the items in order of their indices (item 0, item 1, item 2, etc). When you remove an item from a list, the indices of all the items after that shift by one.
In the first iteration, the list is ['a', 'b', 'c'], and i is list[0].
During the first iteration, you remove 'a'.
In the second iteration, the list is ['b', 'c'], and i is list[1]. You get 'c' instead of 'b' because 'c' is now at index 1.
If you want to remove each item as you iterate, the better approach would be to iterate in a while loop as long as the list contains items, and pop as you print:
my_list = ['a', 'b', 'c']
while my_list:
print(my_list.pop(0))
In many cases, it's better to do the thing you want to do in the iteration, and then clear the list:
for i in my_list:
print(i)
my_list.clear()

You're currently iterating while removing the items, if you want alter the list while reading it then probably you want to use the length "as index":
list = ['a', 'b', 'c']
while len(list):
# pop does what you want: read the element at index [i] and remove it from the list
print(list.pop(0))
Output:
a
b
c

Explanation
the reason the output seems strange it's because you are removing items when iterating over a list.
the problem here is that python iterates checking for the index.
Consider this example:
lst = [32,43,2]
for x in lst:
lst.pop(0)
print(x,lst)
Outputs
32 [43, 2]
2 [2]
here you can see the problem. in the first iteration it took the first item that was removed, all ok. The problem starts with the second iteration.
The iterator thinks the index to go is 1 (2nd element) but it's actually the 1st since the first element was removed.
You can fix it also by iterating the reversed list as the index cannot change.
also see this question for more information
Possible solutions
You should iterate over a copy instead:
for x in mylist.copy():
mylist.remove(x)
You could also use a while loop and list.pop.
while mylist:
print(mylist.pop(0))
Advice
Before leaving, I would like to give some advice.
Don't use builtin as variable names, it causes confusion and could cause conflict in your code if it uses those builtin names.
I would advice to clear the list after the loop using the list.clear() method.
Use the list.pop method if you want to know a value and remove it at the same time.
Useful links
python datastructures official documentation
python list w3school
linuxhint python list methods

python list comprehension with cls

I encountered a snippet of code like the following:
array = ['a', 'b', 'c']
ids = [array.index(cls.lower()) for cls in array]
I'm confusing for two points:
what does [... for cls in array] mean, since cls is a reserved keyword for class, why not just using [... for s in array]?
why bother to write something complicated like this instead of just [i for i in range(len(array))].
I believe this code is written by someone more experienced with python than me, and I believe he must have some reason for doing so...

cls is not a reserved word for class. That would be a very poor choice of name by the language designer. Many programmers may use it by convention but it is no more reserved than the parameter name self.
If you use distinct upper and lower case characters in the list, you will see the difference:
array = ['a', 'b', 'c', 'B','A','c']
ids = [array.index(cls.lower()) for cls in array]
print(ids)
[0, 1, 2, 1, 0, 2]
The value at position 3 is 1 instead of 3 because the first occurrence of a lowercase 'B' is at index 1. Similarly, the value at the last positions is 2 instead of 5 because the first 'c' is at index 2.
This list comprehension requires that the array always contain a lowercase instance of every uppercase letter. For example ['a', 'B', 'c'] would make it crash. Hopefully there are other safeguards in the rest of the program to ensure that this requirement is always met.
A safer, and more efficient way to write this would be to build a dictionary of character positions before going through the array to get indexes. This would make the time complexity O(n) instead of O(n^2). It could also help make the process more robust.
array = ['a', 'b', 'c', 'B','A','c','Z']
firstchar = {c:-i for i,c in enumerate(array[::-1],1-len(array))}
ids = [firstchar.get(c.lower()) for c in array]
print(ids)
[0, 1, 2, 1, 0, 2, None]
The firstchar dictionary contains the first index in array containing a given letter. It is built by going backward through the array so that the smallest index remains when there are multiple occurrences of the same letter.
{'Z': 6, 'c': 2, 'A': 4, 'B': 3, 'b': 1, 'a': 0}
Then, going through the array to form ids, each character finds the corresponding index in O(1) time by using the dictionary.
Using the .get() method allows the list comprehension to survive an upper case letter without a corresponding lowercase value in the list. In this example it returns None but it could also be made to return the letter's index or the index of the first uppercase instance.

Some developers might be experienced, but actually terrible with the code they write and just "skate on by".
Having said that, your suggested output for question #2 would differ if the list contained two of any element. The suggested code would return the first indices where a list element occurs where as yours would give each individual items index. It would also differ if the array elements weren't lowercase.

Finding all the elements in a list between two elements (not using index, and with wrap around)

I'm trying to figure out a way to find all the elements that appear between two list elements (inclusive) - but to do it without reference to position, and instead with reference to the elements themselves. It's easier to explain with code:
I have a list like this:
['a','b','c','d','e']
And I want a function that would take, two arguments corresponding to elements eg. f('a','d'), and return the following:
['a','b','c','d']
I'd also like it to wrap around, eg. f('d','b'):
['d','e','a','b']
I'm not sure how to go about coding this. One hacky way I've thought of is duplicating the list in question (['a','b','c','d','e','a','b','c','d','e']) and then looping through it and flagging when the first element appears and when the last element does and then discarding the rest - but it seems like there would be a better way. Any suggestions?

def foo(a, b):
s, e = [a.index(x) for x in b]
if s <= e:
return a[s:e+1]
else:
return a[s:] + a[:e+1]
print(foo(['a','b','c','d','e'], ['a', 'd'])) # --> ['a', 'b', 'c', 'd']
print(foo(['a','b','c','d','e'], ['d', 'b'])) # --> ['d', 'e', 'a', 'b']

So the following obviously needs error handling as indicated below, and also, note the the index() function only takes the index of the first occurrence. You have not specified how you want to handle duplicate elements in the list.
def f(mylist, elem1, elem2):
posn_first = mylist.index(elem1) # what if it's not in the list?
posn_second = mylist.index(elem2) # ditto
if (posn_first <= posn_second):
return mylist[posn_first:posn_second+1]
else:
return mylist[posn_first:] + mylist[:posn_second+1]

This would be a simple approach, given you always want to use the first appearence of the element in the list:
def get_wrapped_values(input_list, start_element, end_element):
return input_list[input_list.index(start_element): input_list.index(end_element)+1]

Python: is there a temporary pop method?

Is there a method like pop that temporarily removes, say an element of a list, without permanently changing the original list?
One that would do the following:
list = [1,2,3,4]
newpop(list, 0) returns [2,3,4]
list is unchanged
I come from R, where I would just do c(1,2,3,4)[-4] if I wanted to temporarily remove the last element of some list, so forgive me if I'm thinking backwards here.
I know I could write a function like the following:
def newpop(list, index):
return(list[:index] + list[index+1 :]
, but it seems overly complex? Any tips would be appreciated, I'm trying to learn to think more Python and less R.

I am probably taking the "temporary" bit way too litaral, but you could define a contextmanager to pop the item from the list and insert it back in when you are done working with the list:
from contextlib import contextmanager
#contextmanager
def out(lst, idx):
x = lst.pop(idx) # enter 'with'
yield # in `with`, no need for `as` here
lst.insert(idx, x) # leave 'with'
lst = list("abcdef")
with out(lst, 2):
print(lst)
# ['a', 'b', 'd', 'e', 'f']
print(lst)
# ['a', 'b', 'c', 'd', 'e', 'f']
Note: This does not create a copy of the list. All changes you do to the list during with will reflect in the original, up to the point that inserting the element back into the list might fail if the index is no longer valid.
Also note that popping the element and then putting it back into the list will have complexity up to O(n) depending on the position, so from a performance point of view this does not really make sense either, except if you want to save memory on copying the list.
More akin to your newpop function, and probably more practical, you could use a list comprehension with enumerate to create a copy of the list without the offending position. This does not create the two temporary slices and might also be a bit more readable and less prone to off-by-one mistakes.
def without(lst, idx):
return [x for i, x in enumerate(lst) if i != idx]
print(without(lst, 2))
# ['a', 'b', 'd', 'e', 'f']
You could also change this to return a generator expression by simply changing the [...] to (...), i.e. return (x for ...). This will then be sort of a read-only "view" on the list without creating an actual copy.

Retrieving all but one value

I'm looking to retrieve all but one value from a list:
ll = ['a','b','c']
nob = [x for x in ll if x !='b']
Is there any simpler, more pythonic way to do this, with sets perhaps?

given that the element is unique in the list, you can use list.index
i = l.index('b')
l = ll[:i] +ll[i+1:]
another possibility is to use list.remove
ll.remove('b') #notice that ll will change underneath here
whatever you do, you'll always have to step through the list and compare each element, which gets slow for long lists. However, using the index, you'll get the index of the first matching element and can operate with this alone, thus avoiding to step through the remainder of the list.

list_ = ['a', 'b', 'c']
list_.pop(1)
You can also use .pop, and pass the index column, or name, that you want to pop from the list. When you print the list you will see that it stores ['a', 'c'] and 'b' has been "popped" from it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why does list.remove() not behave as one might expect? - python

from pprint import * sites = [['a','b','c'],['d','e','f'],[1,2,3]] pprint(sites) for site in sites: sites.remove(site) pprint(sites) outputs: [['a', 'b', 'c'], ['d', 'e', 'f'], [1, 2, 3]] [['d', 'e', 'f']] why is it not None, or an empty list [] ?

It's because you're modifying a list as you're iterating over it. You should never do that. For something like this, you should make a copy of the list and iterate over that. for site in sites[:]: sites.remove(site)

Related

How to print an item in python and remove that item after printing? [duplicate]

python list comprehension with cls

Finding all the elements in a list between two elements (not using index, and with wrap around)

Python: is there a temporary pop method?

Retrieving all but one value

Categories

Resources