Delete many elements of list (python) - python

I have a list L.
I can delete element i by doing:
del L[i]
But what if I have a set of non contiguous indexes to delete?
I=set([i1, i2, i3,...])
Doing:
for i in I:
del L[i]
Won't work.
Any ideas?

Eine Minuten bitte, Ich hap eine
kleine Problemo avec diese Religione.
-- Eddie Izzard (doing his impression
of Martin Luther)
Deleting by reverse-iterating over a list to preserve the iterator is a common solution to this problem. But another solution is to change this into a different problem. Instead of deleting items from the list using some criteria (in your case, the index exists in a list of indexes to be deleted), create a new list that leaves out the offending items.
L[:] = [ item for i,item in enumerate(L) if i not in I ]
For that matter, where did you come up with the indexes in I in the first place? You could combine the logic of getting the indexes to be removed and building the new list. Assuming this is a list of objects and you only want to keep those that pass an isValid test:
L[:] = [ item for item in L if item.isValid() ]
This is much more straightforward than:
I = set()
for i in range(len(L)):
if not L[i].isValid():
I.add(i)
for i in sorted(I, reverse=True):
del L[i]
For the most part, I turn any question about "how to delete from a list the items that I don't want" into "how to create a new list containing just the items I want".
EDITED: changed "L = ..." to "L[:] = ..." per Alex Martelli's answer to this question.

for i in I:
del L[i]
won't work, because (depending on the order) you may invalidate the iterator -- this will usually show up as some items which you intended to delete remaining in the list.
It's always safe to delete items from the list in the reverse order of their indices. The easiest way to do this is with sorted():
for i in sorted(I, reverse=True):
del L[i]

You can use numpy.delete as follows:
import numpy as np
a = ['a', 'l', 3.14, 42, 'u']
I = [1, 3, 4]
np.delete(a, I).tolist()
# Returns: ['a', '3.14']
If you don't mind ending up with a numpy array at the end, you can leave out the .tolist(). You should see some pretty major speed improvements, too, making this a more scalable solution. I haven't benchmarked it, but numpy operations are compiled code written in either C or Fortran.

If your original list data can safely be turned into a set (i.e. all unique values and doesn't need to maintain order), you could also use set operations:
Lset = set(L)
newset = Lset.difference(I)
You could also maybe do something with a Bag/Multiset, though it probably isn't worth the effort. Paul McGuire's second listcomp solution is certainly best for most cases.

L = [ item for item in L if L.index(item) not in I ]

Related

Python List Indexing or Appending?

What is the best way to add values to a List in terms of processing time, memory usage and just generally what is the best programming option.
list = []
for i in anotherArray:
list.append(i)
or
list = range(len(anotherArray))
for i in list:
list[i] = anotherArray[i]
Considering that anotherArray is for example an array of Tuples. (This is just a simple example)
It really depends on your use case. There is no generic answer here as it depends on what you are trying to do.
In your example, it looks like you are just trying to create a copy of the array, in which case the best way to do this would be to use copy:
from copy import copy
list = copy(anotherArray)
If you are trying to transform the array into another array you should use list comprehension.
list = [i[0] for i in anotherArray] # get the first item from tuples in anotherArray
If you are trying to use both indexes and objects, you should use enumerate:
for i, j in enumerate(list)
which is much better than your second example.
You can also use generators, lambas, maps, filters, etc. The reason all of these possibilities exist is because they are all "better" for different reasons. The writters of python are pretty big on "one right way", so trust me, if there was one generic way which was always better, that is the only way that would exist in python.
Edit: Ran some results of performance for tuple swap and here are the results:
comprehension: 2.682028295999771
enumerate: 5.359116118001111
for in append: 4.177091988000029
for in indexes: 4.612594166001145
As you can tell, comprehension is usually the best bet. Using enumerate is expensive.
Here is the code for the above test:
from timeit import timeit
some_array = [(i, 'a', True) for i in range(0,100000)]
def use_comprehension():
return [(b, a, i) for i, a, b in some_array]
def use_enumerate():
lst = []
for j, k in enumerate(some_array):
i, a, b = k
lst.append((b, a, i))
return lst
def use_for_in_with_append():
lst = []
for i in some_array:
i, a, b = i
lst.append((b, a, i))
return lst
def use_for_in_with_indexes():
lst = [None] * len(some_array)
for j in range(len(some_array)):
i, a, b = some_array[j]
lst[j] = (b, a, i)
return lst
print('comprehension:', timeit(use_comprehension, number=200))
print('enumerate:', timeit(use_enumerate, number=200))
print('for in append:', timeit(use_for_in_with_append, number=200))
print('for in indexes:', timeit(use_for_in_with_indexes, number=200))
Edit2:
It was pointed out to me the the OP just wanted to know the difference between "indexing" and "appending". Really, those are used for two different use cases as well. Indexing is for replacing objects, whereas appending is for adding. However, in a case where the list starts empty, appending will always be better because the indexing has the overhead of creating the list initially. You can see from the results above that indexing is slightly slower, mostly because you have to create the first list.
Best way is list comprehension :
my_list=[i for i in anotherArray]
But based on your problem you can use a generator expression (is more efficient than list comprehension when you just want to loop over your items and you don't need to use some list methods like indexing or len or ... )
my_list=(i for i in anotherArray)
I would actually say the best is a combination of index loops and value loops with enumeration:
for i, j in enumerate(list): # i is the index, j is the value, can't go wrong

I can't delete a list of used numbers from another list of lists [duplicate]

This question already has answers here:
How to remove items from a list while iterating?
(25 answers)
Is there a simple way to delete a list element by value?
(25 answers)
Closed 8 months ago.
Given a list of numbers:
L = [1, 2, 3, 4, 5]
How do I delete an element, let's say 3, from the list while I iterate over it?
I tried the following code but it didn't do it:
for el in L:
if el == 3:
del el
Best is usually to proceed constructively -- build the new list of the items you want instead of removing those you don't. E.g.:
L[:] = [el for el in L if el != 3]
the list comprehension builds the desired list and the assignment to the "whole-list slice", L[:], ensure you're not just rebinding a name, but fully replacing the contents, so the effects are identically equal to the "removals" you wanted to perform. This is also fast.
If you absolutely, at any cost, must do deletions instead, a subtle approach might work:
>>> ndel = 0
>>> for i, el in enumerate(list(L)):
... if el==3:
... del L[i-ndel]
... ndel += 1
nowhere as elegant, clean, simple, or well-performing as the listcomp approach, but it does do the job (though its correctness is not obvious at first glance and in fact I had it wrong before an edit!-). "at any cost" applies here;-).
Looping on indices in lieu of items is another inferior but workable approach for the "must do deletions" case -- but remember to reverse the indices in this case...:
for i in reversed(range(len(L))):
if L[i] == 3: del L[i]
indeed this was a primary use case for reversed back when we were debating on whether to add that built-in -- reversed(range(... isn't trivial to obtain without reversed, and looping on the list in reversed order is sometimes useful. The alternative
for i in range(len(L) - 1, -1, -1):
is really easy to get wrong;-).
Still, the listcomp I recommended at the start of this answer looks better and better as alternatives are examined, doesn't it?-).
for el in L:
if el == 2:
del L[el]

python list.iteritems replacement

I've got a list in which some items shall be moved into a separate list (by a comparator function). Those elements are pure dicts. The question is how should I iterate over such list.
When iterating the simplest way, for element in mylist, then I don't know the index of the element. There's no .iteritems() methods for lists, which could be useful here. So I've tried to use for index in range(len(mylist)):, which [1] seems over-complicated as for python and [2] does not satisfy me, since range(len()) is calculated once in the beginning and if I remove an element from the list during iteration, I'll get IndexError: list index out of range.
Finally, my question is - how should I iterate over a python list, to be able to remove elements from the list (using a comparator function and put them in another list)?
You can use enumerate function and make a temporary copy of the list:
for i, value in enumerate(old_list[:]):
# i == index
# value == dictionary
# you can safely remove from old_list because we are iterating over copy
Creating a new list really isn't much of a problem compared to removing items from the old one. Similarly, iterating twice is a very minor performance hit, probably swamped by other factors. Unless you have a very good reason to do otherwise, backed by profiling your code, I'd recommend iterating twice and building two new lists:
from itertools import ifilter, ifilterfalse
l1 = list(ifilter(condition, l))
l2 = list(ifilterfalse(condition, l))
You can slice-assign the contents of one of the new lists into the original if you want:
l[:] = l1
If you're absolutely sure you want a 1-pass solution, and you're absolutely sure you want to modify the original list in place instead of creating a copy, the following avoids quadratic performance hits from popping from the middle of a list:
j = 0
l2 = []
for i in range(len(l)):
if condition(l[i]):
l[j] = l[i]
j += 1
else:
l2.append(l[i])
del l[j:]
We move each element of the list directly to its final position without wasting time shifting elements that don't really need to be shifted. We could use for item in l if we wanted, and it'd probably be a bit faster, but when the algorithm involves modifying the thing we're iterating over, I prefer the explicit index.
I prefer not to touch the original list and do as #Martol1ni, but one way to do it in place and not be affected by the removal of elements would be to iterate backwards:
for i in reversed(range(len()):
# do the filtering...
That will affect only the indices of elements that you have tested/removed already
Try the filter command, and you can override the original list with it too if you don't need it.
def cmp(i): #Comparator function returning a boolean for a given item
...
# mylist is the initial list
mylist = filter(cmp, mylist)
mylist is now a generator of suitable items. You can use list(mylist) if you need to use it more than once.
Haven't tried this yet but.. i'll give it a quick shot:
new_list = [old.pop(i) for i, x in reversed(list(enumerate(old))) if comparator(x)]
You can do this, might be one line too much though.
new_list1 = [x for x in old_list if your_comparator(x)]
new_list2 = [x for x in old_list if x not in new_list1]

Move every element from list l to list p

I want to transfer every element from one list to another with ascending order. This is my code:
l=[10,1,2,3,4,5,6,7,8,9]
p=[]
for x in l :
p.append(min(l))
l.remove(min(l))
print p
print l
But it returns this result:
[1, 2, 3, 4, 5]
[10, 6, 7, 8, 9]
I don't know why it stop at half way, please help me on it...Thanks!
Just do this:
p = sorted(l)
#l = [] if you /really/ want it to be empty after the operation
The reason you're getting wonky behavior is that you're changing the size of the sequence l as you iterate over it, leading you to skip elements.
If you wanted to fix your method, you would do:
for x in l[:]:
l[:] creates a copy of l, which you can safely iterate over while you do things to the original l.
try this:
p = []
while len(l) > 0:
p.append(min(l))
l.remove(min(l))
Using while instead of for prevents you from modifying the list as you're iterating over it.
If you want to retain the original unsorted array, use a copy of l.
Check out this answer for more information. https://stackoverflow.com/a/1352908/1418255
Gee, I hope your lists are short. Otherwise, all that min()'ing will yield a slow piece of code.
If your lists are long, you might try a heap (EG heapq, in the standard library) or tree (EG: https://pypi.python.org/pypi/red-black-tree-mod) or treap (EG: https://pypi.python.org/pypi/treap/).
For what you're doing, I'm guessing a heapq would be nice, unless there's a part of your story you've left out, like needing to be able to access arbitrary values and not just the min repeatedly.

How to delete an element from a list while iterating over it in Python? [duplicate]

This question already has answers here:
How to remove items from a list while iterating?
(25 answers)
Is there a simple way to delete a list element by value?
(25 answers)
Closed 8 months ago.
Given a list of numbers:
L = [1, 2, 3, 4, 5]
How do I delete an element, let's say 3, from the list while I iterate over it?
I tried the following code but it didn't do it:
for el in L:
if el == 3:
del el
Best is usually to proceed constructively -- build the new list of the items you want instead of removing those you don't. E.g.:
L[:] = [el for el in L if el != 3]
the list comprehension builds the desired list and the assignment to the "whole-list slice", L[:], ensure you're not just rebinding a name, but fully replacing the contents, so the effects are identically equal to the "removals" you wanted to perform. This is also fast.
If you absolutely, at any cost, must do deletions instead, a subtle approach might work:
>>> ndel = 0
>>> for i, el in enumerate(list(L)):
... if el==3:
... del L[i-ndel]
... ndel += 1
nowhere as elegant, clean, simple, or well-performing as the listcomp approach, but it does do the job (though its correctness is not obvious at first glance and in fact I had it wrong before an edit!-). "at any cost" applies here;-).
Looping on indices in lieu of items is another inferior but workable approach for the "must do deletions" case -- but remember to reverse the indices in this case...:
for i in reversed(range(len(L))):
if L[i] == 3: del L[i]
indeed this was a primary use case for reversed back when we were debating on whether to add that built-in -- reversed(range(... isn't trivial to obtain without reversed, and looping on the list in reversed order is sometimes useful. The alternative
for i in range(len(L) - 1, -1, -1):
is really easy to get wrong;-).
Still, the listcomp I recommended at the start of this answer looks better and better as alternatives are examined, doesn't it?-).
for el in L:
if el == 2:
del L[el]

Categories