Python list.remove(x) 2.7.5 - python

I have two lists shown below. I'm trying to use the list.remove(x) function to remove the files that are in both list1 and list2, but one of my lists has file extensions while the other does not! What should be my approach!?
List1 = ['myfile.v', 'myfile2.sv', 'myfile3.vhd', 'etcfile.v', 'randfile.sv']
List2 = ['myfile', 'myfile2', 'myfile3']
#This is in short what I would like to do, but the file extensions throw off
#the tool!
for x in List2:
List1.remove(x)
Thanks!

It's really dangerous to loop over a list as you are removing items from it. You'll nearly always end up skipping over some elements.
>>> L = [1, 1, 2, 2, 3, 3]
>>> for x in L:
...     print x
...     if x == 2:
... L.remove(2)
... 
1
1
2
3
3
It's also inefficient, since each .remove is O(n) complexity
Better to create a new list and bind it back to list1
import os
list1 = ['myfile.v', 'myfile2.sv', 'myfile3.vhd', 'etcfile.v', 'randfile.sv']
list2 = ['myfile', 'myfile2', 'myfile3']
set2 = set(list2) # Use a set for O(1) lookups
list1 = [x for x in list1 if os.path.splitext(x)[0] not in set2]
or for an "inplace" version
list1[:] = [x for x in list1 if os.path.splitext(x)[0] not in set2]
for a truly inplace version as discussed in the comments - doesn't use extra O(n) memory. And runs in O(n) time
>>> list1 = ['myfile.v', 'myfile2.sv', 'myfile3.vhd', 'etcfile.v', 'randfile.sv']
>>> p = 0
>>> for x in list1:
... if os.path.splitext(x)[0] not in set2:
... list1[p] = x
... p += 1
...
>>> del(list1[p:])
>>> list1
['etcfile.v', 'randfile.sv']

For the sake of it, if you want to use the list.remove(element), as it is very easy to read for others, you can try the following. If you have a function f that returns true if the value is correct/passes certain tests as required,
Since this will NOT work:
def rem_vals(L):
for x in L:
if not f(x):
L.remove(x)
for more than one value to be removed in the list L, we can use recursion as follows:
def rem_vals_rec(L):
for x in L:
if not f(x):
L.remove(x)
rem_vals_rec(L)
Not the fastest, but the easiest to read.

Related

Easier way to check if an item from one list of tuples doesn't exist in another list of tuples in python

I have two lists of tuples, say,
list1 = [('item1',),('item2',),('item3',), ('item4',)] # Contains just one item per tuple
list2 = [('item1', 'd',),('item2', 'a',),('item3', 'f',)] # Contains multiple items per tuple
Expected output: 'item4' # Item that doesn't exist in list2
As shown in above example I want to check which item in tuples in list 1 does not exist in first index of tuples in list 2. What is the easiest way to do this without running two for loops?
Assuming your tuple structure is exactly as shown above, this would work:
tuple(set(x[0] for x in list1) - set(x[0] for x in list2))
or per #don't talk just code, better as set comprehensions:
tuple({x[0] for x in list1} - {x[0] for x in list2})
result:
('item4',)
This gives you {'item4'}:
next(zip(*list1)) - dict(list2).keys()
The next(zip(*list1)) gives you the tuple ('item1', 'item2', 'item3', 'item4').
The dict(list2).keys() gives you dict_keys(['item1', 'item2', 'item3']), which happily offers you set operations like that set difference.
Try it online!
This is the only way I can think of doing it, not sure if it helps though. I removed the commas in the items in list1 because I don't see why they are there and it affects the code.
list1 = [('item1'),('item2'),('item3'), ('item4')] # Contains just one item per tuple
list2 = [('item1', 'd',),('item2', 'a',),('item3', 'f',)] # Contains multiple items per tuple
not_in_tuple = []
OutputTuple = [(a) for a, b in list2]
for i in list1:
if i in OutputTuple:
pass
else:
not_in_tuple.append(i)
for i in not_in_tuple:
print(i)
You don't really have a choice but to loop over the two lists. Once efficient way could be to first construct a set of the first elements of list2:
items = {e[0] for e in list2}
list3 = list(filter(lambda x:x[0] not in items, list1))
Output:
>>> list3
[('item4',)]
Try set.difference:
>>> set(next(zip(*list1))).difference(dict(list2))
{'item4'}
>>>
Or even better:
>>> set(list1) ^ {x[:1] for x in list2}
{('item4',)}
>>>
that is a difference operation for sets:
set1 = set(j[0] for j in list1)
set2 = set(j[0] for j in list2)
result = set1.difference(set2)
output:
{'item4'}
for i in list1:
a=i[0]
for j in list2:
b=j[0]
if a==b:
break
else:
print(a)

Remove list from list of lists if criterion is met

I am looking to program a small bit of code in Python. I have a list of lists called "keep_list", from which I want to remove any sublist containing a specific value found in another list called "deleteNODE".
for example:
deleteNODE=[0,4]
keep_list=[[0,1,2],[0,2,3],[1,2,3],[4,5,6]]
After running the code the result should be (removing any list containing 0 or 4):
keep_list=[[1,2,3]]
Is there any efficient way of doing this?
I did it like this:
[x for x in keep_list if not set(x).intersection(deleteNODE)]
Since I thought the other answers were better, I also ran timeit on all the 3 answers and surprisingly this one was faster.
Python 3.8.2
>>> import timeit
>>>
>>> deleteNODE=[0,4]
>>> keep_list=[[0,1,2],[0,2,3],[1,2,3],[4,5,6]]
>>>
>>>
>>> def v1(keep, delete):
... return [l for l in keep_list if not any(n in l for n in deleteNODE)]
...
>>> def v2(keep, delete):
... return [i for i in keep_list if len(set(i)&set(deleteNODE)) == 0]
...
>>> def v3(keep, delete):
... return [x for x in keep_list if not set(x).intersection(deleteNODE)]
...
>>>
>>> timeit.timeit(lambda: v1(keep_list, deleteNODE), number=3000000)
7.2224646
>>> timeit.timeit(lambda: v2(keep_list, deleteNODE), number=3000000)
7.1723587
>>> timeit.timeit(lambda: v3(keep_list, deleteNODE), number=3000000)
5.640403499999998
I'm no Python expert so can anyone understand why mine was faster since it appears to be creating a new set for every evaluation?
You can solve this using a list comprehension to cycle through each list within the larger list and by using sets. The & operator between sets returns the intersection (the elements common between both sets). Therefore, if the intersection between the list you're evaluating and deleteNODE is not zero, that means there is a common element and it gets excluded.
keep_list = [i for i in keep_list if len(set(i)&set(deleteNODE)) == 0]
This could be done using list comprehension
deleteNODE=[0,4]
keep_list=[[0,1,2],[0,2,3],[1,2,3],[4,5,6]]
...
keep_list = [l for l in keep_list if not any(n in l for n in deleteNODE)]

Finding indices of items from a list in another list even if they repeat

This answer works very well for finding indices of items from a list in another list, but the problem with it is, it only gives them once. However, I would like my list of indices to have the same length as the searched for list.
Here is an example:
thelist = ['A','B','C','D','E'] # the list whose indices I want
Mylist = ['B','C','B','E'] # my list of values that I am searching in the other list
ilist = [i for i, x in enumerate(thelist) if any(thing in x for thing in Mylist)]
With this solution, ilist = [1,2,4] but what I want is ilist = [1,2,1,4] so that len(ilist) = len(Mylist). It leaves out the index that has already been found, but if my items repeat in the list, it will not give me the duplicates.
thelist = ['A','B','C','D','E']
Mylist = ['B','C','B','E']
ilist = [thelist.index(x) for x in Mylist]
print(ilist) # [1, 2, 1, 4]
Basically, "for each element of Mylist, get its position in thelist."
This assumes that every element in Mylist exists in thelist. If the element occurs in thelist more than once, it takes the first location.
UPDATE
For substrings:
thelist = ['A','boB','C','D','E']
Mylist = ['B','C','B','E']
ilist = [next(i for i, y in enumerate(thelist) if x in y) for x in Mylist]
print(ilist) # [1, 2, 1, 4]
UPDATE 2
Here's a version that does substrings in the other direction using the example in the comments below:
thelist = ['A','B','C','D','E']
Mylist = ['Boo','Cup','Bee','Eerr','Cool','Aah']
ilist = [next(i for i, y in enumerate(thelist) if y in x) for x in Mylist]
print(ilist) # [1, 2, 1, 4, 2, 0]
Below code would work
ilist = [ theList.index(i) for i in MyList ]
Make a reverse lookup from strings to indices:
string_indices = {c: i for i, c in enumerate(thelist)}
ilist = [string_indices[c] for c in Mylist]
This avoids the quadratic behaviour of repeated .index() lookups.
If you data can be implicitly converted to ndarray, as your example implies, you could use numpy_indexed (disclaimer: I am its author), to perform this kind of operation in an efficient (fully vectorized and NlogN) manner.
import numpy_indexed as npi
ilist = npi.indices(thelist, Mylist)
npi.indices is essentially the array-generalization of list.index. Also, it has a kwarg to give you control over how to deal with missing values and such.

Check if there's an item inside a list and if True modify it

list1 = [1,2,3]
def ex(example_list):
for number in example_list:
if(number == 2):
number = 3
ex(list1)
print(list1)
I need to check if there is the number 2 inside of the list1 and if it's inside of it, I want to modify it to 3.
But if I run the command, number would be 3, but list1 would remain [1,2,3] and not [1,3,3]
You can use enumerate() to get the index of the number you need to change:
list1 = [1,2,3]
def ex(example_list):
for idx, number in enumerate(example_list):
if(number == 2):
example_list[idx] = 3
ex(list1)
print(list1)
The variable number is an object with its own reference and not a reference to the item in the list.
The logic for checking and replacing can be done altogether in a list comprehension using a ternary operator since you're not actually using the index:
list2 = [3 if num==2 else num for num in list1]
References:
List comprehensions
Conditional expressions
In order to modify a list item, you need to know which slot it is in. The .index() method of lists can tell you.
list1 = [1,2,3]
i = list1.index(2)
list1[i] = 2
Now what happens if the list does not contain 2? index() will throw an exception, which normally will terminate your program. You can catch that error, however, and do nothing:
list1 = [1,2,3]
try:
i = list1.index(2)
except ValueError:
pass
else: # no error occurred
list1[i] = 2
So... The problem you're having is that, since number contains a basic type (an int), modifying number doesn't modify the reference inside the list. Basically, you need to change the item within the list by using the index of the item to change:
list1 = [1,2,3]
def ex(example_list):
for i, number in enumerate(example_list):
if(number == 2):
example_list[i] = 3 # <-- This is the important part
ex(list1)
print(list1)
Of just using the index (might be clearer):
list1 = [1,2,3]
def ex(example_list):
for i in range(len(example_list)):
if(example_list[i] == 2):
example_list[i] = 3
ex(list1)
print(list1)
l.index(n) will return the index at which n can be found in list l or throw a ValueError if it's not in there.
This is useful if you want to replace the first instance of n with something, as seen below:
>>> l = [1,2,3,4]
>>> # Don't get to try in case this fails!
>>> l[l.index(2)] = 3
>>> l
[1, 3, 3, 4]
If you need to replace all 2's with 3's, just iterate through, adding elements. If the element isn't 2, it's fine. Otherwise, make it 3.
l = [e if e != 2 else 3 for e in l]
Usage:
>>> l = [1,2,3,4]
>>> l = [e if e != 2 else 3 for e in l]
>>> l
[1, 3, 3, 4]

Comparing two lists and only printing the differences? (XORing two lists)

I'm trying to create a function that takes in 2 lists and returns the list that only has the differences of the two lists.
Example:
a = [1,2,5,7,9]
b = [1,2,4,8,9]
The result should print [4,5,7,8]
The function so far:
def xor(list1, list2):
list3=list1+list2
for i in range(0, len(list3)):
x=list3[i]
y=i
while y>0 and x<list3[y-1]:
list3[y]=list3[y-1]
y=y-1
list3[y]=x
last=list3[-1]
for i in range(len(list3) -2, -1, -1):
if last==list3[i]:
del list3[i]
else:
last=list3[i]
return list3
print xor([1,2,5,7,8],[1,2,4,8,9])
The first for loop sorts it, second one removes the duplicates. Problem is the result is
[1,2,4,5,7,8,9] not [4,5,7,8], so it doesn't completely remove the duplicates? What can I add to do this.
I can't use any special modules, .sort, set or anything, just loops basically.
You basically want to add an element to your new list if it is present in one and not present in another. Here is a compact loop which can do it. For each element in the two lists (concatenate them with list1+list2), we add element if it is not present in one of them:
[a for a in list1+list2 if (a not in list1) or (a not in list2)]
You can easily transform it into a more unPythonic code with explicit looping through elements as you have now, but honestly I don't see a point (not that it matters):
def xor(list1, list2):
outputlist = []
list3 = list1 + list2
for i in range(0, len(list3)):
if ((list3[i] not in list1) or (list3[i] not in list2)) and (list3[i] not in outputlist):
outputlist[len(outputlist):] = [list3[i]]
return outputlist
Use set is better
>>> a = [1,2,5,7,9]
>>> b = [1,2,4,8,9]
>>> set(a).symmetric_difference(b)
{4, 5, 7, 8}
Thanks to #DSM, a better sentence is:
>>> set(a)^set(b)
These two statements are the same. But the latter is clearer.
Update: sorry, I did not see the last requirement: cannot use set. As far as I see, the solution provided by #sashkello is the best.
Note: This is really unpythonic and should only be used as a homework answer :)
After you have sorted both lists, you can find duplicates by doing the following:
1) Place iterators at the start of A and B
2) If Aitr is greater than Bitr, advance Bitr after placing Bitr's value in the return list
3) Else if Bitr is greater than Aitr, advance Aiter after placing Aitr's value in the return list
4) Else you have found a duplicate, advance Aitr and Bitr
This code works assuming you've got sorted lists. It works in linear time, rather than quadratic like many of the other solutions given.
def diff(sl0, sl1):
i0, i1 = 0, 0
while i0 < len(sl0) and i1 < len(sl1):
if sl0[i0] == sl1[i1]:
i0 += 1
i1 += 1
elif sl0[i0] < sl1[i1]:
yield sl0[i0]
i0 += 1
else:
yield sl1[i1]
i1 += 1
for i in xrange(i0, len(sl0)):
yield sl0[i]
for i in xrange(i1, len(sl1)):
yield sl1[i]
print list(diff([1,2,5,7,9], [1,2,4,8,9]))
Try this,
a = [1,2,5,7,9]
b = [1,2,4,8,9]
print set(a).symmetric_difference(set(b))
Simple, but not particularly efficient :)
>>> a = [1,2,5,7,9]
>>> b = [1,2,4,8,9]
>>> [i for i in a+b if (a+b).count(i)==1]
[5, 7, 4, 8]
Or with "just loops"
>>> res = []
>>> for i in a+b:
... c = 0
... for j in a+b:
... if i==j:
... c += 1
... if c == 1:
... res.append(i)
...
>>> res
[5, 7, 4, 8]

Categories