Removing sublists from a list of lists - python

I'm trying to find the fastest way to solve this problem, say I have a list of lists:
myList = [[1,2,3,4,5],[2,3],[4,5,6,7],[1,2,3],[3,7]]
I'd like to be able to remove all the lists that are sublists of one of the other lists, for example I'd like the following output:
myList = [[1,2,3,4,5],[4,5,6,7],[3,7]]
Where the lists [2,3] and [1,2,3] were removed because they are completely contained in one of the other lists, while [3,7] was not removed because no single list contained all those elements.
I'm not restricted to any one data structure, if a list of lists or a set is easier to work with, that would be fine too.
The best I could come up with was something like this but it doesn't really work because I'm trying to remove from a list while iterating over it. I tried to copy it into a new list but somehow I couldn't get it working right.
for outter in range(0,len(myList)):
outterSet = set(myList[outter])
for inner in range(outter,len(myList)):
innerSet = set(myList[inner])
if innerSet.issubset(outterSet):
myList.remove(innerSet)
Thanks.

The key to solving your problem is a list of sets:
lists = [[1,2,3,4,5],[2,3],[4,5,6,7],[1,2,3],[3,7]]
sets = [set(l) for l in lists]
new_list = [l for l,s in zip(lists, sets) if not any(s < other for other in sets)]
This converts the inner lists to sets, compares each set to every other set to see if it is contained within it (using the < operator) and, if it is not strictly contained within another set, adds the original list to the new list of lists.

Related

Find matching elements of two unordered Python lists of different sizes

I'm getting this error: index out of range, in if largerList[j] == smallerList[i]. I'm working on an assignment about binary search trees, I put the trees into lists and I'm just trying to compare the two lists:
def matchList(largerList, smallerList) :
matches = []
for i in smallerList:
for j in largerList:
if largerList[j] == smallerList[i] :
matches[i] = smallerList[i]
return matches
I'm assuming nested for loops should totally iterate all elements in each loop, so smallerList is the smaller list so smallerList doesn't make largerList go out of bounds. The inner for-loop should iterate over all of the larger list entirely, comparing each value to each element of the smaller list. Why doesn't it work?
You can't set a list value with matches[i] if that index does not exist in matches.
Try appending instead:
Change this matches[i] = smallerList[i] to this matches = matches.append(smallerList[i])
Trying to find matching elements in lists like this is rather inefficient. One thing you could improve to make it arguably more pythonic is to use a list comprehension:
matches = [i for i in largerList if i in smallerList]
But then the more mathematically sensible approach still would be to realise that we have two sets of elements and we want to find an intersection of two sets so we can write something like:
matches = set(largerList).intersection(smallerList)

Sorting out unique elements from a list to a set

I was writing a function to save unique values returned by a list "list_accepted_car" to a set "unique_accepted_ant". list_car_id is list with the value ['12','18','3','7']. When i run the code i am getting error , "unhashable list ". Can anyone suggest me what is the error?
list_accepted_car = [] #set to store the value list_accepted_car
unique_accepted_car = set() #set to store the value unique_accepted_car
num_accepted = 2 #predifined value for the number of cars allowed to enter
def DoIOpenTheDoor(list_car_id): #list_ant_id is a list of cars allowed to enter
if len(list_accepted_car) < num_accepted:
if len(list_car_id) > 0:
list_accepted_car.append(list_car_id[0:min(len(list_car_id),num_accepted-len(list_accepted_car))])
unique_accepted_list = set(list_accepted_car)
print unique_accepted_list
return list_accepted_car
Under the assumption that list_car_id looks like: [1,2,3,4,5].
You add in list_accepted_car a sublist of list_car_id, so list_accepted_car will look like [[1,2]] i.e. a list of a list.
Then you should change
unique_accepted_list = set(list_accepted_car)
to
unique_accepted_list = set([x for y in list_accepted_car for x in y])
which will extract each element of the sublist and provide a flatten list. (There exists other options to flatten a list of list)
You are saving a list of lists, which can't be converted to a set. You have to flatten it first. There are many examples of how to do it (I'll supply one using itertools.chain which I prefer to python's nested comprehension).
Also, as a side note, I'd make this line more readable by separating to several lines:
list_accepted_car.append(list_car_id[0:min(len(list_car_id),num_accepted-len(list_accepted_car))])
You can do:
from itertools import chain
# code ...
unique_accepted_list = set(chain.from_iterable(list_accepted_car))
The best option would be to not use a list at all here, and use a set from the start.
Lists are not hashable objects, and only hashable objects can be members of sets. So, you can't have a set of lists. This instruction:
list_accepted_car.append(list_car_id[0:min(len(list_car_id),num_accepted-len(list_accepted_car))])
appends a slice of list_car_id to list_accepted_car, and a slice of a list is a list. So in effect list_accepted_car becomes a list of lists, and that's why converting it to a set:
unique_accepted_list = set(list_accepted_car)
fails. Maybe what you wanted is extend rather than append? I can't say, because I don't know what you wanted to achieve.

How can I find the intersection of a list and a nested list?

I have a list of fruits:
fruits = ["apple","banana"]
I also have a nested list of baskets, in which each list contains a string (the name of the basket) and a list of fruits.
baskets = [["basket1",["apple","banana","pear","strawberry"]],["basket2",["strawberry","pear","peach"]],["basket3",["peach","apple","banana"]]]
I would like to know which baskets contain every fruits in the list fruits: the result I expect is a list with two elements, "basket1" and "basket3".
I figured that intersections would the cleanest way of achieving that, and I tried the following:
myset = set(fruits).intersection(*map(set, set(baskets)))
But I'm getting a TypeError "unhashable type: 'list'". I understand that I can't map lists, but I thought that using the function "set" on both lists would convert them to sets... is there any other way I can find the intersection of a list and a list of lists?
You can loop over baskets and check if the fruits set is a subset of fruits in current basket, if yes store current basket's name.
>>> fruits = {"apple", "banana"} #notice the {}, or `set(["apple","banana"])` in Python 2.6 or earlier
>>> [b for b, f in baskets if fruits.issubset(f)]
['basket1', 'basket3']
You can't hash sets any more than you can hash lists. They both have the same problem: because they're mutable, a value can change its contents, making any set that contains it as a member or any dictionary that contains it as a key suddenly invalid.
You can hash the immutable equivalents of both, tuple and frozenset.
Meanwhile, your immediate problem is ironically created by your attempt to solve this problem. Break this line down into pieces:
myset = set(fruits).intersection(*map(set, set(baskets)))
The first piece is this:
baskets_set = set(baskets)
You've got a list of lists. You, set(baskets) is trying to make a set of lists. Which you can't do, because lists aren't hashable.
If you just removed that, and used map(set, baskets), you would then have an iterator of sets, which is a perfectly valid thing.
Of course as soon as you try to iterate it, it will try to make a set out of the first element of baskets, which is a list, so you'll run into the error again.
Plus, even if you solve this, the logic still doesn't make any sense. What's the intersection of a set of, say, 3 strings with a set of, say, 3 (frozen)sets of strings? It's empty. The two sets don't have any elements in common. The fact that some elements of the second one may contain elements of the first doesn't mean that the second one itself contains any elements of the first.
You could do it this way using your approach:
fruits = ["apple","banana"]
baskets = [["basket1",["apple","banana","pear","strawberry"]],
["basket2",["strawberry","pear","peach"]],
["basket3",["peach","apple","banana"]]]
fruitset = set(fruits)
res = set(b for b, s in ((b, set(c)) for b, c in baskets) if s & fruitset)
print res # --> set(['basket1', 'basket3'])

Modify the list that is being iterated in python

I need to update a list while it is being iterated over.
Basically, i have a list of tuples called some_list Each tuple contains a bunch of strings, such as name and path. What I want to do is go over every tuple, look at the name, then find all the tuples that contain the string with an identical path and delete them from the list.
The order does not matter, I merely wish to go over the whole list, but whenever I encounter a tuple with a certain path, all tuples (including oneself) should be removed from the list. I can easily construct such a list and assign it to some_list_updated, but the problem seems to be that the original list does not update...
The code has more or less the following structure:
for tup in some_list[:]:
...
...somecode...
...
some_list = some_list_updated
It seems that the list does update appropriately when I print it out, but python keeps iterating over the old list, it seems. What is the appropriate way to go about it - if there is one? Thanks a lot!
You want to count the paths using a dictionary, then use only those that have a count of 1, then loop using a list comprehension to do the final filter. Using a collections.Counter() object makes the counting part easy:
from collections import Counter
counts = Counter(tup[index_of_path] for tup in some_list)
some_list = [tup for tup in some_list if counts[tup[index_of_path]] == 1]

Sorting a sublist within a Python list of integers

I have an unsorted list of integers in a Python list. I want to sort the elements in a subset of the full list, not the full list itself. I also want to sort the list in-place so as to not create new lists (I'm doing this very frequently). I initially tried
p[i:j].sort()
but this didn't change the contents of p presumably because a new list was formed, sorted, and then thrown away without affecting the contents of the original list. I can, of course, create my own sort function and use loops to select the appropriate elements but this doesn't feel pythonic. Is there a better way to sort sublists in place?
You can write p[i:j] = sorted(p[i:j])
This is because p[i:j] returns a new list. I can think of this immediate solution:
l = p[i:j]
l.sort()
a = 0
for x in range(i, j):
p[x] = l[a]
a += 1
"in place" doesn't mean much. You want this.
p[i:j] = list( sorted( p[i:j] ) )

Categories