taking intersection of N-many lists in python - python

what's the easiest way to take the intersection of N-many lists in python?
if I have two lists a and b, I know I can do:
a = set(a)
b = set(b)
intersect = a.intersection(b)
but I want to do something like a & b & c & d & ... for an arbitrary set of lists (ideally without converting to a set first, but if that's the easiest / most efficient way, I can deal with that.)
I.e. I want to write a function intersect(*args) that will do it for arbitrarily many sets efficiently. What's the easiest way to do that?
EDIT: My own solution is reduce(set.intersection, [a,b,c]) -- is that good?
thanks.

This works for 1 or more lists. The 0 lists case is not so easy, because it would have to return a set that contains all possible values.
def intersection(first, *others):
return set(first).intersection(*others)

This works with 1 or more lists and does not use multiple parameters:
>>> def intersection(*listas):
... return set(listas[0]).intersection(*listas[1:])
...
>>> intersection([1,2,3,4],[4,5,6],[2,4,5],[1,4,8])
set([4])
>>> intersection([1,2,3,4])
set([1, 2, 3, 4])
>>>
Not sure this is better than other answers, anyway.

lists = [[5,4,3], [4,2], [6,2,3,4]]
try:
# the following line makes one intersection too much, but I don't think
# this hurts performance noticably.
intersected = set(lists[0]).intersection(*lists)
except ValueError:
# no lists[0]
intersected = set()
print intersected # set([4])
Sets can be intersected with any iterable, there's no need to convert it into a set first.

Related

Is there a way to define a compare function to issubset in Python

I want to use an approximate comparison to check if two sets are equal.
More exactly, if there is a bijection of approximate elements on two sets.
As a toy example, consider the following function
def my_compare(a, b):
ep = 0.1
return a>=b-ep and a<=b+ep
I would like to do something like
a = {1, 2, 3}
b = {1.05, 2, 3}
c = {1.2, 2, 3}
print(a.issubset(b) and b.issubset(a))
print(a.issubset(c) and c.issubset(a))
Have output
True
False
PS: I know that I am subverting the mathematical definition of set by doing that. Even so. Is there?
To keep these comparisons efficient, you may want to consider an alternate underlying data structure, such as an Interval Tree.
The Python portion library at https://github.com/AlexandreDecan/portion could help you out here.
Define each element in the set as an interval sized [x - 0.1, x + 0.1], and look for intersections between the two sets: https://github.com/AlexandreDecan/portion#interval-operations

best way of combining the values of 2 different tuples

I want to add together the values of 2 tuples (of any size), and create an output tuple.
For example:
a = (1,4)
b = (2,3)
output: (3,7)
Is there a better way to do it than just:
output = (a[0] + b[0], a[1]+b[1])
How about using a generator expression?
output = tuple(a[i] + b[i] for i in range(len(a)))
If you don't know that the tuples are the same length, you could try using something more fancy like zip (which will stop at the length of the shorter tuple), or itertools.izip (which will allow you to control how to handle different length tuples).
tuple(x+y for (x,y) in zip(a,b))
If you want to stick with 2-tuples, what you have is fine (and probably best). You might consider using a different data structure, one where the + operator adds element-wise. For example:
complex numbers add like 2-vectors (using .real and .imag components)
numpy arrays
Write your own Point class, overriding the __add__ magic method
If you wan't to do it in a way that doesn't require you spell out all elements, go with something functional:
output = tuple(map(sum, zip(a,b)))
or, a list-comp which you, again, must supply to tuple:
output = tuple([i+j for i,j in zip(a,b)])
you could always substitute zip with zip_longest from itertools, using a fill value of 0, if the sizes might differ.
tuple(map(lambda x, y: x + y, a, b))
import operator
tuple(map(operator.add, a, b))

Python. How to optimize search functions

There any way to optimize these two functions ?
first function:
def searchList(list_, element):
for i in range (0,len(list_)):
if(list_[i] == element):
return True
return False
second function:
return_list=[]
for x in list_search:
if searchList(list_users,x)==False:
return_list.append(x)
Yes:
return_list = [x for x in list_search if x not in list_users]
The first function basically checks for membership, in which case you could use the in keyword. The second function can be reduced to a list comprehension to filter out elements from list_search list based on your condition.
For first function
def searchList(list, element):
return element in list
You can make it in 1 line
searchList = lambda x,y: y in x
For 2nd, use a list comp like shown in the other answer
What you are doing with your two functions is building the complement as ozgur pointed out.
Using sets is the most easy thing here
>>> set([2,2,2,3,3,4])- set([1,2,2,4,5])
set([3])
your list_search would be the first list and your list_users the second list.
The only difference is that your new user is only once in the result no matter how often it is in the list_search
Disclaimer: I assumed list_search has no duplicate elements. Otherwise, use this solution.
What you want is exactly the set complement of list_users in list_search.
As an alternative approach, you can use sets to get the difference between two lists and I think it should be much more performant than the naive look up which takes 0(n^2).
>>> list_search = [1, 2, 3, 4]
>>> list_users = [4, 5, 1, 6]
>>> print set(list_search).difference(list_users)
>>> {2, 3}

iterate over list of tuples in two notations

I'm iterating over a list of tuples, and was just wondering if there is a smaller notation to do the following:
for tuple in list:
(a,b,c,d,e) = tuple
or the equivalent
for (a,b,c,d,e) in list:
tuple = (a,b,c,d,e)
Both of these snippits allow me to access the tuple per item as well as as a whole. But is there a notation that somehow combines the two lines into the for-statement? It seems like such a Pythonesque feature that I figured it might exist in some shape or form.
The pythonic way is the first option you menioned:
for tup in list:
a,b,c,d,e = tup
This might be a hack that you could use. There might be a better way, but that's why it's a hack. Your examples are all fine and that's how I would certainly do it.
>>> list1 = [(1, 2, 3, 4, 5)]
>>> for (a, b, c, d, e), tup in zip(list1, list1):
print a, b, c, d, e
print tup
1 2 3 4 5
(1, 2, 3, 4, 5)
Also, please don't use tuple as a variable name.
There isn't anything really built into Python that lets you do this, because the vast majority of the time, you only need to access the tuple one way or the other: either as a tuple or as separate elements. In any case, something like
for t in the_list:
a,b,c,d,e = t
seems pretty clean, and I can't imagine there'd be any good reason to want it more condensed than that. That's what I do on the rare occasions that I need this sort of access.
If you just need to get at one or two elements of the tuple, say perhaps c and e only, and you don't need to use them repeatedly, you can access them as t[2] and t[4]. That reduces the number of variables in your code, which might make it a bit more readable.

How to get the list with the higher sum of its elements

I was wondering if anyone could help me with a Python problem I have. I have four lists, each list holds floats (decimals). I'm adding all the floats that each list contains. The part I'm stuck on is I want to know which of the four list has a higher sum. I know I could use if statements but does anyone know a more of a efficient way. For instance:
foodmart = [12.33,5.55]
nike = [42.20,69.99]
gas_station = [0.89,45.22]
toy_store = [10.99,15.32]
use max():
>>> max(foodmart,nike,gas_station,toy_store, key=sum)
>>> [42.2, 69.99]
help() on max:
max(iterable[, key=func]) -> value
max(a, b, c, ...[, key=func]) ->
value
With a single iterable argument, return its largest item. With two or
more arguments, return the largest argument.
Represent the lists as a dict and use max with an optional key function to calculate the sum
Instead of representing the lists in the way you did, use a dictionary. It would be easier to determine the correct shop and work on any number of lists / shops without the need to enumerate them in the max routine. This would be more Pythonic and maintainable
>>> shops = dict()
>>> shops['foodmart'] = [12.33,5.55]
>>> shops['nike'] = [42.20,69.99]
>>> shops['gas_station'] = [0.89,45.22]
>>> shops['toy_store'] = [10.99,15.32]
>>> max(shops, key = lambda k:sum(shops[k]))
'nike'
>>> max([1,2],[3,4],[2,3], key=lambda x: sum(x))
[3, 4]

Categories