Dynamically compare list for common occurrences - python

I want to dynamically use sets to find common entires in lists. If I have a list of list that could contain any number of lists how can I go about find all the common occurrences in all the lists. I have thought about enumerating the list and storing each nested list in its own variable but then I am not sure how to compare all the individual lists.
Example of List:
l = [[1,2,3,4,], [3,6,4,2,1], [6,4,2,6,7,3]]
I want to do something like this but dynamic so it can accept any number of lists:
common = set(l[0]) & set(l[1]) & set(l[2])

Use reduce, with a lambda for
>>> l = [[1,2,3,4,], [3,6,4,2,1], [6,4,2,6,7,3]]
>>> from functools import reduce
>>> common = reduce(lambda l1,l2: set(l1) & set(l2), l)
>>> print(common)
{2, 3, 4}
Or, as a slightly modified version of #tobias_k's solution (as pointed in the comment), you can do it without lambda as
>>> common = reduce(set.intersection, [set(l[0])] + l[1:]))

You can use set.intersection:
set.intersection(*(set(ls) for ls in l)) #evaluates to {2, 3, 4}

You can use reduce from functools package:
from functools import reduce
l = [[1, 2, 3, 4], [3, 6, 4, 2, 1], [6, 4, 2, 6, 7, 3]]
print(reduce(set.intersection, map(set, l)))
Output:
{2, 3, 4}

Related

How to print non repeating elements with original list

given a list of integers nums, return a list of all the element but the repeating number should not be printed more than twice
example
input: nums = [1,1,2,3,3,4,4,4,5]
output: [1,1,2,3,3,4,4,5]
A more flexible implementation using itertools:
from itertools import islice, groupby, chain
nums = [1,1,2,3,3,4,4,4,5]
output = (islice(g, 2) for _, g in groupby(nums))
output = list(chain.from_iterable(output))
print(output) # [1, 1, 2, 3, 3, 4, 4, 5]
You can replace 2 in islice(g, 2) to tune the max repeats you want.
The easiest and I guess most straight forward way to use unique collections is with a set:
list(set(nums)) -> [1, 2, 3, 4, 5]
The downside of this approuch is that sets are unordered. And we cannot really depend on how the list will be sorted after the conversion.
If order is important in your case you can do this:
list(dict.fromkeys(nums))
[1, 2, 3, 4, 5]
dicts are ordered since python3 came out, and their keys are unique. So with this small trick we get a list of the unique keys of a dictionary, but still maitain the original order!

How to add two lists together, avoid repetitions, and order elements?

I have two lists filled with integers. I wish to add them together such that:
the output list has no duplicate elements,
is in order, and
contains the union of both lists.
Is there any way to do so without creating my own custom function? If not, what would a neat and tidy procedure look like?
For instance:
list1 = [1, 10, 2]
list2 = [3, 4, 10]
Output:
outputlist = [1, 2, 3, 4, 10]
Try this:
combined = [list1, list2]
union = list(set().union(*combined))
This takes advantage of the predefined method (.union()) of set() , which is what you need here.
combined can have as many elements inside it, as the asterisk in *combined means that the union of all of the elements is found.
Also, I list()ed the result but you could leave it as a set().
As #glibdud states in the comments, it's possible that this might produce a sorted list, but it's not guaranteed, so use sorted() to ensure that it's ordered. (like this union = sorted(list(set().union(*combined))))
l1 = [1, 10, 2]
l2 = [3, 4, 10]
sorted(list(set(l1 + l2)))
>>> [1, 2, 3, 4, 10]

Python list comprehension to create unequal length lists from a list using conditional

Using list comprehension, itertools or similar functions, is it possible to create two unequal lists from one list based on a conditional? Here is an example:
main_list = [6, 3, 4, 0, 9, 1]
part_list = [4, 5, 1, 2, 7]
in_main = []
out_main = []
for p in part_list:
if p not in main_list:
out_main.append(p)
else:
in_main.append(p)
>>> out_main
[5, 2, 7]
>>> in_main
[4, 1]
I'm trying to keep it simple, but as an example of usage, the main_list could be values from a dictionary with the part_list containing dictionary keys. I need to generate both lists at the same time.
So long as you have no repeating data & order doesn't matter.
main_set = set([6, 3, 4, 0, 9, 1])
part_set = set([4, 5, 1, 2, 7])
out_main = part_set - main_set
in_main = part_set & main_set
Job done.
If the order (within part_list) is important:
out_main = [p for p in part_list if p not in main_list]
in_main = [p for p in part_list if p in main_list]
otherwise:
out_main = list(set(part_list) - set(main_list))
in_main = list(set(part_list) & set(main_list))
A true itertools-based solution that works on an iterable:
>>> part_iter = iter(part_list)
>>> part_in, part_out = itertools.tee(part_iter)
>>> in_main = (p for p in part_in if p in main_list)
>>> out_main = (p for p in part_out if p not in main_list)
Making lists out of these defeats the point of using iterators, but here is the result:
>>> list(in_main)
[4, 1]
>>> list(out_main)
[5, 2, 7]
This has the advantage of lazily generating in_main and out_main from another lazily generated sequence. The only catch is that if you iterate through one before the other, tee has to cache a bunch of data until it's used by the other iterator. So this is really only useful if you iterate through them both at roughly the same time. Otherwise you might as well use auxiliary storage yourself.
There's also an interesting ternary operator-based solution. (You could squish this into a list comprehension, but that would be wrong.) I changed main_list into a set for O(1) lookup.
>>> main_set = set(main_list)
>>> in_main = []
>>> out_main = []
>>> for p in part_list:
... (in_main if p in main_set else out_main).append(p)
...
>>> in_main
[4, 1]
>>> out_main
[5, 2, 7]
There's also a fun collections.defaultdict approach:
>>> import collections
>>> in_out = collections.defaultdict(list)
>>> for p in part_list:
... in_out[p in main_list].append(p)
...
>>> in_out
defaultdict(<type 'list'>, {False: [5, 2, 7], True: [4, 1]})
in_main = list(set(main_list) & set(part_list))
out_main = list(set(part_list) - set(in_main))
Start by a list of predicates:
test_func = [part_list.__contains__, lambda x: not part_list.__contains__(x)]
# Basically, each of the predicates is a function that returns a True/False value
# (or similar) according to a certain condition.
# Here, you wanted to test set intersection; but you could have more predicates.
print [filter(func, main_list) for func in test_func]
Then you have your "one-liner" but you have a bit of overhead work by maintaining a list of predicates
As said in other answers, you can speed up look up by using set(main_list) instead (not in the list comprehension of course, but before).

Taking the union of sets

I have a list l of sets. To take the union of all the sets in l I do:
union = set()
for x in l:
union |= x
I have a feeling there is a more economical/functional way of writing this. Can I improve upon this?
Here's how I would do it (some corrections as per comments):
union_set = set()
union_set.update(*l)
or
union_set = set.union(*l)
>>> l = [set([1, 2, 3]), set([3, 4, 5]), set([0, 1])]
>>> set.union(*l)
set([0, 1, 2, 3, 4, 5])
If you're looking for a functional approach, there's little more traditional than reduce():
>>> reduce(set.union, [ set([1,2]), set([3,4]), set([5,6]) ])
set([1, 2, 3, 4, 5, 6])
In Python 3.0, reduce can be found in the functools module; in 2.6 and 2.7, it exists both in functools and (as in older interpreters) built-in.
union = reduce(set.union, l)
In Python 2.x, reduce is a built-in. In 3.x, it's in the functools module.

How To Merge an Arbitrary Number of Tuples in Python?

I have a list of tuples:
l=[(1,2,3),(4,5,6)]
The list can be of arbitrary length, as can the tuples. I'd like to convert this into a list or tuple of the elements, in the order they appear:
f=[1,2,3,4,5,6] # or (1,2,3,4,5,6)
If I know the at development time how many tuples I'll get back, I could just add them:
m = l[0] + l[1] # (1,2,3,4,5,6)
But since I don't know until runtime how many tuples I'll have, I can't do that. I feel like there's a way to use map to do this, but I can't figure it out. I can iterate over the tuples and add them to an accumulator, but that would create lots of intermediate tuples that would never be used. I could also iterate over the tuples, then the elements of the tuples, and append them to a list. This seems very inefficient. Maybe there's an even easier way that I'm totally glossing over. Any thoughts?
Chain them (only creates a generator instead of reserving extra memory):
>>> from itertools import chain
>>> l = [(1,2,3),(4,5,6)]
>>> list(chain.from_iterable(l))
[1, 2, 3, 4, 5, 6]
l = [(1, 2), (3, 4), (5, 6)]
print sum(l, ()) # (1, 2, 3, 4, 5, 6)
reduce(tuple.__add__, [(1,2,3),(4,5,6)])
tuple(i for x in l for i in x) # (1, 2, 3, 4, 5, 6)
Use the pythonic generator style for all of the following:
b=[(1,2,3),(4,5,6)]
list = [ x for x in i for i in b ] #produces a list
gen = ( x for x in i for i in b ) #produces a generator
tup = tuple( x for x in i for i in b ) #produces a tuple
print list
>> [1, 2, 3, 4, 5, 6]
>>> from itertools import chain
>>> l = [(1,2,3),(4,5,6)]
>>> list(chain(*l))
[1, 2, 3, 4, 5, 6]
You can combine the values in a list using the .extend() function like this:
l = [(1,2,3), (4,5,6)]
m = []
for t in l:
m.extend(t)
or a shorter version using reduce:
l = [(1,2,3), (4,5,6)]
m = reduce(lambda x,y: x+list(y), l, [])

Categories