python list of sets find symmetric difference in all elements - python

Consider this list of sets
my_input_list= [
{1,2,3,4,5},
{2,3,7,4,5},
set(),
{1,2,3,4,5,6},
set(),]
I want to get the only exclusive elements 6 and 7 as the response, list or set. Set preferred.
I tried
print reduce(set.symmetric_difference,my_input_list) but that gives
{2,3,4,5,6,7}
And i tried sorting the list by length, smallest first raises an error due to two empty sets. Largest first gives the same result as unsorted.
Any help or ideas please?
Thanks :)

Looks like the most straightforward solution is to count everything and return the elements that only appear once.
This solution uses chain.from_iterable (to flatten your sets) + Counter (to count things). Finally, use a set comprehension to filter elements with count == 1.
from itertools import chain
from collections import Counter
c = Counter(chain.from_iterable(my_input_list))
print({k for k in c if c[k] == 1})
{6, 7}
A quick note; the empty literal {} is used to indicate an empty dict, not set. For the latter, use set().

You could use itertools.chain and collection.Counter:
from itertools import chain
from collections import Counter
r = {k for k,v in Counter(chain.from_iterable(my_input_list)).items() if v==1}

Related

Python: Count Values in multi-value Dictionary

How can I count the number of a specific values in a multi-value dictionary?
For example, if I have the keys A and B with different sets of numbers as values, I want get the count of each number amongst all of the dictionary's keys.
I've tried this code, but I get 0 instead of 2.
dic = {'A':{0,1,2},'B':{1,2}}
print(sum(value == 1 for value in dic.values()))
Counter is a good option for this, especially if you want more than a single result:
from collections import Counter
from itertools import chain
from collections import Counter
count = Counter(chain(*(dic.values())))
In the REPL:
>>> count
Counter({1: 2, 2: 2, 0: 1})
>>> count.get(1)
2
Counter simply tallies each item in a list. By using chain we treat a list of lists as simply one large list, gluing everything together. Feeding this right to Counter does the work of counting how many of each item there is.

How to Create Combination of Element in Different Set?

Let say that I have n lists and they are not disjoint. I want to make every combination of n elements which I get one from every lists I have but in that combination there are different elements and there are no double combination. So, [1,1,2] isn't allowed and [1,2,3] is same as [2,1,3].
For example, I have A=[1,2,3], B=[2,4,1], and C=[1,5,3]. So, the output that I want is [[1,2,5],[1,2,3],[1,4,5],[1,4,3],[2,4,1],[2,4,5],[2,4,3],[3,2,5],[3,4,5],[3,1,5]].
I have search google and I think function product in module itertools can do it. But, I have no idea how to make no same elements in every combinations and no double combinations.
Maybe something like:
from itertools import product
A=[1,2,3]
B=[2,4,1]
C=[1,5,3]
L = list(set([ tuple(sorted(l)) for l in product(A,B,C) if len(set(l))==3 ]))
Of course you would have to change 3 ot the relevant value if you work with more than 3 lists.
how about this? create a dicitonary with the sorted permutations as key. accept values only if all the three integers are different:
from itertools import product
A=[1,2,3]
B=[2,4,1]
C=[1,5,3]
LEN = 3
dct = {tuple(sorted(item)): item for item in product(A,B,C)
if len(set(item)) == LEN}
print(dct)
vals = list(dct.values())
print(vals)

Sum of values across all nested dictionaries in python

I have a dictionary of Counters, e.g:
from collections import Counter, defaultdict
numbers = defaultdict(Counter)
numbers['a']['first'] = 1
numbers['a']['second'] = 2
numbers['b']['first'] = 3
I want to get the sum: 1+2+3 = 6
What would be the simplest / idiomatic way to do this in python 3?
Use a nested comprehension:
sum(x for counter in numbers.values() for x in counter.values())
Or sum first the counters (starting with an empty one), and then their values:
sum(sum(numbers.values(), Counter()).values())
Or first each counter's values, and then the intermediate results:
sum(sum(c.values()) for c in numbers.values())
Or use chain:
from itertools import chain
sum(chain.from_iterable(d.values() for d in numbers.values()))
I prefer the first way.
sum(sum(c.values()) for c in numbers.values())
from itertools import chain
sum(chain.from_iterable(d.values() for d in numbers.values()))
# outputs: 6
In terms of performance use .itervalues() in python 2.x, that avoids building intermediary list (applies to all solutions here).
sum(chain.from_iterable(d.itervalues() for d in numbers.itervalues()))

python: how to know the index when you randomly select an element from a sequence with random.choice(seq)

I know very well how to select a random item from a list with random.choice(seq) but how do I know the index of that element?
import random
l = ['a','b','c','d','e']
i = random.choice(range(len(l)))
print i, l[i]
You could first choose a random index, then get the list element at that location to have both the index and value.
>>> import random
>>> a = [1, 2, 3, 4, 5]
>>> index = random.randint(0,len(a)-1)
>>> index
0
>>> a[index]
1
You can do it using randrange function from random module
import random
l = ['a','b','c','d','e']
i = random.randrange(len(l))
print i, l[i]
The most elegant way to do so is random.randrange:
index = random.randrange(len(MY_LIST))
value = MY_LIST[index]
One can also do this in python3, less elegantly (but still better than .index) with random.choice on a range object:
index = random.choice(range(len(MY_LIST)))
value = MY_LIST[index]
The only valid solutions are this solution and the random.randint solutions.
The ones which use list.index not only are slow (O(N) per lookup rather than O(1); gets really bad if you do this for each element, you'll have to do O(N^2) comparisons) but ALSO you will have skewed/incorrect results if the list elements are not unique.
One would think that this is slow, but it turns out to only be slightly slower than the other correct solution random.randint, and may be more readable. I personally consider it more elegant because one doesn't have to do numerical index fiddling and use unnecessary parameters as one has to do with randint(0,len(...)-1), but some may consider this a feature, though one needs to know the randint convention of an inclusive range [start, stop].
Proof of speed for random.choice: The only reason this works is that the range object is OPTIMIZED for indexing. As proof, you can do random.choice(range(10**12)); if it iterated through the entire list your machine would be slowed to a crawl.
edit: I had overlooked randrange because the docs seemed to say "don't use this function" (but actually meant "this function is pythonic, use it"). Thanks to martineau for pointing this out.
You could of course abstract this into a function:
def randomElement(sequence):
index = random.randrange(len(sequence))
return index,sequence[index]
i,value = randomElement(range(10**15)) # try THAT with .index, heh
# (don't, your machine will die)
# use xrange if using python2
# i,value = (268840440712786, 268840440712786)
If the values are unique in the sequence, you can always say: list.index(value)
Using randrage() as has been suggested is a great way to get the index. By creating a dictionary created via comprehension you can reduce this code to one line as shown below. Note that since this dictionary only has one element, when you call popitem() you get the combined index and value in a tuple.
import random
letters = "abcdefghijklmnopqrstuvwxyz"
# dictionary created via comprehension
idx, val = {i: letters[i] for i in [random.randrange(len(letters))]}.popitem()
print("index {} value {}" .format(idx, val))
We can use sample() method also.
If you want to randomly select n elements from list
import random
l, n = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 2
index_list = random.sample(range(len(l)), n)
index_list will have unique indexes.
I prefer sample() over choices() as sample() does not allow duplicate elements in a sequence.

Common elements between two lists not using sets in Python

I want count the same elements of two lists. Lists can have duplicate elements, so I can't convert this to sets and use & operator.
a=[2,2,1,1]
b=[1,1,3,3]
set(a) & set(b) work
a & b don't work
It is possible to do it withoud set and dictonary?
In Python 3.x (and Python 2.7, when it's released), you can use collections.Counter for this:
>>> from collections import Counter
>>> list((Counter([2,2,1,1]) & Counter([1,3,3,1])).elements())
[1, 1]
Here's an alternative using collections.defaultdict (available in Python 2.5 and later). It has the nice property that the order of the result is deterministic (it essentially corresponds to the order of the second list).
from collections import defaultdict
def list_intersection(list1, list2):
bag = defaultdict(int)
for elt in list1:
bag[elt] += 1
result = []
for elt in list2:
if elt in bag:
# remove elt from bag, making sure
# that bag counts are kept positive
if bag[elt] == 1:
del bag[elt]
else:
bag[elt] -= 1
result.append(elt)
return result
For both these solutions, the number of occurrences of any given element x in the output list is the minimum of the numbers of occurrences of x in the two input lists. It's not clear from your question whether this is the behavior that you want.
Using sets is the most efficient, but you could always do r = [i for i in l1 if i in l2].
SilentGhost, Mark Dickinson and Lo'oris are right, Thanks very much for report this problem - I need common part of lists, so for:
a=[1,1,1,2]
b=[1,1,3,3]
result should be [1,1]
Sorry for comment in not suitable place - I have registered today.
I modified yours solutions:
def count_common(l1,l2):
l2_copy=list(l2)
counter=0
for i in l1:
if i in l2_copy:
counter+=1
l2_copy.remove(i)
return counter
l1=[1,1,1]
l2=[1,2]
print count_common(l1,l2)
1

Categories