Given a dataset I want to build all combinations (still using itertools.combinations).
To reduce the number of combinations n!/r!/(n-r)! I want to omit pairs included in other combinations.
For illustration a small example. All combinations(range(9), 3) looks like
(0,1,2), (0,1,3), (0,1,4),... (0,1,8),... (0,7,8), (1,2,3),...
This gives pair (0,1) being part of 7 tuples. Also for all other tuples.
Full wanted output for range(9), 3:
(0, 1, 2)
(0, 3, 4)
(0, 5, 6)
(0, 7, 8)
(1, 3, 6)
(1, 4, 7)
(1, 5, 8)
(2, 3, 8)
(2, 4, 5)
(2, 6, 7)
(3, 5, 7)
(4, 6, 8)
Building tuples of length r given a range of n elements uses (n-1)*n/2 pairs and should provide (n-1)*n/(r-1)/r tuples.
How to build the output?
How to name this "combinations with pair omitting" thing scientifically?
For tuples of length 3
from numpy import roll
from pprint import pprint
def a(s):
if len(s)<3:return
s=list(s)
z=s.pop(0)
s1,s2=s[0::2],s[1::2]
l=[]
for a,b in zip(s1,s2):l+=[(z,a,b)]
z1,z2=s1.pop(0),s2.pop(0)
if(len(s1)>1 and len(s2)>1):
for a,b in sorted(map(sorted,zip(s1,roll(s2,-1)))):l+=[(z1,a,b)]
if(len(s1)>2 and len(s2)>2):
for a,b in sorted(map(sorted,zip(s1,roll(s2,1)))):l+=[(z2,a,b)]
if(len(s1)>3):l+=[a(s1)]
if(len(s2)>3):l+=[a(s2)]
return l
s = range(9)
pprint(a(s))
Using recursion we could build for various lengths
Related
I've created a cartesian product and printed from the following code:
A = [0, 3, 5]
B = [5, 10, 15]
C = product(A, B)
for n in C:
print(n)
And we have a result of
(0, 5)
(0, 10)
(0, 15)
(3, 5)
(3, 10)
(3, 15)
(5, 5)
(5, 10)
(5, 15)
Is it possible to check the sum of each set within the cartesian product to a target value of 10? Can we then return or set aside the sets that match?
Result should be:
(0, 10)
(5, 5)
Finally, how can I count the frequency of numbers in my resulting subset to show that 0 appeared once, 10 appeared once, and 5 appeared twice? I would appreciate any feedback or guidance.
You can use a list comprehension like this:
C = [p for p in product(A, B) if sum(p) == 10]
And to count the frequency of the numbers in the tuples, as #Amadan suggests, you can use collections.Counter after using itertools.chain.from_iterable to chain the sub-lists into one sequence:
from collections import Counter
from itertools import chain
Counter(chain.from_iterable(C))
which returns, given your sample input:
Counter({5: 2, 0: 1, 10: 1})
which is an instance of a dict subclass, so you can iterate over its items for output:
for n, freq in Counter({5: 2, 0: 1, 10: 1}).items():
print('{}: {}'.format(n, freq))
I have list with repeated elements, for example array = [2,2,2,7].
If I use the solution suggested in this answer (using itertools.combinations()), I get:
()
(7,)
(2,)
(2,)
(2,)
(7, 2)
(7, 2)
(7, 2)
(2, 2)
(2, 2)
(2, 2)
(7, 2, 2)
(7, 2, 2)
(7, 2, 2)
(2, 2, 2)
(7, 2, 2, 2)
As you can see some of the 'combinations' are repeated, e.g. (7,2,2) appears 3 times.
The output I would like is:
()
(7,)
(2,)
(7, 2)
(2, 2)
(7, 2, 2)
(2, 2, 2)
(7, 2, 2, 2)
I could check the output for repeated combinations but I don't feel like that is the best solution to this problem.
You can take the set of the combinations and then chain them together.
from itertools import chain, combinations
arr = [2, 2, 2, 7]
list(chain.from_iterable(set(combinations(arr, i)) for i in range(len(arr) + 1)))
# [(), (7,), (2,), (2, 7), (2, 2), (2, 2, 2), (2, 2, 7), (2, 2, 2, 7)]
You would need to maintain a set of tuples that are sorted in the same fashion:
import itertools as it
desired=set([(),(7,),(2,),(7, 2),(2, 2),(7, 2, 2),(2, 2, 2),(7, 2, 2, 2)])
result=set()
for i in range(len(array)+1):
for combo in it.combinations(array, i):
result.add(tuple(sorted(combo, reverse=True)))
>>> result==desired
True
Without using itertools.combinations() and set's:
from collections import Counter
import itertools
def powerset(bag):
for v in itertools.product(*(range(r + 1) for r in bag.values())):
yield Counter(zip(bag.keys(), v))
array = [2, 2, 2, 7]
for s in powerset(Counter(array)):
# Convert `Counter` object back to a list
s = list(itertools.chain.from_iterable(itertools.repeat(*mv) for mv in s))
print(s)
I believe your problem could alternatively be stated as finding the power set of a multiset, at least according to this definition.
However it's worth noting that the method shown above will be slower than the solutions in other answers such as this one which simply group the results from itertools.combinations() into a set to remove duplicates, despite being seemingly less efficient, it is still faster in practice as iterating in Python is much slower than in C (see itertoolsmodule.c for the implementation of itertools.combinations()).
Through my limited testing, the method shown in this answer will outperform the previously cited method when there are approximately 14 distinct elements in your array, each with an average multiplicity of 2 (at which point the other method begins to pull away and run many times slower), however the running time for either method under those circumstances are >30 seconds, so if performance is of concern, then you might want to consider implementing this part of your application in C.
So the problem is essentially this: I have a list of tuples made up of n ints that have to be eliminated if they dont fit certain criteria. This criterion boils down to that each element of the tuple must be equal to or less than the corresponding int of another list (lets call this list f) in the exact position.
So, an example:
Assuming I have a list of tuples called wk, made up of tuples of ints of length 3, and a list f composed of 3 ints. Like so:
wk = [(1,3,8),(8,9,1),(1,1,1)]
f = [2,5,8]
=== After applying the function ===
wk_result = [(1,3,8),(1,1,1)]
The rationale would be that when looking at the first tuple of wk ((1,3,8)), the first element of it is smaller than the first element of f. The second element of wk also complies with the rule, and the same applies for the third. This does not apply for the second tuple tho given that the first and second element (8 and 9) are bigger than the first and second elements of f (2 and 5).
Here's the code I have:
for i,z in enumerate(wk):
for j,k in enumerate(z):
if k <= f[j]:
pass
else:
del wk[i]
When I run this it is not eliminating the tuples from wk. What could I be doing wrong?
EDIT
One of the answers provided by user #James actually made it a whole lot simpler to do what I need to do:
[t for t in wk if t<=tuple(f)]
#returns:
[(1, 3, 8), (1, 1, 1)]
The thing is in my particular case it is not getting the job done, so I assume it might have to do with the previous steps of the process which I will post below:
max_i = max(f)
siz = len(f)
flist = [i for i in range(1,max_i +1)]
def cartesian_power(seq, p):
if p == 0:
return [()]
else:
result = []
for x1 in seq:
for x2 in cartesian_power(seq, p - 1):
result.append((x1,) + x2)
return result
wk = cartesian_power(flist, siz)
wk = [i for i in wk if i <= tuple(f) and max(i) == max_i]
What is happening is the following: I cannot use the itertools library to do permutations, that is why I am using a function that gets the job done. Once I produce a list of tuples (wk) with all possible permutations, I filter this list using two parameters: the one that brought me here originally and another one not relevant for the discussion.
Ill show an example of the results with numbers, given f = [2,5,8]:
[(1, 1, 8), (1, 2, 8), (1, 3, 8), (1, 4, 8), (1, 5, 8), (1, 6, 8), (1, 7, 8), (1, 8, 1), (1, 8, 2), (1, 8, 3), (1, 8, 4), (1, 8, 5), (1, 8, 6), (1, 8, 7), (1, 8, 8), (2, 1, 8), (2, 2, 8), (2, 3, 8), (2, 4, 8), (2, 5, 8)]
As you can see, there are instances where the ints in the tuple are bigger than the corresponding position in the f list, like (1,6,8) where the second position of the tuple (6) is bigger than the number in the second position of f (5).
You can use list comprehension with a (short-circuiting) predicate over each tuple zipped with the list f.
wk = [(1, 3, 8), (8, 9, 1), (1, 1, 1), (1, 9, 1)]
f = [2, 5, 8] # In this contrived example, f could preferably be a 3-tuple as well.
filtered = [t for t in wk if all(a <= b for (a, b) in zip(t, f))]
print(filtered) # [(1, 3, 8), (1, 1, 1)]
Here, all() has been used to specify a predicate that all tuple members must be less or equal to the corresponding element in the list f; all() will short-circuit its testing of a tuple as soon as one of its members does not pass the tuple member/list member <= sub-predicate.
Note that I added a (1, 9, 1) tuple for an example where the first tuple element passes the sub-predicate (<= corresponding element in f) whereas the 2nd tuple element does not (9 > 5).
You can do this with a list comprehension. It iterates over the list of tuples and checks that all of the elements of the tuple are less than or equal to the corresponding elements in f. You can compare tuples directly for element-wise inequality
[t for t in wk if all(x<=y for x,y in zip(t,f)]
# returns:
[(1, 3, 8), (1, 1, 1)]
Here is without loop solution which will compare each element in tuple :
wk_1 = [(1,3,8),(8,9,1),(1,1,1)]
f = [2,5,8]
final_input=[]
def comparison(wk, target):
if not wk:
return 0
else:
data=wk[0]
if data[0]<=target[0] and data[1]<=target[1] and data[2]<=target[2]:
final_input.append(data)
comparison(wk[1:],target)
comparison(wk_1,f)
print(final_input)
output:
[(1, 3, 8), (1, 1, 1)]
P.S : since i don't know you want less and equal or only less condition so modify it according to your need.
I'm trying to find duplicates in a list. I want to preserve the values and insert them into a tuple with their number of occurrences.
For example:
list_of_n = [2, 3, 5, 5, 5, 6, 2]
occurance_of_n = zip(set(list_of_n), [list_of_n.count(n) for n in set(list_of_n)])
[(2, 2), (3, 1), (5, 3), (6, 1)]
This works fine with small sets. My question is: as list_of_n gets larger, will I have to worry about arg1 and arg2 in zip(arg1, arg2) not lining up correctly if they're the same set?
I.e. Is there a conceivable future where I call zip() and it accidentally aligns index [0] of list_of_n in arg1 with some other index of list_of_n in arg2?
(in case it's not clear, I'm converting the list to a set for purposes of speed in arg2, and under the pretense that zip will behave better if they're the same in arg1)
Since your sample output preserves the order of appearance, you might want to go with a collections.OrderedDict to gather the counts:
list_of_n = [2, 3, 5, 5, 5, 6, 2]
d = OrderedDict()
for x in list_of_n:
d[x] = d.get(x, 0) + 1
occurance_of_n = list(d.items())
# [(2, 2), (3, 1), (5, 3), (6, 1)]
If order does not matter, the appropriate approach is using a collections.Counter:
occurance_of_n = list(Counter(list_of_n).items())
Note that both approach require only one iteration of the list. Your version could be amended to sth like:
occurance_of_n = list(set((n, list_of_n.count(n)) for n in set(list_of_n)))
# [(6, 1), (3, 1), (5, 3), (2, 2)]
but the repeated calls to list.count make an entire iteration of the initial list for each (unique) element.
I am having trouble finding a way to do this in a Pythonic way. I assume I can use itertools somehow because I've done something similar before but can't remember what I did.
I am trying to generate all non-decreasing lists of length L where each element can take on a value between 1 and N. For example if L=3 and N=3 then [1,1,1],[1,1,2],[1,1,3],[1,2,2],[1,2,3], etc.
You can do this using itertools.combinations_with_replacement:
>>> L, N = 3,3
>>> cc = combinations_with_replacement(range(1, N+1), L)
>>> for c in cc: print(c)
(1, 1, 1)
(1, 1, 2)
(1, 1, 3)
(1, 2, 2)
(1, 2, 3)
(1, 3, 3)
(2, 2, 2)
(2, 2, 3)
(2, 3, 3)
(3, 3, 3)
This works because c_w_r preserves the order of the input, and since we're passing a nondecreasing sequence in, we only get nondecreasing tuples out.
(It's easy to convert to lists if you really need those as opposed to tuples.)