Remove element from itertools.combinations while iterating? - python

Given a list l and all combinations of the list elements is it possible to remove any combination containing x while iterating over all combinations, so that you never consider a combination containing x during the iteration after it is removed?
for a, b in itertools.combinations(l, 2):
if some_function(a,b):
remove_any_tup_with_a_or_b(a, b)
My list l is pretty big so I don't want to keep the combinations in memory.

A cheap trick to accomplish this would be to filter by disjoint testing using a dynamically updated set of exclusion values, but it wouldn't actually avoid generating the combinations you wish to exclude, so it's not a major performance benefit (though filtering using a C built-in function like isdisjoint will be faster than Python level if checks with continue statements typically, by pushing the filter work to the C layer):
from future_builtins import filter # Only on Py2, for generator based filter
import itertools
blacklist = set()
for a, b in filter(blacklist.isdisjoint, itertools.combinations(l, 2)):
if some_function(a,b):
blacklist.update((a, b))

If you want to remove all tuples containing the number x from the list of combinations itertools.combinations(l, 2), consider that you there is a one-to-one mapping (mathematically speaking) from the set itertools.combinations([i for i in range(1,len(l)], 2) to the itertools.combinations(l, 2) that don't contain the number x.
Example:
The set of all of combinations from itertools.combinations([1,2,3,4], 2) that don't contain the number 1 is given by [(2, 3), (2, 4), (3, 4)]. Notice that the number of elements in this list is equal to the number of elements of combinations in the list itertools.combinations([1,2,3], 2)=[(1, 2), (1, 3), (2, 3)].
Since order doesn't matter in combinations, you can map 1 to 4 in [(1, 2), (1, 3), (2, 3)] to get [(1, 2), (1, 3), (2, 3)]=[(4, 2), (4, 3), (2, 3)]=[(2, 4), (3, 4), (2, 3)]=[(2, 3), (2, 4), (3, 4)].

Related

How to i make "rows" consiting of pairs from a list of objects that is sorted based on their attributes

I have created a class with attributes and sorted them based on their level of x, from 1-6. I then want to sort the list into pairs, where the objects with the highest level of "x" and the object with the lowest level of "x" are paired together, and the second most and second less and so on. If it was my way it would look like this, even though objects are not itereable.
for objects in sortedlist:
i = 0
row(i) = [[sortedlist[i], list[-(i)-1]]
i += 1
if i => len(sortedlist)
break
Using zip
I think the code you want is:
rows = list(zip(sortedList, reversed(sortedList)))
However, note that this would "duplicate" the elements:
>>> sortedList = [1, 2, 3, 4, 5]
>>> list(zip(sortedList, reversed(sortedList)))
[(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)]
If you know that the list has an even number of elements and want to avoid duplicates, you can instead write:
rows = list(zip(sortedList[:len(sortedList)//2], reversed(sortedList[len(sortedList)//2:])))
With the following result:
>>> sortedList = [1,2,3,4,5,6]
>>> list(zip(sortedList[:len(sortedList)//2], reversed(sortedList[len(sortedList)//2:])))
[(1, 6), (2, 5), (3, 4)]
Using loops
Although I recommend using zip rather than a for-loop, here is how to fix the loop you wrote:
rows = []
for i in range(len(sortedList)):
rows.append((sortedList[i], sortedList[-i-1]))
With result:
>>> sortedList=[1,2,3,4,5]
>>> rows = []
>>> for i in range(len(sortedList)):
... rows.append((sortedList[i], sortedList[-i-1]))
...
>>> rows
[(1, 5), (2, 4), (3, 3), (4, 2), (5, 1)]

Problem in getting unique elements from list of tuples

I got sample input as a=[(1,2),(2,3),(1,1),(2,1)], and the expected ouput is a=[(1,2),(2,3),(1,1)].
Here, (2,1) is removed, since the same combinational pair (1,2) is already available. I tried below code to remove duplicate pairs
map(tuple, set(frozenset(x) for x in a))
However, the output is [(1, 2), (2, 3), (1,)]. How to get (1,1) pair as (1,1) instead of (1,).
You can use a dict instead of a set to map the frozensets to the original tuple values. Build the dict in reversed order of the list so that duplicating tuples closer to the front can take precedence:
{frozenset(x): x for x in reversed(a)}.values()
This returns:
[(1, 2), (2, 3), (1, 1)]
This is one approach using sorted
Ex:
a=[(1,2),(2,3),(1,1),(2,1)]
print set([tuple(sorted(i)) for i in a])
Output:
set([(1, 2), (2, 3), (1, 1)])

Python 3 - Reducing a list by cyclical reconstructing a given set

I am looking for an algorithm that reduces a list of tuples cyclical by reconstructing a given set as pattern.
Each tuple contains an id and a set, like (1, {'xy'}).
Example
query = {'xyz'}
my_dict = [(1, {'x'}), (2, {'yx'}), (3, {'yz'}),
(4, {'z'}), (5, {'x'}), (6, {'y'}),
(7, {'xyz'}), (8, {'xy'}), (9, {'x'}),]
The goal is to recreate the pattern xyz as often as possible given the second value of the tuples in my_dict. Remaining elements from which the query set can not be completely reconstructed shall be cut off, hence 'reduce'.
my_dict contains in total: 6 times x, 5 times y, 3 times z.
Considering the my_dict, valid solutions would be for example:
result_1 = [(7, {'xyz'}), (8, {'xy'}), (4, {'z'}), (1, {'x'}), (3, {'yz'})]
result_2 = [(7, {'xyz'}), (2, {'yx'}), (4, {'z'}), (1, {'x'}), (3, {'yz'})]
result_3 = [(7, {'xyz'}), (9, {'x'}), (6, {'y'}), (4, {'z'}), (1, {'x'}), (3, {'yz'})]
The order of the tuples in the list is NOT important, i sorted them in the order of the query pattern xyz for the purpose of illustration.
Goal
The goal is to have a list of tuples where the total number of occurrences of the elements from the query set is most optimal evenly distributed.
result_1, result_2 and result_3 all contain in total: 3 times x, 3 times y, 3 times z.
Does anyone know a way/ approach how to do this?
Thanks for your help!
Depending on your application context, a naive brute-force approach might be enough: using the powerset function from this SO answer,
def find_solutions(query, supply):
for subset in powerset(supply):
if is_solution(query, subset):
yield subset
You would need to implement a function is_solution(query, subset) that returns True when the given subset of the supply (my_dict.values()) is a valid solution for the given query.

How to select increasing elements of a list of tuples?

I have the following list of tuples:
a = [(1, 2), (2, 4), (3, 1), (4, 4), (5, 2), (6, 8), (7, -1)]
I would like to select the elements which second value in the tuple is increasing compared to the previous one. For example I would select (2, 4) because 4 is superior to 2, in the same manner I would select (4, 4) and (6, 8).
Can this be done in a more elegant way than a loop starting explicitly on the second element ?
To clarify, I want to select the tuples which second elements are superior to the second element of the prior tuple.
>>> [right for left, right in pairwise(a) if right[1] > left[1]]
[(2, 4), (4, 4), (6, 8)]
Where pairwise is an itertools recipe you can find here.
You can use a list comprehension to do this fairly easily:
a = [a[i] for i in range(1, len(a)) if a[i][1] > a[i-1][1]]
This uses range(1, len(a)) to start from the second element in the list, then compares the second value in each tuple with the second value from the preceding tuple to determine whether it should be in the new list.
Alternatively, you could use zip to generate pairs of neighbouring tuples:
a = [two for one, two in zip(a, a[1:]) if two[1] > one[1]]
You can use enumerate to derive indices and then make list comprehension:
a = [t[1] for t in enumerate(a[1:]) if t[1][1] > a[t[0]-1][1]]
You can use list comprehension
[i for i in a if (i[0] < i[1])]
Returns
[(1, 2), (2, 4), (6, 8)]
Edit: I was incorrect in my understanding of the question. The above code will return all tuples in which the second element is greater than the first. This is not the answer to the question OP asked.

Generate ordered tuples of infinite sequences

I have two generators genA and genB and each of them generates an infinite, strictly monotonically increasing sequence of integers.
Now I need a generator that generates all tuples (a, b) such that a is produced by genA and b is produced by genB and a < b, ordered by a + b ascending. In case of ambiguity the ordering is of no importance, i.e. if a + b == c + d, it doesn't matter if it generates (a, b) first or (c, d) first.
For instance. If both genA and genB generate the prime numbers, then the new generator should generate:
(2, 3), (2, 5), (3, 5), (2, 7), (3, 7), (5, 7), (2, 11), ...
If genA and genB were finite lists, zipping and then sorting would do the trick.
Apparenyly for all tuples of form (x, b) the following holds: first(genA) <= x <= max(genA,b) <= b, being first(genA) the first element generated by genA and max(genA,b) the last element generated by genA which is less than b.
This is how far I have gotten. Any ideas of how to combine two generators in the described manner?
I don't think it is possible to do this without saving all the results from genA. A solution might look something like this:
import heapq
def gen_weird_sequence(genA, genB):
heap = []
a0 = next_a = next(genA)
saved_a = []
for b in genB:
while next_a < b:
saved_a.append(next_a)
next_a = next(genA)
# saved_a now contains all a < b
for a in saved_a:
heapq.heappush(heap, (a+b, a, b)) #decorate pair with sorting key a+b
# (minimum sum in the next round) > b + a0, so yield everything smaller
while heap and heap[0][0] <= b + a0:
yield heapq.heappop(heap)[1:] # pop smallest and undecorate
Explanation: The main loop iterates simply over all elements in genB, and then gets all elements from genA that are smaller than b and saves them in a list. It then generates all the tuples (a0, b), (a1, b), ..., (a_n, b) and stores them in a min-heap, which is an efficient data-structure when you are only interested in extracting the minimum value of a collection. As with sorting, you can do the trick to not save the pairs itself, but prepend them with the value you want to sort on (a+b), since comparisons between tuples will start by comparing the first item. Finally, it pops all the elements off the heap for which the sum is guaranteed smaller than the sum of any pair generated for the next b and yields them.
Note that both heap and saved_a will increase while you are generating results, I guess proportionally to the square root of the number of elements generated so far.
Quick test with some primes:
In [2]: genA = (a for a in [2,3,5,7,11,13,17,19])
In [3]: genB = (b for b in [2,3,5,7,11,13,17,19])
In [4]: for pair in gen_weird_sequence(genA, genB): print pair
(2, 3)
(2, 5)
(3, 5)
(2, 7)
(3, 7)
(5, 7)
(2, 11)
(3, 11)
(2, 13)
(3, 13)
(5, 11)
(5, 13)
(7, 11)
(2, 17)
(3, 17)
(7, 13)
as expected. Test with infinite generators:
In [11]: from itertools import *
In [12]: list(islice(gen_weird_sequence(count(), count()), 16))
Out[12]: [(0, 1), (0, 2), (0, 3), (1, 2), (0, 4), (1, 3), (0, 5), (1, 4),
(2, 3), (0, 6), (1, 5), (2, 4), (0, 7), (1, 6), (2, 5), (3, 4)]

Categories