Generate only certain combinations from a list in Python - python

I have used the following script to generate all combinations:
import itertools
x = 7
my_list = range(1, 11)
for k in [x]:
for sublist in itertools.combinations(my_list, k):
print sublist
For the second part I will take 6 random elements from range(1, 11). Let's call them my_second_list.
I need to generate the minimum number of combinations of my_list in order to obtain at least one combination to include let's say 5 elements from my_second_list.
Any ideas on how to do that?

import itertools
x = 7
my_list = range(1,11)
for k in [x]: # isn't this just 7?
your_output = (combination for combination in itertools.combinations(my_list,k) if all(element in combination for element in [1,2,3,4,5]))
It's ugly as heck, but that's how I'd do it (if I'm understanding your question correctly. You're trying to get only those combinations that contain a certain subset of items, right? If you want all combinations BEFORE and INCLUDING that first combination that contains the subset of items, I'd do:
accumulator = list()
subset = [1,2,3,4,5] # or whatever
for k in [x]:
for combination in itertools.combinations(my_list,k):
accumulator.append(combination)
if all(el in combination for el in subset):
break
Depending on your exact use case you may want to consider defining subset as a set (e.g. {1,2,3,4,5}) and do subset.issubset(set(combination)) but it's hard to tell if that's better or not without doing some profiling.

Related

Grouping a grouped list of str without duplicates

I have a grouped list of strings that sort of looks like this, the lists inside of these groups will always contain 5 elements:
text_list = [['aaa','bbb','ccc','ddd','eee'],
['fff','ggg','hhh','iii','jjj'],
['xxx','mmm','ccc','bbb','aaa'],
['fff','xxx','aaa','bbb','ddd'],
['aaa','bbb','ccc','ddd','eee'],
['fff','xxx','aaa','ddd','eee'],
['iii','xxx','ggg','jjj','aaa']]
The objective is simple, group all of the list that is similar by the first 3 elements that is then compared against all of the elements inside of the other groups.
So from the above example the output might look like this (output is the index of the list):
[[0,2,4],[3,5]]
Notice how if there is another list that contains the same elements but in a different order is removed.
I've written the following code to extract the groups but they would return duplicates and I am unsure how to proceed. I also think this might not be the most efficient way to do the extraction as the real list can contain upwards to millions of groups:
grouped_list = []
for i in range(0,len(text_list)):
int_temp = []
for m in range(0,len(text_list)):
if i == m:
continue
bool_check = all( x in text_list[m] for x in text_list[i][0:3])
if bool_check:
if len(int_temp) == 0:
int_temp.append(i)
int_temp.append(m)
continue
int_temp.append(m)
grouped_list.append(int_temp)
## remove index with no groups
grouped_list = [x for x in grouped_list if x != []]
Is there a better way to go about this? How do I remove the duplicate group afterwards? Thank you.
Edit:
To be clearer, I would like to retrieve the lists that is similar to each other but only using the first 3 elements of the other lists. For example, using the first 3 elements from list A, check if list B,C,D... contains all 3 of the elements from list A. Repeat for the entire list then remove any list that contains duplicate elements.
You can build a set of frozensets to keep track of indices of groups with the first 3 items being a subset of the rest of the members:
groups = set()
sets = list(map(set, text_list))
for i, lst in enumerate(text_list):
groups.add(frozenset((i, *(j for j, s in enumerate(sets) if set(lst[:3]) <= s))))
print([sorted(group) for group in groups if len(group) > 1])
If the input list is long, it would be faster to create a set of frozensets of the first 3 items of all sub-lists and use the set to filter all combinations of 3 items from each sub-list, so that the time complexity is essentially linear to the input list rather than quadratic despite the overhead in generating combinations:
from itertools import combinations
sets = {frozenset(lst[:3]) for lst in text_list}
groups = {}
for i, lst in enumerate(text_list):
for c in map(frozenset, combinations(lst, 3)):
if c in sets:
groups.setdefault(c, []).append(i)
print([sorted(group) for group in groups.values() if len(group) > 1])

All possible combination of lists for a given list

I would like to have all possible combinations of lists of a given size.
Lets say for example, I have a given list K, such as K = [3, 5, 2]. With the following code I get what I desire. But how do I generalize this for any list of a given size, without just adding for loops?
assignment= []
for l in range(K[0]):
for m in range(K[1]):
for n in range(K[2]):
a = [l,m,n]
assignment.append(a)
The itertools library can often help with things like this. itertools.product() in particular, using * to pass it a list or iterator of ranges:
list(itertools.product(*(range(n) for n in K))
or
list(itertools.product(*map(range, K)))

Check number not a sum of 2 ints on a list

Given a list of integers, I want to check a second list and remove from the first only those which can not be made from the sum of two numbers from the second. So given a = [3,19,20] and b = [1,2,17], I'd want [3,19].
Seems like a a cinch with two nested loops - except that I've gotten stuck with break and continue commands.
Here's what I have:
def myFunction(list_a, list_b):
for i in list_a:
for a in list_b:
for b in list_b:
if a + b == i:
break
else:
continue
break
else:
continue
list_a.remove(i)
return list_a
I know what I need to do, just the syntax seems unnecessarily confusing. Can someone show me an easier way? TIA!
You can do like this,
In [13]: from itertools import combinations
In [15]: [item for item in a if item in [sum(i) for i in combinations(b,2)]]
Out[15]: [3, 19]
combinations will give all possible combinations in b and get the list of sum. And just check the value is present in a
Edit
If you don't want to use the itertools wrote a function for it. Like this,
def comb(s):
for i, v1 in enumerate(s):
for j in range(i+1, len(s)):
yield [v1, s[j]]
result = [item for item in a if item in [sum(i) for i in comb(b)]]
Comments on code:
It's very dangerous to delete elements from a list while iterating over it. Perhaps you could append items you want to keep to a new list, and return that.
Your current algorithm is O(nm^2), where n is the size of list_a, and m is the size of list_b. This is pretty inefficient, but a good start to the problem.
Thee's also a lot of unnecessary continue and break statements, which can lead to complicated code that is hard to debug.
You also put everything into one function. If you split up each task into different functions, such as dedicating one function to finding pairs, and one for checking each item in list_a against list_b. This is a way of splitting problems into smaller problems, and using them to solve the bigger problem.
Overall I think your function is doing too much, and the logic could be condensed into much simpler code by breaking down the problem.
Another approach:
Since I found this task interesting, I decided to try it myself. My outlined approach is illustrated below.
1. You can first check if a list has a pair of a given sum in O(n) time using hashing:
def check_pairs(lst, sums):
lookup = set()
for x in lst:
current = sums - x
if current in lookup:
return True
lookup.add(x)
return False
2. Then you could use this function to check if any any pair in list_b is equal to the sum of numbers iterated in list_a:
def remove_first_sum(list_a, list_b):
new_list_a = []
for x in list_a:
check = check_pairs(list_b, x)
if check:
new_list_a.append(x)
return new_list_a
Which keeps numbers in list_a that contribute to a sum of two numbers in list_b.
3. The above can also be written with a list comprehension:
def remove_first_sum(list_a, list_b):
return [x for x in list_a if check_pairs(list_b, x)]
Both of which works as follows:
>>> remove_first_sum([3,19,20], [1,2,17])
[3, 19]
>>> remove_first_sum([3,19,20,18], [1,2,17])
[3, 19, 18]
>>> remove_first_sum([1,2,5,6],[2,3,4])
[5, 6]
Note: Overall the algorithm above is O(n) time complexity, which doesn't require anything too complicated. However, this also leads to O(n) extra auxiliary space, because a set is kept to record what items have been seen.
You can do it by first creating all possible sum combinations, then filtering out elements which don't belong to that combination list
Define the input lists
>>> a = [3,19,20]
>>> b = [1,2,17]
Next we will define all possible combinations of sum of two elements
>>> y = [i+j for k,j in enumerate(b) for i in b[k+1:]]
Next we will apply a function to every element of list a and check if it is present in above calculated list. map function can be use with an if/else clause. map will yield None in case of else clause is successful. To cater for this we can filter the list to remove None values
>>> list(filter(None, map(lambda x: x if x in y else None,a)))
The above operation will output:
>>> [3,19]
You can also write a one-line by combining all these lines into one, but I don't recommend this.
you can try something like that:
a = [3,19,20]
b= [1,2,17,5]
n_m_s=[]
data=[n_m_s.append(i+j) for i in b for j in b if i+j in a]
print(set(n_m_s))
print("after remove")
final_data=[]
for j,i in enumerate(a):
if i not in n_m_s:
final_data.append(i)
print(final_data)
output:
{19, 3}
after remove
[20]

Converting combinations of integers into list of sums of these combinations

I need to do this: I have list of integers, i make all possible combinations containing 3 of these numbers, and output is list containing sums of these combinations. (not sum of all combinations together, but for each combination one sum)
My algorithm do it in this way: make list of all possible combinations, then count sum of each combination and save it into list, what is too uneffective if number of integers is large. Here is little sample in python:
import itertools
array=[1,2,3,4]
combs=[]
els = [list(x) for x in itertools.combinations(array, 3)]
combs.extend(els)
result=list(map(sum, combs))
Could you please come up with some more effective solution? Maybe if there is some way of making it a loop where there isn't made a list of combinations at first and counted sum of each combination at second but rather it straight counts sum of just created combination, saves it into list and then continues to another one until all combinations and sums are done.
sum() works with any iterable, not just lists. You could make the code more efficient by applying it directly to the combinations' tuples:
result = [sum(c) for c in itertools.combinations(array, 3)]
If you want to process the result lazily, you can use a generator expression as well:
result = (sum(c) for c in itertools.combinations(array, 3))

How to Create Combination of Element in Different Set?

Let say that I have n lists and they are not disjoint. I want to make every combination of n elements which I get one from every lists I have but in that combination there are different elements and there are no double combination. So, [1,1,2] isn't allowed and [1,2,3] is same as [2,1,3].
For example, I have A=[1,2,3], B=[2,4,1], and C=[1,5,3]. So, the output that I want is [[1,2,5],[1,2,3],[1,4,5],[1,4,3],[2,4,1],[2,4,5],[2,4,3],[3,2,5],[3,4,5],[3,1,5]].
I have search google and I think function product in module itertools can do it. But, I have no idea how to make no same elements in every combinations and no double combinations.
Maybe something like:
from itertools import product
A=[1,2,3]
B=[2,4,1]
C=[1,5,3]
L = list(set([ tuple(sorted(l)) for l in product(A,B,C) if len(set(l))==3 ]))
Of course you would have to change 3 ot the relevant value if you work with more than 3 lists.
how about this? create a dicitonary with the sorted permutations as key. accept values only if all the three integers are different:
from itertools import product
A=[1,2,3]
B=[2,4,1]
C=[1,5,3]
LEN = 3
dct = {tuple(sorted(item)): item for item in product(A,B,C)
if len(set(item)) == LEN}
print(dct)
vals = list(dct.values())
print(vals)

Categories