Find the highest sum combination with conditions from a list - python

So I have a 360 element list that I want to find the highest sum of 11 numbers combination, but with a condition.To make it a bit clearer:
1-Gets a list as input
2-Create a combination list with 11 numbers
3-Check the list for a specific condition
4-If yes, return the list's sum
I tried to use itertools.combination then check for the condition but it took so long as my list is really big.So I'm wondering if there's a way to check for the condition first rather than creating all the combinations then filtering them out.
EDIT: Guys I think you didn't get my question quite well.I want to get the list's combination first(like permutation), not just the highest 11 numbers

Why not sorted the list, descending, and then pick the first 11? If you need the indices, you can just find the numbers that you need from the original list.
import random
import pprint
original_items = random.choices(range(999), k=360)
pprint.pprint(original_items)
highest_11 = sorted(original_items, reverse=True)[:11]
pprint.pprint(highest_11)

Generally, sort the list and find the first 11 numbers that satisfy your condition.
Even if your condition is non deterministic, you can still probably reduce the runtime to linear for the search itself (and thus the runtime will depend on the condition).

Related

How to split a dataframe and select all possible pairs?

I have a dataframe that I want to separate in order to apply a certain function.
I have the fields df['beam'], df['track'], df['cycle'] and want to separate it by unique values of each of this three. Then, I want to apply this function (it works between two individual dataframes) to each pair that meets that df['track'] is different between the two. Also, the result doesn't change if you switch the order of the pair, so I'd like to not make unnecessary calls to the function if possible.
I currently work it through with four nested for loops into an if conditional, but I'm absolutely sure there's a better, cleaner way.
I'd appreciate all help!
Edit: I ended up solving it like this:
I split the original dataframe into multiple by using df.groupby()
dfsplit=df.groupby(['beam','track','cycle'])
This generates a dictionary where the keys are all the unique ['beam','track','cycle'] combinations as tuples
I combined all possible ['beam','track','cycle'] pairs with the use of itertools.combinations()
keys=list(itertools.combinations(dfsplit.keys(),2))
This generates a list of 2-element tuples where each element is one ['beam','track','cycle'] tuple itself, and it doesn't include the tuple with the order swapped, so I avoid calling the function twice for what would be the same case.
I removed the combinations where 'track' was the same through a for loop
for k in keys.copy():
if k[0][1]==k[1][1]:
keys.remove(k)
Now I can call my function by looping through the list of combinations
for k in keys:
function(dfsplit[k[0]],dfsplit[k[1]])
Step 3 is taking a long time, probably because I have a very large number of unique ['beam','track','cycle'] combinations so the list is very long, but also probably because I'm doing it sub-optimally. I'll keep the question open in case someone realizes a better way to do this last step.
EDIT 2:
Solved the problem with step 3, once again with itertools, just by doing
keys=list(itertools.filterfalse(lambda k : k[0][1]==k[1][1], keys))
itertools.filterfalse returns all elements of the list that return false to the function defined, so it's doing the same as the previous for loop but selecting the false instead of removing the true. It's very fast and I believe this solves my problem for good.
I don't know how to mark the question as solved so I'll just repeat the solution here:
dfsplit=df.groupby(['beam','track','cycle'])
keys=list(itertools.combinations(dfsplit.keys(),2))
keys=list(itertools.filterfalse(lambda k : k[0][1]==k[1][1], keys))
for k in keys:
function(dfsplit[k[0]],dfsplit[k[1]])

How can I sum pairs of numbers in a list and if the sum is equal to another element in the list, how can I remove this element from the list?

I have a list of numbers:
nums = [-3,-2,-1,1,2,3,0,0]
And I want to iterate over the list, and add two numbers at a time, and if the sum of these two numbers is an element in the list (excluding the two we just summed), then I want to remove the element. How can I do this?
For example, since -3+2= -1 and -1 is in the list, I can remove -1 from the list.
I tried to create an empty list/set to store the sums of each pair of elements, then compare the elements in nums to this list/set, but this doesn't work, as it violates the second part of what I want to do- in brackets above!
P.S. I would like to remove as many elements as possible, and one worry is that if I remove elements as I iterate the list- before finding all of the pair sums- then I might miss some valid pair sums!
Any help is much appreciated.
Although possibly inefficient (didn't care about efficiency when writing this answer), you could probably do the following:
for first in nums[:]:
if first == 0: continue
for second in nums[:]:
if second == 0: continue
sum = first + second
if sum in nums:
nums.remove(sum)
There are 2 things I will go over in this answer;
nums[:] is the slicing syntax that will just create a copy of the array, whereafter that copy will be iterated over. The syntax for this was found within this answer: https://stackoverflow.com/a/1207461/6586516
There are also checks to see if either the first or second number is equal to 0. This will satisfy the second condition (according to my logic). The only time the sum of a set of numbers can be equal to itself/one of the numbers used in the addition is when n - 1 numbers is equal to 0 (where n is the amount of numbers used in the addition)
You could try something like this:
itertools.combinations returns us with all possible combinations of a the argument. Since you want to numbers, providing 2 as the argument will return all possible combinations with length 2
Then we will just check the conditions and proceed to remove from the list.
from itertools import combinations
nums = [-3,-2,-1,1,2,3,0,0]
nums2=[i for i in combinations(nums,2)]
print(nums2)
for i in nums2:
sums=i[0]+i[1]
if sums in nums and (sums!=i[0] and sums!=i[1]):
try:
print(f"{sums} removed from list.\nNew List: {nums}")
nums.remove(sums)
except ValueError:
print("{sums} not in list")
print(nums)

Keeping count of values available from among multiple sets

I have the following situation:
I am generating n combinations of size 3 from, made from n values. Each kth combination [0...n] is pulled from a pool of values, located in the kth index of a list of n sets. Each value can appear 3 times. So if I have 10 values, then I have a list of size 10. Each index holds a set of values 0-10.
So, it seems to me that a good way to do this is to have something keeping count of all the available values from among all the sets. So, if a value is rare(lets say there is only 1 left), if I had a structure where I could look up the rarest value, and have the structure tell me which index it was located in, then it would make generating the possible combinations much easier.
How could I do this? What about one structure to keep count of elements, and a dictionary to keep track of list indices that contain the value?
edit: I guess I should put in that a specific problem I am looking to solve here, is how to update the set for every index of the list (or whatever other structures i end up using), so that when I use a value 3 times, it is made unavailable for every other combination.
Thank you.
Another edit
It seems that this may be a little too abstract to be asking for solutions when it's hard to understand what I am even asking for. I will come back with some code soon, please check back in 1.5-2 hours if you are interested.
how to update the set for every index of the list (or whatever other structures i end up using), so that when I use a value 3 times, it is made unavailable for every other combination.
I assume you want to sample the values truly randomly, right? What if you put 3 of each value into a list, shuffle it with random.shuffle, and then just keep popping values from the end of the list when you're building your combination? If I'm understanding your problem right, here's example code:
from random import shuffle
valid_values = [i for i in range(10)] # the valid values are 0 through 9 in my example, update accordingly for yours
vals = 3*valid_values # I have 3 of each valid value
shuffle(vals) # randomly shuffle them
while len(vals) != 0:
combination = (vals.pop(), vals.pop(), vals.pop()) # combinations are 3 values?
print(combination)
EDIT: Updated code based on the added information that you have sets of values (but this still assumes you can use more than one value from a given set):
from random import shuffle
my_sets_of_vals = [......] # list of sets
valid_values = list()
for i in range(my_sets_of_vals):
for val in my_sets_of_vals[i]:
valid_values.append((i,val)) # this can probably be done in list comprehension but I forgot the syntax
vals = 3*valid_values # I have 3 of each valid value
shuffle(vals) # randomly shuffle them
while len(vals) != 0:
combination = (vals.pop()[1], vals.pop()[1], vals.pop()[1]) # combinations are 3 values?
print(combination)
Based on the edit you could make an object for each value. It could hold the number of times you have used the element and the element itself. When you find you have used an element three times, remove it from the list

I have written a sorting algorithm. Is this a quicksort?

Is this a quicksort? If I have an array A=[1,43,21,45,56,7], I check the last element with first,second,third to last-1 and swap if last element is less with the compared values. Then I proceed to do the process starting from 2nd element going up to last-1 and comparing with the last element? I continue to do this with third, fourth and last-1.
#quicksort
def qsort(li):
end=len(li)-1
for i in range(len(li)):
begin=0
print begin
while (begin<end):
if li[begin]>li[end]:
temp=li[begin]
li[begin]=li[end]
li[end]=temp
begin+=1
end-=1
return li
As Ian and Peter say, it seems to be a combination of bubble sort and selection sort due to the swapping and maintenance of an order list on one end of the array. DEFINITELY not quicksort - which has complexity O(nlogn) - this algorithm has complexity O(n^2)... for every n elements you make n comparisons.
As others have mentioned in comments, this is not a Quicksort. What you are doing is selecting the end element, making a comparison for each of the other values and then swapping values based on this. Despite using a reference point similar to a "pivot" in a Quicksort, you are not using it to split the list on both higher and lower comparisons (only lower).
If you do wish to write a Quicksort instead, I thoroughly recommend the recursive model instead of an iterative one. You would pick a pivot element and then sort into two new lists depending on whether the element is larger or smaller and then run the function on these new lists. Finally, rejoin the lists in order of: smaller, pivot, larger.
That's a selection sort,
where you compare each element with the rest of the succeeding element and find the least(ascending) or greatest(descending) and swap with that element
.i.e we select the least/greatest element and swap and proceed with next element
However, quicksort is recursive in nature and partitions are done.

Python: Listing the duplicates in a list

I am fairly new to Python and I am interested in listing duplicates within a list. I know how to remove the duplicates ( set() ) within a list and how to list the duplicates within a list by using collections.Counter; however, for the project that I am working on this wouldn't be the most efficient method to use since the run time would be n(n-1)/2 --> O(n^2) and n is anywhere from 5k-50k+ string values.
So, my idea is that since python lists are linked data structures and are assigned to the memory when created that I begin counting duplicates from the very beginning of the creation of the lists.
List is created and the first index value is the word 'dog'
Second index value is the word 'cat'
Now, it would check if the second index is equal to the first index, if it is then append to another list called Duplicates.
Third index value is assigned 'dog', and the third index would check if it is equal to 'cat' then 'dog'; since it matches the first index, it is appended to Duplicates.
Fourth index is assigned 'dog', but it would check the third index only, and not the second and first, because now you can assume that since the third and second are not duplicates that the fourth does not need to check before, and since the third/first are equal, the search stops at the third index.
My project gives me these values and append it to a list, so I would want to implement that above algorithm because I don't care how many duplicates there are, I just want to know if there are duplicates.
I can't think of how to write the code, but I figured the basic structure of it, but I might be completely off (using random numgen for easier use):
for x in xrange(0,10):
list1.append(x)
for rev, y in enumerate(reversed(list1)):
while x is not list1(y):
cond()
if ???
I really don't think you'll get better than a collections.Counter for this:
c = Counter(mylist)
duplicates = [ x for x,y in c.items() if y > 1 ]
building the Counter should be O(n) (unless you're using keys which are particularly bad for hashing -- But in my experience, you need to try pretty hard to make that happen) and then getting the duplicates list is also O(n) giving you a total complexity of O(2n) == O(n) (for typical uses).

Categories