Make generator for pairs excluding those with a 0

Make generator for pairs excluding those with a 0 - python

I would like to generate pairs of integers forever sorted by the sum of their absolute values from smallest to biggest. However I don't want to return any that have a 0 for either part of the pair. So I would like:
(1, 1)
(-1, -1)
(1, 2)
(-1, -2)
(2, 1)
(-2, -1)
[...]
I saw Iterate over pairs in order of sum of absolute values which has really great answers such as:
def pairSums(s = 0): # base generation on target sum to get pairs in order
while True: # s parameter allows starting from a given sum
for i in range(s//2+1): # partitions
yield from {(i,s-i),(s-i,i),(i-s,-i),(-i,i-s)} # no duplicates
s += 1
I could filter out the pairs with a 0 in them but that seems ugly and inefficient (most of the pairs seem to have a 0 in them). Is there way to generate them directly?

I think you just need to exclude the cases i=0 and i=s?
def pairSums(s = 0): # base generation on target sum to get pairs in order
while True: # s parameter allows starting from a given sum
for i in range(1, s//2+1): # partitions
if i != s:
yield from {(i,s-i),(s-i,i),(i-s,-i),(-i,i-s)}
s += 1

Related

Get the largest absolute difference in an array Python

I want to do a for loop that can basically do the absolute difference between every 2 elements of the array until it reaches all of them and then print the highest absolute difference.
arr = []
n = int(input("number of elements in array: "))
for i in range(0, n):
arr.append(input('insert element: '))
I've done this already but I would like to know how slow this method is compared to making the absolute difference between the
first and last element after sorting the array.
EXAMPLE
Input: {2, 7, 3, 4, 1, 9}
Output: 8 (|1 – 9|)
This is what I have tried:
arr = []
n = int(input("número de elementos do array : "))
for i in range(0, n):
arr.append(int(input('escreva os elementos: ')))
arr.sort()
print(arr[-1] - arr[0])

If you are fine with numpy, there's a way to do so.
Firstly, you need to find all the possible non-duplicate solutions from the given input using itertools.combinations
from itertools import combinations
alist = [2, 7, 3, 4, 1, 9]
all_comb = list(combinations(alist, 2))
[(2, 7), (2, 3), (2, 4), (2, 1), (2, 9), (7, 3), (7, 4), (7, 1), (7, 9), (3, 4), (3, 1), (3, 9), (4, 1), (4, 9), (1, 9)]
With this, you can use np.diff to find the differences for every tuple.
abs_diff = abs(np.diff(all_comb)).flatten()
array([5, 1, 2, 1, 7, 4, 3, 6, 2, 1, 2, 6, 3, 5, 8])
Finally, you can get the index of the maximum difference using np.argmax.
all_comb[abs_diff.argmax()]
Out[147]: (1, 9)

arr = []
n = int(input("número de elementos do array : "))
my_min, my_max = None, None
for i in range(0, n):
arr.append(int(input('escreva os elementos: ')))
if my_min is None or abs(arr[i]) < my_min:
my_min = arr[i]
if my_max is None or abs(arr[i]) > my_max:
my_max = arr[i]
print(f"{abs(my_max - my_min)} (|{my_max} - {my_min}|)")
You can achieve this just by "emembering" the number with highest and lowest abs value

As I understood about your query.
Sorting the element and then comparing first & last one is much faster than finding highest difference via iterating through list. This is because when sorting happens internally, as it moves forward it needs to compare with less values because if value is higher than last sorted value it directly appends next to it but if value is less then it just moves one value back rather than starting over again.
But comparing through all possible pairs in list takes much more time as it has to start from first value over again since we don't know which comparing will be highest.
So sorting is much faster to find largest difference than iterating for every possible pair with for loop in list.
I hope I got your query right :)
UPDATED
so the main question is about finding a way to find largest diff in a list with for loop and which should be faster so here it is.
In my opinion below code will be even faster than sorting and finding largest diff. Because here in this code we only need to iterate in list once and we will have answer of largest diff. No need to iterate every possible pair of value.
I think this may help :)
list_a = []
n = int(input("number of elements in array: "))
for i in range(0, n):
# store input after converting to integer.
list_a.append(int(input('insert element: ')))
'''to store largest difference of two current numbers in every eteration'''
largest_diff_so_far = 0;
'''list to store that two numbers we are comparing'''
actual_diff_number = None;
'''start from first number in list. we don't need to go through every possible pair so just picking first number without for loop.'''
first = list_a[0]
'''here we iterate through all number only once till last number in
list'''
for second in list_a :
'''first find diff of current two value'''
current_diff = second - first
'''as we can see when current_diff is larger then previous largest diff we will update their value'''
if largest_diff_so_far == 0 or current_diff > largest_diff_so_far:
'''if first value in list is largest than all then the current diff will be negative and in that case we will run below if code and continue the code so that it will not over -ride anything in remaining code'''
if current_diff < 0:
''' since the diff is negative we will store its absolute value in largest diff variable.'''
largest_diff_so_far = abs(current_diff)
''' since first value is largest then all means it is larger than current second also, so in actual_diff_number we will store values in reverse order, so that our largest value which is stored in first variable will be second in list and by this in later iteration we will avoid over-writing of this largest value'''
actual_diff_number = [second, first]
''' we will also update first variable's value to second variable's value since it smaller than previous value of first and by this next iteration will use this value for diff rather than initial value of first variable which was largest.'''
first = second
continue
'''if above condition is not the case than rest of the below code will run'''
'''largest diff will be current_diff'''
largest_diff_so_far = current_diff
'''storing actual number whose diff is largest till now.'''
actual_diff_number = [first, second]
'''below is main part for saving time. if in current process we find diff which is in minus means our second value is even less than first, in that case we no longer need to carry forward that first value so we will update first value to our current second value and will also update largest diff that is stored previously. since our first value is less than previous first value then our diff will also increase from previous diff.'''
elif current_diff < 0:
first = second
'''update largest diff with new first value'''
largest_diff_so_far = actual_diff_number[1] - first
'''update actual diff number's first value in that list'''
actual_diff_number[0] = first
'''finally print answer since after finishing for loop largest_diff_so_far and actual_diff_number contains the answer that we are finding.'''
print(actual_diff_number, largest_diff_so_far)

Python - efficient way to find first occurences of multiple values

I have a following problem: I need to find first occurences in an array for values greater than or equal than multiple other values.
Example:
array_1 = [-3,2,8,-1,0,5]
array_2 = [5,1]
Script has to find where in array_1 is the first value greater than or equal to each value from array_2 so the expected result in that case would be [3,2] for 1-based indices
A simple loop won't be any good for my case as both array have close to million values and it has to execute quickly preferably under a minute.
Simple loop solution that has a run time of about half an hour:
for j in range(0, len(array_2)):
for i in range(0, len(array_1)):
if array_1[i] >= array_2[j]:
solution[j] = i
break
Edit: indices clarification as #Sergio Tulentsev correctly pointed out

First perform some preprocessing on the data: create a new list that only has the values that are greater than all predecessors in the original data, and combine them in a tuple with the 1-based position where they were found.
So for instance, for the example data [-3,2,8,-1,0,5], this would be:
[(-3, 1), (2, 2), (8, 3)]
Note how the answer to any query can only be 1, 2 or 3, as the values at the other positions are all smaller than 8.
Then for each query use a binary search to find the tuple whose left value is at least the queried value, and return the right value of the found tuple (the position). For the binary search you can rely on the bisect library:
import bisect
def solve(data, queries):
# preprocessing
maxima = []
greatest = float("-inf")
for i, val in enumerate(data):
if val > greatest:
greatest = val
maxima.append((val, i+1))
# main
return [maxima[bisect.bisect_left(maxima, (query,))][1]
for query in queries]
Example use:
data = [-3,2,8,-1,0,5]
queries = [5,1]
print(solve(data, queries)) # [3, 2]

I suggest using a loop over the first array and using max(array_2) for the second one.

How can I efficiently merge two sets of ordered pairs, excluding pairs that are lower on both values from any other pair?

As part of a dynamical programming assignment, I find myself having to do the following.
I have two sorted lists of length 2 tuples (ordered pairs, representing scores on two criteria). One pair of values can only be considered strictly greater than another if it is greater on one criterion and not lower on the other. So, (1,8) and (2,7) are incomparable, while (1,7) is lower than (2,8).
Each input list contains only values that are incomparable with each other. My method merges the two lists, omitting duplicates as well as any values that are strictly inferior to another value in the new, bigger list. Then it sorts the new list.
For example, the following input produces this result:
combine([(1,8), (2, 6), (3, 4)], [(2, 7), (3, 3)])
[(1, 8), (2, 7), (3, 4)]
Here's the code I have currently produced:
def combine(left, right):
# start with lists sorted from biggest to smallest
newlist = left+right
leftlen = len(left)
for i in range(leftlen - 1, -1, -1):
l = newlist[i] # candidate value to be inserted in
for j in range(len(newlist) - 1, leftlen - 1, -1):
r = newlist[j]
if r[0] >= l[0]: # cell with >= food encountered without having previously encountered cell with less water
if r[1] >= l[1]: # this cell also has more water - del candidate
del newlist[i]
leftlen -=1
elif r[0] == l[0]: # equal food and less water than candidate - candidate wins
del newlist[j]
break # either way, no need to consider further cells -
# if this cell doesn't beat candidate, then candidate belongs
if r[1] <= l[1]: # cell with less water encountered before cell with more food
del newlist[j]
for k in range(j -1, leftlen - 1, -1): # continue through right list, deleting cells until a value with
# higher food is found
r = newlist[k]
if r[0] > l[0]: break
else: del newlist[k]
break
newlist.sort(reverse=True)
return newlist
This code does work, but I am wondering if there is a faster approach to solving this kind of problem? When the lists are long, I end up making a lot of pairwise comparisons.
I've tried to prune out some unnecessary comparisons, relying upon the fact that items in each list are always greater on one criterion and lesser on the other. Thus, the lists are reverse sorted on the first value in the tuple, and therefore also sorted on the second value!
One idea I had was to try and use a different ADT - some type of tree perhaps, but I'm not sure if this is going to help or not.
Any suggestions? This is for an assignment, so I'm looking for ideas rather than for somebody to rewrite the whole thing for me :) Cheers!

Recovering Subsets in Subset Sum Problem - Not All Subsets Appear

Brushing up on dynamic programming (DP) when I came across this problem. I managed to use DP to determine how many solutions there are in the subset sum problem.
def SetSum(num_set, num_sum):
#Initialize DP matrix with base cases set to 1
matrix = [[0 for i in range(0, num_sum+1)] for j in range(0, len(num_set)+1)]
for i in range(len(num_set)+1): matrix[i][0] = 1
for i in range(1, len(num_set)+1): #Iterate through set elements
for j in range(1, num_sum+1): #Iterate through sum
if num_set[i-1] > j: #When current element is greater than sum take the previous solution
matrix[i][j] = matrix[i-1][j]
else:
matrix[i][j] = matrix[i-1][j] + matrix[i-1][j-num_set[i-1]]
#Retrieve elements of subsets
subsets = SubSets(matrix, num_set, num_sum)
return matrix[len(num_set)][num_sum]
Based on Subset sum - Recover Solution, I used the following method to retrieve the subsets since the set will always be sorted:
def SubSets(matrix, num_set, num):
#Initialize variables
height = len(matrix)
width = num
subset_list = []
s = matrix[0][num-1] #Keeps track of number until a change occurs
for i in range(1, height):
current = matrix[i][width]
if current > s:
s = current #keeps track of changing value
cnt = i -1 #backwards counter, -1 to exclude current value already appended to list
templist = [] #to store current subset
templist.append(num_set[i-1]) #Adds current element to subset
total = num - num_set[i-1] #Initial total will be sum - max element
while cnt > 0: #Loop backwards to find remaining elements
if total >= num_set[cnt-1]: #Takes current element if it is less than total
templist.append(num_set[cnt-1])
total = total - num_set[cnt-1]
cnt = cnt - 1
templist.sort()
subset_list.append(templist) #Add subset to solution set
return subset_list
However, since it is a greedy approach it only works when the max element of each subset is distinct. If two subsets have the same max element then it only returns the one with the larger values. So for elements [1, 2, 3, 4, 5] with sum of 10 it only returns
[1, 2, 3, 4] , [1, 4, 5]
When it should return
[1, 2, 3, 4] , [2, 3, 5] , [1, 4, 5]
I could add another loop inside the while loop to leave out each element but that would increase the complexity to O(rows^3) which can potentially be more than the actual DP, O(rows*columns). Is there another way to retrieve the subsets without increasing the complexity? Or to keep track of the subsets while the DP approach is taking place? I created another method that can retrieve all of the unique elements in the solution subsets in O(rows):
def RecoverSet(matrix, num_set):
height = len(matrix) - 1
width = len(matrix[0]) - 1
subsets = []
while height > 0:
current = matrix[height][width]
top = matrix[height-1][width]
if current > top:
subsets.append(num_set[height-1])
if top == 0:
width = width - num_set[height-1]
height -= 1
return subsets
Which would output [1, 2, 3, 4, 5]. However, getting the actual subsets from it seems like solving the subset problem all over again. Any ideas/suggestions on how to store all of the solution subsets (not print them)?

That's actually a very good question, but it seems mostly you got the right intuition.
The DP approach allows you to build a 2D table and essentially encode how many subsets sum up to the desired target sum, which takes time O(target_sum*len(num_set)).
Now if you want to actually recover all solutions, this is another story in the sense that the number of solution subsets might be very large, in fact much larger than the table you built while running the DP algorithm. If you want to find all solutions, you can use the table as a guide but it might take a long time to find all subsets. In fact, you can find them by going backwards through the recursion that defined your table (the if-else in your code when filling up the table). What do I mean by that?
Well let's say you try to find the solutions, having only the filled table at your disposal. The first thing to do to tell whether there is a solution is to check that the element at row len(num_set) and column num has value > 0, indicating that at least one subset sums up to num. Now there are two possibilities, either the last number in num_set is used in a solution in which case we must then check whether there is a subset using all numbers except that last one, which sums up to num-num_set[-1]. This is one possible branch in the recursion. The other one is when the last number in num_set is not used in a solution, in which case we must then check whether we can still find a solution to sum up to num, but having all numbers except that last one.
If you keep going you will see that the recovering can be done by doing the recursion backwards. By keeping track of the numbers along the way (so the different paths in the table that lead to the desired sum) you can retrieve all solutions, but again bear in mind that the running time might be extremely long because we want to actually find all solutions, not just know their existence.
This code should be what you are looking for recovering solutions given the filled matrix:
def recover_sol(matrix, set_numbers, target_sum):
up_to_num = len(set_numbers)
### BASE CASES (BOTTOM OF RECURSION) ###
# If the target_sum becomes negative or there is no solution in the matrix, then
# return an empty list and inform that this solution is not a successful one
if target_sum < 0 or matrix[up_to_num][target_sum] == 0:
return [], False
# If bottom of recursion is reached, that is, target_sum is 0, just return an empty list
# and inform that this is a successful solution
if target_sum == 0:
return [], True
### IF NOT BASE CASE, NEED TO RECURSE ###
# Case 1: last number in set_numbers is not used in solution --> same target but one item less
s1_sols, success1 = recover_sol(matrix, set_numbers[:-1], target_sum)
# Case 2: last number in set_numbers is used in solution --> target is lowered by item up_to_num
s2_sols, success2 = recover_sol(matrix, set_numbers[:-1], target_sum - set_numbers[up_to_num-1])
# If Case 2 is a success but bottom of recursion was reached
# so that it returned an empty list, just set current sol as the current item
if s2_sols == [] and success2:
# The set of solutions is just the list containing one item (so this explains the list in list)
s2_sols = [[set_numbers[up_to_num-1]]]
# Else there are already solutions and it is a success, go through the multiple solutions
# of Case 2 and add the current number to them
else:
s2_sols = [[set_numbers[up_to_num-1]] + s2_subsol for s2_subsol in s2_sols]
# Join lists of solutions for both Cases, and set success value to True
# if either case returns a successful solution
return s1_sols + s2_sols, success1 or success2
For the full solution with matrix filling AND recovering of solutions you can then do
def subset_sum(set_numbers, target_sum):
n_numbers = len(set_numbers)
#Initialize DP matrix with base cases set to 1
matrix = [[0 for i in range(0, target_sum+1)] for j in range(0, n_numbers+1)]
for i in range(n_numbers+1):
matrix[i][0] = 1
for i in range(1, n_numbers+1): #Iterate through set elements
for j in range(1, target_sum+1): #Iterate through sum
if set_numbers[i-1] > j: #When current element is greater than sum take the previous solution
matrix[i][j] = matrix[i-1][j]
else:
matrix[i][j] = matrix[i-1][j] + matrix[i-1][j-set_numbers[i-1]]
return recover_sol(matrix, set_numbers, target_sum)[0]
Cheers!

Get random unique regions of a list using python

I have a list of numbers, say list=[100,102,108,307,365,421,433,487,511,537,584].
I want to get unique regions from this list for example region 1 from 102-307, region 2 from 421-487 and region 3 from 511-584. These regions should be non overlapping and unique.

I'll credit #TimPietzcker for pointing me in the direction of this answer, although I didn't use the function he offered (random.sample).
In this code, I choose six indices from those in list_ (renamed from list to avoid overwriting the built-in) without replacement, using np.random.choice. I then sort these indices and iterate over each pair of adjacent indices, taking as a region the values from the first index (i) to the second (j) in the pair, inclusive (hence the j + 1).
(If I had used j instead of j + 1, the indices would never be able to include all the values in list, due to the lack of replacement during the selection phase. For example, if one pair were (1, 3), the minimum value for the first index of the next pair would be 4, because 3 could not be chosen twice. Thus, the first pair would take the values at indices 1 and 2, and the value at 3 would be skipped.)
Since it's possible for j to be equal to len(list_) - 1, I've included a try/except section, which catches the IndexError that would be raised in this case and causes the region to include all values through the end of list_ -- equivalent to taking the values from i to j, inclusive, as for all other cases.
import numpy as np
list_ = [100,102,108,307,365,421,433,487,511,537,584]
n_regions = 3
indices = sorted(np.random.choice(range(len(list_)), size=n_regions * 2,
replace=False))
list_of_regions = []
for i, j in zip(indices[::2], indices[1::2]):
try:
list_of_regions.append(list_[i:j + 1])
except IndexError:
# j + 1 == len(list_), so leave it off.
list_of_regions.append(list_[i:])

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Make generator for pairs excluding those with a 0 - python

I think you just need to exclude the cases i=0 and i=s? def pairSums(s = 0): # base generation on target sum to get pairs in order while True: # s parameter allows starting from a given sum for i in range(1, s//2+1): # partitions if i != s: yield from {(i,s-i),(s-i,i),(i-s,-i),(-i,i-s)} s += 1

Related

Get the largest absolute difference in an array Python

Python - efficient way to find first occurences of multiple values

How can I efficiently merge two sets of ordered pairs, excluding pairs that are lower on both values from any other pair?

Recovering Subsets in Subset Sum Problem - Not All Subsets Appear

Get random unique regions of a list using python

Categories

Resources