Python Maximum Pairwise Fast Solution [dup] - python

I'm taking an algorithms course online and I am trying to calculate the maximum pairwise product in a list of numbers. This question has already been answered before:
maximum pairwise product fast solution and
Python for maximum pairwise product
I was able to pass the assignment by looking at those two posts. I was hoping that maybe someone could help me figure out how to correct my solution. I was able to apply a stress test and found out if the largest number in the array is in the starting index it will just multiply itself twice.
This is the test case I failed using the assignment automatic grader
Input:
2
100000 90000
Your output:
10000000000
Correct output:
9000000000
Here is my pairwise method and stress test
from random import randint
def max_pairwise_product(numbers):
n = len(numbers)
max_product = 0
for first in range(n):
for second in range(first + 1, n):
max_product = max(max_product,
numbers[first] * numbers[second])
return max_product
def pairwise1(numbers):
max_index1 = 0
max_index2 = 0
#find the highest number
for i, val in enumerate(numbers):
if int(numbers[i]) > int(numbers[max_index1]):
max_index1 = i
#find the second highest number
for j, val in enumerate(numbers):
if j != max_index1 and int(numbers[j]) > int(numbers[max_index2]):
max_index2 = j
#print(max_index1)
#print(max_index2)
return int(numbers[max_index1]) * int(numbers[max_index2])
def stressTest():
while True:
arr = []
for x in range(5):
random_num = randint(2,101)
arr.append(random_num)
print(arr)
print('####')
result1 = max_pairwise_product(arr)
result2 = pairwise1(arr)
print("Result 1 {}, Result2 {}".format(result1,result2))
if result1 != result2:
print("wrong answer: {} **** {}".format(result1, result2))
break
else:
print("############################################# \n Ok", result1, result2)
if __name__ == '__main__':
stressTest()
'''
length = input()
a = [int(x) for x in input().split()]
answer = pairwise1(a)
print(answer)
'''
Any feedback will be greatly appreciated.
Thanks.

When max number is on position 0, you will get both max_index1 and max_index2 as 0.
That's why you are getting like this .
Add the following lines before #find the second highest number in pairwise1 function .
if max_index1==0:
max_index2=1
else:
max_index2=0
So function will be like:
def pairwise1(numbers):
max_index1 = 0
max_index2 = 0
#find the highest number
for i, val in enumerate(numbers):
if int(numbers[i]) > int(numbers[max_index1]):
max_index1 = i
if max_index1==0:
max_index2=1
else:
max_index2=0
#find the second highest number
for j, val in enumerate(numbers):
if j != max_index1 and int(numbers[j]) > int(numbers[max_index2]):
max_index2 = j
#print(max_index1)
#print(max_index2)
return int(numbers[max_index1]) * int(numbers[max_index2])

Related

Performance improvement for calculating the powerset of a list of integers

I am trying to compute the powerset of a list of prime numbers. I have already done some research and the prefered way of doing this seems to be using a line like
itertools.chain.from_iterable(itertools.combinations(primes, r) for r in range(2, len(primes) + 1))
and then iterating over all combinations to get the products with math.prod(). All in all, the code currently looks like this:
number = 200
p1 = []
# calculate all primes below specified number
for i in range(2, number + 1):
isPrime = True
for prime in p1:
if i % prime == 0:
isPrime = False
if isPrime:
p1.append(i)
Pp = []
myIterable = itertools.chain.from_iterable(itertools.combinations(p1, r) for r in range(2, len(p1) + 1))
# convert iterable to integer array of products -- The code below is extremely slow and should be improved
for x in myIterable:
newValue = math.prod(x)
if newValue <= number:
Pp.append(newValue)
This works, but it is not feasible for any "number" greater than 100 because of too high execution time. The problem is the last for loop, which takes forever to compute. Everything else performs reasonably well. The powerset has to be constricted to sets, whos products are less or equal to number, as done using the last if statement, or else the memory will explode.
The solution to this problem was to create a pointer array, which crawls through the prime array until the product of the pointed primes gets too high. The needed helper functions can be implemented like this:
def calcProductOfPointers(pointerArray, dataArray):
prod = 1
for pointer in pointerArray:
prod *= dataArray[pointer]
return prod
def incrementPointer(pointerArray, dataArray, threshold):
ret = False
for i in range(1, len(pointerArray) + 1):
index = len(pointerArray) - i
pointerArray[index] += 1
if calcProductOfPointers(pointerArray, dataArray) <= threshold and pointerArray[index] < len(dataArray):
ret = True
break
elif index > 0:
pointerArray[index] = pointerArray[index - 1] + 2
else:
break
return ret
And then the iteration over all powersets can be substituted with this code:
Pp = []
for i in range(2, len(p1) + 1): # start at a minimum of 2 prime factors
primePointers = []
for index in range(i):
primePointers.append(index)
if calcProductOfPointers(primePointers, p1) > number:
break
while calcProductOfPointers(primePointers, p1) <= number:
Pp.append(calcProductOfPointers(primePointers, p1))
if not incrementPointer(primePointers, p1, number):
break

Python - Pull random numbers from a list. Populate a new list with a specified length and sum

I am trying to create a function where:
The output list is generated from random numbers from the input list
The output list is a specified length and adds to a specified sum
ex. I specify that I want a list that is 4 in length and adds up to 10. random numbers are pulled from the input list until the criteria is satisfied.
I feel like I am approaching this problem all wrong trying to use recursion. Any help will be greatly appreciated!!!
EDIT: for more context on this problem.... Its going to be a random enemy generator.
The end goal input list will be coming from a column in a CSV called XP. (I plan to use pandas module). But this CSV will have a list of enemy names in the one column, XP in another, Health in another, etc. So the end goal is to be able to specify the total number of enemies and what the sum XP should be between those enemies and have the list generate with the appropriate information. For ex. 5 enemies with a total of 200 XP between them. The result is maybe -> Apprentice Wizard(50 xp), Apprentice Wizard(50 xp), Grung(50), Xvart(25 xp), Xvart(25 xp). The output list will actually need to include all of the row information for the selected items. And it is totally fine to have duplicated in the output as seen in this example. That will actually make more sense in the narrative of the game that this is for.
The csv --> https://docs.google.com/spreadsheets/d/1PjnN00bikJfY7mO3xt4nV5Ua1yOIsh8DycGqed6hWD8/edit?usp=sharing
import random
from random import *
lis = [1,2,3,4,5,6,7,8,9,10]
output = []
def query (total, numReturns, myList, counter):
random_index = randrange(len(myList)-1)
i = myList[random_index]
h = myList[i]
# if the problem hasn't been solved yet...
if len(output) != numReturns and sum(output) != total:
print(output)
# if the length of the list is 0 (if we just started), then go ahead and add h to the output
if len(output) == 0 and sum(output) + h != total:
output.append(h)
query (total, numReturns, myList, counter)
#if the length of the output is greater than 0
if len(output) > 0:
# if the length plus 1 is less than or equal to the number numReturns
if len(output) +1 <= numReturns:
print(output)
#if the sum of list plus h is greater than the total..then h is too big. We need to try another number
if sum(output) + h > total:
# start counter
for i in myList:# try all numbers in myList...
print(output)
print ("counter is ", counter, " and i is", i)
counter += 1
print(counter)
if sum(output) + i == total:
output.append(i)
counter = 0
break
if sum(output) + i != total:
pass
if counter == len(myList):
del(output[-1]) #delete last item in list
print(output)
counter = 0 # reset the counter
else:
pass
#if the sum of list plus h is less than the total
if sum(output) + h < total:
output.append(h) # add h to the list
print(output)
query (total, numReturns, myList, counter)
if len(output) == numReturns and sum(output) == total:
print(output, 'It worked')
else:
print ("it did not work")
query(10, 4, lis, 0)
I guess that it would be better to get first all n-size combinations of given array which adds to specified number, and then randomly select one of them. Random selecting and checking if sum is equal to specified value, in pessimistic scenario, can last indefinitely.
from itertools import combinations as comb
from random import randint
x = [1,1,2,4,3,1,5,2,6]
def query(arr, total, size):
combs = [c for c in list(comb(arr, size)) if sum(c)==total]
return combs[randint(0, len(combs))]
#example 4-item array with items from x, which adds to 10
print(query(x, 10, 4))
If the numbers in your input list are consecutive numbers, then this is equivalent to the problem of choosing a uniform random output list of N integers in the range [min, max], where the output list is ordered randomly and min and max are the smallest and largest number in the input list. The Python code below shows how this can be solved. It has the following advantages:
It does not use rejection sampling.
It chooses uniformly at random from among all combinations that meet the requirements.
It's based on an algorithm by John McClane, which he posted as an answer to another question. I describe the algorithm in another answer.
import random # Or secrets
def _getSolTable(n, mn, mx, sum):
t = [[0 for i in range(sum + 1)] for j in range(n + 1)]
t[0][0] = 1
for i in range(1, n + 1):
for j in range(0, sum + 1):
jm = max(j - (mx - mn), 0)
v = 0
for k in range(jm, j + 1):
v += t[i - 1][k]
t[i][j] = v
return t
def intsInRangeWithSum(numSamples, numPerSample, mn, mx, sum):
""" Generates one or more combinations of
'numPerSample' numbers each, where each
combination's numbers sum to 'sum' and are listed
in any order, and each
number is in the interval '[mn, mx]'.
The combinations are chosen uniformly at random.
'mn', 'mx', and
'sum' may not be negative. Returns an empty
list if 'numSamples' is zero.
The algorithm is thanks to a _Stack Overflow_
answer (`questions/61393463`) by John McClane.
Raises an error if there is no solution for the given
parameters. """
adjsum = sum - numPerSample * mn
# Min, max, sum negative
if mn < 0 or mx < 0 or sum < 0:
raise ValueError
# No solution
if numPerSample * mx < sum:
raise ValueError
if numPerSample * mn > sum:
raise ValueError
if numSamples == 0:
return []
# One solution
if numPerSample * mx == sum:
return [[mx for i in range(numPerSample)] for i in range(numSamples)]
if numPerSample * mn == sum:
return [[mn for i in range(numPerSample)] for i in range(numSamples)]
samples = [None for i in range(numSamples)]
table = _getSolTable(numPerSample, mn, mx, adjsum)
for sample in range(numSamples):
s = adjsum
ret = [0 for i in range(numPerSample)]
for ib in range(numPerSample):
i = numPerSample - 1 - ib
# Or secrets.randbelow(table[i + 1][s])
v = random.randint(0, table[i + 1][s] - 1)
r = mn
v -= table[i][s]
while v >= 0:
s -= 1
r += 1
v -= table[i][s]
ret[i] = r
samples[sample] = ret
return samples
Example:
weights=intsInRangeWithSum(
# One sample
1,
# Count of numbers per sample
4,
# Range of the random numbers
1, 5,
# Sum of the numbers
10)
# Divide by 100 to get weights that sum to 1
weights=[x/20.0 for x in weights[0]]

Finding first pair of numbers in array that sum to value

Im trying to solve the following Codewars problem: https://www.codewars.com/kata/sum-of-pairs/train/python
Here is my current implementation in Python:
def sum_pairs(ints, s):
right = float("inf")
n = len(ints)
m = {}
dup = {}
for i, x in enumerate(ints):
if x not in m.keys():
m[x] = i # Track first index of x using hash map.
elif x in m.keys() and x not in dup.keys():
dup[x] = i
for x in m.keys():
if s - x in m.keys():
if x == s-x and x in dup.keys():
j = m[x]
k = dup[x]
else:
j = m[x]
k = m[s-x]
comp = max(j,k)
if comp < right and j!= k:
right = comp
if right > n:
return None
return [s - ints[right],ints[right]]
The code seems to produce correct results, however the input can consist of array with up to 10 000 000 elements, so the execution times out for large inputs. I need help with optimizing/modifying the code so that it can handle sufficiently large arrays.
Your code inefficient for large list test cases so it gives timeout error. Instead you can do:
def sum_pairs(lst, s):
seen = set()
for item in lst:
if s - item in seen:
return [s - item, item]
seen.add(item)
We put the values in seen until we find a value that produces the specified sum with one of the seen values.
For more information go: Referance link
Maybe this code:
def sum_pairs(lst, s):
c = 0
while c<len(lst)-1:
if c != len(lst)-1:
x= lst[c]
spam = c+1
while spam < len(lst):
nxt= lst[spam]
if nxt + x== s:
return [x, nxt]
spam += 1
else:
return None
c +=1
lst = [5, 6, 5, 8]
s = 14
print(sum_pairs(lst, s))
Output:
[6, 8]
This answer unfortunately still times out, even though it's supposed to run in O(n^3) (since it is dominated by the sort, the rest of the algorithm running in O(n)). I'm not sure how you can obtain better than this complexity, but I thought I might put this idea out there.
def sum_pairs(ints, s):
ints_with_idx = enumerate(ints)
# Sort the array of ints
ints_with_idx = sorted(ints_with_idx, key = lambda (idx, num) : num)
diff = 1000000
l = 0
r = len(ints) - 1
# Indexes of the sum operands in sorted array
lSum = 0
rSum = 0
while l < r:
# Compute the absolute difference between the current sum and the desired sum
sum = ints_with_idx[l][1] + ints_with_idx[r][1]
absDiff = abs(sum - s)
if absDiff < diff:
# Update the best difference
lSum = l
rSum = r
diff = absDiff
elif sum > s:
# Decrease the large value
r -= 1
else:
# Test to see if the indexes are better (more to the left) for the same difference
if absDiff == diff:
rightmostIdx = max(ints_with_idx[l][0], ints_with_idx[r][0])
if rightmostIdx < max(ints_with_idx[lSum][0], ints_with_idx[rSum][0]):
lSum = l
rSum = r
# Increase the small value
l += 1
# Retrieve indexes of sum operands
aSumIdx = ints_with_idx[lSum][0]
bSumIdx = ints_with_idx[rSum][0]
# Retrieve values of operands for sum in correct order
aSum = ints[min(aSumIdx, bSumIdx)]
bSum = ints[max(aSumIdx, bSumIdx)]
if aSum + bSum == s:
return [aSum, bSum]
else:
return None

maximum pairwise product fast solution

I am trying to apply a stress test on python maximum pairwise product, fast and slow algorithm. However, The fast code appears to return a wrong result in some tests. I think the problem comes from the if condition in the fast algorithm. The condition doesn't occur in some cases, though it should be applies. I wasn't able to figure out the problem. any help?
Here is the problem, I/P, O/P details:
Given a sequence of non-negative integers a0,…,an−1, find the maximum pairwise product, that is, the largest integer that can be obtained by multiplying two different elements from the sequence (or, more formally, max0≤i≠j≤n−1aiaj). Different elements here mean ai and aj with i≠j (it can be the case that ai=aj).
Input format
The first line of the input contains an integer n. The next line contains n non-negative integers a0,…,an−1 (separated by spaces).
Constraints:
2≤n≤2⋅105; 0≤a0,…,an−1≤105.
Output format:
Output a single number — the maximum pairwise product.
from random import randint
from random import randrange
def max_pairwise(n,a):
# n = int(input())
res = 0
# a = [int(x) for x in input().split()]
assert(len(a) == n)
for i in range(0,n):
for j in range(i+1,n):
if a[i]*a[j] > res:
res = a[i]*a[j]
return(res)
def max_pairwise_fast(n,a):
# n = int(input())
# a = [int(x) for x in input().split()]
max_index1 = -1
max_index2 = -1
for i in range(0,n):
if a[i] > a[max_index1]:
max_index1 = i
else:
continue
for j in range(0,n):
if ((a[j] != a[max_index1]) and (a[j] > a[max_index2])):
max_index2 = j
else:
continue
res = a[max_index1]* a[max_index2]
return(res)
#stress_test
while(1):
n = randint(0,9) +2
print(n,"n")
a = []
for i in range (n):
a.append(randrange(9000))
for i in range(n):
print(a[i],'a[i]',"/n")
if (max_pairwise(n,a) == max_pairwise_fast(n,a)):
print(max_pairwise(n,a), max_pairwise_fast(n,a),"true")
else:
print(max_pairwise(n,a), max_pairwise_fast(n,a), "false")
break
This is an example of the result:
6 n
318 a[i] /n
7554 a[i] /n
7531 a[i] /n
7362 a[i] /n
4783 a[i] /n
4897 a[i] /n
56889174 56889174 true
5 n
6879 a[i] /n
6985 a[i] /n
8561 a[i] /n
5605 a[i] /n
3077 a[i] /n
59798585 59798585 true
9 n
8285 a[i] /n
3471 a[i] /n
2280 a[i] /n
2443 a[i] /n
5437 a[i] /n
2605 a[i] /n
1254 a[i] /n
6990 a[i] /n
2943 a[i] /n
57912150 68641225 false
In your fast implementation, when you are finding a largest number, you must also update the second largest to the value of the previous largest, otherwise, there are cases where you end up multiplying numbers that are not the two largest.
def product_of_two_largest(seq):
largest = float("-inf")
second_largest = float("-inf")
for elt in seq:
if elt > largest:
second_largest = largest
largest = elt
elif elt > second_largest:
second_largest = elt
return second_largest * largest
Note 1:
Your while loop must also be updated, you are calculating the values twice instead of once.
while(1):
n = randint(0,9) +2
print(n,"n")
a = []
for i in range (n):
a.append(randrange(9000))
for i in range(n):
print(a[i],'a[i]',"/n")
slow, fast = max_pairwise(n, a), two_largest_product(a)
if (slow == fast):
print(slow, fast, slow == fast)
else: # attention, this never happens now.
break
Note 2:
As we are dealing with only non-negative integers, you will likely have a faster implementation if you simply sort the sequence and multiply the last two numbers (in spite of the fact that sort is O(nlogn), vs O(n) for the fast implementation above.
b = sorted(a)
print("max1 x max2 = ", b[-1] * b[-2])
Note 3:
Using a heap data structure (from collections import heap) is the theoretical best way to find the n largest items, but you'd likely need to have 100,000's of items to make it worth your while.
def max_pairwise_fast(n, a):
max_index1 = 0
max_index2 = 0
for i in range(n):
if a[i] > max_index1:
max_index2 = max_index1
max_index1 = a[I]
elif a[i] > max_index2:
max_index2 = a[I]
return max_index1 * max_index2
if __name__ == '__main__':
input_n = int(input())
input_numbers = [int(x) for x in input().split()]
print(max_pairwise_fast(input_n, input_numbers))
This might not help you.
I was attempting this question at Coursera ( Assignment ) and found that we don't need to make the solution more complex. We can simply store the first two largest integers while scanning from the user and print their product. A complex solution is the main reason for the TLE error.
code in c
#include<stdio.h>
int main() {
long long n, a = 0, b = 0, i = 0, numb = 1;
scanf("%lld", &n);
for (i = 0; i < n; i++){
scanf("%lld", &numb);
if(numb >= a){
b = a;
a = numb;
}
else if(numb > b)
b = numb;
}
printf("%lld", a * b);
return 0;
}
With input array data:
Example: [1, 2, 3]
def max(arr):
a = 0
b = 0
for i in range(len(arr)):
if a == 0 & arr[i]>a:
a = arr[i]
else:
if arr[i]>a:
b = a
a = arr[i]
else:
if arr[i]>b:
b = arr[i]
return a*b;
def max_pairwise_fast(n, a):
max_index1 = 0
max_index2 = 0
for i in range(n):
if a[i] > max_index1:
max_index2 = max_index1
max_index1 = a[i]
elif a[i] > max_index2:
max_index2 = a[i]
return max_index1 * max_index2
if __name__ == '__main__':
input_n = int(input())
input_numbers = [int(x) for x in input().split()]
print(max_pairwise_fast(input_n, input_numbers))

Down to zero problem - getting time exceeded error

Trying to solve hackerrank problem.
You are given Q queries. Each query consists of a single number N. You can perform 2 operations on N in each move. If N=a×b(a≠1, b≠1), we can change N=max(a,b) or decrease the value of N by 1.
Determine the minimum number of moves required to reduce the value of N to 0.
I have used BFS approach to solve this.
a. Generating all prime numbers using seive
b. using prime numbers I can simply avoid calculating the factors
c. I enqueue -1 along with all the factors to get to zero.
d. I have also used previous results to not enqueue encountered data.
This still is giving me time exceeded. Any idea? Added comments also in the code.
import math
#find out all the prime numbers
primes = [1]*(1000000+1)
primes[0] = 0
primes[1] = 0
for i in range(2, 1000000+1):
if primes[i] == 1:
j = 2
while i*j < 1000000:
primes[i*j] = 0
j += 1
n = int(input())
for i in range(n):
memoize= [-1 for i in range(1000000)]
count = 0
n = int(input())
queue = []
queue.append((n, count))
while len(queue):
data, count = queue.pop(0)
if data <= 1:
count += 1
break
#if it is a prime number then just enqueue -1
if primes[data] == 1 and memoize[data-1] == -1:
queue.append((data-1, count+1))
memoize[data-1] = 1
continue
#enqueue -1 along with all the factors
queue.append((data-1, count+1))
sqr = int(math.sqrt(data))
for i in range(sqr, 1, -1):
if data%i == 0:
div = max(int(data/i), i)
if memoize[div] == -1:
memoize[div] = 1
queue.append((div, count+1))
print(count)
There are two large causes of slowness with this code.
Clearing an array is slower than clearing a set
The first problem is this line:
memoize= [-1 for i in range(1000000)]
this prepares 1 million integers and is executed for each of your 1000 test cases. A faster approach is to simply use a Python set to indicate which values have already been visited.
Unnecessary loop being executed
The second problem is this line:
if primes[data] == 1 and memoize[data-1] == -1:
If you have a prime number, and you have already visited this number, you actually do the slow loop searching for prime factors which will never find any solutions (because it is a prime).
Faster code
In fact, the improvement due to using sets is so much that you don't even need your prime testing code and the following code passes all tests within the time limit:
import math
n = int(input())
for i in range(n):
memoize = set()
count = 0
n = int(input())
queue = []
queue.append((n, count))
while len(queue):
data, count = queue.pop(0)
if data <= 1:
if data==1:
count += 1
break
if data-1 not in memoize:
memoize.add(data-1)
queue.append((data-1, count+1))
sqr = int(math.sqrt(data))
for i in range(sqr, 1, -1):
if data%i == 0:
div = max(int(data/i), i)
if div not in memoize:
memoize.add(div)
queue.append((div, count+1))
print(count)
Alternatively, there's a O(n*sqrt(n)) time and O(n) space complexity solution that passes all the test cases just fine.
The idea is to cache minimum counts for each non-negative integer number up to 1,000,000 (the maximum possible input number in the question) !!!BEFORE!!! running any query. After doing so, for each query just return a minimum count for a given number stored in the cache. So, retrieving a result will have O(1) time complexity per query.
To find minimal counts for each number (let's call it down2ZeroCounts), we should consider several cases:
0 and 1 have 0 and 1 minimal counts correspondingly.
Prime number p doesn't have factors other than 1 and itself. Hence, its minimal count is 1 plus a minimal count of p - 1 or more formally down2ZeroCounts[p] = down2ZeroCounts[p - 1] + 1.
For a composite number num it's a bit more complicated. For any pair of factors a > 1,b > 1 such that num = a*b the minimal count of num is either down2ZeroCounts[a] + 1 or down2ZeroCounts[b] + 1 or down2ZeroCounts[num - 1] + 1.
So, we can gradually build minimal counts for each number in ascending order. Calculating a minimal count of each consequent number will be based on optimal counts for lower numbers and so in the end a list of optimal counts will be built.
To better understand the approach please check the code:
from __future__ import print_function
import os
import sys
maxNumber = 1000000
down2ZeroCounts = [None] * 1000001
def cacheDown2ZeroCounts():
down2ZeroCounts[0] = 0
down2ZeroCounts[1] = 1
currentNum = 2
while currentNum <= maxNumber:
if down2ZeroCounts[currentNum] is None:
down2ZeroCounts[currentNum] = down2ZeroCounts[currentNum - 1] + 1
else:
down2ZeroCounts[currentNum] = min(down2ZeroCounts[currentNum - 1] + 1, down2ZeroCounts[currentNum])
for i in xrange(2, currentNum + 1):
product = i * currentNum
if product > maxNumber:
break
elif down2ZeroCounts[product] is not None:
down2ZeroCounts[product] = min(down2ZeroCounts[product], down2ZeroCounts[currentNum] + 1)
else:
down2ZeroCounts[product] = down2ZeroCounts[currentNum] + 1
currentNum += 1
def downToZero(n):
return down2ZeroCounts[n]
if __name__ == '__main__':
fptr = open(os.environ['OUTPUT_PATH'], 'w')
q = int(raw_input())
cacheDown2ZeroCounts()
for q_itr in xrange(q):
n = int(raw_input())
result = downToZero(n)
fptr.write(str(result) + '\n')
fptr.close()

Categories