Related
I am trying to compute the powerset of a list of prime numbers. I have already done some research and the prefered way of doing this seems to be using a line like
itertools.chain.from_iterable(itertools.combinations(primes, r) for r in range(2, len(primes) + 1))
and then iterating over all combinations to get the products with math.prod(). All in all, the code currently looks like this:
number = 200
p1 = []
# calculate all primes below specified number
for i in range(2, number + 1):
isPrime = True
for prime in p1:
if i % prime == 0:
isPrime = False
if isPrime:
p1.append(i)
Pp = []
myIterable = itertools.chain.from_iterable(itertools.combinations(p1, r) for r in range(2, len(p1) + 1))
# convert iterable to integer array of products -- The code below is extremely slow and should be improved
for x in myIterable:
newValue = math.prod(x)
if newValue <= number:
Pp.append(newValue)
This works, but it is not feasible for any "number" greater than 100 because of too high execution time. The problem is the last for loop, which takes forever to compute. Everything else performs reasonably well. The powerset has to be constricted to sets, whos products are less or equal to number, as done using the last if statement, or else the memory will explode.
The solution to this problem was to create a pointer array, which crawls through the prime array until the product of the pointed primes gets too high. The needed helper functions can be implemented like this:
def calcProductOfPointers(pointerArray, dataArray):
prod = 1
for pointer in pointerArray:
prod *= dataArray[pointer]
return prod
def incrementPointer(pointerArray, dataArray, threshold):
ret = False
for i in range(1, len(pointerArray) + 1):
index = len(pointerArray) - i
pointerArray[index] += 1
if calcProductOfPointers(pointerArray, dataArray) <= threshold and pointerArray[index] < len(dataArray):
ret = True
break
elif index > 0:
pointerArray[index] = pointerArray[index - 1] + 2
else:
break
return ret
And then the iteration over all powersets can be substituted with this code:
Pp = []
for i in range(2, len(p1) + 1): # start at a minimum of 2 prime factors
primePointers = []
for index in range(i):
primePointers.append(index)
if calcProductOfPointers(primePointers, p1) > number:
break
while calcProductOfPointers(primePointers, p1) <= number:
Pp.append(calcProductOfPointers(primePointers, p1))
if not incrementPointer(primePointers, p1, number):
break
I am trying to create a function where:
The output list is generated from random numbers from the input list
The output list is a specified length and adds to a specified sum
ex. I specify that I want a list that is 4 in length and adds up to 10. random numbers are pulled from the input list until the criteria is satisfied.
I feel like I am approaching this problem all wrong trying to use recursion. Any help will be greatly appreciated!!!
EDIT: for more context on this problem.... Its going to be a random enemy generator.
The end goal input list will be coming from a column in a CSV called XP. (I plan to use pandas module). But this CSV will have a list of enemy names in the one column, XP in another, Health in another, etc. So the end goal is to be able to specify the total number of enemies and what the sum XP should be between those enemies and have the list generate with the appropriate information. For ex. 5 enemies with a total of 200 XP between them. The result is maybe -> Apprentice Wizard(50 xp), Apprentice Wizard(50 xp), Grung(50), Xvart(25 xp), Xvart(25 xp). The output list will actually need to include all of the row information for the selected items. And it is totally fine to have duplicated in the output as seen in this example. That will actually make more sense in the narrative of the game that this is for.
The csv --> https://docs.google.com/spreadsheets/d/1PjnN00bikJfY7mO3xt4nV5Ua1yOIsh8DycGqed6hWD8/edit?usp=sharing
import random
from random import *
lis = [1,2,3,4,5,6,7,8,9,10]
output = []
def query (total, numReturns, myList, counter):
random_index = randrange(len(myList)-1)
i = myList[random_index]
h = myList[i]
# if the problem hasn't been solved yet...
if len(output) != numReturns and sum(output) != total:
print(output)
# if the length of the list is 0 (if we just started), then go ahead and add h to the output
if len(output) == 0 and sum(output) + h != total:
output.append(h)
query (total, numReturns, myList, counter)
#if the length of the output is greater than 0
if len(output) > 0:
# if the length plus 1 is less than or equal to the number numReturns
if len(output) +1 <= numReturns:
print(output)
#if the sum of list plus h is greater than the total..then h is too big. We need to try another number
if sum(output) + h > total:
# start counter
for i in myList:# try all numbers in myList...
print(output)
print ("counter is ", counter, " and i is", i)
counter += 1
print(counter)
if sum(output) + i == total:
output.append(i)
counter = 0
break
if sum(output) + i != total:
pass
if counter == len(myList):
del(output[-1]) #delete last item in list
print(output)
counter = 0 # reset the counter
else:
pass
#if the sum of list plus h is less than the total
if sum(output) + h < total:
output.append(h) # add h to the list
print(output)
query (total, numReturns, myList, counter)
if len(output) == numReturns and sum(output) == total:
print(output, 'It worked')
else:
print ("it did not work")
query(10, 4, lis, 0)
I guess that it would be better to get first all n-size combinations of given array which adds to specified number, and then randomly select one of them. Random selecting and checking if sum is equal to specified value, in pessimistic scenario, can last indefinitely.
from itertools import combinations as comb
from random import randint
x = [1,1,2,4,3,1,5,2,6]
def query(arr, total, size):
combs = [c for c in list(comb(arr, size)) if sum(c)==total]
return combs[randint(0, len(combs))]
#example 4-item array with items from x, which adds to 10
print(query(x, 10, 4))
If the numbers in your input list are consecutive numbers, then this is equivalent to the problem of choosing a uniform random output list of N integers in the range [min, max], where the output list is ordered randomly and min and max are the smallest and largest number in the input list. The Python code below shows how this can be solved. It has the following advantages:
It does not use rejection sampling.
It chooses uniformly at random from among all combinations that meet the requirements.
It's based on an algorithm by John McClane, which he posted as an answer to another question. I describe the algorithm in another answer.
import random # Or secrets
def _getSolTable(n, mn, mx, sum):
t = [[0 for i in range(sum + 1)] for j in range(n + 1)]
t[0][0] = 1
for i in range(1, n + 1):
for j in range(0, sum + 1):
jm = max(j - (mx - mn), 0)
v = 0
for k in range(jm, j + 1):
v += t[i - 1][k]
t[i][j] = v
return t
def intsInRangeWithSum(numSamples, numPerSample, mn, mx, sum):
""" Generates one or more combinations of
'numPerSample' numbers each, where each
combination's numbers sum to 'sum' and are listed
in any order, and each
number is in the interval '[mn, mx]'.
The combinations are chosen uniformly at random.
'mn', 'mx', and
'sum' may not be negative. Returns an empty
list if 'numSamples' is zero.
The algorithm is thanks to a _Stack Overflow_
answer (`questions/61393463`) by John McClane.
Raises an error if there is no solution for the given
parameters. """
adjsum = sum - numPerSample * mn
# Min, max, sum negative
if mn < 0 or mx < 0 or sum < 0:
raise ValueError
# No solution
if numPerSample * mx < sum:
raise ValueError
if numPerSample * mn > sum:
raise ValueError
if numSamples == 0:
return []
# One solution
if numPerSample * mx == sum:
return [[mx for i in range(numPerSample)] for i in range(numSamples)]
if numPerSample * mn == sum:
return [[mn for i in range(numPerSample)] for i in range(numSamples)]
samples = [None for i in range(numSamples)]
table = _getSolTable(numPerSample, mn, mx, adjsum)
for sample in range(numSamples):
s = adjsum
ret = [0 for i in range(numPerSample)]
for ib in range(numPerSample):
i = numPerSample - 1 - ib
# Or secrets.randbelow(table[i + 1][s])
v = random.randint(0, table[i + 1][s] - 1)
r = mn
v -= table[i][s]
while v >= 0:
s -= 1
r += 1
v -= table[i][s]
ret[i] = r
samples[sample] = ret
return samples
Example:
weights=intsInRangeWithSum(
# One sample
1,
# Count of numbers per sample
4,
# Range of the random numbers
1, 5,
# Sum of the numbers
10)
# Divide by 100 to get weights that sum to 1
weights=[x/20.0 for x in weights[0]]
So I decided to approach this problem by creating another array (lets say arraycount) with the same length of the input array(lets say arraynumber) which is to be sorted. For the program to really check the stability of the sorting function, I decided to use test cases of integer array of repeated occurrences eg arraynumber = {10,8,7,8,2,2,2,1,6}. I then use a function to calculate the occurrences of the number in the array and store them in the arraycount[] which will result as arraycount={1,1,1,2,1,2,3,1,1}.
Naturally after sorting in increasing order they become arraynumber={1,2,2,2,6,7,8,8,10} and the result of the arraycount decides that its stable or not. Hence if its stable arraycount={1,1,2,3,1,1,1,2,1} otherwise it won't be same.
This is the code I have been working on. Help and feedback is appreciated. Main function and input is not taken yet. I will take another array copy of the original arraycount which will be compared afterward to check if it's retained the original position. Sorry if I made any trivial mistakes. Thank you.
def merge(arr1, l, m, r,arr2):
n1 = m - l + 1
n2 = r- m
# create temp arrays
L = [0] * (n1)
R = [0] * (n2)
# Copy data to temp arrays L[] and R[]
for i in range(0 , n1):
L[i] = arr1[l + i]
for j in range(0 , n2):
R[j] = arr1[m + 1 + j]
# Merge the temp arrays back into arr[l..r]
i = 0 # Initial index of first subarray
j = 0 # Initial index of second subarray
k = l # Initial index of merged subarray
while i < n1 and j < n2 :
if L[i] <= R[j]:
arr1[k] = L[i]
arr2[k]=L[i]
i += 1
else:
arr1[k] = R[j]
arr2[k]=R[j]
j += 1
k += 1
# Copy the remaining elements of L[], if there
# are any
while i < n1:
arr1[k] = L[i]
arr2[k]=L[i]
i += 1
k += 1
# Copy the remaining elements of R[], if there
# are any
while j < n2:
arr1[k] = R[j]
arr2[k]=R[j]
j += 1
k += 1
# l is for left index and r is right index of the
# sub-array of arr to be sorted
def mergeSort(arr1,l,r,arr2):
if l < r:
# Same as (l+r)/2, but avoids overflow for
# large l and h
m = (l+(r-1))/2
# Sort first and second halves
mergeSort(arr1, l, m,arr2)
mergeSort(arr1, m+1, r,arr2)
merge(arr1, l, m, r,arr2)
# This function takes last element as pivot, places
# the pivot element at its correct position in sorted
# array, and places all smaller (smaller than pivot)
# to left of pivot and all greater elements to right
# of pivot
def partition(arr1,low,high,arr2):
i = ( low-1 ) # index of smaller element
pivot = arr1[high] # pivot
for j in range(low , high):
# If current element is smaller than or
# equal to pivot
if arr1[j] <= pivot:
# increment index of smaller element
i = i+1
arr1[i],arr1[j] = arr1[j],arr1[i]
arr2[i],arr2[j]=arr2[j],arr2[i]
arr1[i+1],arr1[high] = arr1[high],arr1[i+1]
arr2[i+1],arr2[high]=arr2[high],arr2[i+1]
return ( i+1 )
# The main function that implements QuickSort
# arr1[] --> Array to be sorted,
# arr2[] --> Second Array of the count to be sorted along with first array
# low --> Starting index,
# high --> Ending index
# Function to do Quick sort
def quickSort(arr1,low,high,arr2):
if low < high:
# pi is partitioning index, arr[p] is now
# at right place
pi = partition(arr1,low,high,arr2)
# Separately sort elements before
# partition and after partition
quickSort(arr1, low, pi-1,arr2)
quickSort(arr1, pi+1, high,arr2)
def countArray(arr1,arr2):
i=0,j=1,k=0
while i<range(len(arr1[])):
while j<range(len(arr1[])):
if arr1[i]==arr1[j]:
arr2[i]=k+1
j=j+1
k=0
i=i+1
arraylist=[]
arraycount=[]
def isStable(arr1,arr2):
k=0
while i < range(len(arr1[])):
if arr[i]==arr2[i]:
k=k+1
i=i+1
if k==len(arr2):
print("Stable sort")
else:
print("Unstable sort")
First of all, sorry about the naive question. But I couldn't find help elsewhere
I'm trying to create an Optimal Search Tree using Dynamic Programing in Python that receives two lists (a set of keys and a set of frequencies) and returns two answers:
1 - The smallest path cost.
2 - The generated tree for that smallest cost.
I basically need to create a tree organized by the most accessed items on top (most accessed item it's the root), and return the smallest path cost from that tree, by using the Dynamic Programming solution.
I've the following implemented code using Python:
def optimalSearchTree(keys, freq, n):
#Create an auxiliary 2D matrix to store results of subproblems
cost = [[0 for x in xrange(n)] for y in xrange(n)]
#For a single key, cost is equal to frequency of the key
#for i in xrange (0,n):
# cost[i][i] = freq[i]
# Now we need to consider chains of length 2, 3, ... .
# L is chain length.
for L in xrange (2,n):
for i in xrange(0,n-L+1):
j = i+L-1
cost[i][j] = sys.maxint
for r in xrange (i,j):
if (r > i):
c = cost[i][r-1] + sum(freq, i, j)
elif (r < j):
c = cost[r+1][j] + sum(freq, i, j)
elif (c < cost[i][j]):
cost[i][j] = c
return cost[0][n-1]
def sum(freq, i, j):
s = 0
k = i
for k in xrange (k,j):
s += freq[k]
return s
keys = [10,12,20]
freq = [34,8,50]
n=sys.getsizeof(keys)/sys.getsizeof(keys[0])
print(optimalSearchTree(keys, freq, n))
I'm trying to output the answer 1. The smallest cost for that tree should be 142 (the value stored on the Matrix Position [0][n-1], according to the Dynamic Programming solution). But unfortunately it's returning 0. I couldn't find any issues in that code. What's going wrong?
You have several very questionable statements in your code, definitely inspired by C/Java programming practices. For instance,
keys = [10,12,20]
freq = [34,8,50]
n=sys.getsizeof(keys)/sys.getsizeof(keys[0])
I think you think you calculate the number of items in the list. However, n is not 3:
sys.getsizeof(keys)/sys.getsizeof(keys[0])
3.142857142857143
What you need is this:
n = len(keys)
One more find: elif (r < j) is always True, because r is in the range between i (inclusive) and j (exclusive). The elif (c < cost[i][j]) condition is never checked. The matrix c is never updated in the loop - that's why you always end up with a 0.
Another suggestion: do not overwrite the built-in function sum(). Your namesake function calculates the sum of all items in a slice of a list:
sum(freq[i:j])
import sys
def optimalSearchTree(keys, freq):
#Create an auxiliary 2D matrix to store results of subproblems
n = len(keys)
cost = [[0 for x in range(n)] for y in range(n)]
storeRoot = [[0 for i in range(n)] for i in range(n)]
#For a single key, cost is equal to frequency of the key
for i in range (0,n):
cost[i][i] = freq[i]
# Now we need to consider chains of length 2, 3, ... .
# L is chain length.
for L in range (2,n+1):
for i in range(0,n-L+1):
j = i + L - 1
cost[i][j] = sys.maxsize
for r in range (i,j+1):
c = (cost[i][r-1] if r > i else 0)
c += (cost[r+1][j] if r < j else 0)
c += sum(freq[i:j+1])
if (c < cost[i][j]):
cost[i][j] = c
storeRoot[i][j] = r
return cost[0][n-1], storeRoot
if __name__ == "__main__" :
keys = [10,12,20]
freq = [34,8,50]
print(optimalSearchTree(keys, freq))
I'm implementing the Select Algorithm (a.k.a. Deterministic Select). I've got it working for small arrays/lists but when my array size gets above 26 it breaks with the following error: "RuntimeError: maximum recursion depth exceeded". For arrays size 25 and below there is no problem.
My ultimate goal is to have it run for arrays of size 500 and do many iterations. The iterations are not an issue. I have already researched StackOverflow and have seen article: Python implementation of "median of medians" algorithm among many others. I had a hunch that duplicates in my random generated array may have been causing a problem but that doesn't seem to be it.
Here's my code:
import math
import random
# Insertion Sort Khan Academy video: https://www.youtube.com/watch?v=6pyeMmJTefg&list=PL36E7A2B75028A3D6&index=22
def insertion_sort(A): # Sorting it in place
for index in range(1, len(A)):# range is up to but not including len(A)
value = A[index]
i = index - 1 # index of the item that is directly to the left
while i >= 0:
if value < A[i]:
A[i + 1] = A[i]
A[i] = value
i = i - 1
else:
break
timeslo = 0 # I think that this is a global variable
def partition(A, p):
global timeslo
hi = [] #hold things larger than our pivot
lo = [] # " " smaller " " "
for x in A: # walk through all the elements in the Array A.
if x <p:
lo = lo + [x]
timeslo = timeslo + 1 #keep track no. of comparisons
else:
hi = hi + [x]
return lo,hi,timeslo
def get_chunks(Acopy, n):
# Declare some empty lists to hold our chunks
chunk = []
chunks = []
# Step through the array n element at a time
for x in range(0, len(Acopy), n): # stepping by size n starting at the beginning
# of the array
chunk = Acopy[x:x+n] # Extract 5 elements
# sort chunk and find its median
insertion_sort(chunk) # in place sort of chunk of size 5
# get the median ... (i.e. the middle element)
# Add them to list
mindex = (len(chunk)-1)/2 # pick middle index each time
chunks.append(chunk[mindex])
# chunks.append(chunk) # assuming subarrays are size 5 and we want the middle
# this caused some trouble because not all subarrays were size 5
# index which is 2.
return chunks
def Select(A, k):
if (len(A) == 1): # if the array is size 1 then just return the one and only element
return A[0]
elif (len(A) <= 5): # if length is 5 or less, sort it and return the kth smallest element
insertion_sort(A)
return A[k-1]
else:
M = get_chunks(A, 5) # this will give you the array of medians,,, don't sort it....WHY ???
m = len(M) # m is the size of the array of Medians M.
x = Select(M, m/2)# m/2 is the same as len(A)/10 FYI
lo, hi, timeslo = partition(A, x)
rank = len(lo) + 1
if rank == k: # we're in the middle -- we're done
return x, timeslo # return the value of the kth smallest element
elif k < rank:
return Select(lo, k) # ???????????????
else:
return Select(hi, k-rank)
################### TROUBLESHOOTING ################################
# Works with arrays of size 25 and 5000 iterations
# Doesn't work with " 26 and 5000 "
#
# arrays of size 26 and 20 iterations breaks it ?????????????????
# A = []
Total = 0
n = input('What size of array of random #s do you want?: ')
N = input('number of iterations: ')
# n = 26
# N = 1
for x in range(0, N):
A = random.sample(range(1,1000), n) # make an array or list of size n
result = Select(A, 2) #p is the median of the medians, 2 means the 3rd smallest element
Total = Total + timeslo # the total number of comparisons made
print("the result is"), result
print("timeslo = "), timeslo
print("# of comparisons = "), Total
# A = [7, 1, 3, 5, 9, 2, 83, 8, 4, 13, 17, 21, 16, 11, 77, 33, 55, 44, 66, 88, 111, 222]
# result = Select(A, 2)
# print("Result = "), result
Any help would be appreciated.
Change this line
return x, timeslo # return the value of the kth smallest element
into
return x # return the value of the kth smallest element
You can get timeslo by printing it in the end. Returning x with timeslo is not correct, because it will be used in the partition(A, p) to split array, where the parameter p should be the median number from previous statement x = Select(M, m/2)