I wrote an algorithm to solve 0-1 knapsack problem which works perfect which is as follows:
def zero_one_knapsack_problem(weight: list, items: list, values: list, total_capacity: int) -> list:
"""
A function that implement dynamic programming to solve the zero one knapsack problem. It has exponential
time complexity as supposed.
:param weight: the weight list each element correspond to item at same index
:param items: the array of items ordered same as weight list and values list
:param values: the values list
:param total_capacity: the total capcaity of knapsack
:return: How to fill the knapsack
"""
items_length = len(items)+1
total_capacity += 1
# Create The table
table = [[0 for w in range(total_capacity)] for y in range(items_length)]
for i in range(1, items_length):
for j in range(total_capacity):
if weight[i-1] > j: # Item does not fit
pass
else:
# calculate Take It or Not
table[i][j] = max(values[i-1]+table[i-1][j-weight[i-1]], table[i-2][j])
print("The optimal value to carry is: ${}".format(table[items_length-1][total_capacity-1]))
From the analysis the time complexity is seta(items_length * total_capacity) which is the summation of the 2 loops together(ignoring constants). Then i read online that this method has exponential time complexity(Not from one source many blogs says exponential also). Which I can't see how it comes for example consider any of below examples:
1-) 10 * 100000000000 = 1×10¹²
2-) 11 * 100000000000 = 1.1×10¹²
3-) 12 * 100000000000 = 1.2×10¹²
# difference between each
2 and 3 = 100000000000 = 1.2*10^12 - 1.1*10^12
1 and 2 = 100000000000 = 1.1*10^12 - 1*10^12
as you can see increasing input by 1 didn't cause any exponential growth. So how can they say this algorithm is exponential in a mathematical way.
With a problem size of N bits, you can have, for example, sqrt(N) objects with weights about sqrt(N) bits long, and total_capacity about sqrt(N) bits long.
That makes total_capacity about sqrt(2)N, and your solution takes O(sqrt(N)*sqrt(2)N) time, which is certainly exponential.
Related
I have implemented RadixSort for a list of numbers (quite a similar implementation as e.g. this one, with the only difference that it uses a class RadixSort, which IMO shouldn't make any difference in terms of complexity):
class RadixSort:
def __init__(self):
self.base = 7
self.buckets = [[[] for i in range(10)] for i in range(self.base)] #list of sorting buckets, one bucket per digit
def sort(self, list1d):
"""
Sorts a given 1D-list using radixsort in ascending order
#param list1d to be sorted
#returns the sorted list as an 1D array
#raises ValueError if the list is None
"""
if list1d is None: raise ValueError('List mustn\'t be None')
if len(list1d) in [0, 1]: return list1d
for b in range(self.base): #for each bucket
for n in list1d: #for every number of the input list
digit = (n // (10 ** b)) % 10 #get last digit from the end for the first bucket (b=0), second last digit for second bucket (b=1) and so on
self.buckets[b][digit].append(n) #append the number to the corresponding sub-bucket based on the relevant digit
list1d = self.itertools_chain_from_iterable(self.buckets[b]) #merge all the sub-buckets to get the list of numbers sortd by the corresponding digit
return list1d
def itertools_chain_from_iterable(self,arr):
return list(chain.from_iterable(arr))
Now, I'm assessing its complexity using the big_o module (knowing that the complexity should actually be O(b*n) with b being the base of the RadixSort, i.e. the max number of digits in the numbers to be sorted, i.e. len(self.buckets == self.base ==b) and n being the number of numbers in the list to be sorted):
print('r.sort complexity:',big_o.big_o(r.sort,
lambda n: big_o.datagen.integers(n,1,9999999),
n_measures=10)[0])
And the output is as follows:
r.sort complexity: Exponential: time = -2.9 * 0.0001^n (sec)
What am I doing wrong here? I rather tend to think that something about the way I'm using big_o is not correct than that my RadixSort implementation has exponential complexity.
Any tips/ideas would be greatly appreciated!
I'm currently working on a long-short portfolio optimization project with python.
The thing is that I have to generate a list that has a sum of 1 and each element of the list should be larger or equal to -1 and smaller and equal to 5.
The length of the list must be 5 and the elements should be floats.
Is there a way that I could make such a list?
Due to the constraint that the weights must sum to 1, there are only really four sources of randomness here. So, generate potential weights for the first four assets:
weights = [random.uniform(-1, 5) for i in range(4)]
then infer if the final weight final_weight = 1 - sum(weights) meets requirements. So something like:
def generate_weights():
while True:
weights = [random.uniform(-1, 5) for i in range(4)]
weights.append(1 - sum(weights))
if -1 <= weights[-1] <= 5:
return weights
This question is an extension of my previous question: Fast python algorithm to find all possible partitions from a list of numbers that has subset sums equal to a ratio
. I want to divide a list of numbers so that the ratios of subset sums equal to given values. The difference is now I have a long list of 200 numbers so that a enumeration is infeasible. Note that although there are of course same numbers in the list, every number is distinguishable.
import random
lst = [random.randrange(10) for _ in range(200)]
In this case, I want a function to stochastically sample a certain amount of partitions with subset sums equal or close to the given ratios. This means that the solution can be sub-optimal, but I need the algorithm to be fast enough. I guess a Greedy algorithm will do. With that being said, of course it would be even better if there is a relatively fast algorithm that can give the optimal solution.
For example, I want to sample 100 partitions, all with subset sum ratios of 4 : 3 : 3. Duplicate partitions are allowed but should be very unlikely for such long list. The function should be used like this:
partitions = func(numbers=lst, ratios=[4, 3, 3], num_gen=100)
To test the solution, you can do something like:
from math import isclose
eps = 0.05
assert all([isclose(ratios[i] / sum(ratios), sum(x) / sum(lst), abs_tol=eps)
for part in partitions for i, x in enumerate(part)])
Any suggestions?
You can use a greedy heuristic where you generate each partition from num_gen random permutations of the list. Each random permutation is partitioned into len(ratios) contiguous sublists. The fact that the partition subsets are sublists of a permutation make enforcing the ratio condition very easy to do during sublist generation: as soon as the sum of the sublist we are currently building reaches one of the ratios, we "complete" the sublist, add it to the partition and start creating a new sublist. We can do this in one pass through the entire permutation, giving us the following algorithm of time complexity O(num_gen * len(lst)).
M = 100
N = len(lst)
P = len(ratios)
R = sum(ratios)
S = sum(lst)
for _ in range(M):
# get a new random permutation
random.shuffle(lst)
partition = []
# starting index (in the permutation) of the current sublist
lo = 0
# permutation partial sum
s = 0
# index of sublist we are currently generating (i.e. what ratio we are on)
j = 0
# ratio partial sum
rs = ratios[j]
for i in range(N):
s += lst[i]
# if ratio of permutation partial sum exceeds ratio of ratio partial sum,
# the current sublist is "complete"
if s / S >= rs / R:
partition.append(lst[lo:i + 1])
# start creating new sublist from next element
lo = i + 1
j += 1
if j == P:
# done with partition
# remaining elements will always all be zeroes
# (i.e. assert should never fail)
assert all(x == 0 for x in lst[i+1:])
partition[-1].extend(lst[i+1:])
break
rs += ratios[j]
Note that the outer loop can be redesigned to loop indefinitely until num_gen good partitions are generated (rather than just looping num_gen times) for more robustness. This algorithm is expected to produce M good partitions in O(M) iterations (provided random.shuffle is sufficiently random) if the number of good partitions is not too small compared to the total number of partitions of the same size, so it should perform well for for most inputs. For an (almost) uniformly random list like [random.randrange(10) for _ in range(200)], every iteration produces a good partition with eps = 0.05 as is evident by running the example below. Of course, how well the algorithm performs will also depend on the definition of 'good' -- the stricter the closeness requirement (in other words, the smaller the epsilon), the more iterations it will take to find a good partition. This implementation can be found here, and will work for any input (assuming random.shuffle eventually produces all permutations of the input list).
You can find a runnable version of the code (with asserts to test how "good" the partitions are) here.
I was recently trying to solve a HackerEarth problem. The code worked on the sample inputs and some custom inputs that I gave. But, when I submitted, it showed errors for exceeding the time limit. Can someone explain how I can make the code run faster?
Problem Statement: Cyclic shift
A large binary number is represented by a string A of size N and comprises of 0s and 1s. You must perform a cyclic shift on this string. The cyclic shift operation is defined as follows:
If the string A is [A0, A1,..., An-1], then after performing one cyclic shift, the string becomes [A1, A2,..., An-1, A0].
You performed the shift infinite number of times and each time you recorded the value of the binary number represented by the string. The maximum binary number formed after performing (possibly 0) the operation is B. Your task is to determine the number of cyclic shifts that can be performed such that the value represented by the string A will be equal to B for the Kth time.
Input format:
First line: A single integer T denoting the number of test cases
For each test case:
First line: Two space-separated integers N and K
Second line: A denoting the string
Output format:
For each test case, print a single line containing one integer that represents the number of cyclic shift operations performed such that the value represented by string A is equal to B for the Kth time.
Code:
import math
def value(s):
u = len(s)
d = 0
for h in range(u):
d = d + (int(s[u-1-h]) * math.pow(2, h))
return d
t = int(input())
for i in range(t):
x = list(map(int, input().split()))
n = x[0]
k = x[1]
a = input()
v = 0
for j in range(n):
a = a[1:] + a[0]
if value(a) > v:
b = a
v = value(a)
ctr = 0
cou = 0
while ctr < k:
a = a[1:] + a[0]
cou = cou + 1
if a == b:
ctr = ctr + 1
print(cou)
In the problem, the constraint on n is 0<=n<=1e5. In the function value(), you calculating integer from the binary string whose length can go up to 1e5. so the integer calculating by you can go as high as pow(2, 1e5). This surely impractical.
As mentioned by Prune, you must use some efficient algorithms for finding a subsequence, say sub1, whose repetitions make up the given string A. If you solve this by brute-force, the time complexity will be O(n*n), as maximum value of n is 1e5, time limit will exceed. so use some efficient algorithm.
I can't do much with the code you posted, since you obfuscated it with meaningless variables and a lack of explanation. When I scan it, I get the impression that you've made the straightforward approach of doing a single-digit shift in a long-running loop. You count iterations until you hit B for the Kth time.
This is easy to understand, but cumbersome and inefficient.
Since the cycle repeats every N iterations, you gain no new information from repeating that process. All you need to do is find where in the series of N iterations you encounter B ... which could be multiple times.
In order for B to appear multiple times, A must consist of a particular sub-sequence of bits, repeated 2 or more times. For instance, 101010 or 011011. You can detect this with a simple addition to your current algorithm: at each iteration, check to see whether the current string matches the original. The first time you hit this, simply compute the repetition factor as rep = len(a) / j. At this point, exit the shifting loop: the present value of b is the correct one.
Now that you have b and its position in the first j rotations, you can directly compute the needed result without further processing.
I expect that you can finish the algorithm and do the coding from here.
Ah -- taken as a requirements description, the wording of your problem suggests that B is a given. If not, then you need to detect the largest value.
To find B, append A to itself. Find the A-length string with the largest value. You can hasten this by finding the longest string of 1s, applying other well-known string-search algorithms for the value-trees after the first 0 following those largest strings.
Note that, while you iterate over A, you look for the first place in which you repeat the original value: this is the desired repetition length, which drives the direct-computation phase in the first part of my answer.
I have several vectors of scores, for example:
A = [1,2,3,4]
B = [6,8,2,1]
C = [2,7,9,4]
D = [4,3,2,1]
Where each score position corresponds to a cost on a separate cost vector:
cost = [0.25,0.5,0.75,1]
For example picking '8' from vector B carries a cost of 0.5.
How do I choose only one item from each of A to D which maximises the sum value (as per knapsack problem). This is subject to two constraints:
1) a budget constraint. i.e. if '*' denotes the position of the optimum combination in a vector, then:
cost[A*]+cost[B*]+cost[C*]+cost[D*] == budget
2) a range constraint on each vector, such that there is a minimum and maximum spend from each vector that can be set. i.e:
0.2 < cost[C*] < 0.6
I attempted a brute force version of http://rosettacode.org/wiki/Knapsack_problem/0-1#C.2B.2B.
But I wasn't sure how to expand it to multiple lists and I am certain there is a more elegant solution. Does not necessarily have to be knapsack approach.