Optimizing this recursive function (dynamic programming)

Optimizing this recursive function (dynamic programming) - python

I'm solving a very simple algorithm problem that requests recursion and memoization. The code below works fine but it doesn't meet the time limitation. Someone advised me to optimize tail recursion, but it is not a tail recursion.. This is just a studying material, not a homework.
Question
• A snail can climb 2m per 1 day if it rains, 1m otherwise.
• The probability of raining per day is 75%.
• Given the number of days(<=1000) and height(<=1000), calculate the probability that the snail can get out of the well (climb more than the height well)
This python code is implemented with recursion and memoization.
import sys
sys.setrecursionlimit(10000)
# Probability of success that snails can climb 'targetHeight' within 'days'
def successRate(days, targetHeight):
global cache
# edge case
if targetHeight <= 1:
return 1
if days == 1:
if targetHeight > 2:
return 0
elif targetHeight == 2:
return 0.75
elif targetHeight == 1:
return 0.25
answer = cache[days][targetHeight]
# if the answer is not previously calculated
if answer == -1:
answer = 0.75 * (successRate(days - 1, targetHeight - 2)) + 0.25 * (successRate(days - 1, targetHeight - 1))
cache[days][targetHeight] = answer
return answer
height, duration = map(int, input().split())
cache = [[-1 for j in range(height + 1)] for i in range(duration + 1)] # cache initialized as -1
print(round(successRate(duration, height),7))

It is simple. So it is just a hint.
For inital part set:
# suppose cache is allocated
cache[1][1] = 0.25
cache[1][2] = 0.75
for i in range(3,targetHeight+1):
cache[1][i] = 0
for i in range(days+1):
cache[i][1] = 1
cache[i][0] = 1
And then try to rewrite the recursive part using the initialized values (you should iterate bottom-up, likes the below). And finally, return the value of cache[days][targetHeight].
for i in range(2, days+1):
for j in range(2, targetHeight+1):
cache[i][j] = 0.75 * cache[i-1][j-2] + 0.25 * cache[i-1][j-1]

Related

What exactly is Stop in this question and how do I get the sum? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 months ago.
Improve this question
Problem Statement
Edit: I have transcribed the image as suggested although I think some terms are better shown in the picture if anything is unclear here;
This function takes in a positive integer n and returns the sum of the following series Sn, as long as the absolute value of each term is larger than stop.
Sn= 1 − 1/2 + 1/3 − 1/4 + ... + (−1)n+1/n + ...
You can assume that stop is a float value and 0 < stop < 1.
You need not round the output.
For example, if stop = 0.249, then Sn is evaluated with only four terms.
Sn = 1 − 1/2 + 1/3 − 1/4
For example, if stop = 0.199, then Sn is evaluated with only five terms.
Sn = 1 − 1/2 + 1/3 − 1/4 + 1/5
The built-in function abs() is useful. You should use a while loop.
Test cases:
print( alternating_while(0.249) )
print( alternating_while(0.199) )
gives:
0.5833333333333333
0.7833333333333332
Now for this question, I want to get the sum of this series based on the conditions stipulated in the question.
My problem is I don't understand how to type the formula given in the question because I'm not familiar with how the while-loop works. Can someone instruct me on how to?
def alternating_while(stop):
total = 0
n = 1
term = 1
while abs(term) > stop:
total= (-1) ** (n + 1) / n + alternating_while(n - 1)
return total

No reason to use recursion as it wasn't mentioned as a requirement. Just check the term in the while loop for the stop condition:
Python 3.8+ (for the := operator):
def alternating_while(stop):
n = 1
total = 0
while abs(term := (-1)**(n+1)/n) > stop:
total += term
n += 1
return total
print(alternating_while(0.249))
print(alternating_while(0.199))
Output:
0.5833333333333333
0.7833333333333332
Pre-Python 3.8 version:
def alternating_while(stop):
n = 1
total = 0
while True:
term = (-1)**(n+1)/n
if abs(term) <= stop:
break
total += term
n += 1
return total
Or:
def alternating_while(stop):
n = 1
total = 0
term = (-1)**(n+1)/n
while abs(term) > stop:
total += term
n += 1
term = (-1)**(n+1)/n # redundant
return total

The key is "alternating". You can just increment the current denominator one at a time. If it is odd, you add. Otherwise, you subtract. abs is not really required; I'm not sure why they would mention it.
def alternating_while(stop):
total = 0
denom = 1
while 1/denom > stop:
if denom & 1:
total += 1/denom
else:
total -= 1/denom
denom += 1
return total
print(alternating_while(0.249))
print(alternating_while(0.199))
Output:
0.5833333333333333
0.7833333333333332

You need to cycle between adding and subtracting. The itertools module has a very helpful cycle class which you could utilise thus:
from itertools import cycle
from operator import add, sub
def get_term(d=2):
while True:
yield 1 / d
d += 1
def calc(stop=0.199):
c = cycle((sub, add))
term = get_term()
Sn = 1
while (t := next(term)) > stop:
Sn = next(c)(Sn, t)
return Sn
print(calc())
Output:
0.6936474305598223
Note:
The reference in the problem statement to absolute values seems to be irrelevant as no terms will ever be negative

I understand you need to use while in this particular problem, and this answer won't immediately help you as it is probably a few steps ahead of the current level of your course. The hope however is that you'll find it intriguing, and will perhaps come back to it in the future when you start being interested in performance and the topics introduced here.
from math import ceil
def f(stop):
n = ceil(1 / stop) - 1
return sum([(2 * (k & 1) - 1) / k for k in range(1, n + 1)])
Explanation
First, we want to establish ahead of time n, so that we avoid a math evaluation at each loop to decide whether to stop or not. Instead, the main loop is now for k in range(1, n + 1) which will go from 1 to n, included.
We use the oddness of k (k & 1) to determine the sign of each term, i.e. +1 for k == 1, -1 for k == 2, etc.
We make the series of terms in a list comprehension (for speed).
(A point often missed by many Pythonistas): building the list using such a comprehension and then summing it is, counter-intuitively, slightly faster than summing directly from a generator. In other words, sum([expr for k in generator]) is faster than sum(expr for k in generator). Note: I haven't tested this with Python 3.11 and that version of Python has many speed improvements.
For fun, you can change slightly the loop above to return the elements of the terms and inspect them:
def g(stop):
n = ceil(1 / stop) - 1
return [(2 * (k & 0x1) - 1, k) for k in range(1, n + 1)]
>>> g(.249)
[(1, 1), (-1, 2), (1, 3), (-1, 4)]

Python - "Fast Exponention Algorithm" performance outperformed by "worse" algorithm

Can anyone explain to me how this code:
def pow1(a, n):
DP = [None] * (n+1)
DP[1] = a
i = [1]
while not i[-1] == n:
if(i[-1]*2 <= n):
DP[i[-1]*2] = DP[i[-1]]*DP[i[-1]]
i.append(i[-1]+i[-1])
else:
missing = n-i[-1]
low = 0
high = len(i) - 1
mid = 0
while low <= high:
mid = (high + low) // 2
if i[mid] < missing:
if i[mid+1] > missing:
break
low = mid + 1
elif i[mid] > missing:
high = mid - 1
else:
break
DP[i[-1]+i[mid]] = DP[i[-1]]*DP[i[mid]]
i.append(i[mid]+i[-1])
return DP[n]
out-performs this code:
def pow2(a, n):
res = 1
while (n > 0):
if (n & 1):
res = res * a
a = a * a
n >>= 1
return res
Here is how I check them:
a = 34 # just arbitrary
n = 2487665 # just arbitrary
starttime = timeit.default_timer()
pow1(a, n)
print("pow1: The time difference is :", (
timeit.default_timer() - starttime))
starttime = timeit.default_timer()
pow2(a, n)
print("pow2: The time difference is :",
(timeit.default_timer() - starttime))
This is the result on my MacBook Air m1:
# pow1: The time difference is : 3.71763225
# pow2: The time difference is : 6.091892
As far as I can tell they work very similarly only the first one (pow1) stores all its intermediate results therefore has like O(log(n)) space complexity and then has to find all the required factors(sub-products) to get the final result. So O(log(n)) calculating them all, worst case has to do log(n) binarySearches (again O(log(n)) resulting in a runtime of O( logN*log(logN) ). Where as pow2 essentially never has to search through previous results... so pow2 has Time Complexity: O(logN) and Auxiliary Space: O(1) vs pow1 - Time Complexity: O(logN*logN*logN) and Auxiliary Space: O(logN).
Maybe (probably) I'm missing something essential but I don't see how the algorithm, the hardware (ARM), python or the testing could have this impact.
Also I just realised the space complexity for pow1 is O(n) the way I did it.

Okey so I figured it out. In pow2 (which I implemented from cp-algorithms.com but can also be found on geeksforgeeks.org in python) there is a bug.
The problem is this line gets executed one time too many:
res = 1
while (n > 0):
if (n & 1):
res = res * a
a = a * a #<--- THIS LINE
n >>= 1
return res
that gets called even tough the result has already been calculated, causing the function to do one more unnecessary multiplication, which with big numbers has a big impact. Here would be a very quick fix:
def pow2(a, n):
res = 1
while (n > 0):
if n & 1:
res = res * a
if n == 1:
return res
a = a * a
n >>= 1
return res
New measurements:
# a:34 n:2487665
# pow1(): The time difference is : 3.749621834
# pow2(): The time difference is : 3.072042833
# a**n: The time difference is : 2.119430791000001

Great catch. It's a weakness of while loops that they lead to wasted computations at the bottom of the loop body. Usually it doesn't matter.
Languages like Ada provide an unconditional loop expecting that an internal break (exit in Ada) will be used to leave for this reason.
So fwiw, a cleaner code using this "break in the middle" style would be:
def pow2(a, n):
res = 1
while True:
if (n & 1):
res *= a
n >>= 1
if n == 0:
return res
a *= a
With other algorithms, you might need to guard the loop for the case n = 0. But with this one, that's optional. You could check for that and return 1 explicitly before the loop if desired.
Overall, this avoids a comparison per loop wrt your solution. Maybe with big numbers this is worthwhile.

Python: Why does the value always return 0

im new to python and im trying to solve this problem for school.
Airlines find that each passenger who reserves a seat fails to turn up with probability q independently of the other passengers. So Airline A always sell n tickets for their n−1 seat aeroplane while Airline B always sell 2n tickets for their n − 2 seat aeroplane. Which is more often over-booked?
This is my code:
import random
arrive = [True, False]
def airline_a(q, n, sims):
sim_counter = 0
reserved_seats = []
overbooked = 0
for x in range(n):
reserved_seats.append(x)
while sim_counter != sims:
passengers = 0
for x in range(len(reserved_seats)):
success = random.choices(arrive, weights=(100 - (q * 100), q * 100), k=1)
if success == True:
passengers += 1
if passengers > n - 1:
overbooked += 1
sim_counter += 1
print("Results: ")
print(f"Probability that Airline A is overbooked: ", overbooked / sims)
airline_a(.7, 100, 1000)
I am supposed to do a similar function for airline B and compare both values, however i came across an issue
The problem is that the output is always 0. I dont quite understand what I'm doing wrong.

the random.choices() is returning a list, so it'll never be success == True as it's returning success == ['True']. Try fixing to this:
success = random.choices(arrive, weights=(100 - (q * 100), q * 100), k=1)[0]
And secondly, you might need to adjust your functionion slightly. Your max seats are 100...Then out of 100 you select .30 True, .70 False. To be overbooked you need ALL pasangers to come back as True (so that you'll have passengers > n -1 (which is 100 passengers > 99).
You'd have to change your q to be a very small probability (Ie, a high probablility a passanger DOES show so that you can get cases where all 100 show up).
Think of it this way, if you flip a coin 100 times, you need it to come up with 100 HEADS to be greater than 100-1. If the coin is weighted to give you .30 HEADS and .70 TAILS... while THEORETICALLY it could happen (you can calculate out that probability), you will "never" experimentally/simulate get 100 HEADS. You'd need something like a weight to be .999 HEADS and .001 TAILS for a chance to get 100 HEADS out of 100 tosses.
import random
arrive = [True, False]
def airline_a(q, n, sims):
sim_counter = 0
reserved_seats = []
overbooked = 0
for x in range(n):
reserved_seats.append(x)
alpha = 100 - (q * 100)
beta = q * 100
while sim_counter != sims:
passengers = 0
for x in range(len(reserved_seats)):
success = random.choices(arrive, weights=(alpha, beta), k=1)[0]
if success == True:
passengers += 1
if passengers > n - 1:
overbooked += 1
sim_counter += 1
print("Results: ")
print(f"Probability that Airline A is overbooked: ", overbooked / sims)
airline_a(.001, 100, 1000)
Gave me an output of:
Results:
Probability that Airline A is overbooked: 0.906
You can solve this mathematically by the way instead of a simulation.

The main problem with the code is that you are trying to compare a list to a boolean beacuse the random.choice() function return a list.
So, first of all you have to take the first element of the success list:
success = random.choices(arrive, weights=(100 - (q * 100), q * 100), k=1)
if success[0]:
passengers += 1
or
success = random.choices(arrive, weights=(100 - (q * 100), q * 100), k=1)[0]
if success:
passengers += 1
The second thing to keep in mind, maybe the most important, is that you will always receive 0% as a result, this is because the way the logic is developed, it is statistically almost impossible that no individual does not show up and with only 1 individual who does not show up, in case A, there will not be people in overbookin
EXAMPLE
I performed some tests modifying the percentage of probability that a person will not show up.
I would like to mention that 0.7 is a 70% probability and is very high in this context, this means that on 100 people, in average for every simulation those that will result present on the flight will be 30, that's why there are never overbooked people.
The first result other than 0.0% I had, is by setting q = 0.05 and I received as a result an overbooking probability of 0.007 (which is 0.7%).
PLANE B
I will not write you the solution to simulate the case of Company B, it is to modify 1 line of code and it is better that you do it since that' s an assignment. However I did some tests for this second case too, and here the situation becomes statistically much more interesting.
By setting a q = 0.4 we receive as probability of overbooking on average 0.97 (97%) and with a q = 0.5 on average 0.45 (45%).

The problem with your code is this line: if success == True:. If you try printing success, it outputs either [True] or [False]. You can fix this by taking the first value of success in the if statement like this:
if success[0] == True:
passengers += 1
and to simplify the code a bit:
if success[0]:
passengers += 1
Change your code to the following to fix the error, however the count for the variable passengers will remain below the variable n, still resulting 0, so try and reduce the value of n or by reducing the value of q:
import random
arrive = [True, False]
def airline_a(q, n, sims):
sim_counter = 0
reserved_seats = []
overbooked = 0
for x in range(n):
reserved_seats.append(x)
while sim_counter != sims:
passengers = 0
for x in range(len(reserved_seats)):
success = random.choices(arrive, weights=(100 - (q * 100), q * 100), k=1)
if success[0]:
passengers += 1
if passengers > n - 1:
overbooked += 1
sim_counter += 1
print("Results: ")
print(f"Probability that Airline A is overbooked: ", overbooked / sims)
airline_a(.7, 100, 1000)
For an explanation on random.choices() visit the following link
https://www.w3schools.com/python/ref_random_choices.asp

How can I improve my spiral index function?

Introduction
I am taking an online Introduction to Computer Science course for which I was given the task to create a function that takes in a coordinate and returns its corresponding "spiral index". I have a working function, but the online learning platform tells me that my code takes too long to execute (I am currently learning about code complexity and Big O notation).
The Problem Statement
All numbers on an infinite grid that extends in all four directions can be identified with a single number in the following manner:
Where 0 corresponds with the coordinate (0, 0), 5 corresponds with the coordinate (-1, 0), and 29 with (3, 2).
Create a function which returns the spiral index of any pair of coordinates that are input by the user.
Examples:
spiral_index(10, 10) returns 380.
spiral_index(10, -10) returns 440.
spiral_index(3, 15) returns 882.
spiral_index(1000, 1000) returns 3998000.
My Approach
def spiral_index(x, y):
if x == 0 and y == 0:
return 0
pos = [1, 0]
num = 1
ring_up = 0
ring_left = 0
ring_down = 0
ring_right = 0
base = 3
while pos != [x, y]:
if ring_up < base - 2:
pos[1] += 1
ring_up += 1
num += 1
elif ring_left < base - 1:
pos[0] -= 1
ring_left += 1
num += 1
elif ring_down < base - 1:
pos[1] -= 1
ring_down += 1
num += 1
elif ring_right < base:
pos[0] += 1
ring_right += 1
num += 1
else:
base = base + 2
ring_up = 0
ring_left = 0
ring_down = 0
ring_right = 0
return num
The above code is able to find the correct index for every coordinate, and (on my laptop) computes spiral_index(1000, 1000) in just over 2 seconds (2.06).
Question
I have seen some solutions people have posted to a similar problem. However, I am wondering what is making my code so slow? To my knowledge, I believe that it is executing the function in linear time (is that right?). How can I improve the speed of my function? Can you post a faster function?
The course told me that a for loop is generally faster than a while loop, and I am guessing that the conditional statements are slowing the function down as well.
Any help on the matter is greatly appreciated!
Thanks in advance.

First of all, your solution takes two coordinates, A and B, and is O(AB), which can be considered quadratic. Second of all, the similar problem involves constructing the entire spiral, which means there is no better solution. You only have to index it, which means there's no reason to traverse the entire spiral. You can simply find which ring of the spiral you're on, then do some math to figure out the number. The center has 1 element, the next ring has 8, the next 16, the next 24, and it always increases by 8 from there. This solution is constant time and can almost instantly calculate spiral_index(1000, 1000).
def spiral_index(x, y):
ax = abs(x)
ay = abs(y)
# find loop number in spiral
loop = max(ax, ay)
# one less than the edge length of the current loop
edgelen = 2 * loop
# the numbers in the inner loops
prev = (edgelen - 1) ** 2
if x == loop and y > -loop:
# right edge
return prev + y - (-loop + 1)
if y == loop:
# top edge
return prev + loop - x + edgelen - 1
if x == -loop:
# left edge
return prev + loop - y + 2 * edgelen - 1
if y == -loop:
# bottom edge
return prev + x + loop + 3 * edgelen - 1
raise Exception("this should never happen")

Splitting the unit segment into two parts recursively

I would like to create a simple multifractal (Binomial Measure). It can be done as follows:
The binomial measure is a probability measure which is defined conveniently via a recursive construction. Start by splitting $ I := [0, 1] $ into two subintervals $ I_0 $ and $ I_1 $ of equal length and assign the masses $ m_0 $ and $ m_1 = 1 - m_0 $ to them. With the two subintervals one proceeds in the same manner and so forth: at stage two, e.g. the four subintervals $ I_{00}, I_{01}, I_{10}, I_{11} $ have masses $ m_0m_0, m_0m_1 m_1m_0 m_1m_1 $ respectively.
Rudolf H. Riedi. Introduction to Multifractals
And it should look like this on the 13 iteration:
I tried to implement it recursively but something went wrong: it uses the previously changed interval in both left child and the right one
def binom_measuare(iterations, val_dct=None, interval=[0, 1], p=0.4, temp=None):
if val_dct is None:
val_dct = {str(0.0): 0}
if temp is None:
temp = 0
temp += 1
x0 = interval[0] + (interval[1] - interval[0]) / 2
x1 = interval[1]
print(x0, x1)
m0 = interval[1] * p
m1 = interval[1] * (1 - p)
val_dct[str(x0)] = m0
val_dct[str(x1)] = m1
print('DEBUG: iter before while', iterations)
while iterations != 0:
if temp % 2 == 0:
iterations -= 1
print('DEBUG: iter after while (left)', iterations)
# left
interval = [interval[0] + (interval[1] - interval[0]) / 2, interval[1] / 2]
binom_measuare(iterations, val_dct, interval, p=0.4, temp=temp)
elif temp % 2 == 1:
print('DEBUG: iter after while (right)', iterations)
# right
interval = [interval[0] + (interval[1] - interval[0]) / 2, interval[1]]
binom_measuare(iterations, val_dct, interval, p=0.4, temp=temp)
else:
return val_dct
Also, I have tried to do this using for-loop and it is doing good job up to the second iteration: on the third iteration it uses 2^3 multipliers rather than 3 $ m_0m_0m_0 $ and 2^4 on the fourth rather than 4 and so on:
iterations = 4
interval = [0, 1]
val_dct = {str(0.0): 0}
p = 0.4
for i in range(1, iterations):
splits = 2 ** i
step = interval[1] / splits
print(splits)
for k in range(1, splits + 1):
deg0 = splits // 2 - k // 2
deg1 = k // 2
print(deg0, deg1)
val_dct[str(k * step)] = p ** deg0 * (1 - p) ** deg1
print(val_dct)
The concept seems very easy to implement and probably someone has already done it. Am I just looking from another angle?
UPD: Please, may sure that your suggestion can achieve the results that are illustrated in the Figure above (p=0.4, iteration=13).
UPUPD: Bill Bell provided a nice idea to achieve what Riedi mentioned in the article. I used Bill's approach and wrote a function that implements it for needed number of iterations and $m_0$ (please see my answer below).

If I understand the principle correctly you could use the sympy symbolic algebra library for making this calculation, along these lines.
>>> from sympy import *
>>> var('m0 m1')
(m0, m1)
>>> layer1 = [m0, m1]
>>> layer2 = [m0*m0, m0*m1, m0*m1, m1*m1]
>>> layer3 = []
>>> for item in layer2:
... layer3.append(m0*item)
... layer3.append(m1*item)
...
>>> layer3
[m0**3, m0**2*m1, m0**2*m1, m0*m1**2, m0**2*m1, m0*m1**2, m0*m1**2, m1**3]
The intervals are always of equal size.
When you need to evaluate the distribution you can use the following kind of code.
>>> [_.subs(m0,0.3).subs(m1,0.7) for _ in layer2]
[0.0900000000000000, 0.210000000000000, 0.210000000000000, 0.490000000000000]

I think that the problem is in your while loop: it doesn't properly handle the base case of your recursion. It stops only when iterations is 0, but keeps looping otherwise. If you want to debug why this forms an infinite loop, I'll leave it to you. Instead, I did my best to fix the problem.
I changed the while to a simple if, made the recursion a little safer by not changing iterations within the routine, and made interval a local copy of the input parameter. You're using a mutable object as a default value, which is dangerous.
def binom_measuare(iterations, val_dct=None, span=[0, 1], p=0.4, temp=None):
interval = span[:]
...
...
print('DEBUG: iter before while', iterations)
if iterations > 0:
if temp % 2 == 0:
print('DEBUG: iter after while (left)', iterations)
# left
interval = [interval[0] + (interval[1] - interval[0]) / 2, interval[1] / 2]
binom_measuare(iterations-1, val_dct, interval, 0.4, temp)
else:
print('DEBUG: iter after while (right)', iterations)
# right
interval = [interval[0] + (interval[1] - interval[0]) / 2, interval[1]]
binom_measuare(iterations-1, val_dct, interval, 0.4, temp)
else:
return val_dct
This terminates and seems to give somewhat sane results. However, I wonder about your interval computations, when the right boundary can often be less than the left. Consider [0.5, 1.0] ... the left-child recursion will be on the interval [0.75, 0.5]; is that what you wanted?

This is my adaptation of #Bill Bell's answer to my question. It generalizes the idea that he provided.
import matplotlib.pyplot as plt
from sympy import var
def binom_measuare(iterations, p=0.4, current_layer=None):
var('m0 m1')
if current_layer is None:
current_layer = [1]
next_layer = []
for item in current_layer:
next_layer.append(m0*item)
next_layer.append(m1*item)
if iterations != 0:
return binom_measuare(iterations - 1, current_layer=next_layer)
else:
return [i.subs(m0, p).subs(m1, 1 - p) for i in next_layer]
Let's plot the output
y = binom_measuare(iterations=12)
x = [(i+1) / len(y) for i in range(len(y))]
x = [0] + x
y = [0] + y
plt.plot(x, y)
I think we have it.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Optimizing this recursive function (dynamic programming) - python

Related

What exactly is Stop in this question and how do I get the sum? [closed]

Python - "Fast Exponention Algorithm" performance outperformed by "worse" algorithm

Python: Why does the value always return 0

How can I improve my spiral index function?

Splitting the unit segment into two parts recursively

Categories

Resources