How is it possible to call functions in PuLP (Python) constraints? - python

Use case
This is just an easy example for understanding, how and why it is not working as expected.
There is a set of processes, which have a start and a finishing timestamp.
the start timestamp of a process must be after the finishing timestamp of it's predecessor. So far, so good.
Consideration
Regarding constraints: Shouldn't it be possible to carry out more complex operations than arithmetic equations (e.g. queries and case distinctions)?
This is illustrated in the code below.
The standard formulation of the constraint works properly.
But it fails, if you put a function call into the equation.
def func(p):
if self.start_timestamps[p] >= self.end_timestamps[p-1]:
return 1
return 0
# constraint for precedences of processes
for process_idx in self.processes:
if process_idx > 0:
# works fine !
model += self.start_timestamps[process_idx] >= self.end_timestamps[process_idx-1]
# doesn't work, but should?!
model += func(process_idx) == 1
Questions
Are there ways to resolve that via function call? (In a more complex case, for example, you would have to make different queries and iterations within the function.)
If it is not possible with PuLP, are there other OR libraries, which can process things like that?
Thanks!

Related

Unusual behaviour of Z3py optimization program

I'm writing a program that scrapes option_chain data off the TMX website and suggests an optimized Covered Call options portfolio based on those data. for the optimization process, I used the z3py library as discussed on this website by many users. the optimization process works by maximizing the premiums and setting the portfolio delta to a user-specified amount.
initially, I made a mistake in my calculation for portfolio delta which made everything work so smoothly but I'm facing issues after correcting it. portfolio delta is calculated by taking the Weighted average delta of non-zero positions in the portfolio. to achieve this I used the following setup:
eng = Optimize()
Weights = [Real(row.symbol) for row in df.itertuples()]
#to count non-zero positions
TotCount = Real("TotCount")
eng.add(TotCount == Sum([If(w > 0, 1, 0) for w in Weights]))
eng.add(TotCount == If(TotCount >= 0, TotCount, 1))
#to get portfolio delta
eng.add(TotDelta == (Sum([(w * row.delta) for w, row in zip(Weights, df.itertuples())]) / TotCount))
eng.add(TotDelta == delta)
res = eng.check()
the weird behavior happens when I run this as a function in a loop with different values for delta, the first loop actually does the calculation and spits out an answer but after that my code gets stuck on the last line for hours without any progress. i tried a few things like completely reformatting it but nothing seems to make a difference. I was wondering if anyone knows what's happening here or not?
Unfortunately, there isn't much in your description for anyone to go on without knowing further details of your setup. All one can deduce from your text is that the constraints are hard to solve for those values of delta where it takes longer. Without seeing the rest of your program, it's impossible to opine on what else might be going on.
See if you can isolate the value of delta that it runs slowly on, and just run it for that one instance separately to see if there's interaction coming from elsewhere. That's by no means a solution of course, but it's one way to get started.
One thing I noticed, though, is this line you have:
eng.add(TotCount == If(TotCount >= 0, TotCount, 1))
Let's see what this line is saying: If TotCount >= 0, then it says TotCount == TotCount. i.e., it puts no restrictions on it. Otherwise, it says TotCount == 1; i.e., if TotCount < 0, then TotCount == 1. Well this latter statement is obviously false. That is if, the other constraints force TotCount < 0, then your program would be unsat. This essentially means that whole line is functionally equivalent to:
eng.add(TotCount >= 0)
It's hard for anyone to tell if that's what you intended; but I suspect perhaps it isn't; since you'd have just written the simpler form above. Perhaps you wanted to say something more along the lines of if TotCount < 0, then make it 1. But looking at the previous line, (i.e., the Sum expression), we see that this'll never be the case. So, something is fishy there.
Hope that helps to get you started; note that you cannot model "sequential assignment" like in a programming language in this way. (You'd need to do what's known as single static assignment, a.k.a. SSA, conversion.) But then again, without knowing your exact intention, it's hard to opine.

How does Python stops itself a recurence?

Imagine I calculate the Fibonacci sequence following the (obviously inefficient) recursive algorithm :
def Fibo(n):
if n <= 1:
return(n)
else:
return(Fibo(n-2) + Fibo(n-1))
then my question is : how does Python known it has to stop the recurrence at n=0 ?
After all, if I call Fibo(-12), Python obviously answers -12, so why does it stop the recursion at n=0 while calling Fibo(12) for instance ?
Edit after a few comments :
This question has nothing to do with the mathematical concept of recurrence. I know recurrence stops at the initialized point. I would like to understand how recurrence are implemented in computer. For me ti is absolutely not clear while a computer should stop while there is no explicit stop command. What prevents here Fibo(0)=Fibo(-1)+Fibo(-2) to continue endlessly ? Because after all I precised that Fibo(-1)=-1, Fibo(-2)=-2, ... and I might want to summ all negative numbers as well ...
I confess in the last case I would prefer do a while loop.
It's functional, so it doesn't run, so it also doesn't stop. You are (still) thinking in terms of iterative programming and assume some kind of loop here which needs to stop at some time. This is not the case.
Instead in this paradigm you just state that the return value is the sum of the prior two numbers. At this point you don't care how the prior numbers are produced, here you just assume that they already exist.
Of course they don't and you will have to compute them as well but still this is not a loop which needs a stop. Instead it is a recursion which has an anchor. With each recursion step the values will become smaller and smaller and once they reach a value below 2 you just return 0 or 1 without any further recursion. This is your anchor.
Feel free to think of it as a "stopping point" but be aware that there is no loop happening which you need to break out of or similar.

Python SciPy ODE solver not converging

I'm trying to use scipy's ode solver to plot the interaction between a 2D system of equations. I'm attempting to alter the parameters passed to the solver by the following block of code:
# define maximum number of iteration steps for ode solver iteration
m = 1 #power of iteration
N = 2**m #number of steps
# setup a try-catch formulation to increase the number of steps as needed for solution to converge
while True:
try:
z = ode(stateEq).set_integrator("vode",nsteps=N,method='bdf',max_step=5e5)
z.set_initial_value(x0, t0)
for i in range(1, t.size):
if i%1e3 == 0:
print 'still integrating...'
x[i, :] = z.integrate(t[i]) # get one more value, add it to the array
if not z.successful():
raise RuntimeError("Could not integrate")
break
except:
m += 1
N = 2**m
if m%2 == 0:
print 'increasing nsteps...'
print 'nsteps = ', N
Running this never breaks the while loop. It keeps increasing the nsteps forever and the system never gets solved. If I don't put it in the while loop, the system gets solved, I think, because the solution gets plotted. Is the while loop necessary? Am I formulating the solver incorrectly?
The parameter nsteps regulates how many integration steps can be maximally performed during one sampling step (i.e., a call of z.integrate). Its default value is okay if your sampling step is sufficiently small to capture the dynamics. If you want to integrate over a huge time span in one large sampling step (e.g., to get rid of transient dynamics), the value can easily be too small.
The point of this parameter is to avoid problems arising from unexpectedly very long integrations. For example, if you want to perform a given integration for 100 values of a control parameter in a loop over night, you do not want to see on the next morning that the No. 14 was pathological and is still running.
If this is not relevant to you, just set nsteps to a very high value and stop worrying about it. There is certainly no point to successively increase nsteps, you are just performing the same calculations all over again.
Running this never breaks the while loop. It keeps increasing the nsteps forever and the system never gets solved.
This suggests that you have a different problem than nsteps being exceeded, most likely that the problem is not well posed. Carefully read the error message produced by the integrator. I also recommend that you check your differential equations. It may help to look at the solutions until the integration fails to see what is going wrong, i.e., plot x after running this:
z = ode(stateEq)
z.set_integrator("vode",nsteps=1e10,method='bdf',max_step=5e5)
z.set_initial_value(x0, t0)
for i,time in enumerate(t):
x[i,:] = z.integrate(time)
if not z.successful():
break
Your value for max_step is very high (this should not be higher than the time scale of your dynamics). Depending on your application, this may very well be reasonable, but then it suggests that you are working with large numbers. This in turn may mean that the default values of the parameters atol and first_step are not suited for your situation and you need to adjust them.

How to make a recursive program run for a long time without getting RunTimeError in Python

This code is the recursive factorial function.
The problem is that if I want to calculate a very large number, it generates this error:
RuntimeError : maximum recursion depth exceeded
import time
def factorial (n) :
if n == 0:
return 1
else:
return n * (factorial (n -1 ) )
print " The factorial of the number is: " , factorial (1500)
time.sleep (3600)
The goal is to do with the recursive function is a factor which can calculate maximum one hour.
This is a really bad idea. Python is not at all well-suited for recursing that many times. I'd strongly recommend you switch this to a loop which checks a timer and stops when it reaches the limit.
But, if you're seriously interested in increasing the recursion limit in cython (the default depth is 1000), there's a sys setting for that, sys.setrecursionlimit. Note as it says in the documentation that "the highest possible limit is platform-dependent" - meaning there's no way to know when your program will fail. Nor is there any way you, I or cython could ever tell whether your program will recurse for something as irrelevant to the actual execution of your code as "an hour." (Just for fun, I tried this with a method that passes an int counting how many times its already recursed, and I got to 9755 before IDLE totally restarted itself.)
Here's an example of a way I think you should do this:
# be sure to import time
start_time = time.time()
counter = 1
# will execute for an hour
while time.time() < start_time + 3600:
factorial(counter) # presumably you'd want to do something with the return value here
counter += 1
You should also keep in mind that regardless of whether you use iteration or recursion, (unless you're using a separate thread) you're still going to be blocking the entire program for the entirety of the hour.
Don't do that. There is an upper limit on how deep your recursion can get. Instead, do something like this:
def factorial(n):
result = 1
for i in range(1, n+1):
result *= i
return result
Any recursive function can be rewritten to an iterative function. If your code is fancier than this, show us the actual code and we'll help you rewrite it.
Few things to note here:
You can increase recursion stack with:
import sys
sys.setrecursionlimit(someNumber) # may be 20000 or bigger.
Which will basically just increase your limit for recursion. Note that in order for it to run one hour, this number should be so unreasonably big, that it is mostly impossible. This is one of the problems with recursion and this is why people think about iterative programs.
So basically what you want is practically impossible and you would rather make it with a loop/while approach.
Moreover your sleep function does not do what you want. Sleep just forces you to wait additional time (frozing your program)
It is a guard against a stack overflow. You can change the recursion limit with sys.setrecursionlimit(newLimit)where newLimit is an integer.
Python isn't a functional language. Rewriting the algorithm iteratively, if possible, is generally a better idea.

Fitting a function to data using pyminuit

I wrote a Python (2.7) program to evaluate some scientific data. Its main task is to fit this data to a certain function (1). Since this is quite a lot of data the program distributes the jobs (= "fit one set of data") to several cores using multiprocessing. In a first attempt I implemented the fitting process using curve_fit from scipy.optimize which works pretty well.
So far, so good. Then we saw that the data is more precisely described by a convolution of function (1) and a gaussian distribution. The idea was to fit the data first to function (1), get the guess values as a result of this and then fit the data again to the convolution. Since the data is quite noisy and I am trying to fit it to a convolution with seven parameters, the results were this time rather bad. Especially the gaussian parameters were in some extend physically impossible.
So I tried implementing the fitting process in PyMinuit because it allows to limit the parameters within certain borders (like a positive amplitude). As I never worked with Minuit before and I try to start small, I rewrote the first ("easy") part of the fitting process. The code snippet doing the job looks like this (simplified):
import minuit
import numpy as np
# temps are the x-values
# pol are the y-values
def gettc(temps, pol, guess_values):
try:
efunc_chi2 = lambda a,b,c,d: np.sum( (efunc(temps, a,b,c,d) - pol)**2 )
fit = minuit.Minuit(efunc_chi2)
fit.values['a'] = guess_values[0]
fit.values['b'] = guess_values[1]
fit.values['c'] = guess_values[2]
fit.values['d'] = guess_values[3]
fit.fixed['d'] = True
fit.maxcalls = 1000
#fit.tol = 1000.0
fit.migrad()
param = fit.args
popc = fit.covariance
except minuit.MinuitError:
return np.zeros(len(guess_values))
return param,popc
Where efunc() is function (1). The parameter d is fixed since I don't use it at the moment.
PyMinuit function reference
Finally, here comes the actual problem: When running the script, Minuit prints for almost every fit
VariableMetricBuilder: Tolerance is not sufficient - edm is 0.000370555 requested 1e-05 continue the minimization
to stdout with different values for edm. The fit still works fine but the printing slows the program considerably down. I tried to increase fit.tol but there are a lot of datasets which return even higher edm's. Then I tried just hiding the output of fit.migrad() using this solution which actually works. And now something strange happens: Somewhere in the middle of the program, the processes on all cores fail simultaneously. Not at the first fit but in the middle of my whole dataset. The only thing I changed, was
with suppress_stdout_stderr():
fit.migrad()
I know this is quite a long introduction, but I think it helps you more when you know the whole framework. I would be very grateful if someone has any idea on how to approach this problem.
Note: Function (1) is defined as
def efunc(x,a,b,c,d):
if x < c:
return a*np.exp(b*x**d)
else:
return a*np.exp(b*c**d) # therefore constant
efunc = np.vectorize(efunc)

Categories