Including intercept terms in piecewise linear programming

Including intercept terms in piecewise linear programming - python

I am new to linear programming and am hoping to get some help in understanding how to include intercept terms in the objective for a piecewise function (see below code example).
import pulp as pl
# Pieces
pieces = [1, 2]
# Problem
prob = pl.LpProblem('Example', pl.LpMaximize)
# Decision Vars
x_vars = pl.LpVariable.dict('x', pieces, 0, None, pl.LpInteger)
y_vars = pl.LpVariable.dict('y', pieces, 0, None, pl.LpInteger)
# Objective
prob += (-500+10*x_vars[1]) + (150+9*x_vars[2]) + (2+9.1*y_vars[1]) + (4+6*y_vars[2])
# Constraints
prob += pl.lpSum(x_vars[i] for i in pieces) + pl.lpSum(y_vars[i] for i in pieces) <= 1100
prob += x_vars[1] <= 700
prob += x_vars[2] <= 700
prob += y_vars[1] <= 400
prob += y_vars[2] <= 400
# Solve
prob.solve()
# Results
for v in prob.variables():
print(v.name, "=", pl.value(v))
The terms included in the objective function are the piecewise intercepts and coefficients obtained from univariate piecewise regression models. For example, the linear regression model for x is yhat=-500+10*x for the first piece, and yhat=150+9*x for the second piece. Likewise, for y we have yhat=2+9.1*x and yhat=4+6*x for the first and second pieces, respectively.
If I remove and/or change any of the intercept values, I arrive at the same solution. I would have thought that each intercept is required for producing the estimates in the objective function. Have I not specified the objective function properly? ..or are the intercept terms not required (and therefore not taken into account) in this type of LP formulation.

I don't exactly understand what you are trying to achieve. But let me try to explain what we normally do when we talk about piecewise linear functions.
A piecewise linear function is completely determined by its breakpoints. E.g.
The input is just these points
xbar = [1,3,6,10]
ybar = [6,2,8,7]
These points you have to calculate in advance, outside the optimization model. The intercept and slope of the segments are represented in these points. Note that the intercept cannot be ignored: it would lead to a very different segment. Calculations of these breakpoints need care: without correct breakpoints, your model will not function properly.
When using such a piecewise linear function, we want to maintain a mapping between x and y (both decision variables). I.e. we always want to hold for any feasible solution:
y = f(x)
where f represents the piecewise linear function. This means that besides choosing a segment, we need to interpolate between the breakpoints (i.e. we want to trace the blue line). The formulations below essentially form the constraint y=f(x) but in such a way that it is accepted by a MIP (Mixed Integer Programming) solver.
To interpolate between the breakpoints, we can use a lot of different formulations. The simplest is to use SOS2 variables. (SOS2 stands for Special Ordered Sets of Type 2, a construct that is supported by most high-end solvers). The formulation would look like:
Here x,y, and λ are decision variables (and xbar,ybar are data, i.e. constants). k is the set of points (here: k=1,..,4).
Not all solvers and modeling tools support SOS2 variables. Here is another formulation using binary variables:
Here s is the segment index: s=1,2,3. This is sometimes called the incremental formulation.
These are just two formulations. There are many others. Note that some solvers and modeling tools have special constructs to express piecewise linear functions. But all these share the idea of providing a collection of breakpoints.
This is very different from what you did. But this is what we typically do to model piecewise linear functions in Mixed-Integer Programming models.
A well-written reference is: H. Paul Williams, Model Building in Mathematical Programming, Wiley. You are encouraged to consult this practical book: it is very good.

Related

How to solve a delay differential equation numerically

I would like to compute the Buchstab function numerically. It is defined by the delay differential equation:
How can I compute this numerically efficiently?

To get a general feeling of how DDE integration works, I'll give some code, based on the low-order Heun method (to avoid uninteresting details while still being marginally useful).
In the numerical integration the previous values are treated as a function of time like any other time-depending term. As there is not really a functional expression for it, the solution so-far will be used as a function table for interpolation. The interpolation error order should be as high as the error order of the ODE integrator, which is easy to arrange for low-order methods, but will require extra effort for higher order methods. The solve_ivp stepper classes provide such a "dense output" interpolation per step that can be assembled into a function for the currently existing integration interval.
So after the theory the praxis. Select step size h=0.05, convert the given history function into the start of the solution function table
u=1
u_arr = []
w_arr = []
while u<2+0.5*h:
u_arr.append(u)
w_arr.append(1/u)
u += h
Then solve the equation, for the delayed value use interpolation in the function table, here using numpy.interp. There are other functions with more options in `scipy.interpolate.
Note that h needs to be smaller than the smallest delay, so that the delayed values are from a previous step. Which is the case here.
u = u_arr[-1]
w = w_arr[-1]
while u < 4:
k1 = (-w + np.interp(u-1,u_arr,w_arr))/u
us, ws = u+h, w+h*k1
k2 = (-ws + np.interp(us-1,u_arr,w_arr))/us
u,w = us, w+0.5*h*(k1+k2)
u_arr.append(us)
w_arr.append(ws)
Now the numerical approximation can be further processed, for instance plotted.
plt.plot(u_arr,w_arr); plt.grid(); plt.show()

PuLP: Minimizing the standard deviation of decision variables

In an optimization problem developed in PuLP i use the following objective function:
objective = p.lpSum(vec[r] for r in range(0,len(vec)))
All variables are non-negative integers, hence the sum over the vector gives the total number of units for my problem.
Now i am struggling with the fact, that PuLP only gives one of many solutions and i would like to narrow down the solution space to results that favors the solution set with the smallest standard deviation of the decision variables.
E.g. say vec is a vector with elements 6 and 12. Then 7/11, 8/10, 9/9 are equally feasible solutions and i would like PuLP to arrive at 9/9.
Then the objective
objective = p.lpSum(vec[r]*vec[r] for r in range(0,len(vec)))
would obviously create a cost function, that would help the case, but alas, it is non-linear and PuLP throws an error.
Anyone who can point me to a potential solution?

Instead of minimizing the standard deviation (which is inherently non-linear), you could minimize the range or bandwidth. Along the lines of:
minimize maxv-minv
maxv >= vec[r] for all r
minv <= vec[r] for all r

Fitting with funtional parameter constraints in Python

I have some data {x_i,y_i} and I want to fit a model function y=f(x,a,b,c) to find the best fitting values of the parameters (a,b,c); however, the three of them are not totally independent but constraints to 1<b , 0<=c<1 and g(a,b,c)>0, where g is a "good" function. How could I implement this in Python since with curve_fit one cannot put the parametric constraints directly?
I have been reading with lmfit but I only see numerical constraints like 1<b, 0<=c<1 and not the one with g(a,b,c)>0, which is the most important.

If I understand correctly, you have
def g(a,b,c):
c1 = (1.0 - c)
cx = 1/c1
c2 = 2*c1
g = a*a*b*gamma(2+cx)*gamma(cx)/gamma(1+3/c2)-b*b/(1+b**c2)**(1/c2)
return g
If so, and if get the math right, this could be represented as
a = sqrt((g+b*b/(1+b**c2)**(1/c2))*gamma(1+3/c2)/(b*gamma(2+cx)*gamma(cx)))
Which is to say that you could think about your problem as having a variable g which is > 0 and a value for a derived from b, c, and g by the above expression.
And that you can do with lmfit and its expression-based constraint mechanism. You would have to add the gamma function, as with
from lmfit import Parameters
from scipy.special import gamma
params = Parameters()
params._asteval.symtable['gamma'] = gamma
and then set up the parameters with bounds and constraints. I would probably follow the math above to allow better debugging and use something like:
params.add('b', 1.5, min=1)
params.add('c', 0.4, min=0, max=1)
params.add('g', 0.2, min=0)
params.add('c1', expr='1-c')
params.add('cx', expr='1.0/c1')
params.add('c2', expr='2*c1')
params.add('gprod', expr='b*gamma(2+cx)*gamma(cx)/gamma(1+3/c2)')
params.add('bfact', expr='(1+b**c2)**(1/c2)')
params.add('a', expr='sqrt(g+b*b/(bfact*gprod))')
Note that this gives 3 actual variables (now g, b, and c) with plenty of derived values calculated from these, including a. I would certainly check all that math. It looks like you're safe from negative**fractional_power, sqrt(negitive), and gamma(-1), but be aware of these possibilities that will kill the fit.
You could embed all of that into your fitting function, but using constraint expressions gives you the ability to constrain parameter values independently of how the fitting or model function is defined.
Hope that helps. Again, if this does not get to what you are trying to do, post more details about the constraint you are trying to impose.

Like James Phillips, I was going to suggest SciPy's curve_fit. But the way that you have defined your function, one of the constraints is on the function itself, and SciPy's bounds are defined only in terms of input variables.
What, exactly, are the forms of your functions? Can you transform them so that you can use a standard definition of bounds, and then reverse the transformation to give a function in the original form that you wanted?
I have encountered a related problem when trying to fit exponential regressions using SciPy's curve_fit. The parameter search algorithms vary in a linear fashion, and it's really easy to fail to establish a gradient. If I write a function which fits the logarithm of the function I want, it's much easier to make curve_fit work. Then, for my final work, I take the exponent of my fitted function.
This same strategy could work for you. Predict ln(y). The value of that function can be unbounded. Then for your final result, output exp(ln(y)) = y.

How to generate random numbers with predefined probability distribution?

I would like to implement a function in python (using numpy) that takes a mathematical function (for ex. p(x) = e^(-x) like below) as input and generates random numbers, that are distributed according to that mathematical-function's probability distribution. And I need to plot them, so we can see the distribution.
I need actually exactly a random number generator function for exactly the following 2 mathematical functions as input, but if it could take other functions, why not:
1) p(x) = e^(-x)
2) g(x) = (1/sqrt(2*pi)) * e^(-(x^2)/2)
Does anyone have any idea how this is doable in python?

For simple distributions like the ones you need, or if you have an easy to invert in closed form CDF, you can find plenty of samplers in NumPy as correctly pointed out in Olivier's answer.
For arbitrary distributions you could use Markov-Chain Montecarlo sampling methods.
The simplest and maybe easier to understand variant of these algorithms is Metropolis sampling.
The basic idea goes like this:
start from a random point x and take a random step xnew = x + delta
evaluate the desired probability distribution in the starting point p(x) and in the new one p(xnew)
if the new point is more probable p(xnew)/p(x) >= 1 accept the move
if the new point is less probable randomly decide whether to accept or reject depending on how probable1 the new point is
new step from this point and repeat the cycle
It can be shown, see e.g. Sokal2, that points sampled with this method follow the acceptance probability distribution.
An extensive implementation of Montecarlo methods in Python can be found in the PyMC3 package.
Example implementation
Here's a toy example just to show you the basic idea, not meant in any way as a reference implementation. Please refer to mature packages for any serious work.
def uniform_proposal(x, delta=2.0):
return np.random.uniform(x - delta, x + delta)
def metropolis_sampler(p, nsamples, proposal=uniform_proposal):
x = 1 # start somewhere
for i in range(nsamples):
trial = proposal(x) # random neighbour from the proposal distribution
acceptance = p(trial)/p(x)
# accept the move conditionally
if np.random.uniform() < acceptance:
x = trial
yield x
Let's see if it works with some simple distributions
Gaussian mixture
def gaussian(x, mu, sigma):
return 1./sigma/np.sqrt(2*np.pi)*np.exp(-((x-mu)**2)/2./sigma/sigma)
p = lambda x: gaussian(x, 1, 0.3) + gaussian(x, -1, 0.1) + gaussian(x, 3, 0.2)
samples = list(metropolis_sampler(p, 100000))
Cauchy
def cauchy(x, mu, gamma):
return 1./(np.pi*gamma*(1.+((x-mu)/gamma)**2))
p = lambda x: cauchy(x, -2, 0.5)
samples = list(metropolis_sampler(p, 100000))
Arbitrary functions
You don't really have to sample from proper probability distributions. You might just have to enforce a limited domain where to sample your random steps3
p = lambda x: np.sqrt(x)
samples = list(metropolis_sampler(p, 100000, domain=(0, 10)))
p = lambda x: (np.sin(x)/x)**2
samples = list(metropolis_sampler(p, 100000, domain=(-4*np.pi, 4*np.pi)))
Conclusions
There is still way too much to say, about proposal distributions, convergence, correlation, efficiency, applications, Bayesian formalism, other MCMC samplers, etc.
I don't think this is the proper place and there is plenty of much better material than what I could write here available online.
The idea here is to favor exploration where the probability is higher but still look at low probability regions as they might lead to other peaks. Fundamental is the choice of the proposal distribution, i.e. how you pick new points to explore. Too small steps might constrain you to a limited area of your distribution, too big could lead to a very inefficient exploration.
Physics oriented. Bayesian formalism (Metropolis-Hastings) is preferred these days but IMHO it's a little harder to grasp for beginners. There are plenty of tutorials available online, see e.g. this one from Duke university.
Implementation not shown not to add too much confusion, but it's straightforward you just have to wrap trial steps at the domain edges or make the desired function go to zero outside the domain.

NumPy offers a wide range of probability distributions.
The first function is an exponential distribution with parameter 1.
np.random.exponential(1)
The second one is a normal distribution with mean 0 and variance 1.
np.random.normal(0, 1)
Note that in both case, the arguments are optional as these are the default values for these distributions.
As a sidenote, you can also find those distributions in the random module as random.expovariate and random.gauss respectively.
More general distributions
While NumPy will likely cover all your needs, remember that you can always compute the inverse cumulative distribution function of your distribution and input values from a uniform distribution.
inverse_cdf(np.random.uniform())
By example if NumPy did not provide the exponential distribution, you could do this.
def exponential():
return -np.log(-np.random.uniform())
If you encounter distributions which CDF is not easy to compute, then consider filippo's great answer.

Sympy function derivatives and sets of equations

I'm working with nonlinear systems of equations. These systems are generally a nonlinear vector differential equation.
I now want to use functions and derive them with respect to time and to their time-derivatives, and find equilibrium points by solving the nonlinear equations 0=rhs(eqs).
Similar things are needed to calculate the Euler-Lagrange equations, where you need the derivative of L wrt. diff(x,t).
Now my question is, how do I implement this in Sympy?
My main 2 problems are, that deriving a Symbol f wrt. t diff(f,t), I get 0. I can see, that with
x = Symbol('x',real=True);
diff(x.subs(x,x(t)),t) # because diff(x,t) => 0
and
diff(x**2, x)
does kind of work.
However, with
x = Fuction('x')(t);
diff(x,t);
I get this to work, but I cannot differentiate wrt. the funtion x itself, like
diff(x**2,x) -DOES NOT WORK.
Since I need these things, especially not only for scalars, but for vectors (using jacobian) all the time, I really want this to be a clean and functional workflow.
Which kind of type should I initiate my mathematical functions in Sympy in order to avoid strange substitutions?
It only gets worse for matricies, where I cannot get
eqns = Matrix([f1-5, f2+1]);
variabs = Matrix([f1,f2]);
nonlinsolve(eqns,variabs);
to work as expected, since it only allows symbols as input. Is there an easy conversion here? Like eqns.tolist() - which doesn't work either?
EDIT:
I just found this question, which was answered towards using expressions and matricies. I want to be able to solve sets of nonlinear equations, build the jacobian of a vector wrt. another vector and derive wrt. functions as stated above. Can anyone point me into a direction to start a concise workflow for this purpose? I guess the most complex task is calculating the Lie-derivative wrt. a vector or list of functions, the rest should be straight forward.
Edit 2:
def substi(expr,variables):
return expr.subs( {w:w(t)} )
would automate the subsitution, such that substi(vector_expr,varlist_vector).diff(t) is not all 0.

Yes, one has to insert an argument in a function before taking its derivative. But after that, differentiation with respect to x(t) works for me in SymPy 1.1.1, and I can also differentiate with respect to its derivative. Example of Euler-Lagrange equation derivation:
t = Symbol("t")
x = Function("x")(t)
L = x**2 + diff(x, t)**2 # Lagrangian
EL = -diff(diff(L, diff(x, t)), t) + diff(L, x)
Now EL is 2*x(t) - 2*Derivative(x(t), t, t) as expected.
That said, there is a build-in method for Euler-Lagrange:
EL = euler_equations(L)
would yield the same result, except presented as a differential equation with right-hand side 0: [Eq(2*x(t) - 2*Derivative(x(t), t, t), 0)]

The following defines x to be a function of t
import sympy as s
t = s.Symbol('t')
x = s.Function('x')(t)
This should solve your problem of diff(x,t) being evaluated as 0. But I think you will still run into problems later on in your calculations.
I also work with calculus of variations and Euler-Lagrange equations. In these calculations, x' needs to be treated as independent of x. So, it is generally better to use two entirely different variables for x and x' so as not to confuse Sympy with the relationship between those two variables. After we are done with the calculations in Sympy and we go back to our pen and paper we can substitute x' for the second variable.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Including intercept terms in piecewise linear programming - python

Related

How to solve a delay differential equation numerically

PuLP: Minimizing the standard deviation of decision variables

Fitting with funtional parameter constraints in Python

How to generate random numbers with predefined probability distribution?

Sympy function derivatives and sets of equations

Categories

Resources