Network Flow Optimimization (Gurobi) - python

I am trying to model and solve an optimization problem, with python and gurobi optimizer. It is my first experience to solve a problem using optimizer. firstly I wrote a really big problem and add all variables and constraints, step by step. But there was problem(S) in that. so I reduce the problem to the small version, again and again. After all, now I have a very simple code:
from gurobipy import *
m = Model('net')
x = m.addVar(name = 'x')
y = m.addVar(name = 'y')
m.addConstr(x >= 0 and x <= 9000, name = 'flow0')
m.addConstr(y >= 0 and y <= 1000, name = 'flow1')
m.addConstr(y + x == 9990, name = 'total_flow')
m.setObjective(x *(4 + 0.6*(x/9000)) + (y * (4 + 0.6*(y/1000))), GRB.MINIMIZE)
solo = m.optimize()
if solo:
print ('find!!!')
It actually is a simple network flow problem (for a graph with two nodes and two edges) I want to calculate the flow of each edge (x and y). Obviously the flow of each edge cant be negative and cant be bigger than edge capacity(x(capa) = 9000, y(capa) = 1000). and the third constraint shows the the total flow limitation on both edges. Finally, the objective function has to minimize the equation.
Now I have some question on this code:
why 'solo' is (None)?
How can I print solution variables. I used getAttr() function. but I couldn't find out the role of variables name (x, y or flow0, flow1)
3.Ive got this result. But I really cant understand this!!!!
for example: what dose it calculate in each iteration??!
Tnx in advance, and excuse for my simple question...

The optimize() method always returns None, see print(help(m.optimize)). The status of your model after calling this method is stored in m.status while the solution values are stored in the .X attribute for each variable (assumed the model was solved to optimality). To access them you can use m.getVars():
# your model ...
m.optimize()
if m.status = GRB.OPTIMAL:
for var in m.getVars():
print(var.VarName, var.X)
Your posted log shows for each iteration of the barrier method (also known as interior point method) the objective value. See here for a detailed overview.

Related

Solving natural convection equations (heat and flow) with the shooting method

TL;DR I've been implementing a python program to solve numerically equations for natural convection based on a particular similarity variable using runge-kutta 4 and the shooting method. However I don't get the right solutions when I plot it. Did I make a mistake somewhere ?
Hi !
Starting from a special case of natural convection, we get these similitude equations.
The first describe the fluid flow, the second describe the heat flow.
"Pr" is for Prandtl it's basically a dimensionless number used in fluid dynamics (Prandtl) :
These equations are subjects to the following boundary values such that the temperature near the plate is greater than the temperature outside the boundary layer and such that the fluid velocity is 0 far away from the boundary layer.
I've been trying to resolve these numerically with Runge-Kutta 4 and the shooting method to transform the boundary value problem into an initial value problem. The way the shooting method is implemented is with the newton method.
However, I don't get the right solutions.
As you can see in the following, the temperature (in red) is increasing as we are moving away from the plate whereas it should decrease exponentially.
It's more consistent for the fluid velocity (in blue), however the speed i think it should go up faster then go down faster. Here the curve is smoother.
Now, the fact is that we have a system of 2 coupled ODE. However, right now, I'm only trying to find the one of the two initials values (e.g. f''(0) = a, trying to find a) such that we have a solution to the boundary value problem (shooting method). Once found, I suppose we have the solution for the whole problem.
I guess I should maybe manage the two (f''(0) = a ; theta'(0) = b) but I don't know how to manage these two in parallel.
Last think to mention, if I try to get the initial value of theta' (so theta'(0)) I don't get the right heat profile.
Here is the code :
"""
The goal is to resolve a 3rd order non-linear ODE for the blasius problem.
It's made of 2 equations (flow / heat)
f''' = 3ff'' - 2(f')^2 + theta
3 Pr f theta' + theta'' = 0
RK4 + Shooting Method
"""
import numpy as np
import math
from scipy.integrate import odeint
from scipy.optimize import newton
from edo_solver.plot import plot
from constants import PRECISION
def blasius_edo(y, t, prandtl):
f = y[0:3]
theta = y[3:5]
return np.array([
# flow edo
f[1], # f' = df/dn
f[2], # f'' = d^2f/dn^2
- 3 * f[0] * f[2] + (2 * math.pow(f[1], 2)) - theta[0], # f''' = - 3ff'' + 2(f')^2 - theta,
# heat edo
theta[1], # theta' = dtheta/dn
- 3 * prandtl * f[0] * theta[1], # theta'' = - 3 Pr f theta'
])
def rk4(eta_range, shoot):
prandtl = 0.01
# initial values
f_init = [0, 0, shoot] # f(0), f'(0), f''(0)
theta_init = [1, shoot] # theta(0), theta'(0)
ci = f_init + theta_init # concatenate two ci
# note: tuple with single argument must have "," at the end of the tuple
return odeint(func=blasius_edo, y0=ci, t=eta_range, args=(prandtl,))
"""
if we have :
f'(t_0) = fprime_t0 ; f'(eta -> infty) = fprime_inf
we can transform it into :
f'(t_0) = fprime_t0 ; f''(t_0) = a
we define the function F(a) = f'(infty ; a) - fprime_inf
if F(a) has a root in "a",
then the solutions to the initial value problem with f''(t_0) = a
is also the solution the boundary problem with f'(eta -> infty) = fprime_inf
our goal is to find the root, we have the root...we have the solution.
it can be done with bissection method or newton method.
"""
def shooting(eta_range):
# boundary value
fprimeinf = 0 # f'(eta -> infty) = 0
# initial guess
# as far as I understand
# it has to be the good guess
# otherwise the result can be completely wrong
initial_guess = 10 # guess for f''(0)
# define our function to optimize
# our goal is to take big eta because eta should approach infty
# [-1, 1] : last row, second column => f'(eta_final) ~ f'(eta -> infty)
fun = lambda initial_guess: rk4(eta_range, initial_guess)[-1, 1] - fprimeinf
# newton method resolve the ODE system until eta_final
# then adjust the shoot and resolve again until we have a correct shoot
shoot = newton(func=fun, x0=initial_guess)
# resolve our system of ODE with the good "a"
y = rk4(eta_range, shoot)
return y
def compute_blasius_edo(title, eta_final):
ETA_0 = 0
ETA_INTERVAL = 0.1
ETA_FINAL = eta_final
# default values
title = title
x_label = "$\eta$"
y_label = "profil de vitesse $(f'(\eta))$ / profil de température $(\\theta)$"
legends = ["$f'(\eta)$", "$\\theta$"]
eta_range = np.arange(ETA_0, ETA_FINAL + ETA_INTERVAL, ETA_INTERVAL)
# shoot
y_set = shooting(eta_range)
plot(eta_range, y_set, title, legends, x_label, y_label)
compute_blasius_edo(
title="Convection naturelle - Solution de similitude",
eta_final=10
)
I could be completely off base here, but I wrote something similar to solve 1D fluid-reaction-heat equations. Try using solve_ivp and using the RADAU solver method, it helps with more difficult systems.
Also maybe try converting your system of ODES to a system of first order ODEs as that may help.
You are implementing the additional but wrong boundary condition f''(0) = theta'(0), as both slots get the same initial value in the shooting method. You need to hold them separate, giving 2 free variables and thus the need for a 2-dimensional Newton method or any other solver for non-scalar functions.
You could just as well use the solve_bvp routine with a sensible initial guess.

Why does the random seed impact on my back-propagation algorithm?

after a whole class about Machine Learning I realize I don't have the slightest idea of how to build a NN even though I passed the exam. Therefore I tried to write one from scratch following the advice of this videohttps://youtu.be/I74ymkoNTnw?t=425
In order to test the NN code I tried to overfit over the first point and, for some reason, I get the exactly opposite result (output=(0, 1); expected=(1, 0) ) where the output are probability.
I tried to change the sign of the correction in the back propagation but I still get an error of 45% even after thousands iteration. Therefore I assumed the sign was correct and the problem lies elsewhere.
I'm working with Google Collab so you can check and run the whole code: https://colab.research.google.com/drive/1j-WMk80t8mbg7vr5HscbTUJxUFxOK1yN
The function I'm assuming it's not working is the following:
def back_propagation(self, x:np.ndarray, y:np.ndarray, y_exp:np.ndarray):
error = 0.5*np.sum( (y-y_exp)**2 )
Ep = y-y_exp # d(Error) / d(y)
dfrac = np.flip( self.out)/np.sum( self.out)**2 # d( x/sum(x) )/d(x)
dsigm = self.out*(1-self.out) # d( 1/(1+exp(-x)) )/d(x) | out = sig(x)
correction = np.outer(Ep*dfrac*dsigm, x) # Correction matrix
self.NN *= 1-self.lr*correction
return error
Where y was obtained through:
def forward_propagation(self, x:np.ndarray):
Ax = self.NN.dot(x)
self.out = self.sigmoid(Ax)
y = self.out / np.sum( self.out)
return y
Can someone lend me an hand?
PS: I haven't written english in a long time, if there is any error / unreadable part tell me, I'll try to explain myself better.
EDIT: I examined more the error then I have the + sign in the back-propagation and I noticed that changing the seed change the minimum error after 10002 iteration:
seed = 1000 --> error = 0.4443457394544875
seed = 1234 --> error = 3.484945305904348e-05
seed = 1 --> error = 2.8741028650796533e-05
seed = 10000 --> error = 0.44434995711021025
seed = 12345 --> error = 3.430037390869015e-05
seed = 100 --> error = 2.851979370926932e-05
Therefore I change the question from "Why is my back-propagation algorithm maximizing error?" to "Why does the random seed impact on my back-propagation algorithm?"
I managed to fix my code. There was a couple of errors:
As I'm doing gradient descent a minus sign was needed. (i got lost in math)
The correction is independent of the weight. Therefore the right way to do it is
self.NN = self.NN - self.lr*correction
and not
self.NN *= 1-self.lr*correction
The fact that the seed changed my result was probably due to a low learning rate (when i wrote the question it was 0.1-0.5) it may be due to a local maximum (local maximum because I was wrongly trying to maximize error)
Hope my answer help someone else with a simular problem.

Pytorch: How to create an update rule that doesn't come from derivatives?

I want to implement the following algorithm, taken from this book, section 13.6:
I don't understand how to implement the update rule in pytorch (the rule for w is quite similar to that of theta).
As far as I know, torch requires a loss for loss.backwward().
This form does not seem to apply for the quoted algorithm.
I'm still certain there is a correct way of implementing such update rules in pytorch.
Would greatly appreciate a code snippet of how the w weights should be updated, given that V(s,w) is the output of the neural net, parameterized by w.
EDIT: Chris Holland suggested a way to implement, and I implemented it. It does not converge on Cartpole, and I wonder if I did something wrong.
The critic does converge on the solution to the function gamma*f(n)=f(n)-1 which happens to be the sum of the series gamma+gamma^2+...+gamma^inf
meaning, gamma=1 diverges. gamma=0.99 converges on 100, gamma=0.5 converges on 2 and so on. Regardless of the actor or policy.
The code:
def _update_grads_with_eligibility(self, is_critic, delta, discount, ep_t):
gamma = self.args.gamma
if is_critic:
params = list(self.critic_nn.parameters())
lamb = self.critic_lambda
eligibilities = self.critic_eligibilities
else:
params = list(self.actor_nn.parameters())
lamb = self.actor_lambda
eligibilities = self.actor_eligibilities
is_episode_just_started = (ep_t == 0)
if is_episode_just_started:
eligibilities.clear()
for i, p in enumerate(params):
if not p.requires_grad:
continue
eligibilities.append(torch.zeros_like(p.grad, requires_grad=False))
# eligibility traces
for i, p in enumerate(params):
if not p.requires_grad:
continue
eligibilities[i][:] = (gamma * lamb * eligibilities[i]) + (discount * p.grad)
p.grad[:] = delta.squeeze() * eligibilities[i]
and
expected_reward_from_t = self.critic_nn(s_t)
probs_t = self.actor_nn(s_t)
expected_reward_from_t1 = torch.tensor([[0]], dtype=torch.float)
if s_t1 is not None: # s_t is not a terminal state, s_t1 exists.
expected_reward_from_t1 = self.critic_nn(s_t1)
delta = r_t + gamma * expected_reward_from_t1.data - expected_reward_from_t.data
negative_expected_reward_from_t = -expected_reward_from_t
self.critic_optimizer.zero_grad()
negative_expected_reward_from_t.backward()
self._update_grads_with_eligibility(is_critic=True,
delta=delta,
discount=discount,
ep_t=ep_t)
self.critic_optimizer.step()
EDIT 2:
Chris Holland's solution works. The problem originated from a bug in my code that caused the line
if s_t1 is not None:
expected_reward_from_t1 = self.critic_nn(s_t1)
to always get called, thus expected_reward_from_t1 was never zero, and thus no stopping condition was specified for the bellman equation recursion.
With no reward engineering, gamma=1, lambda=0.6, and a single hidden layer of size 128 for both actor and critic, this converged on a rather stable optimal policy within 500 episodes.
Even faster with gamma=0.99, as the graph shows (best discounted episode reward is about 86.6).
BIG thank you to #Chris Holland, who "gave this a try"
I am gonna give this a try.
.backward() does not need a loss function, it just needs a differentiable scalar output. It approximates a gradient with respect to the model parameters. Let's just look at the first case the update for the value function.
We have one gradient appearing for v, we can approximate this gradient by
v = model(s)
v.backward()
This gives us a gradient of v which has the dimension of your model parameters. Assuming we already calculated the other parameter updates, we can calculate the actual optimizer update:
for i, p in enumerate(model.parameters()):
z_theta[i][:] = gamma * lamda * z_theta[i] + l * p.grad
p.grad[:] = alpha * delta * z_theta[i]
We can then use opt.step() to update the model parameters with the adjusted gradient.

How to define General deterministic function in PyMC

In my model, I need to obtain the value of my deterministic variable from a set of parent variables using a complicated python function.
Is it possible to do that?
Following is a pyMC3 code which shows what I am trying to do in a simplified case.
import numpy as np
import pymc as pm
#Predefine values on two parameter Grid (x,w) for a set of i values (1,2,3)
idata = np.array([1,2,3])
size= 20
gridlength = size*size
Grid = np.empty((gridlength,2+len(idata)))
for x in range(size):
for w in range(size):
# A silly version of my real model evaluated on grid.
Grid[x*size+w,:]= np.array([x,w]+[(x**i + w**i) for i in idata])
# A function to find the nearest value in Grid and return its product with third variable z
def FindFromGrid(x,w,z):
return Grid[int(x)*size+int(w),2:] * z
#Generate fake Y data with error
yerror = np.random.normal(loc=0.0, scale=9.0, size=len(idata))
ydata = Grid[16*size+12,2:]*3.6 + yerror # ie. True x= 16, w= 12 and z= 3.6
with pm.Model() as model:
#Priors
x = pm.Uniform('x',lower=0,upper= size)
w = pm.Uniform('w',lower=0,upper =size)
z = pm.Uniform('z',lower=-5,upper =10)
#Expected value
y_hat = pm.Deterministic('y_hat',FindFromGrid(x,w,z))
#Data likelihood
ysigmas = np.ones(len(idata))*9.0
y_like = pm.Normal('y_like',mu= y_hat, sd=ysigmas, observed=ydata)
# Inference...
start = pm.find_MAP() # Find starting value by optimization
step = pm.NUTS(state=start) # Instantiate MCMC sampling algorithm
trace = pm.sample(1000, step, start=start, progressbar=False) # draw 1000 posterior samples using NUTS sampling
print('The trace plot')
fig = pm.traceplot(trace, lines={'x': 16, 'w': 12, 'z':3.6})
fig.show()
When I run this code, I get error at the y_hat stage, because the int() function inside the FindFromGrid(x,w,z) function needs integer not FreeRV.
Finding y_hat from a pre calculated grid is important because my real model for y_hat does not have an analytical form to express.
I have earlier tried to use OpenBUGS, but I found out here it is not possible to do this in OpenBUGS. Is it possible in PyMC ?
Update
Based on an example in pyMC github page, I found I need to add the following decorator to my FindFromGrid(x,w,z) function.
#pm.theano.compile.ops.as_op(itypes=[t.dscalar, t.dscalar, t.dscalar],otypes=[t.dvector])
This seems to solve the above mentioned issue. But I cannot use NUTS sampler anymore since it needs gradient.
Metropolis seems to be not converging.
Which step method should I use in a scenario like this?
You found the correct solution with as_op.
Regarding the convergence: Are you using pm.Metropolis() instead of pm.NUTS() by any chance? One reason this could not converge is that Metropolis() by default samples in the joint space while often Gibbs within Metropolis is more effective (and this was the default in pymc2). Having said that, I just merged this: https://github.com/pymc-devs/pymc/pull/587 which changes the default behavior of the Metropolis and Slice sampler to be non-blocked by default (so within Gibbs). Other samplers like NUTS that are primarily designed to sample the joint space still default to blocked. You can always explicitly set this with the kwarg blocked=True.
Anyway, update pymc with the most recent master and see if convergence improves. If not, try the Slice sampler.

Defining a gradient with respect to a subtensor in Theano

I have what is conceptually a simple question about Theano but I haven't been able to find the answer (I'll confess upfront to not really understanding how shared variables work in Theano, despite many hours with the tutorials).
I'm trying to implement a "deconvolutional network"; specifically I have a 3-tensor of inputs (each input is a 2D image) and a 4-tensor of codes; for the ith input codes[i] represents a set of codewords which together code for input i.
I've been having a lot of trouble figuring out how to do gradient descent on the codewords. Here are the relevant parts of my code:
idx = T.lscalar()
pre_loss_conv = conv2d(input = codes[idx].dimshuffle('x', 0, 1,2),
filters = dicts.dimshuffle('x', 0,1, 2),
border_mode = 'valid')
loss_conv = pre_loss_conv.reshape((pre_loss_conv.shape[2], pre_loss_conv.shape[3]))
loss_in = inputs[idx]
loss = T.sum(1./2.*(loss_in - loss_conv)**2)
del_codes = T.grad(loss, codes[idx])
delc_fn = function([idx], del_codes)
train_codes = function([input_index], loss, updates = [
[codes, T.set_subtensor(codes[input_index], codes[input_index] -
learning_rate*del_codes[input_index]) ]])
(here codes and dicts are shared tensor variables). Theano is unhappy with this, specifically with defining
del_codes = T.grad(loss, codes[idx])
The error message I'm getting is: theano.gradient.DisconnectedInputError: grad method was asked to compute the gradient with respect to a variable that is not part of the computational graph of the cost, or is used only by a non-differentiable operator: Subtensor{int64}.0
I'm guessing that it wants a symbolic variable instead of codes[idx]; but then I'm not sure how to get everything connected to get the intended effect. I'm guessing I'll need to change the final line to something like
learning_rate*del_codes) ]])
Can someone give me some pointers as to how to define this function properly? I think I'm probably missing something basic about working with Theano but I'm not sure what.
Thanks in advance!
-Justin
Update: Kyle's suggestion worked very nicely. Here's the specific code I used
current_codes = T.tensor3('current_codes')
current_codes = codes[input_index]
pre_loss_conv = conv2d(input = current_codes.dimshuffle('x', 0, 1,2),
filters = dicts.dimshuffle('x', 0,1, 2),
border_mode = 'valid')
loss_conv = pre_loss_conv.reshape((pre_loss_conv.shape[2], pre_loss_conv.shape[3]))
loss_in = inputs[input_index]
loss = T.sum(1./2.*(loss_in - loss_conv)**2)
del_codes = T.grad(loss, current_codes)
train_codes = function([input_index], loss)
train_dicts = theano.function([input_index], loss, updates = [[dicts, dicts - learning_rate*del_dicts]])
codes_update = ( codes, T.set_subtensor(codes[input_index], codes[input_index] - learning_rate*del_codes) )
codes_update_fn = function([input_index], updates = [codes_update])
for i in xrange(num_inputs):
current_loss = train_codes(i)
codes_update_fn(i)
To summarize the findings:
Assigning grad_var = codes[idx], then making a new variable such as:
subgrad = T.set_subtensor(codes[input_index], codes[input_index] - learning_rate*del_codes[input_index])
Then calling
train_codes = function([input_index], loss, updates = [[codes, subgrad]])
seemed to do the trick. In general, I try to make variables for as many things as possible. Sometimes tricky problems can arise from trying to do too much in a single statement, plus it is hard to debug and understand later! Also, in this case I think theano needs a shared variable, but has issues if the shared variable is created inside the function that requires it.
Glad this worked for you!

Categories