How to assign dummy binary variables in PYOMO - python

Suppose I have two real variables: X & Y and two binary variables x & y.
I want to add the following constraint pyomo:
when X>0 x--->1 else x-->0
when Y>0 y--->1 else y-->0
and x+y==1
My approach was
cons1:
x>=X
cons2:
y>=Y
cons3:
x+y==1
but the above doesn't seem to work and the values of x and y are random.

Your first two conditions require big M constraints. You can try something like
M_x * x >= X, M_y * y >= Y, and x + y == 1 where M_x and M_y are be constants that you set to values that doesn't unnecessarily bound X and Y. These constraints won't restrict the values of X and Y to 1 and will make x = 1 when X > 0 and y = 1 when Y > 0.

Related

Linearize if statement in MILP constraint

I am trying to solve an optimization problem in which one of the constraints is :
x*y=0, where x and y are decision variables and only x or y can be positive. In other words, if x!=0 then y=0 and if y!=0 then x=0.
Please help
Assumption: x and y are nonnegative
Infer upper-bounds UB_x and UB_y for x, y
Introduce new boolean variable b
Add constraints:
x <= (1-b) * UB_x
y <= b * UB_y

How can I remove the corresponding Y values that correspond to the deleted outliers in X columns in Python

I have used the following method to remove outliers in my X variables prior to modelling:
z = np.abs(stats.zscore(X))
X = X[(z < 3).all(axis=1)]
How can I make it so the corresponding values in the Y column are deleted so that I can continue with my modelling?
You need to save the mask then apply it on both X and Y.
z = np.abs(stats.zscore(X))
mask = (z < 3).all(axis=1)
X = X[mask]
Y = Y[mask]

Test whole range of possible inputs for a given question

Is there a quick way to find the maximum value (float) from a function and the corresponding arguments x, y that are both integers between 0 and 100 (inclusive)? Do I need to use the assert function or something like that to get the range of all possible inputs?
def fun_A(x,y):
import math
if x == y:
return 0
first = math.cos((y%75)*(math.pi/180))
second = math.sin((x%30)*(math.pi/180))
return (first + second) / (abs(x - y))
For small problems like this it is probably fast enough to evaluate every possible combination and choose the maximum. The numpy library makes this easy to write and pretty fast as well:
import numpy as np
def fun_A(x, y):
first = np.cos((y%75)*(np.pi/180))
second = np.sin((x%30)*(np.pi/180))
return np.where(x == y, 0, (first + second) / (abs(x - y)))
x, y = np.mgrid[0:101, 0:101]
f = fun_A(x, y)
maxindex = np.argmax(f)
print('Max =', f.flat[maxindex], ' at x =', x.flat[maxindex], 'y =', y.flat[maxindex])
Output:
Max = 1.4591796850315724 at x = 89 y = 88
Things to note:
I've just replaced calls to math with calls to np.
x and y are matrices which allow us to evaluate every possible combination the two values in one function call.
I would do this for the tan function :
from math import tan
y = 0
x = 0
for x_iteration in range(0, 101):
if tan(x_iteration) > y :
x = x_iteration
y = tan(x_iteration)
x = int(x)
y = int(y)
It's fairly straightforward to write a program to solve this:
max_result = None
max_x = 0
max_y = 0
for x in range(0, 101):
for y in range(0, 101):
result = fun_A(x, y)
if max_result is None or result > max_result:
max_result = result
max_x = x
max_y = y
print(f"x={max_x} and y={max_y} produced the maximum result of {max_result}")

Plotting Specific Regions

I am new to python. The problem is that, assume that we have two parameters, x and y, and four functions f_1, f_2, f_3 and f_4. Suppose that we know that:
If (x < 5 < y < 5+x) or (5 <= y < x) or (x= 5 and 5 < y < 10) then function f_1 is the maximum function.
If (5 < x < y < 5 + x) or (x <= y < 5) then function f_2 is the maximum function.
If (y < x < 5) or (y < 5 < x) or ( x = 5 and y < x) then function f_3 is the maximum function.
If y > x+5 then function f_4 is the maximum function.
I need to draw a plot with x-axis = x and y-axis = y which shows the regions under which each function is the maximum function.
I used the following code, however the resulted plot, shown below, is not accurate.
import numpy as np
from matplotlib import pyplot as plt
x = np.arange(0,10,.1)
y = np.arange(0,15,.2)
x,y = np.meshgrid(x,y)
maxf = np.zeros(shape = x)
maxf.fill(-9999.99)
for i in range(len(x)):
for j in range(len(y)):
if j<i<5 or j<5<i:
maxf[i,j] =1
elif i<5<=j<i+5 or 5<=j<i:
maxf[i,j] =2
elif 5<i<=j<i+5 or i<=j<5:
maxf[i,j] =3
elif i == 5 and j<5:
maxf[i,j]=1
elif i == 5 and 5<=j<10:
maxf[i,j]=2
elif j >= 5+i:
maxf[i,j]=4
plt.contourf(x,y,maxf)
plt.colorbar()
plt.show()
The result should have been sth like the following picture:
When you set the initial array to -9999.99 you now have to make sure you only contour the values that you want which is between 1-3. Since that value is so much bigger in magnitude it does not get included in your plot. Set your contour levels for your plot using this:
plt.contourf(x,y,maxf,[0,1,2,3])
Yields:
Update
I didn't notice before but you are using i,j like they are the numbers but they actually represent the indexes of the arrays which is throwing off your calculation. You need to know the index and the value so you can use enumerate. If this is still not correct, then you need to revisit your logic in your conditions.
import numpy as np
from matplotlib import pyplot as plt
y = np.arange(0,15,.01)
x = np.arange(0,10,.01)
Y,X = np.meshgrid(y,x)
maxf = np.zeros(shape = Y.shape)
maxf.fill(-9999.99)
for i,x_ in enumerate(x):
for j, y_ in enumerate(y):
if y_<x_<5 or y_<5<x_:
maxf[i,j] =3
elif x_<5<=y_<(x_+5) or 5<=y_<x_:
maxf[i,j] =1
elif 5<x_<=y_<(x_+5) or x_<=y_<5:
maxf[i,j] =2
elif x_ == 5 and y_<5:
maxf[i,j]=3
elif x_ == 5 and y_>=5:
maxf[i,j]=1
elif y_ >= (5+x_):
maxf[i,j]=4
plt.contourf(X,Y,maxf,[0,1,2,3,4])
plt.colorbar()
plt.show()
Final Note
Just because you add a condition does not mean it will get evaluated if another condition is met first. In this case your 4th function is never true because one of the other conditions is always met. If you want that condition first, then make it your first if statement. How you arrange your logical statements matters especially since you have lots of conditions and some of which overlap each other.

Fast way to compute if statements on arrays in python?

Assume three numpy arrays x, y and z
z = (x**2)/ y for each x > 2 y
z = (x**2)/y**(3/2) for each x > 3 y
z = (1/x)*sin(x) for each x > 4 y
The array x, y and z are of-course made up but they illustrate the point of operating multiple if statements on multiple arrays. The arrays x, y and z are about 500,000 elements each.
One possible way (much like FORTRAN) is to create a variable i to index the arrays and use it to test if x[i] > 2*y[i] or x[i] > 3*y[i]. I assume it would be slow.
I need a fast, elegant and a more pythonic way to compute the array z.
UPDATE: I have tried the two methods and here are the results:
# Fortran way of loops:
import numpy as np
x=np.random.rand(40000,1)
y=np.random.rand(40000,1)
z = np.zeros(x.shape)
for i, v in enumerate(x):
#print i
if x[i] >2*y[i]:
z[i]= x[i]**2/y[i]
if x[i] > 3*y[i]:
z[i]=x[i]**2/y[i]**(1.5)
if x[i] > 4*y[i]:
z[i] = (1/x[i])*np.sin(x[i])
z = np.zeros(x.shape)
print z
#end----
The timing results are as follows:
real 0m0.920s
user 0m0.900s
sys 0m0.016s
The other piece of code used is:
# Pythonic way
import numpy as np
x=np.random.rand(40000,1)
y=np.random.rand(40000,1)
indices1 = np.where(x > 2*y)
indices2 = np.where(x > 3*y)
indices3 = np.where(x > 4*y)
z = np.zeros(x.shape)
z[indices1] = x[indices1]**2/y[indices1]
z[indices2] = x[indices2]**2/y[indices2]**(1.5)
z[indices3] = (1/x[indices3])*np.sin(x[indices3])
print z
# end of code -----
The timing results are as follows:
real 0m0.110s
user 0m0.076s
sys 0m0.028s
So there is a large difference in the execution times. The two pieces were run on a ubuntu virtual machine with python 2.7.5
UPDATE: I did another test using
indices1 = x > 2*y
indices2 = x > 3*y
indices3 = x > 4*y
The timing results were:
real 0m0.105s
user 0m0.084s
sys 0m0.016s
SUMMARY: Method 3 is the most elegant and slightly faster than using np.where. Using explicit loops is very slow.
I'm not quite sure if you are looking to have your z array be the same size as x or y, but I will assume so.
Numpy has a function that can find the indices of elements based on a condition.
In the example below I am doing a calculation similar to what your first line does.
import numpy as np
x = np.arange(4)
x[2:] += 10
print x
y = np.arange(4)
print y
indices = np.where(x > 2*y)
print indices
z = np.zeros(x.shape)
z[indices] = x[indices]**2/y[indices]
print z
The print statements yield the following:
x: [0 1 12 13]
y: [0 1 2 3]
indices: [2, 3]
z: [0 0 72 56]
Edit:
Upon further testing it turns out that you don't even need to use the numpy where function. You can simply set indices = x > 2*y.

Categories