I am pretty new to optimization in general and pyomo in particular, so I apologize in advance for any rookie mistakes.
I have defined a simple unit commitment exercise (example 3.1 from [1]) using [2] as starting point. I got the correct result and my code runs, but I have a few questions regarding how to access stuff.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import shutil
import sys
import os.path
import pyomo.environ as pyo
import pyomo.gdp as gdp #necessary if you use booleans to select active and innactive units
def bounds_rule(m, n, param='Cap_MW'):
# m because it pases the module
# n because it needs a variable from each set, in this case there was only m.N
return (0, Gen[n][param]) #returns lower and upper bounds.
def unit_commitment():
m = pyo.ConcreteModel()
m.dual = pyo.Suffix(direction=pyo.Suffix.IMPORT_EXPORT)
N=Gen.keys()
m.N = pyo.Set(initialize=N)
m.Pgen = pyo.Var(m.N, bounds = bounds_rule) #amount of generation
m.Rgen = pyo.Var(m.N, bounds = bounds_rule) #amount of generation
# m.OnOff = pyo.Var(m.N, domain=pyo.Binary) #boolean on/off marker
# objective
m.cost = pyo.Objective(expr = sum( m.Pgen[n]*Gen[n]['energy_$MWh'] + m.Rgen[n]*Gen[n]['res_$MW'] for n in m.N), sense=pyo.minimize)
# demand
m.demandP = pyo.Constraint(rule=lambda m: sum(m.Pgen[n] for n in N) == Demand['ener_MWh'])
m.demandR = pyo.Constraint(rule=lambda m: sum(m.Rgen[n] for n in N) == Demand['res_MW'])
# machine production limits
# m.lb = pyo.Constraint(m.N, rule=lambda m, n: Gen[n]['Cap_min']*m.OnOff[n] <= m.Pgen[n]+m.Rgen[n] )
# m.ub = pyo.Constraint(m.N, rule=lambda m, n: Gen[n]['Cap_MW']*m.OnOff[n] >= m.Pgen[n]+m.Rgen[n])
m.lb = pyo.Constraint(m.N, rule=lambda m, n: Gen[n]['Cap_min'] <= m.Pgen[n]+m.Rgen[n] )
m.ub = pyo.Constraint(m.N, rule=lambda m, n: Gen[n]['Cap_MW'] >= m.Pgen[n]+m.Rgen[n])
m.rc = pyo.Suffix(direction=pyo.Suffix.IMPORT)
return m
Gen = {
'GenA' : {'Cap_MW': 100, 'energy_$MWh': 10, 'res_$MW': 0, 'Cap_min': 0},
'GenB' : {'Cap_MW': 100, 'energy_$MWh': 30, 'res_$MW': 25, 'Cap_min': 0},
} #starting data
Demand = {
'ener_MWh': 130, 'res_MW': 20,
} #starting data
m = unit_commitment()
pyo.SolverFactory('glpk').solve(m).write()
m.display()
df = pd.DataFrame.from_dict([m.Pgen.extract_values(), m.Rgen.extract_values()]).T.rename(columns={0: "P", 1: "R"})
print(df)
print("Cost Function result: " + str(m.cost.expr()) + "$.")
print(m.rc.display())
print(m.dual.display())
print(m.dual[m.demandR])
da= {'duals': m.dual[m.demandP],
'uslack': m.demandP.uslack(),
'lslack': m.demandP.lslack(),
}
db= {'duals': m.dual[m.demandR],
'uslack': m.demandR.uslack(),
'lslack': m.demandR.lslack(),
}
duals = pd.DataFrame.from_dict([da, db]).T.rename(columns={0: "demandP", 1: "demandR"})
print(duals)
Here come my questions.
1-Duals/shadow-price: By definition the shadow price are the dual variables of the constraints (m.demandP and m.demandR). Is there a way to access this values and put them into a dataframe without doing that "shitty" thing I did? I mean defining manually da and db and then creating the dataframe as both dictionaries joined? I would like to do something cleaner like the df that holds the results of P and R for each generator in the system.
2-Usually, in the unit commitment problem, binary variables are used in order to "mark" or "select" active and inactive units. Hence the "m.OnOff" variable (commented line). For what I found in [3], duals don't exist in models containing binary variables. After that I rewrote the problem without including binarys. This is not a problem in this simplistic exercise in which all units run, but for larger ones. I need to be able to let the optimization decide which units will and won't run and I still need the shadow-price. Is there a way to obtain the shadow-price/duals in a problem containing binary variables?
I let the constraint definition based on binary variables also there in case someone finds it useful.
Note: The code also runs with the binary variables and gets the correct result, however I couldn't figure out how to get the shadow-price. Hence my question.
[1] Morales, J. M., Conejo, A. J., Madsen, H., Pinson, P., & Zugno, M. (2013). Integrating renewables in electricity markets: operational problems (Vol. 205). Springer Science & Business Media.
[2] https://jckantor.github.io/ND-Pyomo-Cookbook/04.06-Unit-Commitment.html
[3] Dual Variable Returns Nothing in Pyomo
To answer 1, you can dynamically get the constraint objects from your model using model.component_objects(pyo.Constraint) which will return an iterator of your constraints, which keeps your from having to hard-code the constraint names. It gets tricky for indexed variables because you have to do an extra step to get the slacks for each index, not just the constraint object. For the duals, you can iterate over the keys attribute to retrieve those values.
duals_dict = {str(key):m.dual[key] for key in m.dual.keys()}
u_slack_dict = {
# uslacks for non-indexed constraints
**{str(con):con.uslack() for con in m.component_objects(pyo.Constraint)
if not con.is_indexed()},
# indexed constraint uslack
# loop through the indexed constraints
# get all the indices then retrieve the slacks for each index of constraint
**{k:v for con in m.component_objects(pyo.Constraint) if con.is_indexed()
for k,v in {'{}[{}]'.format(str(con),key):con[key].uslack()
for key in con.keys()}.items()}
}
l_slack_dict = {
# lslacks for non-indexed constraints
**{str(con):con.lslack() for con in m.component_objects(pyo.Constraint)
if not con.is_indexed()},
# indexed constraint lslack
# loop through the indexed constraints
# get all the indices then retrieve the slacks for each index of constraint
**{k:v for con in m.component_objects(pyo.Constraint) if con.is_indexed()
for k,v in {'{}[{}]'.format(str(con),key):con[key].lslack()
for key in con.keys()}.items()}
}
# combine into a single df
df = pd.concat([pd.Series(d,name=name)
for name,d in {'duals':duals_dict,
'uslack':u_slack_dict,
'lslack':l_slack_dict}.items()],
axis='columns')
Regarding 2, I agree with #Erwin s comment about solving with the binary variables to get the optimal solution, then removing the binary restriction but fixing the variables to the optimal values to get some dual variable values.
Related
I am writing an assignment optimization (multiple knapsack) in CVXPY with multiple items from multiple source containers to multiple knapsacks. The items each have a value and there is another associated cost of assigning items from each container to each knapsack (ex. shipping cost from sources to destinations). Each item can only be assigned to one knapsack.
My question is if it's possible to write a few constraints such that
items assigned to knapsacks are in multiples of 88
each bundle of 88 items have to be from the same source container
Ex. Multiple bundles can be assigned from different source containers and the capacities of the knapsacks are also multiples of 88
example code:
import cvxpy as cp
import pandas as pd
import numpy as np
knapsack_df = pd.read_csv('knapsacks.csv')
item_df = pd.read_csv('items.csv')
assign_cost_data_df = pd.read_csv('assignment_costs.csv')
capacities = knapsack_df['capacities'] ##capacities of each knapsack
B = len(item_df) ##num_items
C = len(knapsack_df) ##num_knapsacks
tc = np.sum(knapsack_df['capacities']) ##total capacities across all the knapsacks
x = cp.variable(B,C, boolean=True)
obj_item = cp.sum(cp.multiply(item_df['value'], cp.sum(x, axis=1)))
obj_assign = cp.sum(cp.multiply(assign_cost_data_df.values, x)) ##assign_cost_data_df has shape (B,C) as well
obj_supply = cp.sum(x)
objective = cp.Minimize(obj_item + obj_assign - 100*obj_supply) ##heavily reward algorithm assigning items
constraints = []
constraints += [cp.sum(x, axis=1) <= np.ones(B)] ##each item can only be assigned once
constraints += [cp.sum(x, axis=0 <=capacities] ##cannot exceed capacity of knapsacks
## todo: constrain assignment to multiples of 88 from same container
problem = cp.Problem(objective,constraints)
problem.solve(solver='CBC')
If the constraints are not possible, is there a way to at least minimize the number of containers being used and heavily weight that in the cost function?
something like:
obj_container = 0
for i in range(x.shape[1]):
obj_container = len(set(cp.multiply(x[:,i], item_df['container'].values)))
objective = cp.Minimize(obj_item + obj_assign - 100*obj_supply + 8000*obj_container)
## container weighted more heavily than each individual item so the optimization would need to source multiple items from another container for it to be "worth".
Essentially, is there a way to find unique/distinct items in cp.multiply(x[:,i], item_df['container'].values)?
I'm modeling a reoptimisation model and I would like to include a constraint in order to reduce the distance between the initial solution and the reoptimized solution. I'm doing a staff scheduling and to do so I wanna penalized each assignment in the reoptimized solution that is different from the initial solution.
Before I start, I'm new to optimisation model and the way I built the constraint may be wrong.
#1 Extract the data from the initial solution of my main variable
ModelX_DictExtVal = model.x.extract_values()
# 2 Create a new binary variable which activate when the main variable `ModelX_DictExtVal[x,s,d]` of the initial
#solution is =1 (an employee n works days d and sifht s) and the value of `model.x[n,s,d]` of the reoptimized solution are different.
model.alpha_distance = Var(model.N_S_D, within=Binary)
#3 Model a constraint to activate my variable.
def constraint_distance(model, n, s, d):
v = ModelX_DictExtVal[n,s,d]
if v == 1 and ModelX_DictExtVal[n,s,d] != model.x[n,s,d]:
return model.alpha_distance[n,s,d] == 1
elif v == 0:
return model.alpha_distance[n,s,d] == 0
model.constraint_distance = Constraint(model.N_S_D, rule = constraint_distance)
#4 Penalize in my objective function every time the varaible is equal to one
ObjFunction = Objective(expr = sum(model.alpha_distance[n,s,d] * WeightDistance
for n in model.N for s in model.S for d in model.D))
Issue: I'm not sure about what I'm doing in part 3 and I get an index error when v == 1.
ERROR: Rule failed when generating expression for constraint
constraint_distance with index (0, 'E', 6): ValueError: Constraint
'constraint_distance[0,E,6]': rule returned None
I am wondering since I am reusing the same model for re-optimization if the model keeps the value of the initial solution of model.x [n, s, d] to do the comparison ModelX_DictExtVal [n, s, d]! = model.x [n, s, d] during the re-optimization phase instead of the new assignments...
You are right to suspect part 3. :)
So you have some "initial values" that could be either the original schedule (before optimizing) or some other preliminary optimization. And your decision variable is binary, indexed by [n,s,d] if I understand your question.
In your constraint you cannot employ an if-else structure based on a comparison test of your decision variable. The value of that variable is unknown at the time the constraint is built, right?
You are on the right track, though. So, what you really want to do is to have your alpha_distance (or penalty) variable capture any changes, indicating 1 where there is a change. That is an absolute value operation, but can be captured with 2 constraints. Consider (in pseudocode):
penalty = |x.new - x.old| # is what you want
So introduce 2 constraints, (indexed fully by [n,s,d]):
penalty >= x.new - x.old
penalty >= x.old - x.new
Then, as you are doing now, include the penalty in your objective, optionally multiplied by a weight.
Comment back if that doesn't make sense...
I'm using pyomo to find an approach for solving an energy optimization problem. I'm trying to find the optimal time slot during a day to shift the start of a smart dishwasher to, so that the electricity cost is minimized.
I'm using an example from a paper but unfortunately I can't embed pictures yet, so here are the links to the pictures describing the problem:
I know how to use pyomo to solve a simple multi-period optimization problem. If bounds of the sum in the objective function are fix I can write "sum(... for t in m.T)". See the following code example for a better understanding of what I mean:
from pyomo.environ import *
model = ConcreteModel()
model.T = RangeSet(5)
...
model.y = Var(model.T, domain=Binary)
model.i_pos = Var(model.T, domain=NonNegativeReals)
model.i_neg = Var(model.T, domain=NegativeReals)
...
c = 30 # setup cost: fixe Kosten pro Zeitschritt wenn in diesem Zeitschritt produziert werden muss
h_pos = 0.7 # cost per unit of holding inventory
h_neg = 1.2 # shortage cost per unit
P = 5.0 # maximum production amount per time step
...
def obj_rule(m):
return sum(c*m.y[t] + h_pos*m.i_pos[t] + h_neg*m.i_neg[t] for t in m.T)
model.obj = Objective(rule=obj_rule, sense=minimize)
Now my issue with the mentioned energy optimization problem is, that I don't know how to include a variable in the bounds of a sum in pyomo. The case is that the variable s_n,i(see link: Problem description) describes the starting time slot of the dishwasher, but is also included in the sum of my objective function which minimizes the costs. So somehow I need to get my variable s_n,i into the "for t in m.T" part of my objective function.
I adapted the problem from the paper a bit:
7 time steps
the dishwasher runs for 2 hours when it's turned on
the dishwasher has a rated power of 1 kW
This is my current code:
from pyomo.environ import *
model = ConcreteModel()
# 7 time steps
model.T = RangeSet(7)
a = 1 # earliest starting time slot for the dishwasher
b = 7 # lastest ending time slot for the dishwasher
d = 2 # duration dishwasher in h
p = 1 # power dishwasher in kW
# electricity cost per time step in €/kWh
cost = {1: 0.4, 2: 0.3, 3: 0.3, 4: 0.15, 5: 0.25, 6: 0.30, 7: 0.35}
# definition of the variable s_n,i
model.start = Var(bounds=(a, b-d), domain=NonNegativeReals)
def obj_rule(m):
return sum(cost[t]*p for t in m.T)
model.obj = Objective(rule=obj_rule, sense=minimize)
solver = SolverFactory('glpk')
solver.solve(model)
Thanks for your help!
Welcome to the site. Your question is a little more math programming rather than a coding issue, but let's give it a try...
I hope you have some familiarity with integer programming, because that is where you need to go with this problem to fix the part that you are hung up on.
Think about changing your start variable. (Let's just rename it Y for ease.) Let us decide that Y_j is a binary variable that represents the decision to start the machine in period j where j is a subset of the periods T that allow start. Now we have something to work with in your objective function....
In the objective then, we will want to look at the summation element and look to see if Y_j is 1 (selected) in the current or previous 2 time periods (assuming the machine runs for 3 time steps).
Alternatively, you could introduce yet another variable to indicate whether the machine is running in any particular period and set up constraints to force that variable based on Y_j.
You will need a constraint or two on Y_j depending on how you set the problem up, notably, of course, sum(Y_j) = 1
Comment back if yer stuck.
EDIT : Implementation below
there are a couple approaches to this. The one below uses 2 variables for start period and one for "running". I think you could finesse it a bit with a more complicated objective function and just use the "start" variable, but this is along the lines of the comments you had.
# pick the cheapest 2-consecutive periods to run an appliance
import pyomo.environ as pyo
# Data
costs = { 1:10,
2:5,
3:7,
4:15}
num_periods = 4
run_time = 2 # the number of periods the machine runs
m = pyo.ConcreteModel('appliance runner')
# SETS
m.T = pyo.RangeSet(num_periods)
m.T_start = pyo.RangeSet(num_periods - (run_time - 1)) # legal starts {1,2,3}
# VARS
m.Y = pyo.Var(m.T_start, domain=pyo.Binary) # the period to START
m.X = pyo.Var(m.T, domain=pyo.Binary) # machine running
# CONSTRAINTS
# must start at least once
m.C1 = pyo.Constraint(expr=sum(m.Y[t] for t in m.T_start) >=1 )
# periods after the start, the machine must be on for run_time
# (this is the tricky one... couple ways to do this...)
def running(self, t):
return m.X[t] >= sum(m.Y[t2] for t2 in
range(t, t-run_time, -1)
if t2 in m.T_start)
m.C2 = pyo.Constraint(m.T, rule=running)
# m.pprint() <-- use this to see what was created in this constraint
# OBJ
m.OBJ = pyo.Objective(expr=sum(m.X[t]*costs[t] for t in m.T))
solver = pyo.SolverFactory('glpk')
results = solver.solve(m)
print(results)
m.display()
I'm writing a code to analyze a (8477960, 1) column vector. I am not sure if the while loops in my code are running infinitely, or if the way I've written things is just really slow.
This is a section of my code up to the first while loop, which I cannot get to run to completion.
import numpy as np
import pandas as pd
data = pd.read_csv(r'C:\Users\willo\Desktop\TF_60nm_2_2.txt')
def recursive_low_pass(rawsignal, startcoeff, endcoeff, filtercoeff):
# The current signal length
ni = len(rawsignal) # signal size
rougheventlocations = np.zeros(shape=(100000, 3))
# The algorithm parameters
# filter coefficient
a = filtercoeff
raw = np.array(rawsignal).astype(np.float)
# thresholds
s = startcoeff
e = endcoeff # for event start and end thresholds
# The recursive algorithm
# loop init
ml = np.zeros(ni)
vl = np.zeros(ni)
s = np.zeros(ni)
ml[0] = np.mean(raw) # local mean init
vl[0] = np.var(raw) # local variance init
i = 0 # sample counter
numberofevents = 0 # number of detected events
# main loop
while i < (ni - 1):
i = i + 1
# local mean low pass filtering
ml[i] = a * ml[i - 1] + (1 - a) * raw[i]
# local variance low pass filtering
vl[i] = a * vl[i - 1] + (1 - a) * np.power([raw[i] - ml[i]],2)
# local threshold to detect event start
sl = ml[i] - s * np.sqrt(vl[i])
I'm not getting any error messages, but I've let the program run for more than 10 minutes without any results, so I assume I'm doing something incorrectly.
You should try to vectorize this process rather than accessing/processing indexes (otherwise why use numpy).
The other thing is that you seem to be doing unnecessary work (unless we're not seeing the whole function).
the line:
sl = ml[i] - s * np.sqrt(vl[i])
assigns the variable sl which you're not using inside the loop (or anywhere else). This assignment performs a whole vector multiplication by s which is all zeroes. If you do need the sl variable, you should calculate it outside of the loop using the last encountered values of ml[i] and vl[i] which you can store in temporary variables instead of computing on every loop.
If ni is in the millions, this unnecessary vector multiplication (of millions of zeros) is going to be very costly.
You probably didn't mean to override the value of s = startcoeff with s = np.zeros(ni) in the first place.
In order to vectorize these calculations you will need to use np.acumulate with some customized functions.
The non-numpy equivalent would be as follows (using itertools instead):
from itertools import accumulate
ml = [np.mean(raw)]+[0]*(ni-1)
mlSums = accumulate(zip(ml,raw),lambda r,d:(a*r[0] + (1-a)*d[1],0))
ml = [v for v,_ in mlSums]
vl = [np.var(raw)]+[0]*(ni-1)
vlSums = accumulate(zip(vl,raw,ml),lambda r,d:(a*r[0] + (1-a)*(d[1]-d[2])**2,0,0))
vl = [v for v,_,_ in vlSums]
In each case, the ml / vl vectors are initialized with the base value at index zero and the rest filled with zeroes.
The accumulate(zip(... function calls go through the array and call the lambda function with the current sum in r and the paired elements in d. For the ml calculation, this corresponds to r = (ml[i-1],_) and d = (0,raw[i]).
Because accumulate ouputs the same date type as it is given as input (which are zipped tuples), the actual result is only the first value of the tuples in the mlSums/vlSums lists.
This took 9.7 seconds to process for 8,477,960 items in the lists.
I am trying to do some analytics against a large dictionary created by reading a file from disk. The read operation results in a stable memory footprint. I then have a method which performs some calculations based on data I copy out of that dictionary into a temporary dictionary. I do this so that all the copying and data use is scoped in the method, and would, I had hoped, disappear at the end of the method call.
Sadly, I am doing something wrong. The customerdict definition is as follows (defined at top of .py variable):
customerdict = collections.defaultdict(dict)
The format of the object is {customerid: dictionary{id: 0||1}}
There is also a similarly defined dictionary called allids.
I have a method for calculating the sim_pearson distance (modified code from Programming Collective Intelligence book), which is below.
def sim_pearson(custID1, custID2):
si = []
smallcustdict = {}
smallcustdict[custID1] = customerdict[custID1]
smallcustdict[custID2] = customerdict[custID2]
#a loop to round out the remaining allids object to fill in 0 values
for customerID, catalog in smallcustdict.iteritems():
for id in allids:
if id not in catalog:
smallcustdict[customerID][asin] = 0.0
#get the list of mutually rated items
for id in smallcustdict[custID1]:
if id in smallcustdict[custID2]:
si.append(id) # = 1
#return 0 if there are no matches
if len(si) == 0: return 0
#add up all the preferences
sum1 = sum([smallcustdict[custID1][id] for id in si])
sum2 = sum([smallcustdict[custID2][id] for id in si])
#sum up the squares
sum1sq = sum([pow(smallcustdict[custID1][id],2) for id in si])
sum2sq = sum([pow(smallcustdict[custID2][id],2) for id in si])
#sum up the products
psum = sum([smallcustdict[custID1][id] * smallcustdict[custID2][id] for id in si])
#calc Pearson score
num = psum - (sum1*sum2/len(si))
den = sqrt((sum1sq - pow(sum1,2)/len(si)) * (sum2sq - pow(sum2,2)/len(si)))
del smallcustdict
del si
del sum1
del sum2
del sum1sq
del sum2sq
del psum
if den == 0:
return 0
return num/den
Every loop through the sim_pearson method grows the memory footprint of python.exe unbounded. I tried using the "del" method to explicitly delete local scoped variables.
Looking at taskmanager, the memory is jumping up at 6-10Mb increments. Once the initial customerdict is setup, the footprint is 137Mb.
Any ideas why I am running out of memory doing it this way?
I presume the issue is here:
smallcustdict[custID1] = customerdict[custID1]
smallcustdict[custID2] = customerdict[custID2]
#a loop to round out the remaining allids object to fill in 0 values
for customerID, catalog in smallcustdict.iteritems():
for id in allids:
if id not in catalog:
smallcustdict[customerID][asin] = 0.0
The dictionaries from customerdict are being referenced in smallcustdict - so when you add to them, you they persist. This is the only point that I can see where you do anything that will persist out of scope, so I would imagine this is the problem.
Note you are making a lot of work for yourself in many places by not using list comps, doing the same thing repeatedly, and not making generic ways to do things, a better version might be as follows:
import collections
import functools
import operator
customerdict = collections.defaultdict(dict)
def sim_pearson(custID1, custID2):
#Declaring as a dict literal is nicer.
smallcustdict = {
custID1: customerdict[custID1],
custID2: customerdict[custID2],
}
# Unchanged, as I'm not sure what the intent is here.
for customerID, catalog in smallcustdict.iteritems():
for id in allids:
if id not in catalog:
smallcustdict[customerID][asin] = 0.0
#dict views are set-like, so the easier way to do what you want is the intersection of the two.
si = smallcustdict[custID1].viewkeys() & smallcustdict[custID2].viewkeys()
#if not is a cleaner way of checking for no values.
if not si:
return 0
#Made more generic to avoid repetition and wastefully looping repeatedly.
parts = [list(part) for part in zip(*((value[id] for value in smallcustdict.values()) for id in si))]
sums = [sum(part) for part in parts]
sumsqs = [sum(pow(i, 2) for i in part) for part in parts]
psum = sum(functools.reduce(operator.mul, part) for part in zip(*parts))
sum1, sum2 = sums
sum1sq, sum2sq = sumsqs
#Unchanged.
num = psum - (sum1*sum2/len(si))
den = sqrt((sum1sq - pow(sum1,2)/len(si)) * (sum2sq - pow(sum2,2)/len(si)))
#Again using if not.
if not den:
return 0
else:
return num/den
Note that this is entirely untested as the code you gave isn't a complete example. However, It should be easy enough to use as a basis for improvement.
Try changing the following two lines:
smallcustdict[custID1] = customerdict[custID1]
smallcustdict[custID2] = customerdict[custID2]
to
smallcustdict[custID1] = customerdict[custID1].copy()
smallcustdict[custID2] = customerdict[custID2].copy()
That way the changes you make to the two dictionaries do not persist in customerdict when the sim_pearson() function returns.