How can I perform a calculation on my CVXPY variable? - python

I have a convex programming problem in which I am constrained to several periods, each of these periods represents different times of a day in minutes. Assume we are constrained to 7 periods in the day, these periods consist [480, 360, 120, 180, 90, 120, 90].
Update to my thoughts on this:
Can the 7 intervals variable be transferred to a binary variable of 1440? This would mean we can calculated the level as needed.
I would assume to use these periods as a max for my integer variable which can be defined as X. X is a CVXPY variable X = cp.Variable(7). Performing my solution I create and define the problem by creating constraints, the constraints I want to work with are as follows:
Target >= min target
Reservoir level >= min level
Reservoir level <= max level
I understand than in order to calculate the reservoir levels I must feed correct data to ensure calculation such as surface area, what is expected to leave the reservoir. The problem I am struggling with is that due to the shape of X I feel like I should be ensuring the reservoir isn't overfilling between periods, at the moment my calculation just checks at point 0, point 1 ... point 7 and this does satisfy the constraint, however in real world the issue I am facing is that we are exceeding the max level in between these points at stages and would need to factor this into account, how could we refactor the code below to account for this given the variable running times of pumps as set by X
Please see below the code that we currently are working with.
# Imports
import numpy as np
import cvxpy as cp
# Initial variables required to reproduce the problem
periods_minutes = [480, 360, 120, 180, 90, 120, 90]
energy_costs = np.array([0.19, 0.22, 0.22, 0.22, 0.22, 0.22, 0.19])
flow_out = np.array([84254.39998627, 106037.09985495, 35269.19992447, 47066.40017509, 26121.59963608, 33451.20002747, 20865.5999279])
# Constant variables
pump_flow = 9.6
pump_energy = 9.29
pump_flow_per_minute = pump_flow * 60
# CVXPY section
N = len(periods_minutes)
MIN_RUN_TIME = 10
running_time = cp.Variable(N, integer=True)
mins = np.ones(N) * MIN_RUN_TIME
maxs = np.ones(N) * periods_minutes
k = cp.Variable(N, boolean=True)
# Optimiation calculations
running_time_hours = running_time / 60
cost_of_running = cp.multiply(running_time_hours, energy_costs) * pump_energy
sum_of_energy = cp.sum(cost_of_running)
volume_cp = cp.sum(running_time*pump_flow_per_minute)
period_volume = running_time * pump_flow_per_minute
# Create a variable that will represent 1 if the period is runnning and 0 for the remeainder, 1440 total
# Example running_time[0] = 231, then this is 231 trues in the variable
# test = np.zeros((1, 1440))
# for i in range(N):
# for j in range(running_time[i]):
# test[0][j] = 1
# Reservoir information and calculations
FACTOR = 1/160.6
flow_in = running_time * pump_flow_per_minute
flow_diff = (flow_in - flow_out) / 1000
res_level = cp.cumsum(flow_diff) * FACTOR + 2.01
# Constant constraints
min_level_constraint = res_level >= 1.8
max_level_constraint = res_level <= 2.4
volume_constraint = volume_cp >= 353065.5
# Build constraints
constraints = []
# Convert the integer variables to binary variables
# constraints += [test_cp[0] == 1]
# Append common constraints
constraints += [min_level_constraint]
constraints += [max_level_constraint]
constraints += [volume_constraint]
constraints += [running_time >= cp.multiply(k, mins)]
constraints += [running_time <= cp.multiply(k, maxs)]
# Objective definition
objective = cp.Minimize(cp.sum(sum_of_energy))
# Problem declaration
prob = cp.Problem(objective, constraints)
prob.solve(solver=cp.CPLEX, verbose=False)
# Each ith element of the array represents running time in minutes
running_time.value
Note that some variables are part of our external class:
Surface Area: 160.6m²
Min Level: 1.85m
Max Level: 2.4m
Pump Flow: 9.6l/s
Pump Energy: 9kW
At the moment our outflow data for the reservoir is in 30 minute intervals, Ideally if we could develop a solution to allow for this in the sense of say a inflow matrix that accounted for the various running times over a period of time and accounted the volume such as imagine a variable output for X being [231 100 0 0 30 90 99] If we look at the first element being 231 I would expect something like this in our matrix given 480 minutes for the first element as the max running time for period 1, this yields 16 elements as 480/30.
The expected outcome given this would be something like
[17280 17280 17280 17280 17280 17280 17280 12096 0 0 0 0 0 0 0 0]
Figures shown above are in volumes 17280 being full 30 minute interval running and 12096 being 21 minutes of the period, 0 being not running. I hope to have provided enough information to entice people into looking at this problem and look forward to answering and queries you may have. Thanks for taking the time to read through my post.

Problem
I assume that the pump starts running at the beginning of each time period, and stops after running_time seconds, until the next time period starts. We are checking the level at the end of each period but within the periods the level may get higher when the pump is working. I hope I've understood the problem correctly.
Solution
The constraint is:
res_level(t) < 2.4
The function is piecewise smooth, the pieces being separated by time period boundaries and the event of pump shutdown within each time period.
Mathematically we know that the constraint is satisfied if the value of res_level(t) is smaller than 2.4 at all critical points—i.e. piece boundaries and interior extrema.
We also know that res_level(t) is linear in the piece intervals. So there are no interior extrema—except in case of constant function, but in that case the value is already checked at the boundaries.
So your approach of checking res_level at ends of each period is correct, except that you also need to check the level at the times of pump shutdown.
From simple mathematics:
res_level(t_shutdown) = res_level(period_start) + flow_in - (t_shutdown/t_period) * flow_out
In CVXPY this can be implemtend as:
res_level_at_beginning_of_period = res_level - flow_diff * FACTOR
flow_diff_until_pump_shutdown = (flow_in - cp.multiply(flow_out, (running_time / periods_minutes))) / 1000
res_level_at_pump_shutdown = res_level_at_beginning_of_period + flow_diff_until_pump_shutdown * FACTOR
max_level_constraint_at_pump_shutdown = res_level_at_pump_shutdown <= 2.4
constraints += [max_level_constraint_at_pump_shutdown]
Running the code with this additional constraint gave me the following res_levels (levels at end of periods):
[2.0448792 1.8831538 2.09393089 1.80086488 1.96100436 1.81727335
2.0101401 ]

Related

How to simulate a Non-Homogenous Poisson Process?

I tried to simulate the NHPP in python. The function works but the numbers simulated don´t follow the NHPP.
The code is:
def nhpp(parametros,T,N):
numeros=list()
# the function of rate λ(t) is a power law model, that is λ(t) = λ β 𝑡**(𝛽−1) ,𝑡, 𝜆, 𝛽>0.
funçao =lambda x:parametros[1] * parametros[0] * x ** (parametros[0] - 1)
#calculate the maximum of the function in the interval (0,T)
res=integrate.quad(funçao,0,T)
# l represents the λ
l=res[0]
t=0
cont=0
contagem =list()
listafinal=list()
for i in range (1,N+1):
u = numpy.random.uniform(0, 1)
#t represents the exponential times generated
t = t - (ln(u) / l)
#fun represents the values of λ(t) for the t1,t2,t3...tN
fun=parametros[1] * parametros[0] * t ** (parametros[0] - 1)
# if u<λ(t)/λ we acept the time
if u<=fun/l:
numeros.append(t)
#cont represents the number of times (N(T)) that were acepted as NHPP
cont = cont + 1
contagem.append(cont)
listafinal.append(numeros)
listafinal.append(contagem)
print(listafinal)
return listafinal
x=nhpp([0.5,0.35],500,20000)
The output of this function is: [[6.637092201160706, 12.739051189013342, 22.89616658744735, 161.12015416135688, 386.6019409119157, 424.7928356177192, 428.48931184149734, 733.1527508780554, 886.1376014091232, 1073.653026573429, 1133.4535462717483, 1787.4258499386765, 2077.7766357676205], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]]
If I do the graphic of the points, the times between occurrences are not decreasing but they should because, when the B (parameter of power-law model) is < 1 (in this case is B=0.5), the times between occurrences decrease. Anyone can help simulate the NHPP in python correctly?
Note :In the power law process:case B<1 times inter failures decrease,case B=1 times inter failures constant,case B>1 times inter failures increase.
i want create someting like this :
In the picture the black line representes B=0,5 ,the blue line B=1, and the red line B=1,5.
Update up-front
A common way to generate/simulate non-homogeneous Poisson processes is to use thinning. Candidate event times are generated at the maximal rate for the interval, and then thinned out by accepting a proportion of them based on the ratio of the instantaneous rate to the maximal rate. Unfortunately, this does not work if the maximal rate in the interval of interest is not finite, which can be the case when the rate function follows a power law with b < 1. I've left the discussion of thinning below for people who find this post based on the question title.
NIST has an online manual which describes a generating algorithm specific to the power law case. According to the link given above, in order to generate power law NHPP events with parameters a=0.5 and b=0.35 you should generate exponential random variates with rate a, add that to the prior time raised to the bth power, and then take the bth root of the sum to yield the next event time:
import random
# params is a list containing rate a and power b.
# t is the amount of time to be simulated.
def nhpp(params, t):
time = 0.0
event = 0
print("event time,event number")
while True:
# Generate b'th root of an exponential with rate "a",
# and update the simulated time accordingly
time = (time ** params[1] + random.expovariate(params[0])) ** (1/params[1])
event += 1
if time > t:
break
print(f"{time},{event}")
nhpp([0.5, 0.35], 10000)
which yields output such as:
event time,event number
0.0027863666405411654,1
0.1663302577640816,2
0.3771684274752755,3
36.54675259117693,4
76.353564909201,5
260.547640677633,6
292.0182519323185,7
406.34546142065693,8
5342.127722590645,9
5472.997406321742,10
5844.439757675029,11
8521.086105482522,12
How to simulate NHPP in python using thinning
The thinning technique's principles are described here. The following is a heavily annotated sample implementation of how to do it in Python:
import math
import random
# A frequency that cycles every 20 time units, in radians
OMEGA = 0.05 * 2 * math.pi
# A sinusoidal time-varying rate, truncated at zero
rate_f = lambda t: max(0, 4 - (16 * math.cos(OMEGA * t)))
# The rate function above gives a maximum instantaneous arrival rate
# of 20 events per time unit, i.e., 4 - (16 * -1) when cos == -1
lambda_max = 20.0
# The following should generate 5 cycles of non-zero
# event epochs between time 0 and time 100
t = 0.0
print("time of event")
while True:
# generate Poisson candidate event times using
# exponentially distributed inter-event delays
# at the maximal rate
t += random.expovariate(lambda_max)
# stop if we're past time 100
if t > 100.0:
break
# (rate_f(t) / lambda_max) is the probability we
# should accept a candidate at time t
if random.random() <= rate_f(t) / lambda_max:
# Accept and print this as an actual event if
# a U(0,1) is below the threshold probability
print(t)
This sample program generates results as described in the comments of the code.

'#Error: Solution not found' being returned when using gekko for optimization

I'm trying to complete a year-long battery optimization problem (8760 hours). "ind_1" and "ind_2" are lists of length 8760 containing 0s/1s. Certain hours of the year may earn additional revenue, so these indicator lists are used to distinguish those hours (further used in the maximization function).
m = Gekko(remote=False)
#variables
e_battery = m.Var(lb=0, ub=4000, value=2000) #energy in battery at time t, battery size 4 MWh, initial value is 2MWh
command = m.Var(lb=-1000, ub=1000) #command power -1 to 1 (in MW)
e_price = m.Param(value = price) #price is a list of 8760 values
ind_1 = m.Param(value = ind_1)
ind_2 = m.Param(value = ind_2)
m.time = np.linspace(0,8759, 8760)
m.Equation(e_battery.dt() == e_battery + command)
m.Maximize((-command)*(e_price + ind_1*ind1_price + ind_2*ind2_price))
m.options.IMODE = 6
m.solve()
When I run the above model, it runs for about 20 iteration then returns the error: "#error: Solution Not Found". The objective of this task is to return an array of 8760 values (the command variable) which maximizes the return. Any ideas where this error comes from?
It looks like one of the equations isn't correct:
m.Equation(e_battery.dt() == e_battery + command)
This causes an exponential rise in e_battery that exceeds the upper bound. It should probably be:
m.Equation(e_battery.dt() == command)
There is a specific error code that gives insight on why the solver failed to find a solution. The two most common are Maximum Iterations and Infeasible Solution. If it is Maximum Iterations then try setting m.options.MAX_ITER to a higher number (up to 1000?). If it is Infeasible solution then look at infeasibilities.txt file in the run directory m.path to find the constraint(s) that are causing the problem. The other thing to try is to use a smaller time horizon initially to verify that it works on something with about 100 time steps.
from gekko import Gekko
import numpy as np
m = Gekko(remote=False)
#variables
n = 100
price=np.ones(n)
e_battery = m.Var(lb=0, ub=4000, value=2000) #energy in battery at time t
# battery size 4 MWh, initial value is 2MWh
command = m.Var(lb=-1000, ub=1000) #command power -1 to 1 (in MW)
e_price = m.Param(value = price) #price is a list of 8760 values
ind_1=1; ind_2=1
ind1_price=1; ind2_price=1
ind_1 = m.Param(value = ind_1)
ind_2 = m.Param(value = ind_2)
m.time = np.linspace(0,n-1,n)
m.Equation(e_battery.dt() == command)
m.Maximize((-command)*(e_price + ind_1*ind1_price + ind_2*ind2_price))
m.options.IMODE = 6
m.solve()
This gives a successful solution:
EXIT: Optimal Solution Found.
The solution was found.
The final value of the objective function is -6000.00005999999
---------------------------------------------------
Solver : IPOPT (v3.12)
Solution time : 0.05 sec
Objective : -6000.00005999999
Successful solution
---------------------------------------------------

Adding constraint to limit battery cycling to 1 charge and 1 discharge every 24 hours

This enquiry is an extension to the question found in : '#Error: Solution not found' being returned when using gekko for optimization.
"ind_1" and "ind_2" are lists of length 8760 containing 0s/1s. Certain hours of the year may earn additional revenue, so these indicator lists are used to distinguish those hours (further used in the maximization function
I am trying to build onto this model by limiting the battery cycle to at MOST 1 charge and discharge every 24 hours. As an initial simplistic approach, I am attempting to sum up the battery command signals for each 24 hour segment and limiting it to at most 8000 kWh. You can find my approach below:
m = Gekko(remote=False)
#variables
e_battery = m.Var(lb=0, ub=4000, value=2000) #energy in battery at time t, battery size 4 MWh, initial value is 2MWh
command = m.Var(lb=-1000, ub=1000) #command power -1 to 1 (in MW)
e_price = m.Param(value = price) #price is a list of 8760 values
ind_1 = m.Param(value = ind_1)
ind_2 = m.Param(value = ind_2)
peak_list = m.Param(value = peak_load_list) #list of the monthly peaks (an array of length 8760)
load_list = m.Param(value = load) #hourly electric load
m.time = np.linspace(0,8759, 8760)
m.Equation(e_battery.dt() == command)
#The next 2 constraints are to ensure that the new load (original load + battery operation) is greater than 0, but less than the peak load for that month
m.Equation(load_list + command >= 0)
m.Equation(load_list + command <= peak_list)
#Here is the code to limit the cycling. "abs(command)" is used since "command" can be negative (discharge) or positive (charge), and a full charge and full discharge will equate to 8000 kWh.
daily_sum=0
for i in range(8760):
daily_sum += abs(command)
if i%24==0 and i!=0: #when i=0, it's the beginning of the first day so we can skip it
m.Equation(daily_sum <= 8000)
daily_sum = 0 #reset to 0 in preparation for the first hour of the next day
m.Maximize((-command)*(e_price + ind_1*ind1_price + ind_2*ind2_price))
m.options.IMODE = 6
m.solve()
When adding the cycling constraint, the following output is returned:
--------- APM Model Size ------------
Each time step contains
Objects : 0
Constants : 0
Variables : 373
Intermediates: 0
Connections : 0
Equations : 368
Residuals : 368
Error: At line 1545 of file apm.f90
Traceback: not available, compile with -ftrace=frame or -ftrace=full
Fortran runtime error: Out of memory
Does this particular implementation work using gekko's framework? Would I have to initialize a different type of variable for "command"? Also, I haven't been able to find many relevant examples of using for loops for the equations, so I'm very aware that my implementation might be well off. Would love to hear anyone's thoughts and/or suggestions, thanks.
Binary variables indicate when a destination has been reached (e_battery>3999 or e_battery<1). Integrating those binary variables gives an indication of how many times in a day the limit has been reached. One possible solution is to limit the integral of the binary variable to be less than the day count.
Below are two examples with soft constraints and hard constraints. The number of time points is reduced from 8760 to 120 (5 days) for testing.
from gekko import Gekko
import numpy as np
m = Gekko(remote=False)
n = 120 # hours
price=np.ones(n)
e_battery = m.Var(lb=0, ub=4000, value=2000) #energy in battery at time t
# battery size 4 MWh, initial value is 2MWh
command = m.Var(lb=-1000, ub=1000) #command power -1 to 1 (in MW)
e_price = m.Param(value = price) #price is a list of 8760 values
ind_1=1; ind_2=1
ind1_price=1; ind2_price=1
ind_1 = m.Param(value = ind_1)
ind_2 = m.Param(value = ind_2)
m.time = np.linspace(0,n-1,n)
m.Equation(e_battery.dt() == command)
day = 24
discharge = m.Intermediate(m.integral(m.if3(e_battery+1,1,0)))
charge = m.Intermediate(m.integral(m.if3(e_battery-3999,0,1)))
x = np.ones_like(m.time)
for i in range(1,n):
if i%day==0:
x[i] = x[i-1] + 1
else:
x[i] = x[i-1]
limit = m.Param(x)
soft_constraints = True
if soft_constraints:
derr = m.CV(value=0)
m.Equation(derr==limit-discharge)
derr.STATUS = 1
derr.SPHI = 1; derr.WSPHI = 1000
derr.SPLO = 0; derr.WSPLO = 1000
cerr = m.CV(value=0)
m.Equation(cerr==limit-charge)
cerr.STATUS = 1
cerr.SPHI = 1; cerr.WSPHI = 1000
cerr.SPLO = 0; cerr.WSPLO = 1000
else:
# Hard Constraints
m.Equation(charge<=limit)
m.Equation(charge>=limit-1)
m.Equation(discharge<=limit)
m.Equation(discharge>=limit-1)
m.Minimize(command*(e_price + ind_1*ind1_price + ind_2*ind2_price))
m.options.IMODE = 6
m.solve()
import matplotlib.pyplot as plt
plt.figure(figsize=(10,6))
plt.subplot(3,1,1)
plt.plot(m.time,limit.value,'g-.',label='Limit')
plt.plot(m.time,discharge.value,'b:',label='Discharge')
plt.plot(m.time,charge.value,'r--',label='Charge')
plt.legend(); plt.xlabel('Time'); plt.ylabel('Cycles'); plt.grid()
plt.subplot(3,1,2)
plt.plot(m.time,command.value,'k-',label='command')
plt.legend(); plt.xlabel('Time'); plt.ylabel('Command'); plt.grid()
plt.subplot(3,1,3)
plt.plot(m.time,e_battery.value,'g-.',label='Battery Charge')
plt.legend(); plt.xlabel('Time'); plt.ylabel('Battery'); plt.grid()
plt.show()
The application in the original question runs out of memory because 8760 equations are each simultaneously integrated over 8760 time steps. Try posing equations that are written once but valid over the entire frame. The current objective function is to minimize electricity usage. You may need to include constraints or an objective function to meet demand. Otherwise, the solution is to never use electricity because it is minimized (e.g. maximize(-command)). Here are similar Grid Energy Benchmark problems that may help.

Scipy confidence interval returns different bounds than manual calculation

I have the following data from 10 different people.
df = pd.DataFrame({'id':range(1, 11),
'x':[.7,-1.6,-.2,-1.2,-.1,3.4,3.7,.8,0,2]})
print(df)
id x
0 1 0.7
1 2 -1.6
2 3 -0.2
3 4 -1.2
4 5 -0.1
5 6 3.4
6 7 3.7
7 8 0.8
8 9 0.0
9 10 2.0
I want to calculate a 95 per cent confidence interval for the population mean of df[x].
Since the number of observations is small, the sample mean should follow the t distribution with 10 - 1 degrees of freedom. I tried the following in order to calculate a 95 per cent C.I. using scipy:
# Libraries
import numpy as np
from scipy import stats
# Number of observations
n_obs = 10
# Observed mean
m_obs = df['x'].mean()
# Observed variance (unbiased)
v_obs = df['x'].var(ddof=1) / n_obs
# Declare random variable with observed parameters
t = stats.t(df=n_obs - 1, loc=m_obs, scale=np.sqrt(v_obs))
# Calculate 95% CI
t.interval(alpha=0.95)
> (-0.5297804134938646, 2.0297804134938646) ### Correct interval
This confidence interval is correct. However, I get a completely different result when I manually calculate the interval. What is causing this?
# T such that P(t < T) = 0.975
T = t.ppf(0.975)
# Manually compute interval
(m_obs - (T * np.sqrt(v_obs)), m_obs + (T * np.sqrt(v_obs)))
> (-0.3983168630668432, 1.8983168630668432) ### Incorrect interval
It's been two weeks since I posted the answer in the comments but no one took credit for it, so here it goes:
The reason the two confidence intervals differ is because the value T accumulates 97.5% of the area under the probability distribution of t, and t has mean m_obs and variance v_obs. That is, it is not a standard t distribution.
The correct interval can be calculated correctly by simply making T the value that accumulates 0.975 of the probability of a standard t distribution:
# Number of observations (UNCHANGED)
n_obs = 10
# Observed mean (UNCHANGED)
m_obs = df['x'].mean()
# Observed variance (unbiased) (UNCHANGED)
v_obs = df['x'].var(ddof=1) / n_obs
# Declare *STANDARD* t distribution (CHANGED!!!)
t = stats.t(df=n_obs - 1, loc=0, scale=1)
# T such that P(x < T) = 0.975 (CHANGED!!!)
T = t.ppf(0.975)
# Manually compute interval (CORRECT ANSWER)
(m_obs - (T * np.sqrt(v_obs)), m_obs + (T * np.sqrt(v_obs)))
> (-0.5297804134938646, 2.0297804134938646)
This yields the correct answer.

Scipy optimize minimize always returns initial guess (SLSQP)

Like the title explains, my program always returns the initial guess.
For context, the program is trying to find the best way to allocate some product across multiple stores. Each stores has a forecast of what they are expected to sell in the following days (sales_data). This forecast does not necessarily have to be integers, or above 1 (it rarely is), it is an expectation in the statistical sense. So, if a store has sales_data = [0.33, 0.33, 0.33] the it is expected that after 3 days, they would sell 1 unit of product.
I want to minimize the total time it takes to sell the units i am allocating (i want to sell them the fastest) and my constraints are that I have to allocate the units that I have available, and I cannot allocate a negative number of product to a store. I am ok having non-integer allocations for now. For my initial allocations i am dividing the units I have available equally among all stores.
Below is a shorter version of my code where I am having the problem:
import numpy, random
from scipy.optimize import curve_fit, minimize
unitsAvailable = 50
days = 15
class Store:
def __init__(self, num):
self.num = num
self.sales_data = []
stores = []
for i in range(10):
# Identifier
stores.append(Store(random.randint(1000, 9999)))
# Expected units to be sold that day (It's unlikey they will sell 1 every day)
stores[i].sales_data = [random.randint(0, 100) / 100 for i in range(days)]
print(stores[i].sales_data)
def days_to_turn(alloc, store):
day = 0
inventory = alloc
while (inventory > 0 and day < days):
inventory -= store.sales_data[day]
day += 1
return day
def time_objective(allocations):
time = 0
for i in range(len(stores)):
time += days_to_turn(allocations[i], stores[i])
return time
def constraint1(allocations):
return unitsAvailable - sum(allocations)
def constraint2(allocations):
return min(allocations) - 1
cons = [{'type':'eq', 'fun':constraint1}, {'type':'ineq', 'fun':constraint2}]
guess_allocs = []
for i in range(len(stores)):
guess_allocs.append(unitsAvailable / len(stores))
guess_allocs = numpy.array(guess_allocs)
print('Optimizing...')
time_solution = minimize(time_objective, guess_allocs, method='SLSQP', constraints=cons, options={'disp':True, 'maxiter': 500})
time_allocationsOpt = [max([a, 0]) for a in time_solution.x]
unitsUsedOpt = sum(time_allocationsOpt)
unitsDaysProjected = time_solution.fun
for i in range(len(stores)):
print("----------------------------------")
print("Units to send to Store %s: %s" % (stores[i].num, time_allocationsOpt[i]))
print("Time to turn allocated: %d" % (days_to_turn(time_allocationsOpt[i], stores[i])))
print("----------------------------------")
print("Estimated days to be sold: " + str(unitsDaysProjected))
print("----------------------------------")
print("Total units sent: " + str(unitsUsedOpt))
print("----------------------------------")
The optimization finishes successfully, with only 1 iteration, and no matter how i change the parameters, it always returns the initial guess_allocs.
Any advice?
The objective function does not have a gradient because it returns discrete multiples of days. This is easily visualized:
import numpy as np
import matplotlib.pyplot as plt
y = []
x = np.linspace(-4, 4, 1000)
for i in x:
a = guess_allocs + [i, -i, 0, 0, 0, 0, 0, 0, 0, 0]
y.append(time_objective(a))
plt.plot(x, y)
plt.xlabel('relative allocation')
plt.ylabel('objective')
plt.show()
If you want to optimize such a function you cannot use gradient based optimizers. There are two options: 1) Find a way to make the objective function differentiable. 2) Use a different optimizer. The first is hard. For the second, let's try dual annealing. Unfortunately, it does not allow constraints so we need to modify the objective function.
Constraining N numbers to a constant sum is the same as having N-1 unconstrained numbers and setting the Nth number to constant - sum.
import scipy.optimize as spo
bounds = [(0, unitsAvailable)] * (len(stores) - 1)
def constrained_objective(partial_allocs):
if np.sum(partial_allocs) > unitsAvailable:
# can't sell more than is available, so make the objective infeasible
return np.inf
# Partial_alloc contains allocations to all but one store.
# The final store gets allocated the remaining units.
allocs = np.append(partial_allocs, unitsAvailable - np.sum(partial_allocs))
return time_objective(allocs)
time_solution = spo.dual_annealing(constrained_objective, bounds, x0=guess_allocs[:-1])
print(time_solution)
This is a stochastic optimization method. You may want to run it multiple times to see if it can do better, or play with the optional parameters...
Finally, I think there is a problem with the objective function:
for i in range(len(stores)):
time += days_to_turn(allocations[i], stores[i])
This says that the stores do not sell at the same time but only one after another. Does each store wait with selling until the previous store runs out of items? I think not. Instead, they will sell simultaneously and the time it takes for all units to be sold is the time of the store that takes longest. Try this instead:
for i in range(len(stores)):
time = max(time, days_to_turn(allocations[i], stores[i]))

Categories