Python Mixed Integer Optimization - python

I'm new to mixed integer optimization problem. Currently, I'm using pulp python interface with default CBC solver to solve the problem.
The problem is to improve resource utilization in a cancer clinic model and below is the code with objective function and constraints. When I use prob.solve(), I have 3 different questions:
1. I get values 1.0 for BeginTreatment variable however I do not get values 1.0 for ContinueTreatment variable?
2. Based on chair continuity constraint, slots numbered more than 29 should not be assigned to pat_type of 8 as there are max 40 slots only available. But I still see that? (and not only for pat_type of 8 but others too)
3. Should I try a different solver instead of the default CBC solver of pulp? If yes, how do I do that?
import pulp
# Indices and Parameters
chair = list(range(1,24))
pat_type = list(range(1,9))
slot = list(range(1,41))
type_and_demand = {1:24,2:10,3:13,4:9,5:7,6:6,7:2,8:1}
type_and_slots = {1:1,2:4,3:8,4:12,5:16,6:20,7:24,8:28}
# Decision Variables
Y = pulp.LpVariable.dicts("BeginTreatment",
(chair,pat_type,slot),0,1,pulp.LpBinary)
X = pulp.LpVariable.dicts("ContinueTreatment",
(chair,pat_type,slot),0,1,pulp.LpBinary)
# Objective Function
prob = pulp.LpProblem("ChairUtilization", pulp.LpMaximize)
prob += pulp.lpSum([Y[i][j][t] for i in chair for j in pat_type for t in
slot])
# Constraints
# Patient Type 1 Continuity constraint
for i in chair:
for t in slot:
for j in range(1,2):
prob += X[i][j][t] == 0
# Chair Continuity Constraint
for i in chair:
for j in range(2,9):
for t in range(1,(len(slot)-type_and_slots[j]+1)+1):
prob += pulp.lpSum([X[i][j][u] for u in range(t+1,t+type_and_slots[j])])
== (type_and_slots[j] - 1)*Y[i][j][t]
# No more than one patient per chair
for t in slot:
for i in chair:
prob += pulp.lpSum([X[i][j][t] for j in pat_type]) + pulp.lpSum([Y[i][j]
[t] for j in pat_type]) <= 1
# No new arrivals during lunch time period
prob += pulp.lpSum([Y[i][j][t] for i in chair for j in pat_type for t in
range(19,23)]) == 0
# Patient Mix
for j in pat_type:
prob += pulp.lpSum([Y[i][j][t] for i in chair for t in slot]) ==
type_and_demand[j]
prob.solve()

Related

Python Gurobi - Constraints effecting decision variable

I have a Optimization Problem with minimizing cost. I have a number of stations, a number of tasks and a number of types of machines. The tasks should be assigned to a station, where every station is of a certain type. My code was like that:
C = 200
tasks = ['w01','w02','w03']
durations = {('BTC1','w01'):150, ('BTC1','w02'):115, ('BTC1','w03'):135,
('BOC1','w01'):150, ('BOC1','w02'):115, ('BOC1','w03'):135}
successors = {'w01':[],'w02':['w01'],'w03':['w02']}
types = ['BTC1','BOC1']
f_cost = {(1,'BTC1'): 50000,(1,'BOC1'): 40000,
(2,'BTC1'): 50000,(2,'BOC1'): 40000,
(3,'BTC1'): 50000,(3,'BOC1'): 40000}
edges, v_cost = gp.multidict({
('BTC1','w01'): [15],
('BTC1','w02'): [30],
('BTC1','w03'): [45],
('BOC1','w01'): [50],
('BOC1','w02'): [80],
('BOC1','w03'): [110]
})
x_arcs = [(i,j) for i in types for j in tasks]
y_arcs = [(i,j) for i in stations for j in types]
m = gp.Model()
y = m.addVars(y_arcs, vtype = GRB.BINARY, name = 'y')
x = m.addVars(x_arcs, vtype = GRB.CONTINUOUS, name = 'x',ub=1)
m.addConstrs(quicksum(x[i,j] for i in types) == 1 for j in tasks)
m.addConstrs(quicksum(durations[k,j]*x[k,j] for j in tasks) <= C*y[i,k] for i in stations for k in types)
m.addConstrs(quicksum(x[j,k] for j in types) <=
quicksum(x[j,i] for j in types)
for i in tasks for k in successors[i] if i!= 'w01')
m.setObjective(x.prod(v_cost) + y.prod(f_cost), GRB.MINIMIZE)
I checked the LP file and the solution point and everything was fine like expected but only one thing was a bit off. The problem was that the Model was giving solutions like that in the .sol:
y[1,BTC1] 1
y[1,BOC1] 1
y[2,BTC1] 1
y[2,BOC1] 1
y[3,BTC1] 1
y[3,BOC1] 1
But because I want to define that every station can only have one type of machine, I added the following constraints:
m.addConstrs(quicksum(y[i,j] for j in types) == 1 for i in stations)
Looking at the solution and the LP this constraints are also doing what I want them to do.
But at the same time the constraints are forcing my x decision variable to be binary and have only 0 and 1 values. But it should be a fractional value between 0 and 1. This works fine without the last constraints I added, but with them x is not a float between 0 and 1 anymore. Why does the constraints effect the fractionality of x in this manner? And how would I fix this?
Thanks in advance

How to formulate linear programming optimization in PuLP?

I am looking to formulate a (I think) complex LP problem in Python using PuLP.
The optimization goal is to maximize profit margin (aggregate acquisition cost vs aggregate sale revenue plus some future appreciation (FV)) on a basket of products for purchase.
The LP decision variables are pricing 'statistics' distinct for each product.
The constraint is that the bid for a particular product cannot use a pricing statistic > some maximum value. 1.0 in this case. And the aggregate ratio of sum(FV) of won items to net revenue must be <= -2.0.
The price I'd bid is a function of a theoretical price plus a theoretical future value (FV) minus a theoretical cost. These 3 inputs are static - but the pricing statistic scales (or weights) the impact of the FV on the bid, which is what I'd like to solve for. Higher statistic -> higher bid. The trick is that once you change the statistic, you change the bid, and this changes the aggregates that PuLP is trying to optimize for. I figured this would be ok since the bid price is a closed form linear formula, but please see below for how I tried to tackle.
I also have the actual price the item sold for, so can compare the model's output price to the actual price to determine whether I would have bought in that case.
Concretely:
Bid[item j] = Theo price + (Statistic to be tuned[product i] * FV) - (Costs + Expenses)
There are 10 products to tune for, and j total items, non-uniformly distributed throughout a dataset.
If my output bid price based on the parameter being tuned in the LP > actual winning bid price, then consider the item purchased, and add it to the objective function.
Can someone please help me formulate this in PuLP? Maybe this is MIP? If so, I am unsure how to represent it formally.
What I have so far is the following:
from pulp import LpMaximize, LpProblem, LpStatus, lpSum, LpVariable, LpBinary
import pandas as pd
df= pd.read_excel('data.xlsx')
#create matrices and set variables
MAX_STAT = 1.0
RATIO_CONSTRAINT = -2.0
PRODUCTS = [0,1,2,3,4,5,6,7,8,9]
ITEMS = df['ITEMS'].tolist() # IDs
#1xj dicts
ITEM_PRODUCT = {ITEMS[i]:df['PRODUCT'].iloc[i] for i in range(len(df))}
ACTUAL_PX = {ITEMS[i]:df['ACTUAL_PX'].iloc[i] for i in range(len(df))}
COST = {ITEMS[i]:df['COST'].iloc[i] for i in range(len(df))}
EXPENSE = {ITEMS[i]:df['EXPENSE'].iloc[i] for i in range(len(df))}
#ixj dicts
THEO_PX = {ITEMS[i]:[df['THEO_PX'].iloc[i] if PRODUCTS[ITEMS[i]] == x else 0for x in PRODUCTS] for i in range(len(df))}
QUANTITY = {ITEMS[i]:[df['QUANTITY'].iloc[i] if PRODUCTS[ITEMS[i]] == x else 0 for x in PRODUCTS] for i in range(len(df))}
FV = {ITEMS[i]:[df['FV'].iloc[i] if PRODUCTS[ITEMS[i]] == x else 0 for x in PRODUCTS] for i in range(len(df))}
use_vars = {j:[i if ITEM_PRODUCT[j] == i else 0 for i in PRODUCTS] for j in ITEMS}
#Define the model
model = LpProblem(name="maximize_margin", sense=LpMaximize)
#Define decision variables
strategy_statistic = LpVariable.dicts('StrategyStat', [(j,i) for j in ITEMS for i in PRODUCTS], 0, MAX_STAT)
#other variables dependent on the statistic
strategy_bid = {(j,i):strategy_statistic[(j,i)]*FV[j][i]+THEO_PX[j][i]-COST[j]-EXPENSE[j] for j in ITEMS for i in PRODUCTS}
win_loss = {(j,i):1 if strategy_bid[(j,i)] >= ACTUAL_PX[j] else 0 for j in ITEMS for i in PRODUCTS}
aggQuantity = lpSum(win_loss[(j,i)]*QUANTITY[j][i]*use_vars[j][i] for j in ITEMS for i in PRODUCTS)
aggTheo = lpSum(win_loss[(j,i)]*THEO_PX[j][i]*QUANTITY[j][i]*use_vars[j][i] for j in ITEMS for i in PRODUCTS)
aggFV = lpSum(win_loss[(j,i)]*FV[j][i]*QUANTITY[j][i]*use_vars[j][i] for j in ITEMS for i in PRODUCTS)
aggBidNotional = lpSum(win_loss[(j,i)]*strategy_bid[(j,i)]*QUANTITY[j][i]*use_vars[j][i] for j in ITEMS for i in PRODUCTS)
model += (aggTheo - aggBidNotional + aggFV)
model += (aggFV / (aggTheo - aggBidNotional)) <= RATIO_CONSTRAINT
Currently seeing an error on the last line saying that:
TypeError: Expressions cannot be divided by a non-constant expression
But I think there is more wrong with this formulation than that...

Team Allocation Optimisation with PuLP

Background: I am trying to allocate students to teams where each student will have a series of preferences to other students being in their teams. I have an objective function which I want to minimise along with 3 constraints for the function (written in the image below). In my DB I have a set of students along with their preferences (such as student i rating student j as their 3rd choice).
If student A rates student B as their 1st choice, that preference will have a weighting of 1 which is why the objective function is set to minimise.
Mathematical Formula:
Question: I am unsure whether I have written the constraints and variables correctly in PuLP, and I can't find any close resources that do team allocation with preferences. I'm very new to PuLP and am struggling to figure out if what I've written is correct syntatically, thanks for any help!
Here is the code that I have written in my file:
from pulp import *
model = LpProblem("Team Allocation Problem", LpMinimize)
############ VARIABLES ############
students = [1,...,20]
n = 20
# this will be imported from the database
r = [[...],...,[...]]
team_sizes = [5,5,5,5]
num_teams = len(z)
# x(ik) = 1 if student i in team k
x_vars = [[LpVariable("x%d%d" % (i,k), cat='Binary')
for k in range(num_teams)]
for i in range(num_students)]
# y(ijk) = 1 if student i and j are in team k
y_vars = [[[LpVariable("y%d%d%d" % (i,j,k), cat='Binary')
for k in range(num_teams)]
for j in range(num_students)]
for i in range(num_students)]
############ OBJECTIVE FUNCTION ############
for i in range(num_students):
for j in range(num_students):
if i!=j:
for k in range(num_teams):
model += r[i][j] * y_vars[i][j][k], "Minimize the sum of rank points in the team"
############ CONSTRAINTS ############
# C1: Every student is on exactly one team
for i in range(num_students):
for k in range(num_teams):
model += lpSum(x_vars[i][k]) == 1
# C2: Every team has the right size
for k in range(num_teams):
for i in range(num_students):
model += lpSum(x_vars[i][k]) == team_sizes[k]
# C3:
for i in range(num_students):
for j in range(num_students):
if i != j:
for k in range(num_teams):
model += 1 + y_vars[i][j][k] >= x_vars[i][k] + x_vars[j][k]
# Solve and Print
model.solve()
print("Status:", model.status)
(1) Make sure the sense of the objective is correct. The way I read your problem, you should maximize.
(2) The proper linearization of
y(i,j,k) = x(i,k)*x(j,k)
is
y(i,j,k) <= x(i,k)
y(i,j,k) <= x(j,k)
y(i,j,k) >= x(i,k)+x(j,k)-1
Sometimes we can drop some of these constraints because of how the objective works. Make sure you have verified, you indeed can drop y(i,j,k) <= x(i,k) and y(i,j,k) <= x(j,k).
(3) This is (almost) the same question as Algorithms for optimal student seating arrangements.
(4) I want to minimise the objective in this case, if someone rates someone as their first choice they'll essentially be given 1 point, 2 points for second, 3 points for third etc. You cannot have 0=no points 1=best 2=second best,... in your formulation. I suggest to recode your points: 0=no points, 1=ok, 2=better, 3=best. (Just preprocessing of the data). Then maximize instead of minimize. You can add -1,-2,... for dislike if you want.

Is there a better way to guess possible unknown variables without brute force than I am doing? Machine learning? [duplicate]

This question already has answers here:
How to approach a number guessing game (with a twist) algorithm?
(7 answers)
Closed 4 years ago.
I have a game with the following rules:
A user is given fruit prices and has a chance to buy or sell items in their fruit basket every turn.
The user cannot make more than a 10% total change in their basket on a single turn.
Fruit prices change every day and when multiplied by the quantities of items in the fruit basket, the total value of the basket changes relative to the fruit price changes every day as well.
The program is only given the current price of all the fruits and the current value of the basket (current price of fruit * quantities for all items in the basket).
Based on these 2 inputs(all fruit prices and basket total value), the program tries to guess what items are in the basket.
A basket cannot hold more than 100 items but slots can be empty
The player can play several turns.
My goal is to accurately guess as computationally inexpensively as possible (read: no brute force) and scale if there are thousands of new fruits.
I am struggling to find an answer but in my mind, it’s not hard. If I have the below table. I could study day 1 and get the following data:
Apple 1
Pears 2
Oranges 3
Basket Value = 217
I can do a back of napkin calculation and assume, the weights in the basket are: 0 apple, 83 pears, and 17 Oranges equaling a basket value of 217.
The next day, the values of the fruits and basket changes. To (apple = 2, Pear 3, Oranges 5) with a basket value of 348. When I take my assumed weights above (0,83,17) I get a total value of 334 – not correct! Running this by my script, I see the closest match is 0 apples, 76 pears, 24 oranges which although does equal 348 when % change of factored in it’s a 38% change so it’s not possible!
I know I can completely brute force this but if I have 1000 fruits, it won’t scale. Not to jump on any bandwagon but can something like a neural net quickly rule out the unlikely so I calculate large volumes of data? I think they have to be a more scalable/quicker way than pure brute force? Or is there any other type of solution that could get the result?
Here is the raw data (remember program can only see prices and total basket value only):
Here's some brute force code (Thank you #paul Hankin for a cleaner example than mine):
def possibilities(value, prices):
for i in range(0, value+1, prices[0]):
for j in range(0, value+1-i, prices[1]):
k = value - i - j
if k % prices[2] == 0:
yield i//prices[0], j//prices[1], k//prices[2]
def merge_totals(last, this, r):
ok = []
for t in this:
for l in last:
f = int(sum(l) * r)
if all(l[i] -f <= t[i] <= l[i] + f for i in range(len(l))):
ok.append(t)
break
return ok
days = [
(217, (1, 2, 3)),
(348, (2, 3, 5)),
(251, (1, 2, 4)),
]
ps = None
for i, d in enumerate(days):
new_ps = list(possibilities(*d))
if ps is None:
ps = new_ps
ps = merge_totals(ps, new_ps, 0.10)
print('Day %d' % (i+1))
for p in ps:
print('Day %d,' % (i+1), 'apples: %s, pears: %s, oranges: %s' % p)
print
Update - The info so far is awesome. Does it make sense to break the problem into two problems? One is generating the possibilities while the other is finding the relationship between the possibilities(no more than a 10% daily change). By ruling out possibilities, couldn't that also be used to help only generate possibilities that are possible, to begin with? I'm not sure the approach still but I do feel both problems are different but tightly related. Your thoughts?
Update 2 - there are a lot of questions about the % change. This is the total volume of items in the basket that can change. To use the game example, Imagine the store says - you can sell/return/buy fruits but they cannot be more than 10% of your last bill. So although the change in fruit prices can cause changes in your basket value, the user cannot take any action that would impact it by more than 10%. So if the value was 100, they can make changes that create get it to 110 but not more.
I hate to let you down but I really don't think a neural net will help at all for this problem, and IMO the best answer to your question is the advice "don't waste your time trying neural nets".
An easy rule of thumb for deciding whether or not neural networks are applicable is to think, "can an average adult human solve this problem reasonably well in a few seconds?" For problems like "what's in this image", "respond to this question", or "transcribe this audio clip", the answer is yes. But for your problem, the answer is a most definite no.
Neural networks have limitations, and one is that they don't deal well with highly logical problems. This is because the answers are generally not "smooth". If you take an image and slightly change a handful of pixels, the content of the image is still the same. If you take an audio clip and insert a few milliseconds of noise, a neural net will probably still be able to figure out what's said. But in your problem, change a single day's "total basket value" by only 1 unit, and your answer(s) will drastically change.
It seems that the only way to solve your problem is with a "classical" algorithmic approach. As currently stated, there might not be any algorithm better than brute force, and it might not be possible to rule out much. For example, what if every day has the property that all fruits are priced the same? The count of each fruit can vary, as long as the total number of fruits is fixed, so the number of possibilities is still exponential in the number of fruits. If your goal is to "produce a list of possibilities", then no algorithm can be better than exponential time since this list can be exponentially large in some cases.
It's interesting that part of your problem can be reduced to an integer linear program (ILP). Consider a single day, where you are given the basket total B and each fruit's cost c_i, for i=1 through i=n (if n is the total number of distinct fruits). Let's say the prices are large, so it's not obvious that you can "fill up" the basket with unit cost fruits. It can be hard in this situation to even find a single solution. Formulated as an ILP, this is equivalent to finding integer values of x_i such that:
sum_i (x_i*c_i) = x_1*c_1 + x_2*c_2 + ... + x_n*c_n = B
and x_i >= 0 for all 1 <= i <= n (can't have negative fruits), and sum_i x_i <= 100 (can have at most 100 fruits).
The good news is that decent ILP solvers exist -- you can just hand over the above formulas and the solver will do its best to find a single solution. You can even add an "objective function" that the solver will maximize or minimize -- minimizing sum_i x_i has the effect of minimizing the total number of fruits in the basket. The bad news is that ILP is NP-complete, so there is almost no hope of finding an efficient solution for a large number of fruits (which equals the number of variables x_i).
I think the best approach forward is to try the ILP approach, but also introduce some more constraints on the scenario. For example, what if all fruits had a different prime number cost? This has the nice property that if you find one solution, you can enumerate a bunch of other related solutions. If an apple costs m and an orange costs n, where m and n are relatively prime, then you can "trade" n*x apples for m*x oranges without changing the basket total, for any integer x>0 (so long as you have enough apples and oranges to begin with). If you choose all fruits to have different prime number costs, then all of the costs will be pairwise relatively prime. I think this approach will result in relatively few solutions for a given day.
You might also consider other constraints, such as "there can't be more than 5 fruits of a single kind in the basket" (add the constraint x_i <= 5), or "there can be at most 5 distinct kinds of fruits in the basket" (but this is harder to encode as an ILP constraint). Adding these kinds of constraints will make it easier for the ILP solver to find a solution.
Of course the above discussion is focused on a single day, and you have multiple days' worth of data. If the hardest part of the problem is finding any solution for any day at all (which happens if your prices are large), then using an ILP solver will give you a large boost. If solutions are easy to find (which happens if you have a very-low-cost fruit that can "fill up" your basket), and the hardest part of the problem is finding solutions that are "consistent" across multiple days, then the ILP approach might not be the best fit, and in general this problem seems much more difficult to reason about.
Edit: and as mentioned in the comments, for some interpretations of the "10% change" constraint, you can even encode the entire multi-day problem as an ILP.
It seems to me like your approach is reasonable, but whether it is depends on the size of the numbers in the actual game. Here's a complete implementation that's a lot more efficient than yours (but still has plenty of scope for improvement). It keeps a list of possibilities for the previous day, and then filters the current day amounts to those that are within 5% of some possibility from the previous day, and prints them out per day.
def possibilities(value, prices):
for i in range(0, value+1, prices[0]):
for j in range(0, value+1-i, prices[1]):
k = value - i - j
if k % prices[2] == 0:
yield i//prices[0], j//prices[1], k//prices[2]
def merge_totals(last, this, r):
ok = []
for t in this:
for l in last:
f = int(sum(l) * r)
if all(l[i] -f <= t[i] <= l[i] + f for i in range(len(l))):
ok.append(t)
break
return ok
days = [
(26, (1, 2, 3)),
(51, (2, 3, 4)),
(61, (2, 4, 5)),
]
ps = None
for i, d in enumerate(days):
new_ps = list(possibilities(*d))
if ps is None:
ps = new_ps
ps = merge_totals(ps, new_ps, 0.05)
print('Day %d' % (i+1))
for p in ps:
print('apples: %s, pears: %s, oranges: %s' % p)
print
Problem Framing
This problem can be described as a combinatorial optimization problem. You're trying to find an optimal object (a combination of fruit items) from a finite set of objects (all possible combinations of fruit items). With the proper analogy and transformations, we can reduce this fruit basket problem to the well known, and extensively studied (since 1897), knapsack problem.
Solving this class of optimization problems is NP-hard. The decision problem of answering "Can we find a combination of fruit items with a value of X?" is NP-complete. Since you want to account for a worst case scenario when you have thousands of fruit items, your best bet is to use a metaheuristic, like evolutionary computation.
Proposed Solution
Evolutionary computation is a family of biologically inspired metaheuristics. They work by revising and mixing (evolving) the most fit candidate solutions based on a fitness function and discarding the least fit ones over many iterations. The higher the fitness of a solution, the more likely it will reproduce similar solutions and survive to the next generation (iteration). Eventually, a local or global optimal solution is found.
These methods provide a needed compromise when the search space is too large to cover with traditional closed form mathematical solutions. Due to the stochastic nature of these algorithms, different executions of the algorithms may lead to different local optima, and there is no guarantee that the global optimum will be found. The odds are good in our case since we have multiple valid solutions.
Example
Let's use the Distributed Evolutionary Algorithms in Python (DEAP) framework and retrofit their Knapsack problem example to our problem. In the code below we apply strong penalty for baskets with 100+ items. This will severely reduce their fitness and have them taken out of the population pool in one or two generations. There are other ways to handle constraints that are also valid.
# This file is part of DEAP.
#
# DEAP is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as
# published by the Free Software Foundation, either version 3 of
# the License, or (at your option) any later version.
#
# DEAP is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with DEAP. If not, see <http://www.gnu.org/licenses/>.
import random
import numpy as np
from deap import algorithms
from deap import base
from deap import creator
from deap import tools
IND_INIT_SIZE = 5 # Calls to `individual` function
MAX_ITEM = 100 # Max 100 fruit items in basket
NBR_ITEMS = 50 # Start with 50 items in basket
FRUIT_TYPES = 10 # Number of fruit types (apples, bananas, ...)
# Generate a dictionary of random fruit prices.
fruit_price = {i: random.randint(1, 5) for i in range(FRUIT_TYPES)}
# Create fruit items dictionary. The key is item ID, and the
# value is a (weight, price) tuple. Weight is always 1 here.
items = {}
# Create random items and store them in the items' dictionary.
for i in range(NBR_ITEMS):
items[i] = (1, fruit_price[i])
# Create fitness function and an individual (solution candidate)
# A solution candidate in our case is a collection of fruit items.
creator.create("Fitness", base.Fitness, weights=(-1.0, 1.0))
creator.create("Individual", set, fitness=creator.Fitness)
toolbox = base.Toolbox()
# Randomly initialize the population (a set of candidate solutions)
toolbox.register("attr_item", random.randrange, NBR_ITEMS)
toolbox.register("individual", tools.initRepeat, creator.Individual,
toolbox.attr_item, IND_INIT_SIZE)
def evalBasket(individual):
"""Evaluate the value of the basket and
apply constraints penalty.
"""
value = 0 # Total value of the basket
for item in individual:
value += items[item][1]
# Heavily penalize baskets with 100+ items
if len(individual) > MAX_ITEM:
return 10000, 0
return len(individual), value # (items in basket, value of basket)
def cxSet(ind1, ind2):
"""Apply a crossover operation on input sets.
The first child is the intersection of the two sets,
the second child is the difference of the two sets.
This is one way to evolve new candidate solutions from
existing ones. Think of it as parents mixing their genes
to produce a child.
"""
temp = set(ind1) # Used in order to keep type
ind1 &= ind2 # Intersection (inplace)
ind2 ^= temp # Symmetric Difference (inplace)
return ind1, ind2
def mutSet(individual):
"""Mutation that pops or add an element.
In nature, gene mutations help offspring express new traits
not found in their ancestors. That could be beneficial or
harmful. Survival of the fittest at play here.
"""
if random.random() < 0.5: # 50% chance of mutation
if len(individual) > 0:
individual.remove(random.choice(sorted(tuple(individual))))
else:
individual.add(random.randrange(NBR_ITEMS))
return individual,
# Register evaluation, mating, mutation and selection functions
# so the framework can use them to run the simulation.
toolbox.register("evaluate", evalKnapsack)
toolbox.register("mate", cxSet)
toolbox.register("mutate", mutSet)
toolbox.register("select", tools.selNSGA2)
def main():
random.seed(64)
NGEN = 50
MU = 50
LAMBDA = 100
CXPB = 0.7
MUTPB = 0.2
pop = toolbox.population(n=MU) # Initial population size
hof = tools.ParetoFront() # Using Pareto front to rank fitness
# Keep track of population fitness stats which should
# improve over generations (iterations).
stats = tools.Statistics(lambda ind: ind.fitness.values)
stats.register("avg", numpy.mean, axis=0)
stats.register("std", numpy.std, axis=0)
stats.register("min", numpy.min, axis=0)
stats.register("max", numpy.max, axis=0)
algorithms.eaMuPlusLambda(pop, toolbox, MU,LAMBDA,\
CXPB, MUTPB, NGEN, stats,\
halloffame=hof)
return pop, stats, hof
if __name__ == "__main__":
main()
Not an answer, but an attempt to make the one information about what "% change" might be supposed to mean (sum of change in count of each item computed backwards) more accessible to non-believers in pixel heaps:
| Day 1 ! Day 2 change ! Day 3 change ! Day 4 change
|$/1| # | $ !$/1| # | % | $ !$/1| # | % | $ !$/1| # | % | $
Apples | 1 | 20 | 20 ! 2 | 21 | 4.76 | 42 ! 1 | 21 | 0 | 21 ! 1 | 22 | 4.55 | 22
Pears | 2 | 43 | 86 ! 3 | 42 | 2.38 | 126 ! 2 | 43 | 2.33 | 86 ! 2 | 43 | 0 | 86
Oranges| 3 | 37 | 111 ! 5 | 36 | 2.78 | 180 ! 4 | 36 | 0 | 144 ! 3 | 35 | 2.86 | 105
Total | 100 | 217 ! 100 | 9.92 | 348 ! 100 | 2.33 | 251 ! 100 | 7.40 | 213
Integer Linear Programming Approach
This sets up naturally as a multi-step Integer Program, with the holdings in {apples, pears, oranges} from the previous step factoring in the calculation of the relative change in holdings that must be constrained. There is no notion of optimal here, but we can turn the "turnover" constraint into an objective and see what happens.
The solutions provided improve on those in your chart above, and are minimal in the sense of total change in basket holdings.
Comments -
I don't know how you calculated the "% change" column in your table. A change from Day 1 to Day 2 of 20 apples to 21 apples is a 4.76% change?
On all days, your total holdings in fruits is exactly 100. There is a constraint that the sum of holdings is <= 100. No violation, I just want to confirm.
We can set this up as an integer linear program, using the integer optimization routine from ortools. I haven't used an ILP solver for a long time, and this one is kind of flaky I think (the solver.OPTIMAL flag is never true it seems, even for toy problems. In addition the ortools LP solver fails to find an optimal solution in cases where scipy.linprog works without a hitch)
h1,d = holdings in apples (number of apples) at end of day d
h2,d = holdings in pears at end of day d
h3,d = holdings in oranges at end of day d
I'll give two proposals here, one which minimizes the l1 norm of the absolute error, the other the l0norm.
The l1 solution finds the minimum of abs(h1,(d+1) - h1,d)/h1 + ... + abs(h3,(d+1) - h3,d)/h3), hoping that the constraint that each relative change in holdings is under 10% if the sum of the relative change in holdings is minimized.
The only thing that prevents this from being a linear program (aside from the integer requirement) is the nonlinear objective function. No problem, we can introduce slack variables and make everything linear. For the l1 formulation, 6 additional slack variables are introduced, 2 per fruit, and 6 additional inequality constraints. For the l0 formulation, 1 slack variable is introduced, and 6 additional inequality constraints.
This is a two step process, for example, replacing |apples_new - apples_old|/|apples_old| with the variable |e|, and adding inequality constraints to ensure the e measures what we'd like. We then replace|e| with (e+ - e-), each of e+, e- >0. It can be shown that one of e+, e- will be 0, and that (e+ + e-) is the absolute value of e. That way the pair (e+, e-) can represent a positive or negative number. Standard stuff, but that adds a bunch of variables and constraints. I can explain this in a bit more detail if necessary.
import numpy as np
from ortools.linear_solver import pywraplp
def fruit_basket_l1_ortools():
UPPER_BOUND = 1000
prices = [[2,3,5],
[1,2,4],
[1,2,3]]
holdings = [20,43,37]
values = [348, 251, 213]
for day in range(len(values)):
solver = pywraplp.Solver('ILPSolver',
pywraplp.Solver.BOP_INTEGER_PROGRAMMING)
# solver = pywraplp.Solver('ILPSolver',
# pywraplp.Solver.CLP_LINEAR_PROGRAMMING)
c = ([1,1] * 3) + [0,0,0]
price = prices[day]
value = values[day]
A_eq = [[ 0, 0, 0, 0, 0, 0, price[0], price[1], price[2]]]
b_eq = [value]
A_ub = [[-1*holdings[0], 1*holdings[0], 0, 0, 0, 0, 1.0, 0, 0],
[-1*holdings[0], 1*holdings[0], 0, 0, 0, 0, -1.0, 0, 0],
[ 0, 0, -1*holdings[1], 1*holdings[1], 0, 0, 0, 1.0, 0],
[ 0, 0, -1*holdings[1], 1*holdings[1], 0, 0, 0, -1.0, 0],
[ 0, 0, 0, 0, -1*holdings[2], 1*holdings[2], 0, 0, 1.0],
[ 0, 0, 0, 0, -1*holdings[2], 1*holdings[2], 0, 0, -1.0]]
b_ub = [1*holdings[0], -1*holdings[0], 1*holdings[1], -1*holdings[1], 1*holdings[2], -1*holdings[2]]
num_vars = len(c)
num_ineq_constraints = len(A_ub)
num_eq_constraints = len(A_eq)
data = [[]] * num_vars
data[0] = solver.IntVar( 0, UPPER_BOUND, 'e1_p')
data[1] = solver.IntVar( 0, UPPER_BOUND, 'e1_n')
data[2] = solver.IntVar( 0, UPPER_BOUND, 'e2_p')
data[3] = solver.IntVar( 0, UPPER_BOUND, 'e2_n')
data[4] = solver.IntVar( 0, UPPER_BOUND, 'e3_p')
data[5] = solver.IntVar( 0, UPPER_BOUND, 'e3_n')
data[6] = solver.IntVar( 0, UPPER_BOUND, 'x1')
data[7] = solver.IntVar( 0, UPPER_BOUND, 'x2')
data[8] = solver.IntVar( 0, UPPER_BOUND, 'x3')
constraints = [0] * (len(A_ub) + len(b_eq))
# Inequality constraints
for i in range(0,num_ineq_constraints):
constraints[i] = solver.Constraint(-solver.infinity(), b_ub[i])
for j in range(0,num_vars):
constraints[i].SetCoefficient(data[j], A_ub[i][j])
# Equality constraints
for i in range(num_ineq_constraints, num_ineq_constraints+num_eq_constraints):
constraints[i] = solver.Constraint(b_eq[i-num_ineq_constraints], b_eq[i-num_ineq_constraints])
for j in range(0,num_vars):
constraints[i].SetCoefficient(data[j], A_eq[i-num_ineq_constraints][j])
# Objective function
objective = solver.Objective()
for i in range(0,num_vars):
objective.SetCoefficient(data[i], c[i])
# Set up as minization problem
objective.SetMinimization()
# Solve it
result_status = solver.Solve()
solution_set = [data[i].solution_value() for i in range(len(data))]
print('DAY: {}'.format(day+1))
print('======')
print('SOLUTION FEASIBLE: {}'.format(solver.FEASIBLE))
print('SOLUTION OPTIMAL: {}'.format(solver.OPTIMAL))
print('VALUE OF BASKET: {}'.format(np.dot(A_eq[0], solution_set)))
print('SOLUTION (apples,pears,oranges): {!r}'.format(solution_set[-3:]))
print('PCT CHANGE (apples,pears,oranges): {!r}\n\n'.format([round(100*(x-y)/y,2) for x,y in zip(solution_set[-3:], holdings)]))
# Update holdings for the next day
holdings = solution_set[-3:]
A single run gives:
DAY: 1
======
SOLUTION FEASIBLE: 1
SOLUTION OPTIMAL: 0
VALUE OF BASKET: 348.0
SOLUTION (apples,pears,oranges): [20.0, 41.0, 37.0]
PCT CHANGE (apples,pears,oranges): [0.0, -4.65, 0.0]
DAY: 2
======
SOLUTION FEASIBLE: 1
SOLUTION OPTIMAL: 0
VALUE OF BASKET: 251.0
SOLUTION (apples,pears,oranges): [21.0, 41.0, 37.0]
PCT CHANGE (apples,pears,oranges): [5.0, 0.0, 0.0]
DAY: 3
======
SOLUTION FEASIBLE: 1
SOLUTION OPTIMAL: 0
VALUE OF BASKET: 213.0
SOLUTION (apples,pears,oranges): [20.0, 41.0, 37.0]
PCT CHANGE (apples,pears,oranges): [-4.76, 0.0, 0.0]
The l0 formulation is also presented:
def fruit_basket_l0_ortools():
UPPER_BOUND = 1000
prices = [[2,3,5],
[1,2,4],
[1,2,3]]
holdings = [20,43,37]
values = [348, 251, 213]
for day in range(len(values)):
solver = pywraplp.Solver('ILPSolver',
pywraplp.Solver.BOP_INTEGER_PROGRAMMING)
# solver = pywraplp.Solver('ILPSolver',
# pywraplp.Solver.CLP_LINEAR_PROGRAMMING)
c = [1, 0, 0, 0]
price = prices[day]
value = values[day]
A_eq = [[0, price[0], price[1], price[2]]]
b_eq = [value]
A_ub = [[-1*holdings[0], 1.0, 0, 0],
[-1*holdings[0], -1.0, 0, 0],
[-1*holdings[1], 0, 1.0, 0],
[-1*holdings[1], 0, -1.0, 0],
[-1*holdings[2], 0, 0, 1.0],
[-1*holdings[2], 0, 0, -1.0]]
b_ub = [holdings[0], -1*holdings[0], holdings[1], -1*holdings[1], holdings[2], -1*holdings[2]]
num_vars = len(c)
num_ineq_constraints = len(A_ub)
num_eq_constraints = len(A_eq)
data = [[]] * num_vars
data[0] = solver.IntVar(-UPPER_BOUND, UPPER_BOUND, 'e' )
data[1] = solver.IntVar( 0, UPPER_BOUND, 'x1')
data[2] = solver.IntVar( 0, UPPER_BOUND, 'x2')
data[3] = solver.IntVar( 0, UPPER_BOUND, 'x3')
constraints = [0] * (len(A_ub) + len(b_eq))
# Inequality constraints
for i in range(0,num_ineq_constraints):
constraints[i] = solver.Constraint(-solver.infinity(), b_ub[i])
for j in range(0,num_vars):
constraints[i].SetCoefficient(data[j], A_ub[i][j])
# Equality constraints
for i in range(num_ineq_constraints, num_ineq_constraints+num_eq_constraints):
constraints[i] = solver.Constraint(int(b_eq[i-num_ineq_constraints]), b_eq[i-num_ineq_constraints])
for j in range(0,num_vars):
constraints[i].SetCoefficient(data[j], A_eq[i-num_ineq_constraints][j])
# Objective function
objective = solver.Objective()
for i in range(0,num_vars):
objective.SetCoefficient(data[i], c[i])
# Set up as minization problem
objective.SetMinimization()
# Solve it
result_status = solver.Solve()
solution_set = [data[i].solution_value() for i in range(len(data))]
print('DAY: {}'.format(day+1))
print('======')
print('SOLUTION FEASIBLE: {}'.format(solver.FEASIBLE))
print('SOLUTION OPTIMAL: {}'.format(solver.OPTIMAL))
print('VALUE OF BASKET: {}'.format(np.dot(A_eq[0], solution_set)))
print('SOLUTION (apples,pears,oranges): {!r}'.format(solution_set[-3:]))
print('PCT CHANGE (apples,pears,oranges): {!r}\n\n'.format([round(100*(x-y)/y,2) for x,y in zip(solution_set[-3:], holdings)]))
# Update holdings for the next day
holdings = solution_set[-3:]
A single run of this gives
DAY: 1
======
SOLUTION FEASIBLE: 1
SOLUTION OPTIMAL: 0
VALUE OF BASKET: 348.0
SOLUTION (apples,pears,oranges): [33.0, 79.0, 9.0]
PCT CHANGE (apples,pears,oranges): [65.0, 83.72, -75.68]
DAY: 2
======
SOLUTION FEASIBLE: 1
SOLUTION OPTIMAL: 0
VALUE OF BASKET: 251.0
SOLUTION (apples,pears,oranges): [49.0, 83.0, 9.0]
PCT CHANGE (apples,pears,oranges): [48.48, 5.06, 0.0]
DAY: 3
======
SOLUTION FEASIBLE: 1
SOLUTION OPTIMAL: 0
VALUE OF BASKET: 213.0
SOLUTION (apples,pears,oranges): [51.0, 63.0, 12.0]
PCT CHANGE (apples,pears,oranges): [4.08, -24.1, 33.33]
Summary
The l1 formulation gives more sensible results, lower turnover, much lower. The optimality check fails on all runs, however, which is concerning. I included a linear solver too and that fails the feasiblity check somehow, I don't know why. The Google people provide precious little documentation for the ortools lib, and most of it is for the C++ lib. But the l1 formulation may be a solution to your problem, which may scale. ILP is in general NP-complete, and so is your problem most likely.
Also - does a solution exist on day 2? How do you define % change so that it does in your chart above? If I knew I could recast the inequalities above and we would have the general solution.
You got a logic problem on integers, not a representation problem. Neural networks are relevant to problem with complex representation (eg., image with pixels, objects in differents shape and color, sometimes hidden etc), as they build their own set of features (descriptors) and mipmaps; they also are a good match with problems dealing with reals, not integer; and last, as they are today, they don't really deal with reasonning and logic, or eventually with simple logic like a small succession of if/else or switch but we don't really have a control over that.
What I see is closer to a cryptographic-ish problem with constraints (10% change, max 100 articles).
Solution for all sets of fruits
There is a way to reach all solutions very quickly. We start by factoring into primes the total, then we find few solutions through brute force. From there we can change the set of fruits with equal total. Eg., we exchange 1 orange for 1 apple and 1 pear with prices = (1,2,3). This way we can navigate through solutions without having to go through brute force.
Algorithm(s): you factorize in prime numbers the total, then you split them into two or more groups; let's take 2 groups: let A be one common multiplier, and let B the other(s). Then you can add your fruits to reach the total B.
Examples:
Day 1: Apple = 1, Pears = 2, Oranges = 3, Basket Value = 217
Day 2: Apple = 2, Pears = 3, Oranges = 5, Basket Value = 348
217 factorizes into [7, 31], we pick 31 as A (common multiplier), then let say 7=3*2+1 (2 orange, 0 pear, 1 apple), you got an answer: 62 oranges, 0 pears, 31 apples. 62+31<100: valid.
348 factorizes into [2, 2, 3, 29], you have several ways to
group your factors and multiply your fruits inside this. The
multiplier can be 29, (or 2*29 etc), then you pick your fruits to reach 12. Let's say 12=2*2+3+5. You got (2 apples, 1 pear, 1 orange) * 29, but it's more than 100 articles. You can fuse recursively 1 apple and 1 pear into 1 orange until you are below 100 articles, or you can go directly with the solution with a minimum of articles: (2 oranges, 1 apple)*29 = (58 oranges, 29 apples). And at last:
-- 87<100 valid;
-- the change is (-4 oranges, -2 apples), 6/93=6.45% <10% change: valid.
Code
Remark: no implementation of the 10% variation
Remark: I didn't implement the "fruit exchange" process that allows the "solution navigation"
Run with python -O solution.py to optimize and remove the debug messages.
def prime_factors(n):
i = 2
factors = []
while i * i <= n:
if n % i:
i += 1
else:
n //= i
factors.append(i)
if n > 1:
factors.append(n)
return factors
def possibilities(value, prices):
for i in range(0, value + 1, prices[0]):
for j in range(0, value + 1-i, prices[1]):
k = value - i - j
if k % prices[2] == 0:
yield i//prices[0], j//prices[1], k//prices[2]
days = [
(217, (1, 2, 3)),
(348, (2, 3, 5)),
(251, (1, 2, 4)),
(213, (1, 2, 3)),
]
for set in days:
total = set[0]
(priceApple, pricePear, priceOrange) = set[1]
factors = prime_factors(total)
if __debug__:
print(str(total) + " -> " + str(factors))
# remove small article to help factorize (odd helper)
evenHelper = False
if len(factors) == 1 :
evenHelper = True
t1 = total - priceApple
factors = prime_factors(t1)
if __debug__:
print(str(total) + " --> " + str(factors))
# merge factors on left
while factors[0] < priceOrange :
factors = [factors[0] * factors[1]] + factors[2:]
if __debug__:
print("merging: " + str(factors))
# merge factors on right
if len(factors) > 2:
multiplier = 1
for f in factors[1:]:
multiplier *= f
factors = [factors[0]] + [multiplier]
(smallTotal, multiplier) = factors
if __debug__:
print("final factors: " + str(smallTotal) + " (small total) , " + str(multiplier) + " (multiplier)")
# solutions satisfying #<100
smallMax = 100 / multiplier
solutions = [o for o in possibilities(smallTotal, set[1]) if sum(o) < smallMax ]
for solution in solutions:
(a,p,o) = [i * multiplier for i in solution]
# if we used it, we need to add back the odd helper to reach the actual solution
if evenHelper:
a += 1
print(str(a) + " apple(s), " + str(p) + " pear(s), " + str(o) + " orange(s)")
# separating solutions
print()
I timed the program with a 10037 total with (5, 8, 17) prices, and maximum 500 articles: it's about 2ms (on i7 6700k). The "solution navigation" process is very simple and shouldn't add significant time.
There might be a heuristic to go from day to day without having to do the factorization + navigation + validation process. I'll think about it.
I know it's a bit late, but I thought this was an interesting problem and that I might as well add my two cents.
My code:
import math
prices = [1, 2, 3]
basketVal = 217
maxFruits = 100
numFruits = len(prices)
## Get the possible baskets
def getPossibleBaskets(maxFruits, numFruits, basketVal, prices):
possBaskets = []
for i in range(101):
for j in range(101):
for k in range(101):
if i + j + k > 100:
pass
else:
possibleBasketVal = 0
for m in range(numFruits):
possibleBasketVal += (prices[m] * [i, j, k][m])
if possibleBasketVal > basketVal:
break
if possibleBasketVal == basketVal:
possBaskets.append([i, j, k])
return possBaskets
firstDayBaskets = getPossibleBaskets(maxFruits, numFruits, basketVal, prices)
## Compare the baskets for percentage change and filter out the values
while True:
prices = list(map(int, input("New Prices:\t").split()))
basketVal = int(input("New Basket Value:\t"))
maxFruits = int(input("Max Fruits:\t"))
numFruits = len(prices)
secondDayBaskets = getPossibleBaskets(maxFruits, numFruits, basketVal, prices)
possBaskets = []
for basket in firstDayBaskets:
for newBasket in secondDayBaskets:
if newBasket not in possBaskets:
percentChange = 0
for n in range(numFruits):
percentChange += (abs(basket[n] - newBasket[n]) / 100)
if percentChange <= 10:
possBaskets.append(newBasket)
firstDayBaskets = possBaskets
secondDayBaskets = []
print(firstDayBaskets)
I guess this could be called a brute force solution, but it definitely works. Every day, it'll print the possible configurations of the basket.

Python/Biomolecular Physics- Trying to code a simple stochastic simulation of a system exhibiting conditional behavior!

*edited 6/17/10
I'm trying to understand how to improve my code (make it more pythonic). Also, I'm interested in writing more intuitive 'conditionals' that would describe scenarios that are commonplace in biochemistry. The conditional criteria in the below program I've explained in Answer #2, but I am not satisfied with the code- it works fine, but isn't obvious and isn't easy to implement for more complicated conditional scenarios. Ideas welcome. Comments/criticisms welcome. First posting experience # stackoverflow- please comment on etiquette if needed.
The code generates a list of values that are the solution to the following exercise:
"In a programming language of your choice, implement Gillespie’s First Reaction Algorithm to study the temporal behaviour of the reaction A--->B in which the transition from A to B can only take place if another compound, C, is present, and where C dynamically interconverts with D, as modelled in the Petri-net below. Assume that there are 100 molecules of A, 1 of C, and no B or D present at the start of the reaction. Set kAB to 0.1 s-1 and both kCD and kDC to 1.0 s-1. Simulate the behaviour of the system over 100 s."
def sim():
# Set the rate constants for all transitions
kAB = 0.1
kCD = 1.0
kDC = 1.0
# Set up the initial state
A = 100
B = 0
C = 1
D = 0
# Set the start and end times
t = 0.0
tEnd = 100.0
print "Time\t", "Transition\t", "A\t", "B\t", "C\t", "D"
# Compute the first interval
transition, interval = transitionData(A, B, C, D, kAB, kCD, kDC)
# Loop until the end time is exceded or no transition can fire any more
while t <= tEnd and transition >= 0:
print t, '\t', transition, '\t', A, '\t', B, '\t', C, '\t', D
t += interval
if transition == 0:
A -= 1
B += 1
if transition == 1:
C -= 1
D += 1
if transition == 2:
C += 1
D -= 1
transition, interval = transitionData(A, B, C, D, kAB, kCD, kDC)
def transitionData(A, B, C, D, kAB, kCD, kDC):
""" Returns nTransition, the number of the firing transition (0: A->B,
1: C->D, 2: D->C), and interval, the interval between the time of
the previous transition and that of the current one. """
RAB = kAB * A * C
RCD = kCD * C
RDC = kDC * D
dt = [-1.0, -1.0, -1.0]
if RAB > 0.0:
dt[0] = -math.log(1.0 - random.random())/RAB
if RCD > 0.0:
dt[1] = -math.log(1.0 - random.random())/RCD
if RDC > 0.0:
dt[2] = -math.log(1.0 - random.random())/RDC
interval = 1e36
transition = -1
for n in range(len(dt)):
if dt[n] > 0.0 and dt[n] < interval:
interval = dt[n]
transition = n
return transition, interval
if __name__ == '__main__':
sim()
not sure if you've seen this.
http://stompy.sourceforge.net/html/userguide_doc.html
I work on similar stuff and I happened to discover this lately.
Info on the math behind simple stochastic simulation of chemical rxns:
Typically, processes like this are simulated as discrete events with each event occurring with probability 'P' given a specific rate constant 'k' and a number of possible events 'n' in the interval of time 'dt': P=1-e**(-kdtn). Here we are neglecting the actual time of each event (~0) and focusing instead on the interval of time in which the event occurs. Anyone familiar with N choose K problems/bernouli trials will appreciate the presence of 1/e e.g. when N=K and N->oo, the probability of not choosing a specific element from N approaches 1/e. Hence, in a stochastic chemical reaction (first order), the probability that a molecule will not undergo reaction (not be chosen) is some power of 1/e... that power dependent on the time interval and rate constant as well as the number of molecules and rate constant in question. Conversely, 1-(1/e)^xyz gives the probability that any specific molecule will react (be chosen).
In terms of simulation, it would be logical to divide our total time interval into ever smaller intervals and use a random number generator to predict whether an event happened in a given time interval- e.g. if we divided the dt for a single even into 10 smaller intervals, a number between 0 and 0.1 would indicate an event occurred, while a number between .1 and 1.0 would indicate it did not. There is however uncertainty as to exactly when the event occurred- so we must make our intervals smaller- this quickly becomes a loosing battle as uncertainty persists with this method.
The solution to this problem is to take the natural log (‘ln’ here, log() by default in py) of both sides of the above equation and solving for dt, which gives dt= (-ln(1-P))/(k*n). The probability P is then randomly generated, giving a definitive dt for each event.
I don't know the Gillespie algorithm, but I assume that you have checked that the program converges to the correct equilibrium. Therefore I interpret you questions as
"Here is a working physics program, how can I make it more pythonic"
It would probably be more pythonic to do something like the following
R = [ kAB * A * C, kCD * C, kAB * A * C]
dt = [(-math.log(1-random.random())/x,i) for i,x in enumerate(R) if x > 0]
if dt:
interval,transition = min(dt)
else:
transition = None
If you want to use python in physics, then I suggest that you learn numpy. Because numpy is faster for many problems. So here are some untested parts of a numpy solution. Add the following to the header of you program
from numpy import log, array, isinf, zeros
from numpy.random import rand
Then you can replace the inside TransitionData with something like the following
R = array([ kAB * A * C, kCD * C, kAB * A * C])
dt = -log(1-rand(3))/R
transition = dt.argmin()
interval = dt[transition]
if isinf(interval):
transition = None
I don't know if it would be more pythonic to raise a StopIteration exception instead of returning None, but that is a detail.
You should also store your concentrations in a single data structure. If you use numpy, then I suggest that you use an array. Similarly yoy can use an array dABCD to store the changes in the concentration (You can probably come up with better variable names). Add the following code outside your loop
ABCD = array([A,B,C,D])
dABCD = zeros(3,4)
dABCD[0,0] = -1#A decreases when transition == 0
dABCD[0,1] = 1 #B increases when transition == 0
dABCD[1,2] = -1#C decreases when transition == 1
dABCD[1,3] = 1 #D increases when transition == 1
..... etc
Now you can replace you main loop with something like the following
while t <= tEnd:
print t, '\t', transition, '\t', ABCD
transition, interval = transitionData(ABCD, kAB, kCD, kDC)
if transition != None:
t += interval
ABCD += dABCD[transition,:]
else:
break;
else:
print "Warning: Stopping due to timeout. The system did not equilibrate"
There is probably more to do. As an example dABCD should probably be a sparse array, but I hope that these ideas can be a start.
****Edit**** I originally explained this wrong!!!! The following is correct though- Justin, this program uses a clever criteria to ‘weight’ each event. The RAB, RCD, and RDC values are all given a true/false parameter by multiplying kAB, kCD, and kDC by C or D, which in this case can be either one or zero. A zero value for D, and thus RDC would prevent dt[2] from being drawn in the
for n in range(len(dt)):
if dt[n] > 0.0 and dt[n] < interval:
statement. Furthermore, the following-
if transition == 1:
C -= 1
D += 1
if transition == 2:
C += 1
D -= 1
dictates that when the event C->D occurs (transition 1), the next event necessarily must be D->C (transition 2), since of the three values in dt[], only dt[1] is nonzero and thus meets the aforementioned criteria. So, how are we weighting the likelihood that transition 0 or transition 1 occur? It's a little tricky, but is inherent in the following lines:
interval = 1e36
transition = -1
for n in range(len(dt)):
if dt[n] > 0.0 and dt[n] < interval:
interval = dt[n]
transition = n
return transition, interval
"for n in range (len(dt)):" returns all values of the list dt[]. The next line specifies the criteria that must be met, namely that each value has to be greater than 0 and less than interval. For transition 0, interval is 1e36 (which is supposed to be representative of infinity). The rub is that interval is then set to transition 0, so for the second value in dt[], transition 1, the criteria states that it must be smaller than the dt for transition 0 in order to occur... or in other words that it must have happened faster to have happened at all, which agrees with chemical logic. My biggest concern is that the accumulated t values mandated by the "t += interval" line might not be entirely fair... because since t1 firing is independent of t0 firing, t0 firing and taking say, .1 sec, shouldn't exclude t1 from using the same .1 sec to fire... but I'm working on a fix for this... suggestions welcome! This is a verbose print out from the script, including a firing of transition 1 and 2:
Time Transition A B C D
dt0= 0.0350693547214
dt1= 2.26710773787
interval= 1e+36
dt= 0.0350693547214
transition= 0
0.0 0 100 0 1 0
dt0= 0.000339596342313
dt1= 0.21083283004
interval= 1e+36
dt= 0.000339596342313
transition= 0
0.0350693547214 0 99 1 1 0
dt0= 0.0510125874767
dt1= 1.26127048627
interval= 1e+36
dt= 0.0510125874767
transition= 0
0.0354089510637 0 98 2 1 0
dt0= 0.0809691957218
dt1= 0.593246425076
interval= 1e+36
dt= 0.0809691957218
transition= 0
0.0864215385404 0 97 3 1 0
dt0= 0.00205040633531
dt1= 1.70623338677
interval= 1e+36
dt= 0.00205040633531
transition= 0
0.167390734262 0 96 4 1 0
dt0= 0.106140534256
dt1= 0.0915160378053
interval= 1e+36
dt= 0.106140534256
transition= 0
interval= 0.106140534256
dt= 0.0915160378053
transition= 1
0.169441140598 1 95 5 1 0
dt2= 0.0482892532952
interval= 1e+36
dt= 0.0482892532952
transition= 2
0.260957178403 2 95 5 0 1
dt0= 0.112545351421
dt1= 1.84936696832
interval= 1e+36
dt= 0.112545351421
transition= 0
0.309246431698 0 95 5 1 0
Justin, I'm not sure what you mean by dt[2] being less than 1e36 making it 'stay' on transition 2? That doesn't happen because of the
if transition == 2:
C += 1
D -= 1
statement. Anyone know of a more direct way to accomplish this
Haha, let the flaming begin- you guys are awesome though-I really appreciate the feedback! Stackoverflow is soooooo legit.

Categories