I would like to reduce the computation time for the code posted below. In essence, the code below calculates the array Tf as product of the following nested loop:
Af = lambda x: Approximationf(f, x)
for idxp, prior in enumerate(grid_prior):
for idxy, y in enumerate(grid_y):
posterior = lambda yPrime: updated_posterior(prior, y, yPrime)
integrateL = integrate(lambda z: Af(np.array([y*np.exp(mu[0])*z,
posterior(y*np.exp(mu[0]) * z)])))
integrateH = integrate(lambda z: Af(np.array([y*np.exp(mu[1])*z,
posterior(y * np.exp(mu[1])*z)])))
Tf[idxy, idxp] = (h[idxy, idxp] +
beta * ((prior * integrateL) +
(1-prior)*integrateH))
The objects posterior, integrate and Af are functions that are repeatedly called while iterating over the loop. The function posterior calculates a scalar called posterior. The function Af approximates the function f at sample points x and passes the result on to the function integrate, which calculates the conditional expectation of the function f.
The code posted below is a simplification of a more difficult problem. Instead of running the nested loop once, I have to run it multiple times to solve a fixed point problem. This problem is initialized with an arbitrary function f and a function Tf is created. This array is then used in the next iteration over the nested loop to calculate another array Tf. The process continues until convergence.
I decided not to report results of the cProfile module. By neglecting the iteration over the nested loop until convergence a lot of internal python executions require a relatively long time. However, when iterating until convergence, these internal executions loose their relative importance and are relegated to lower positions in the cPython output.
I tried to mimick different suggestions for lowering the computation time of loops I found online for slightly modified problems. Unfortunately, I couldn't do so and could not really figure out a common approach to tackle these problems. Does somebody has an idea how to lower the computation time of this loop? I am grateful for any help!
import numpy as np
from scipy import interpolate
from scipy.stats import lognorm
from scipy.integrate import fixed_quad
# == The following lines define the paramters for the problem == #
gamma, beta, sigma, mu = 2, 0.95, 0.0255, np.array([0.0113, -0.0016])
grid_y, grid_prior = np.linspace(7, 10, 15), np.linspace(0, 1, 5)
int_min, int_max = np.exp(- 7 * sigma), np.exp(+ 7 * sigma)
phi = lognorm(sigma)
f = np.array([[ 1.29824564, 1.29161017, 1.28379398, 1.2676886, 1.15320819],
[ 1.26290108, 1.26147364, 1.24755837, 1.23819851, 1.11912802],
[ 1.22847276, 1.23013194, 1.22128198, 1.20996971, 1.0864706 ],
[ 1.19528104, 1.19645792, 1.19056084, 1.17980572, 1.05532966],
[ 1.16344832, 1.16279841, 1.15997191, 1.15169942, 1.02564429],
[ 1.13301675, 1.13109952, 1.12883038, 1.1236645, 0.99730795],
[ 1.10398195, 1.10125013, 1.0988554, 1.09612933, 0.97019688],
[ 1.07630046, 1.07356297, 1.07126087, 1.06878758, 0.94417658],
[ 1.04989686, 1.04728542, 1.04514962, 1.04289665, 0.91910765],
[ 1.02467087, 1.0221532, 1.02011384, 1.01797238, 0.89485162],
[ 1.00050447, 0.99795025, 0.99576917, 0.99330549, 0.87127677],
[ 0.97726849, 0.97443288, 0.97190614, 0.96861352, 0.84826362],
[ 0.95482612, 0.94783816, 0.94340077, 0.93753641, 0.82569922],
[ 0.93302433, 0.91985497, 0.9059118, 0.88895196, 0.80348449],
[ 0.91165997, 0.88253486, 0.86126688, 0.84769975, 0.78147382]])
# == Calculate function h, Used in the loop below == #
E0 = np.exp((1-gamma)*mu + (1-gamma)**2*sigma**2/2)
h = np.outer(beta*grid_y**(1-gamma), grid_prior*E0[0] + (1-grid_prior)*E0[1])
def integrate(g):
"""
This function is repeatedly called in the loop below
"""
integrand = lambda z: g(z) * phi.pdf(z)
result = fixed_quad(integrand, int_min, int_max, n=15)[0]
return result
def Approximationf(f, x):
"""
This function approximates the function f and is repeatedly called in
the loop
"""
# == simplify notation == #
fApprox = np.empty((x.shape[1]))
lower, middle = (x[0] < grid_y[0]), (x[0] >= grid_y[0]) & (x[0] <= grid_y[-1])
upper = (x[0] > grid_y[-1])
# = Calculate Polynomial == #
y_tile = np.tile(grid_y, len(grid_prior))
prior_repeat = np.repeat(grid_prior, len(grid_y))
s = interpolate.SmoothBivariateSpline(y_tile, prior_repeat,
f.T.flatten(), kx=5, ky=5)
# == interpolation == #
fApprox[middle] = s(x[0, middle], x[1, middle])[:, 0]
# == Extrapolation == #
if any(lower):
s0 = s(lower[lower]*grid_y[0], x[1, lower])[:, 0]
s1 = s(lower[lower]*grid_y[1], x[1, lower])[:, 0]
slope_lower = (s0 - s1)/(grid_y[0] - grid_y[1])
fApprox[lower] = s0 + slope_lower*(x[0, lower] - grid_y[0])
if any(upper):
sM1 = s(upper[upper]*grid_y[-1], x[1, upper])[:, 0]
sM2 = s(upper[upper]*grid_y[-2], x[1, upper])[:, 0]
slope_upper = (sM1 - sM2)/(grid_y[-1] - grid_y[-2])
fApprox[upper] = sM1 + slope_upper*(x[0, upper] - grid_y[-1])
return fApprox
def updated_posterior(prior, y, yPrime):
"""
This function calculates the posterior weights put on each distribution.
It is the thrid function repeatedly called in the loop below.
"""
z_0 = yPrime/(y * np.exp(mu[0]))
z_1 = yPrime/(y * np.exp(mu[1]))
l0, l1 = phi.pdf(z_0), phi.pdf(z_1)
posterior = l0*prior / (l0*prior + l1*(1-prior))
return posterior
Tf = np.empty_like(f)
Af = lambda x: Approximationf(f, x)
# == Apply the T operator to f == #
for idxp, prior in enumerate(grid_prior):
for idxy, y in enumerate(grid_y):
posterior = lambda yPrime: updated_posterior(prior, y, yPrime)
integrateL = integrate(lambda z: Af(np.array([y*np.exp(mu[0])*z,
posterior(y*np.exp(mu[0]) * z)])))
integrateH = integrate(lambda z: Af(np.array([y*np.exp(mu[1])*z,
posterior(y * np.exp(mu[1])*z)])))
Tf[idxy, idxp] = (h[idxy, idxp] +
beta * ((prior * integrateL) +
(1-prior)*integrateH))
Some experience with multiprocessing Following reptilicus comment, I decided to investigate how to use the multiprocessing module. My idea was to begin by parallizing the computation of the intergrateL array. To do so, I fixed the outer loop to prior =0.5 and wanted to iterate over the inner loop, grid_y. However, I still have to take into consideration that intergrateL is a lambda function in z. I tried to follow the advice of the stack-overflow question "How to let Pool.map take a lambda function" and wrote the following code:
prior = 0.5
Af = lambda x: Approximationf(f, x)
class Iteration(object):
def __init__(self,state):
self.y = state
def __call__(self,z):
Af(np.array([self.y*np.exp(mu[0])*z,
updated_posterior(prior,
self.y,self.y*np.exp(mu[0])*z)]))
with Pool(processes=4) as pool:
out = pool.map(Iteration(y), np.nditer(grid_y))
Unfortunately, python returns upon running the program:
IndexError: tuple index out of range
On first sight, these sniffs like a trivial error, but I cannot remedy it. Does somebody has an idea how to tackle the problem? Again, I'm grateful for any advice I receive!
I would target that nested loop, something like this. This is psuedo-code but it should get you started.
def do_calc(idxp, idxy, y, prior):
posterior = lambda yPrime: updated_posterior(prior, y, yPrime)
integrateL = integrate(lambda z: Af(np.array([y*np.exp(mu[0])*z,
posterior(y*np.exp(mu[0]) * z)])))
integrateH = integrate(lambda z: Af(np.array([y*np.exp(mu[1])*z,
posterior(y * np.exp(mu[1])*z)])))
return (idxp, idyy, posterior, integrateL, integrateH)
pool = multiprocessing.pool(8) # or however many cores you have
results = []
# This is the part that I would try to parallelize
for idxp, prior in enumerate(grid_prior):
for idxy, y in enumerate(grid_y):
results.append(pool.apply_async(do_calc, args=(idxpy, idxy, y, prior))
pool.close()
pool.join()
results = [r.get() for r in results]
for r in results:
Tf[r[0], r[1] = (h[r[0], r[1]] +
beta * ((prior * r[3]) +
(1-prior)*r[4))
Related
Here is my code.
import numpy as np
from scipy.integrate import odeint
#Constant
R0=1.475
gamma=2.
ScaleMeVfm3toEskm3 = 8.92*np.power(10.,-7.)
def EOSe(p):
return np.power((p/450.785),(1./gamma))
def M(m,r):
return (4./3.)*np.pi*np.power(r,3.)*p
# function that returns dz/dt
def model(z,r):
p, m = z
dpdr = -((R0*EOSe(p)*m)/(np.power(r,2.)))*(1+(p/EOSe(p)))*(1+((4*math.pi*(np.power(r,3))*p)/(m)))*((1-((2*R0)*m)/(r))**(-1.))
dmdr = 4.*math.pi*(r**2.)*EOSe(p)
dzdr = [dpdr,dmdr]
return dzdr
# initial condition
r0=10.**-12.
p0=10**-6.
z0 = [p0, M(r0, p0)]
# radius
r = np.linspace(r0, 15, 100000)
# solve ODE
z = odeint(model,z0,r)
The result of z[:,0] keeps decreasing as I expected. But what I want is only positive values. One may run the code and try print(z[69306]) and it will show [2.89636405e-11 5.46983202e-01]. That is the last point I want the odeint to stop integration.
Of course, the provided code shows
RuntimeWarning: invalid value encountered in power
return np.power((p/450.785),(1./gamma))
because the result of p starts being negative. For any further points, the odeint yields the result [nan nan].
However, I could use np.nanmin() to find the minimum of z[:,0] that is not nan. But I have a set of p0 values for my work. I will need to call odeint in a loop like
P=np.linspace(10**-8.,10**-2.,10000)
for p0 in P:
#the code for solving ode provided above.
which takes more time.
I think it would reduce a time for execution if I can just stop at before z[:,0] going to be negative a value?
Here is the modified code using solve_ivp:
import numpy as np
from scipy.integrate import solve_ivp
import matplotlib.pylab as plt
# Constants
R0 = 1.475
gamma = 2.
def EOSe(p):
return np.power(np.abs(p)/450.785, 1./gamma)
def M(m, r):
return (4./3.)*np.pi*np.power(r,3.)*p
# function that returns dz/dt
# note: the argument order is reversed compared to `odeint`
def model(r, z):
p, m = z
dpdr = -R0*EOSe(p)*m/r**2*(1 + p/EOSe(p))*(1 + 4*np.pi*r**3*p/m)*(1 - 2*R0*m/r)**(-1)
dmdr = 4*np.pi * r**2 * EOSe(p)
dzdr = [dpdr, dmdr]
return dzdr
# initial condition
r0 = 1e-3
r_max = 50
p0 = 1e-6
z0 = [p0, M(r0, p0)]
# Define the event function
# from the doc: "The solver will find an accurate value
# of t at which event(t, y(t)) = 0 using a root-finding algorithm. "
def stop_condition(r, z):
return z[0]
stop_condition.terminal = True
# solve ODE
r_span = (r0, r_max)
sol = solve_ivp(model, r_span, z0,
events=stop_condition)
print(sol.message)
print('last p, m = ', sol.y[:, -1], 'for r_event=', sol.t_events[0][0])
r_sol = sol.t
p_sol = sol.y[0, :]
m_sol = sol.y[1, :]
# Graph
plt.subplot(2, 1, 1);
plt.plot(r_sol, p_sol, '.-b')
plt.xlabel('r'); plt.ylabel('p');
plt.subplot(2, 1, 2);
plt.plot(r_sol, m_sol, '.-r')
plt.xlabel('r'); plt.ylabel('m');
Actually, using events in this case do not prevent a warning because of negative p. The reason is that the solver is going to evaluate the model for p<O anyway. A solution is to take the absolute value of p in the square root (as in the code above). Using np.sign(p)*np.power(np.abs(p)/450.785, 1./gamma) gives interesting result too.
I am trying to solve this differential equation as part of my assignment. I am not able to understand on how can i put the condition for u in the code. In the code shown below, i arbitrarily provided
u = 5.
2dx(t)dt=−x(t)+u(t)
5dy(t)dt=−y(t)+x(t)
u=2S(t−5)
x(0)=0
y(0)=0
where S(t−5) is a step function that changes from zero to one at t=5. When it is multiplied by two, it changes from zero to two at that same time, t=5.
def model(x,t,u):
dxdt = (-x+u)/2
return dxdt
def model2(y,x,t):
dydt = -(y+x)/5
return dydt
x0 = 0
y0 = 0
u = 5
t = np.linspace(0,40)
x = odeint(model,x0,t,args=(u,))
y = odeint(model2,y0,t,args=(u,))
plt.plot(t,x,'r-')
plt.plot(t,y,'b*')
plt.show()
I do not know the SciPy Library very well, but regarding the example in the documentation I would try something like this:
def model(x, t, K, PT)
"""
The model consists of the state x in R^2, the time in R and the two
parameters K and PT regarding the input u as step function, where K
is the infimum of u and PT is the delay of the step.
"""
x1, x2 = x # Split the state into two variables
u = K if t>=PT else 0 # This is the system input
# Here comes the differential equation in vectorized form
dx = [(-x1 + u)/2,
(-x2 + x1)/5]
return dx
x0 = [0, 0]
K = 2
PT = 5
t = np.linspace(0,40)
x = odeint(model, x0, t, args=(K, PT))
plt.plot(t, x[:, 0], 'r-')
plt.plot(t, x[:, 1], 'b*')
plt.show()
You have a couple of issues here, and the step function is only a small part of it. You can define a step function with a simple lambda and then simply capture it from the outer scope without even passing it to your function. Because sometimes that won't be the case, we'll be explicit and pass it.
Your next problem is the order of arguments in the function to integrate. As per the docs (y,t,...). Ie, First the function, then the time vector, then the other args arguments. So for the first part we get:
u = lambda t : 2 if t>5 else 0
def model(x,t,u):
dxdt = (-x+u(t))/2
return dxdt
x0 = 0
y0 = 0
t = np.linspace(0,40)
x = odeint(model,x0,t,args=(u,))
Moving to the next part, the trouble is, you can't feed x as an arg to y because it's a vector of values for x(t) for particular times and so y+x doesn't make sense in the function as you wrote it. You can follow your intuition from math class if you pass an x function instead of the x values. Doing so requires that you interpolate the x values using the specific time values you are interested in (which scipy can handle, no problem):
from scipy.interpolate import interp1d
xfunc = interp1d(t.flatten(),x.flatten(),fill_value="extrapolate")
#flatten cuz the shape is off , extrapolate because odeint will go out of bounds
def model2(y,t,x):
dydt = -(y+x(t))/5
return dydt
y = odeint(model2,y0,t,args=(xfunc,))
Then you get:
#Sven's answer is more idiomatic for vector programming like scipy/numpy. But I hope my answer provides a clearer path from what you know already to a working solution.
I have the same problem as in this question but don't want to add only one but several constraints to the optimization problem.
So e.g. I want to maximize x1 + 5 * x2 with the constraints that the sum of x1 and x2 is smaller than 5 and x2 is smaller than 3 (needless to say that the actual problem is far more complicated and cannot just thrown into scipy.optimize.minimize as this one; it just serves to illustrate the problem...).
I can to an ugly hack like this:
from scipy.optimize import differential_evolution
import numpy as np
def simple_test(x, more_constraints):
# check wether all constraints evaluate to True
if all(map(eval, more_constraints)):
return -1 * (x[0] + 5 * x[1])
# if not all constraints evaluate to True, return a positive number
return 10
bounds = [(0., 5.), (0., 5.)]
additional_constraints = ['x[0] + x[1] <= 5.', 'x[1] <= 3']
result = differential_evolution(simple_test, bounds, args=(additional_constraints, ), tol=1e-6)
print(result.x, result.fun, sum(result.x))
This will print
[ 1.99999986 3. ] -16.9999998396 4.99999985882
as one would expect.
Is there a better/ more straightforward way to add several constraints than using the rather 'dangerous' eval?
An example is something like this::
additional_constraints = [lambda(x): x[0] + x[1] <= 5., lambda(x):x[1] <= 3]
def simple_test(x, more_constraints):
# check wether all constraints evaluate to True
if all(constraint(x) for constraint in more_constraints):
return -1 * (x[0] + 5 * x[1])
# if not all constraints evaluate to True, return a positive number
return 10
There is a proper solution to the problem described in the question, to enforce multiple nonlinear constraints with scipy.optimize.differential_evolution.
The proper way is by using the scipy.optimize.NonlinearConstraint function.
Here below I give a non-trivial example of optimizing the classic Rosenbrock function inside a region defined by the intersection of two circles.
import numpy as np
from scipy import optimize
# Rosenbrock function
def fun(x):
return 100*(x[1] - x[0]**2)**2 + (1 - x[0])**2
# Function defining the nonlinear constraints:
# 1) x^2 + (y - 3)^2 < 4
# 2) (x - 1)^2 + (y + 1)^2 < 13
def constr_fun(x):
r1 = x[0]**2 + (x[1] - 3)**2
r2 = (x[0] - 1)**2 + (x[1] + 1)**2
return r1, r2
# No lower limit on constr_fun
lb = [-np.inf, -np.inf]
# Upper limit on constr_fun
ub = [4, 13]
# Bounds are irrelevant for this problem, but are needed
# for differential_evolution to compute the starting points
bounds = [[-2.2, 1.5], [-0.5, 2.2]]
nlc = optimize.NonlinearConstraint(constr_fun, lb, ub)
sol = optimize.differential_evolution(fun, bounds, constraints=nlc)
# Accurate solution by Mathematica
true = [1.174907377273171, 1.381484428610871]
print(f"nfev = {sol.nfev}")
print(f"x = {sol.x}")
print(f"err = {sol.x - true}\n")
This prints the following with default parameters:
nfev = 636
x = [1.17490808 1.38148613]
err = [7.06260962e-07 1.70116282e-06]
Here is a visualization of the function (contours) and the feasible region defined by the nonlinear constraints (shading inside the green line). The constrained global minimum is indicated by the yellow dot, while the magenta one shows the unconstrained global minimum.
This constrained problem has an obvious local minimum at (x, y) ~ (-1.2, 1.4) on the boundary of the feasible region which will make local optimizers fail to converge to the global minimum for many starting locations. However, differential_evolution consistently finds the global minimum as expected.
I started with this code to calculate a simple matrix multiplication. It runs with %timeit in around 7.85s on my machine.
To try to speed this up I tried cython which reduced the time to 0.4s. I want to also try to use numba jit compiler to see if I can get similar speed ups (with less effort). But adding the #jit annotation appears to give exactly the same timings (~7.8s). I know it can't figure out the types of the calculate_z_numpy() call but I'm not sure what I can do to coerce it. Any ideas?
from numba import jit
import numpy as np
#jit('f8(c8[:],c8[:],uint)')
def calculate_z_numpy(q, z, maxiter):
"""use vector operations to update all zs and qs to create new output array"""
output = np.resize(np.array(0, dtype=np.int32), q.shape)
for iteration in range(maxiter):
z = z*z + q
done = np.greater(abs(z), 2.0)
q = np.where(done, 0+0j, q)
z = np.where(done, 0+0j, z)
output = np.where(done, iteration, output)
return output
def calc_test():
w = h = 1000
maxiter = 1000
# make a list of x and y values which will represent q
# xx and yy are the co-ordinates, for the default configuration they'll look like:
# if we have a 1000x1000 plot
# xx = [-2.13, -2.1242,-2.1184000000000003, ..., 0.7526000000000064, 0.7584000000000064, 0.7642000000000064]
# yy = [1.3, 1.2948, 1.2895999999999999, ..., -1.2844000000000058, -1.2896000000000059, -1.294800000000006]
x1, x2, y1, y2 = -2.13, 0.77, -1.3, 1.3
x_step = (float(x2 - x1) / float(w)) * 2
y_step = (float(y1 - y2) / float(h)) * 2
y = np.arange(y2,y1-y_step,y_step,dtype=np.complex)
x = np.arange(x1,x2,x_step)
q1 = np.empty(y.shape[0],dtype=np.complex)
q1.real = x
q1.imag = y
# Transpose y
x_y_square_matrix = x+y[:, np.newaxis] # it is np.complex128
# convert square matrix to a flatted vector using ravel
q2 = np.ravel(x_y_square_matrix)
# create z as a 0+0j array of the same length as q
# note that it defaults to reals (float64) unless told otherwise
z = np.zeros(q2.shape, np.complex128)
output = calculate_z_numpy(q2, z, maxiter)
print(output)
calc_test()
I figured out how to do this with some help from someone else.
#jit('i4[:](c16[:],c16[:],i4,i4[:])',nopython=True)
def calculate_z_numpy(q, z, maxiter,output):
"""use vector operations to update all zs and qs to create new output array"""
for iteration in range(maxiter):
for i in range(len(z)):
z[i] = z[i] + q[i]
if z[i] > 2:
output[i] = iteration
z[i] = 0+0j
q[i] = 0+0j
return output
What I learnt is that use numpy datastructures as inputs (for typing), but within use c like paradigms for looping.
This runs in 402ms which is a touch faster than cython code 0.45s so for fairly minimal work in rewriting the loop explicitly we have a python version faster than C(just).
So basically I want to grab the number of iterations it takes my newton's method to find the root, and then take that number and apply it to my color scheme to make the longer the amount of iterations, the darker the color, and the fewer, the more full the color.
so here's my code
from numpy import *
import pylab as pl
def myffp(x):
return x**3 - 1, 3*(x**2)
def newton( ffp, x, nits):
for i in range(nits):
#print i,x
f,fp = ffp(x)
x = x - f/fp
return x
q = sqrt(3)/2
def leggo(xmin=-1,xmax=1,jmin=-1,jmax=1,pts=1000,nits=30):
x = linspace(xmin, xmax, pts)
y = linspace(jmin, jmax, pts)*complex(0,1)
x1,y1 = meshgrid(x,y)
n = newton(myffp,x1+y1,nits) #**here is where i wanna see the number of iterations newton's method takes to find my root**
r1 = complex(1,0)
r2 = complex(-.5, q)
r3 = complex(-.5,-q)
data = zeros((pts,pts,3))
data[:,:,0] = abs(n-r1) #**and apply it here**
data[:,:,2] = abs(n-r2)
data[:,:,1] = abs(n-r3)
pl.show(pl.imshow(data))
leggo()
The main problem is finding the number of iterations, I can then figure out how to apply that to darkening the color, but for now it's just finding the number of iterations it takes for each value ran through newton's method.
Perhaps the simplest way is to just refactor your newton function so that it keeps track of the total iterations and then returns it (along with the result, of course), e.g.,
def newton( ffp, x, nits):
c = 0 # initialize iteration counter
for i in range(nits):
c += 1 # increment counter for each iteration
f, fp = ffp(x)
x = x - f/fp
return x, c # return the counter when the function is called
so in the main body of your code, change your call to newton, like so:
res, tot_iter = newton(myffp, x, nits)
the number of iterations in the last call to newton is stored in tot_iter
As aside, your implementation of Newton's Method seems to be incomplete.
for instance, it's missing a test against some convergence criterion.
Here's a simple implementation in python that works:
def newtons_method(x_init, fn, max_iter=100):
"""
returns: approx. val of root of the function passed in, fn;
pass in: x_init, initial value for the root;
max_iter, total iteration count not exceeded;
fn, a function of the form:
def f(x): return x**3 - 2*x
"""
x = x_init
eps = .0001
# set initial value different from x_init so at lesat 1 loop
x_old = x + 10 * eps
step = .1
c = 0
# (x - x_old) is convergence criterion
while (abs(x - x_old) > eps) and (c < max_iter):
c += 1
fval = fn(x)
dfdx = (fn(x + step)) - fn(x) / step
x_old = x
x = x_old - fval / dfdx
return x, c
The code you're currently using for newton() has a fixed number of iterations (nits - which is being passed in as 30), so the results would be kind of trivial and uninteresting.
It looks like you're trying to generate a Newton fractal -- the method you're trying to use is incorrect; the typical coloring mode is based on the output of the function, not the number of iterations. See the Wikipedia article for a full explanation.