It seems that multiprocessing.pool take the same argument two times - python

I'm using multiprocessing.pool to perform multiple integration in parallel.
In this program I integrate an equation of motion for different realization of noise by generating the dW 3D array. The first part of the program is just definition of parameters and the generation of arrays needed for the calculation.
I generate dW outside the function since I know that otherwise I have to reseed at each process to not obtain the same random sequence.
Euler(replica) function is the function which have to be parallelize. This include a for loop over a single process for the numerical integration. The arg replica is the number of the replica of my system as stored in the "replicas" array, which is the argument passed in pool.map.
import numpy as np
from multiprocessing import Pool
# parameters
N = 30 # number of sites
T = 1 # total time
dt = 0.1 # time step
l = 0 # initially localized state on site l
e = 0.0 # site energy
v = 1.0 # hopping coefficient
mu, sigma = 0, 1.0 # average and variance of the gaussian distribution
num_replicas = 8 # number of replicas of the system
processes=2 # number of processes
# identity vector which represents the diagonal of the Hamiltonian
E = np.ones(N) * e
# vector which represents the upper/lower diagonal terms of the Hopping Matrix and the Hamiltonian
V = np.ones(N-1) * v
# definition of the tight-binding Hamiltonian (tridiagonal)
H = np.diag(E) + np.diag(V, k=1) + np.diag(V, k=-1)
# corner elements of the Hamiltonian
H[0, -1] = v
H[-1, 0] = v
# time array
time_array = np.arange(0, T, dt)
# site array
site_array = np.arange(N)
# initial state
psi_0 = np.zeros((N), dtype=complex)
psi_0[l] = 1. + 0.j
#initialization of the state array
Psi = np.zeros((len(time_array), N), dtype=complex)
Psi[0,:] = psi_0
# replicas 1D array
replicas = np.arange(0, num_replicas)
# random 2D array
dW = np.random.normal(mu, 1.0, (len(time_array), num_replicas, N)) * np.sqrt(dt)
def Euler(replica):
psi_0 = np.zeros((N), dtype=complex)
psi_0[l] = 1. + 0.j
psi = psi_0
for i in np.arange(1, len(time_array)):
psi += -1.j * (H # psi) * dt - 1.j * sigma * psi * dW[i,replica,:] - 0.5 * (sigma**2) * psi * dt
psi /= np.sqrt(psi # np.conj(psi))
Psi[i,:] = psi
return Psi
pool = Pool(processes)
Psi = pool.map(Euler, replicas)
Psi = np.asarray(Psi)
Psi = np.swapaxes(Psi,0,1)
print(Psi)
Empirically I found that if num_replicas > 4 * processes as expressed in the pool.map function, it seems that two processes take the same argument, as if the same calculation is repeated two times. Instead, from 'num_replicas <= 4*processes` I get the expected result: each process is different from the others.
This is not due to the generation of the random matrix dW, since each row is uncorrelated, so I ascribe this behavior to my use of multiprocessing.pool.

I think you should initialize your psi_0 and "Psi" inside the Euler function.
I tried to reproduce your results and, indeed, I found that when num_replicas > 4 * processes you get the same results from multiple processors. But I think this is due to the fact that Psi, in your case, it's a global variable.
Modifying the code as following it gives different results for each num_replicas (by the way, why do you use site_array? It is not used anywhere).
import numpy as np
from multiprocessing import Pool
# parameters
N = 3 # number of sites
T = 1 # total time
dt = 0.1 # time step
l = 0 # initially localized state on site l
e = 0.0 # site energy
v = 1.0 # hopping coefficient
mu, sigma = 0, 1.0 # average and variance of the gaussian distribution
num_replicas = 10 # number of replicas of the system
processes=2 # number of processes
# identity vector which represents the diagonal of the Hamiltonian
E = np.ones(N) * e
# vector which represents the upper/lower diagonal terms of the Hopping Matrix and the Hamiltonian
V = np.ones(N-1) * v
# definition of the tight-binding Hamiltonian (tridiagonal)
H = np.diag(E) + np.diag(V, k=1) + np.diag(V, k=-1)
# corner elements of the Hamiltonian
H[0, -1] = v
H[-1, 0] = v
# time array
time_array = np.arange(0, T, dt)
## site array
#site_array = np.arange(N)
# replicas 1D array
replicas = np.arange(0, num_replicas)
# random 2D array
dW = np.random.normal(mu, 1.0, (len(time_array), num_replicas, N)) * np.sqrt(dt)
#dW = np.random.normal(mu, 1.0, (len(time_array), num_replicas, N)) * np.sqrt(dt)
def Euler(replica):
# initial state
psi_0 = np.zeros((N), dtype=complex)
psi_0[l] = 1. + 0.j
#initialization of the state array
Psi = np.zeros((len(time_array), N), dtype=complex)
Psi[0,:] = psi_0
psi_0 = np.zeros((N), dtype=complex)
psi_0[l] = 1. + 0.j
psi = psi_0
# print(dW[0,replica,0])
for i in np.arange(1, len(time_array)):
psi += -1.j * (H # psi) * dt - 1.j * sigma * psi * dW[i,replica,:] - 0.5 * (sigma**2) * psi * dt
psi /= np.sqrt(psi # np.conj(psi))
Psi[i,:] = psi
return Psi
pool = Pool(processes)
Psi = pool.map(Euler, replicas)
Psi = np.asarray(Psi)
Psi = np.swapaxes(Psi,0,1)
print(Psi)

as #Fabrizio pointed out, Psi is shared between invocations of Euler. this is generally a bad thing to do and another example of why it's a bad idea to have "global mutable state". it's just too easy for things to break in unexpected ways
the reason it fails in this case is subtle and due to the way Pool.map accumulates several results in each process before pickling them and returning them to the parent/controlling process. you can see this by setting the chunksize parameter of map to 1, causing it return results immediately and hence not overwriting intermediate results
it's equivalent to the following minimal working example:
from multiprocessing import Pool
arr = [None]
def mutate_global(i):
arr[0] = i
return arr
with Pool(2) as pool:
out = pool.map(mutate_global, range(10), chunksize=5)
print(out)
the last time I ran this I got:
[[4], [4], [4], [4], [4], [9], [9], [9], [9], [9]]
you can change the chunksize parameter to get an idea of what it's doing, or maybe run with the following version:
def mutate_local(i):
arr = [None]
arr[0] = i
return arr
which "just works", and is the equlivelant to doing what #Fabrizio describes where you create Phi inside Euler rather than using a single global variable

Related

making python code faster using numba, cache or any other optimization

I have defined the following function
def laplacian_2D_array(func_2D):
func_2nd_derv_x_fin2D = np.zeros((N,N))
for j in range (0,N):
func_CD2_x_list = []
for i in range (0,N):
value = func_2D[i][j] #func_2D is a NxN matrix
func_CD2_x_list.append(value)
func_CD2_x_array = np.array (func_CD2_x_list)[np.newaxis]
func_2nd_derv_x = matrix_R # (np.transpose(func_CD2_x_array))
func_2nd_derv_x_fin2D[j] = np.transpose (np.reshape(func_2nd_derv_x,[N,]))
func_2nd_derv_y_fin2D = np.zeros((N,N))
for i in range (0,N):
func_CD2_y_list = []
for j in range (0,N):
value = func_2D[i][j]
func_CD2_y_list.append(value)
func_CD2_y_array = np.array (func_CD2_y_list)[np.newaxis]
func_2nd_derv_y = matrix_S # (np.transpose(func_CD2_y_array))
func_2nd_derv_y_fin2D[i] = (np.reshape(func_2nd_derv_y,[N,]))
return (np.add(func_2nd_derv_x_fin2D, func_2nd_derv_y_fin2D))
In the above code a 2D matrix func_2Deach j row is extracted as column vector and multiplied with matrix_R and stored in jth column of func_2nd_derv_x_fin2D and similarly of 2nd block of code and finally function returns the addition of func_2nd_derv_x_fin2D and func_2nd_derv_y_fin2D
In this N = 401 and matrix_S and matrix_Rare also N X N matrices. This function is being called multiple times in a while loop and execution of single iteration is taking a lot of time. I have tried #njit to make it faster but I am not successful in doing so and getting errors. I have also tried using cache.
How can we optimise for lists and arrays in this and what are other ways to optimize the defined function?
I am showing the code where the function is used.
while (time<timemax):
#Analytical Solution----------------------------------------------------------------------------
exact_time = time/a_sec
omega_t = np.zeros((N,N))
psi_t = np.zeros((N,N))
for i in range (0,N):
for j in range (0,N):
psi_t[i][j] = np.sin(x_list[i]) * np.sin(y_list[j]) * np.exp((-2*exact_time)/Re)
omega_t[i][j] = 2*np.sin (x_list[i]) * np.sin(y_list[j]) * np.exp((-2*exact_time)/Re)
.
.
.
.
.
.
.
# BiCGSTAB algo
x0 = psi_0 #initial guess--> psi of previous time step
r0 = omega_0 - laplacian_2D_array(x0) # r0 = b-Ax0
r0_hat = r0
rho_0 = 1
alpha = 1
w0 = 1
v0 = np.zeros((N,N))
P_0 = np.zeros((N,N))
tol = 10 ** (-7)
iteration = 0
while ((np.max(np.abs(laplacian_2D_array(x0) - omega_0))) < tol):
rho_prev = rho_0
rho = np.dot ((np.reshape(r0_hat,(N**2,1))),(np.reshape(r0,(N**2,1))))
beta = (rho/rho_prev) * (alpha/w0)
P_0 = r0 + beta * (P_0 - w0 * v0)
v0 = laplacian_2D_array(P_0)
alpha = rho / np.dot ((np.reshape(r0_hat,(N**2,1))),(np.reshape(v0,(N**2,1))))
s = r0 - alpha * v0
t = laplacian_2D_array(s)
Any suggestions to fasten code is highly appreciable.
It turns out the complicated code of laplacian_2D_array can be simplified as the following implementation:
def laplacian_2D_array(func_2D):
return (matrix_R # func_2D + matrix_S # func_2D.T).T
This is 63 times faster on my machine on random matrices based on your provided inputs. Most of the time should be spent in the matrix multiplication (performed very efficiently in parallel if the installation of Numpy/Python/BLAS is done correctly on the target platform).

Using python built-in functions for coupled ODEs

THIS PART IS JUST BACKGROUND IF YOU NEED IT
I am developing a numerical solver for the Second-Order Kuramoto Model. The functions I use to find the derivatives of theta and omega are given below.
# n-dimensional change in omega
def d_theta(omega):
return omega
# n-dimensional change in omega
def d_omega(K,A,P,alpha,mask,n):
def layer1(theta,omega):
T = theta[:,None] - theta
A[mask] = K[mask] * np.sin(T[mask])
return - alpha*omega + P - A.sum(1)
return layer1
These equations return vectors.
QUESTION 1
I know how to use odeint for two dimensions, (y,t). for my research I want to use a built-in Python function that works for higher dimensions.
QUESTION 2
I do not necessarily want to stop after a predetermined amount of time. I have other stopping conditions in the code below that will indicate whether the system of equations converges to the steady state. How do I incorporate these into a built-in Python solver?
WHAT I CURRENTLY HAVE
This is the code I am currently using to solve the system. I just implemented RK4 with constant time stepping in a loop.
# This function randomly samples initial values in the domain and returns whether the solution converged
# Inputs:
# f change in theta (d_theta)
# g change in omega (d_omega)
# tol when step size is lower than tolerance, the solution is said to converge
# h size of the time step
# max_iter maximum number of steps Runge-Kutta will perform before giving up
# max_laps maximum number of laps the solution can do before giving up
# fixed_t vector of fixed points of theta
# fixed_o vector of fixed points of omega
# n number of dimensions
# theta initial theta vector
# omega initial omega vector
# Outputs:
# converges true if it nodes restabilizes, false otherwise
def kuramoto_rk4_wss(f,g,tol_ss,tol_step,h,max_iter,max_laps,fixed_o,fixed_t,n):
def layer1(theta,omega):
lap = np.zeros(n, dtype = int)
converges = False
i = 0
tau = 2 * np.pi
while(i < max_iter): # perform RK4 with constant time step
p_omega = omega
p_theta = theta
T1 = h*f(omega)
O1 = h*g(theta,omega)
T2 = h*f(omega + O1/2)
O2 = h*g(theta + T1/2,omega + O1/2)
T3 = h*f(omega + O2/2)
O3 = h*g(theta + T2/2,omega + O2/2)
T4 = h*f(omega + O3)
O4 = h*g(theta + T3,omega + O3)
theta = theta + (T1 + 2*T2 + 2*T3 + T4)/6 # take theta time step
mask2 = np.array(np.where(np.logical_or(theta > tau, theta < 0))) # find which nodes left [0, 2pi]
lap[mask2] = lap[mask2] + 1 # increment the mask
theta[mask2] = np.mod(theta[mask2], tau) # take the modulus
omega = omega + (O1 + 2*O2 + 2*O3 + O4)/6
if(max_laps in lap): # if any generator rotates this many times it probably won't converge
break
elif(np.any(omega > 12)): # if any of the generators is rotating this fast, it probably won't converge
break
elif(np.linalg.norm(omega) < tol_ss and # assert the nodes are sufficiently close to the equilibrium
np.linalg.norm(omega - p_omega) < tol_step and # assert change in omega is small
np.linalg.norm(theta - p_theta) < tol_step): # assert change in theta is small
converges = True
break
i = i + 1
return converges
return layer1
Thanks for your help!
You can wrap your existing functions into a function accepted by odeint (option tfirst=True) and solve_ivp as
def odesys(t,u):
theta,omega = u[:n],u[n:]; # or = u.reshape(2,-1);
return [ *f(omega), *g(theta,omega) ]; # or np.concatenate([f(omega), g(theta,omega)])
u0 = [*theta0, *omega0]
t = linspan(t0, tf, timesteps+1);
u = odeint(odesys, u0, t, tfirst=True);
#or
res = solve_ivp(odesys, [t0,tf], u0, t_eval=t)
The scipy methods pass numpy arrays and convert the return value into same, so that you do not have to care in the ODE function. The variant in comments is using explicit numpy functions.
While solve_ivp does have event handling, using it for a systematic collection of events is rather cumbersome. It would be easier to advance some fixed step, do the normalization and termination detection, and then repeat this.
If you want to later increase efficiency somewhat, use directly the stepper classes behind solve_ivp.

Translating from Matlab: New problems (complex ginzburg landau equation

Thank you for all of your constructive criticisim on my last post. I have made some changes, but alas my code is still not working and I can't figure out why. What happens when I run this version is that I get a runtime warning about invalid errors encountered in matmul.
My code is given as
from __future__ import division
import numpy as np
from scipy.linalg import eig
from scipy.linalg import toeplitz
def poldif(*arg):
"""
Calculate differentiation matrices on arbitrary nodes.
Returns the differentiation matrices D1, D2, .. DM corresponding to the
M-th derivative of the function f at arbitrarily specified nodes. The
differentiation matrices can be computed with unit weights or
with specified weights.
Parameters
----------
x : ndarray
vector of N distinct nodes
M : int
maximum order of the derivative, 0 < M <= N - 1
OR (when computing with specified weights)
x : ndarray
vector of N distinct nodes
alpha : ndarray
vector of weight values alpha(x), evaluated at x = x_j.
B : int
matrix of size M x N, where M is the highest derivative required.
It should contain the quantities B[l,j] = beta_{l,j} =
l-th derivative of log(alpha(x)), evaluated at x = x_j.
Returns
-------
DM : ndarray
M x N x N array of differentiation matrices
Notes
-----
This function returns M differentiation matrices corresponding to the
1st, 2nd, ... M-th derivates on arbitrary nodes specified in the array
x. The nodes must be distinct but are, otherwise, arbitrary. The
matrices are constructed by differentiating N-th order Lagrange
interpolating polynomial that passes through the speficied points.
The M-th derivative of the grid function f is obtained by the matrix-
vector multiplication
.. math::
f^{(m)}_i = D^{(m)}_{ij}f_j
This function is based on code by Rex Fuzzle
https://github.com/RexFuzzle/Python-Library
References
----------
..[1] B. Fornberg, Generation of Finite Difference Formulas on Arbitrarily
Spaced Grids, Mathematics of Computation 51, no. 184 (1988): 699-706.
..[2] J. A. C. Weidemann and S. C. Reddy, A MATLAB Differentiation Matrix
Suite, ACM Transactions on Mathematical Software, 26, (2000) : 465-519
"""
if len(arg) > 3:
raise Exception('number of arguments is either two OR three')
if len(arg) == 2:
# unit weight function : arguments are nodes and derivative order
x, M = arg[0], arg[1]
N = np.size(x)
# assert M<N, "Derivative order cannot be larger or equal to number of points"
if M >= N:
raise Exception("Derivative order cannot be larger or equal to number of points")
alpha = np.ones(N)
B = np.zeros((M, N))
elif len(arg) == 3:
# specified weight function : arguments are nodes, weights and B matrix
x, alpha, B = arg[0], arg[1], arg[2]
N = np.size(x)
M = B.shape[0]
I = np.eye(N) # identity matrix
L = np.logical_or(I, np.zeros(N)) # logical identity matrix
XX = np.transpose(np.array([x, ] * N))
DX = XX - np.transpose(XX) # DX contains entries x(k)-x(j)
DX[L] = np.ones(N) # put 1's one the main diagonal
c = alpha * np.prod(DX, 1) # quantities c(j)
C = np.transpose(np.array([c, ] * N))
C = C / np.transpose(C) # matrix with entries c(k)/c(j).
Z = 1 / DX # Z contains entries 1/(x(k)-x(j)
Z[L] = 0 # eye(N)*ZZ; # with zeros on the diagonal.
X = np.transpose(np.copy(Z)) # X is same as Z', but with ...
Xnew = X
for i in range(0, N):
Xnew[i:N - 1, i] = X[i + 1:N, i]
X = Xnew[0:N - 1, :] # ... diagonal entries removed
Y = np.ones([N - 1, N]) # initialize Y and D matrices.
D = np.eye(N) # Y is matrix of cumulative sums
DM = np.empty((M, N, N)) # differentiation matrices
for ell in range(1, M + 1):
Y = np.cumsum(np.vstack((B[ell - 1, :], ell * (Y[0:N - 1, :]) * X)), 0) # diags
D = ell * Z * (C * np.transpose(np.tile(np.diag(D), (N, 1))) - D) # off-diags
D[L] = Y[N - 1, :]
DM[ell - 1, :, :] = D
return DM
def herdif(N, M, b=1):
"""
Calculate differentiation matrices using Hermite collocation.
Returns the differentiation matrices D1, D2, .. DM corresponding to the
M-th derivative of the function f, at the N Chebyshev nodes in the
interval [-1,1].
Parameters
----------
N : int
number of grid points
M : int
maximum order of the derivative, 0 < M < N
b : float, optional
scale parameter, real and positive
Returns
-------
x : ndarray
N x 1 array of Hermite nodes which are zeros of the N-th degree
Hermite polynomial, scaled by b
DM : ndarray
M x N x N array of differentiation matrices
Notes
-----
This function returns M differentiation matrices corresponding to the
1st, 2nd, ... M-th derivates on a Hermite grid of N points. The
matrices are constructed by differentiating N-th order Hermite
interpolants.
The M-th derivative of the grid function f is obtained by the matrix-
vector multiplication
.. math::
f^{(m)}_i = D^{(m)}_{ij}f_j
References
----------
..[1] B. Fornberg, Generation of Finite Difference Formulas on Arbitrarily
Spaced Grids, Mathematics of Computation 51, no. 184 (1988): 699-706.
..[2] J. A. C. Weidemann and S. C. Reddy, A MATLAB Differentiation Matrix
Suite, ACM Transactions on Mathematical Software, 26, (2000) : 465-519
..[3] R. Baltensperger and M. R. Trummer, Spectral Differencing With A
Twist, SIAM Journal on Scientific Computing 24, (2002) : 1465-1487
"""
if M >= N - 1:
raise Exception('number of nodes must be greater than M - 1')
if M <= 0:
raise Exception('derivative order must be at least 1')
x = herroots(N) # compute Hermite nodes
alpha = np.exp(-x * x / 2) # compute Hermite weights.
beta = np.zeros([M + 1, N])
# construct beta(l,j) = d^l/dx^l (alpha(x)/alpha'(x))|x=x_j recursively
beta[0, :] = np.ones(N)
beta[1, :] = -x
for ell in range(2, M + 1):
beta[ell, :] = -x * beta[ell - 1, :] - (ell - 1) * beta[ell - 2, :]
# remove initialising row from beta
beta = np.delete(beta, 0, 0)
# compute differentiation matrix (b=1)
DM = poldif(x, alpha, beta)
# scale nodes by the factor b
x = x / b
# scale the matrix by the factor b
for ell in range(M):
DM[ell, :, :] = (b ** (ell + 1)) * DM[ell, :, :]
return x, DM
def herroots(N):
"""
Compute roots of the Hermite polynomial of degree N
Parameters
----------
N : int
degree of the Hermite polynomial
Returns
-------
x : ndarray
N x 1 array of Hermite roots
"""
# Jacobi matrix
d = np.sqrt(np.arange(1, N))
J = np.diag(d, 1) + np.diag(d, -1)
# compute eigenvalues
mu = eig(J)[0]
# return sorted, normalised eigenvalues
# real part only since all roots must be real.
return np.real(np.sort(mu) / np.sqrt(2))
a = 1-1j
b = 2+0.2j
c1 = 0.34
c2 = 0.005
alpha1 = (4*c2/a)**0.25
alpha2 = b/2*a
Nx = 220;
# hermite differentiation matrices
[x,D] = herdif(Nx, 2, np.real(alpha1))
D1 = D[0,:]
D2 = D[1,:]
# integration weights
diff = np.diff(x)
#print(len(diff))
p = np.concatenate([np.zeros(1), diff])
q = np.concatenate([diff, np.zeros(1)])
w = (p + q)/2
Q = np.diag(w)
#Discretised operator
const = c1*np.diag(np.ones(len(x)))-c2*(np.diag(x)*np.diag(x))
#print(const)
A = a*D2 - b*D1 + const
##### Timestepping
tmax = 200
tmin = 0
dt = 1
n = (tmax - tmin)/dt
tvec = np.linspace(0,tmax,n, endpoint = True)
#(len(tvec))
q = np.zeros((Nx, len(tvec)),dtype=complex)
f = np.zeros((Nx, len(tvec)),dtype=complex)
q0 = np.ones(Nx)*10**4
q[:,0] = q0
#print(q[:,0])
#print(q0)
# qnew - qold = dt*Aqold + dt*N(qold,qold,qold)
# qnew - qold = dt*Aqnew - dt*N(qold,qold,qold)
# therefore qnew - qold = 0.5*dtAqold + 0.5*dt*Aqnew + dtN(qold,qold,qold)
# rearranging to give qnew( 1- 0.5Adt) = (1 + 0.5Adt) + dt N(qold,qold,qold)
from numpy.linalg import inv
inverted = inv(np.eye(Nx)-0.5*A*dt)
forqold = (np.eye(Nx) + 0.5*A*dt)
firstterm = np.matmul(inverted,forqold)
for t in range(0, len(tvec)-1):
nl = abs(np.square(q[:,t]))*q[:,t]
q[:,t+1] = np.matmul(firstterm,q[:,t]) - dt*np.matmul(inverted,nl)
where the hermitedifferentiation matrices can be found online and are in a different file. This code blows up after five interations, which I cannot understand as I don't see how it differs in the matlab found here https://www.bagherigroup.com/research/open-source-codes/
I would really appreciate any help.
Error in:
q[:,t+1] = inverted*forgold*np.array(q[:,t]) + inverted*dt*np.array(nl)
q[:, t+1] indexes a 2d array (probably not a np.matrix which is more MATLAB like). This indexing reduces the number of dimensions by 1, hence the (220,) shape in the error message.
The error message says the RHS is (220,220). That shape probably comes from inverted and forgold. np.array(q[:,t]) is 1d. Multiplying a (220,220) by a (220,) is ok, but you can't put that square array into a 1d slot.
Both uses of np.array in the error line are superfluous. Their arguments are already ndarray.
As for the loop, it may be necessary. It looks like q[:,t+1] is a function of q[:,t], a serial, rather than parallel operation. Those are harder to render as 'vectorized' (unless you can usecumsum` like operations).
Note that in numpy * is elementwise multiplication, the .* of MATLAB. np.dot and # are used for matrix multiplication.
q[:,t+1]= invert#q[:,t]
would work

How to simulate a reaction with an order < 1 in pyomo?

I am simulating a chemical reaction of the form A --> B --> C using a chemical batch reactor model. The corresponding ODE is a follows:
dcA/dt = - kA * cA(t) ** nA1
dcB/dt = kA * cA(t) ** nA1 - kB * cB(t) **nB2
dcC/dt = - kB * cB(t) ** nB2
Pyomo solves the ODE system fine if the exponents nA1 and nB2 are 1 or higher. But in my case they below 1 and as the components concentrations approach zero the ode integration fails, giving out only nans. The reason is that once the concentrations approach zero they numerically become values of cA(t) = -10e-20 for example and then the expression cA(t)**nA1 is not solvable any more.
I tried to implement a workaround of the form:
if cA < 0:
R1 = 0
else:
R1 = kA * cA(t) ** nA1
but I wasn't able to do it properly as I had a hard time using the pyomo synthax.
This is the minimal working example:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
from pyomo.environ import *
from pyomo.dae import *
V = 40 # l
kA = 0.5 # 1/min
kB = 0.1 # 1/min
nA1 = 0.5
nB2 = 0.5
cAf = 2.0 # mol/l
def batch_plot(t, y):
plt.plot(t, y[:, 0], label = "cA")
plt.plot(t, y[:, 1], label = "cB")
plt.plot(t, y[:, 2], label = "cC")
plt.legend()
def batch():
m = ConcreteModel()
m.t = ContinuousSet(bounds = (0, 500))
m.cA = Var(m.t, domain = NonNegativeReals)
m.cB = Var(m.t, domain = NonNegativeReals)
m.cC = Var(m.t, domain = NonNegativeReals)
m.dcA = DerivativeVar(m.cA, wrt = m.t)
m.dcB = DerivativeVar(m.cB, wrt = m.t)
m.dcC = DerivativeVar(m.cC, wrt = m.t)
m.cA[0] = cAf
m.cB[0] = 0
m.cC[0] = 0
R1 = lambda m, t: kA * m.cA[t] ** nA1
R2 = lambda m, t: kB * m.cB[t] ** nB2
m.odeA = Constraint(m.t, rule = lambda m, t: m.dcA[t] == - R1(m, t) )
m.odeB = Constraint(m.t,
rule = lambda m, t: m.dcB[t] == R1(m, t) - R2(m, t) )
m.odeC = Constraint(m.t,
rule = lambda m, t: m.dcC[t] == R2(m, t) )
return m
tsim, profiles = Simulator(batch(), package = "scipy").simulate(numpoints = 100)
batch_plot(tsim, profiles)
I expect the ode integration to work even with reaction orders below 1.
Does anybody have an idea on how to achieve this?
There are two aims in modifying the power function x^n:
extend to negative x in a smooth way so that the numerical method does not hiccup close to x=0 and
have a small slope for small x so that the numerical integration for very small x has a greater chance to be stable.
The first condition is satisfied by constructs like
x*max(eps,abs(x))^(n-1) or
x*(eps+abs(x-eps))^(n-1),
x*(eps^2+abs(x-eps)^2)^(0.5*(n-1)),
which all have the exact same value x^n for x>eps and are continuous and piecewise smooth. But the slope at x=0 is of the size eps^(n-1) which will require very small step sizes even after the system stabilizes.
The solution is to extract even more integer power from the rational power in the form of
x*abs(x) * max(eps,abs(x))^(n-2)
or one of the other variants for the last factor. For 0<x<eps and n=0.5 this results in the value r(x)=x^2 * eps^(-1.5), so that the equation x'=-k*r(x) has the solution x(t)=x1/(1+x1*k*eps^(-1.5)*(t-t1)) after it fell to a point 0<x1<eps at t=t1. The slope of r is smaller 2, which is nice for numerical integrators.
This was implemented for scipy.integrate.solve_ivp, using method LSODA and rather strict tolerances, with the ODE right side function
# your original function, stabilizes at negative values
power0 = lambda x,n: max(0,x) ** n;
# linear at x=0, small step sizes
def power1(x,n): eps=1e-4; return x * max(eps, abs(x)) ** (n-1);
def power2(x,n): eps=1e-4; return x * (eps**2+(x-eps)**2) ** (0.5*(n-1))
# quadratic at x=0, large step sizes on the tail
eps = 1e-8
power3 = lambda x,n: x * abs(x) * max(eps,abs(x)) ** (n-2)
power4 = lambda x,n: x * abs(x) * (eps**2+(x-eps)**2) ** (0.5*n-1)
# select the power approximation used
power = power3
def model(t,u):
cA, cB, Cc = u;
R1 = kA * power(cA, nA1)
R2 = kB * power(cB, nB2)
return [ -R1, R1-R2, R2 ]
The integration runs successfully, using step sizes 20-30 in the tail end. The resulting plot looks qualitatively correct,
and in the zoom for small values is smooth and remains positive.

Efficiently compute n-body gravitation in python

I am trying to compute the accelerations due to gravity for an n-body problem in 3-space (I'm using symplectic Euler).
I have position and velocity vectors for each time step, and am using the below (working) code to calculate accelerations and update velocity and position. Note that the accelerations are vectors in 3-space, not just magnitudes.
I would like to know if there's a more efficient way to compute this with numpy to avoid the loops.
def accelerations(positions, masses):
'''Params:
- positions: numpy array of size (n,3)
- masses: numpy array of size (n,)
Returns:
- accelerations: numpy of size (n,3), the acceleration vectors in 3-space
'''
n_bodies = len(masses)
accelerations = numpy.zeros([n_bodies,3]) # n_bodies * (x,y,z)
# vectors from mass(i) to mass(j)
D = numpy.zeros([n_bodies,n_bodies,3]) # n_bodies * n_bodies * (x,y,z)
for i, j in itertools.product(range(n_bodies), range(n_bodies)):
D[i][j] = positions[j]-positions[i]
# Acceleration due to gravitational force between each pair of bodies
A = numpy.zeros((n_bodies, n_bodies,3))
for i, j in itertools.product(range(n_bodies), range(n_bodies)):
if numpy.linalg.norm(D[i][j]) > epsilon:
A[i][j] = gravitational_constant * masses[j] * D[i][j] \
/ numpy.linalg.norm(D[i][j])**3
# Calculate net acceleration of each body (vectors in 3-space)
accelerations = numpy.sum(A, axis=1) # sum of accel vectors for each body of shape (n_bodies,3)
return accelerations
Here is an optimized version using blas. blas has special routines for linear algebra on symmetric or Hermitian matrices. These use specialized, packed storage, keeping only the upper or lower triangle and leaving out the (redundant) mirrored entries. That way blas saves not only ~half the storage but also ~half the flops.
I've put quite a few comments to make it readable.
import numpy as np
import itertools
from scipy.linalg.blas import zhpr, dspr2, zhpmv
def acc_vect(pos, mas):
n = mas.size
d2 = pos#(-2*pos.T)
diag = -0.5 * np.einsum('ii->i', d2)
d2 += diag + diag[:, None]
np.einsum('ii->i', d2)[...] = 1
return np.nansum((pos[:, None, :] - pos) * (mas[:, None] * d2**-1.5)[..., None], axis=0)
def acc_blas(pos, mas):
n = mas.size
# trick: use complex Hermitian to get the packed anti-symmetric
# outer difference in the imaginary part of the zhpr answer
# don't want to sum over dimensions yet, therefore must do them one-by-one
trck = np.zeros((3, n * (n + 1) // 2), complex)
for a, p in zip(trck, pos.T - 1j):
zhpr(n, -2, p, a, 1, 0, 0, 1)
# does a -> a + alpha x x^H
# parameters: n -- matrix dimension
# alpha -- real scalar
# x -- complex vector
# ap -- packed Hermitian n x n matrix a
# i.e. an n(n+1)/2 vector
# incx -- x stride
# offx -- x offset
# lower -- is storage of ap lower or upper
# overwrite_ap -- whether to change a inplace
# as a by-product we get pos pos^T:
ppT = trck.real.sum(0) + 6
# now compute matrix of squared distances ...
# ... using (A-B)^2 = A^2 + B^2 - 2AB
# ... that and the outer sum X (+) X.T equals X ones^T + ones X^T
dspr2(n, -0.5, ppT[np.r_[0, 2:n+1].cumsum()], np.ones((n,)), ppT,
1, 0, 1, 0, 0, 1)
# does a -> a + alpha x y^T + alpha y x^T in packed symmetric storage
# scale anti-symmetric differences by distance^-3
np.divide(trck.imag, ppT*np.sqrt(ppT), where=ppT.astype(bool),
out=trck.imag)
# it remains to scale by mass and sum
# this can be done by matrix multiplication with the vector of masses ...
# ... unfortunately because we need anti-symmetry we need to work
# with Hermitian storage, i.e. complex numbers, even though the actual
# computation is only real:
out = np.zeros((3, n), complex)
for a, o in zip(trck, out):
zhpmv(n, 0.5, a, mas*-1j, 1, 0, 0, o, 1, 0, 0, 1)
# multiplies packed Hermitian matrix by vector
return out.real.T
def accelerations(positions, masses, epsilon=1e-6, gravitational_constant=1.0):
'''Params:
- positions: numpy array of size (n,3)
- masses: numpy array of size (n,)
'''
n_bodies = len(masses)
accelerations = np.zeros([n_bodies,3]) # n_bodies * (x,y,z)
# vectors from mass(i) to mass(j)
D = np.zeros([n_bodies,n_bodies,3]) # n_bodies * n_bodies * (x,y,z)
for i, j in itertools.product(range(n_bodies), range(n_bodies)):
D[i][j] = positions[j]-positions[i]
# Acceleration due to gravitational force between each pair of bodies
A = np.zeros((n_bodies, n_bodies,3))
for i, j in itertools.product(range(n_bodies), range(n_bodies)):
if np.linalg.norm(D[i][j]) > epsilon:
A[i][j] = gravitational_constant * masses[j] * D[i][j] \
/ np.linalg.norm(D[i][j])**3
# Calculate net accleration of each body
accelerations = np.sum(A, axis=1) # sum of accel vectors for each body
return accelerations
from numpy.linalg import norm
def acc_pm(positions, masses, G=1):
'''Params:
- positions: numpy array of size (n,3)
- masses: numpy array of size (n,)
'''
mass_matrix = masses.reshape((1, -1, 1))*masses.reshape((-1, 1, 1))
disps = positions.reshape((1, -1, 3)) - positions.reshape((-1, 1, 3)) # displacements
dists = norm(disps, axis=2)
dists[dists == 0] = 1 # Avoid divide by zero warnings
forces = G*disps*mass_matrix/np.expand_dims(dists, 2)**3
return forces.sum(axis=1)/masses.reshape(-1, 1)
n = 500
pos = np.random.random((n, 3))
mas = np.random.random((n,))
from timeit import timeit
print(f"loops: {timeit('accelerations(pos, mas)', globals=globals(), number=1)*1000:10.3f} ms")
print(f"pmende: {timeit('acc_pm(pos, mas)', globals=globals(), number=10)*100:10.3f} ms")
print(f"vectorized: {timeit('acc_vect(pos, mas)', globals=globals(), number=10)*100:10.3f} ms")
print(f"blas: {timeit('acc_blas(pos, mas)', globals=globals(), number=10)*100:10.3f} ms")
A = accelerations(pos, mas)
AV = acc_vect(pos, mas)
AB = acc_blas(pos, mas)
AP = acc_pm(pos, mas)
assert np.allclose(A, AV) and np.allclose(AB, AV) and np.allclose(AP, AV)
Sample run; comparing to OP, my pure numpy vectorization and #P Mende's.
loops: 3213.130 ms
pmende: 41.480 ms
vectorized: 43.860 ms
blas: 7.726 ms
We can see that
1) P Mende is slightly better than I at vectorizing
2) blas is ~5 times as fast; please note that my blas is not very good; I suspect with an optimized blas you may get even better (numpy would be expected to run faster too on a better blas, though)
3) any of the answers is much faster than loops
A follow up to my comments on your original post:
from numpy.linalg import norm
def accelerations(positions, masses):
'''Params:
- positions: numpy array of size (n,3)
- masses: numpy array of size (n,)
'''
mass_matrix = masses.reshape((1, -1, 1))*masses.reshape((-1, 1, 1))
disps = positions.reshape((1, -1, 3)) - positions.reshape((-1, 1, 3)) # displacements
dists = norm(disps, axis=2)
dists[dists == 0] = 1 # Avoid divide by zero warnings
forces = G*disps*mass_matrix/np.expand_dims(dists, 2)**3
return forces.sum(axis=1)/masses.reshape(-1, 1)
Some things to consider:
You only need half the distances; once you've calculated D[i][j], that's the same as -D[j][i].
You can do df2 = df.apply(lambda x:gravitational_constant/x**3)
You can generate a dataframe that records, for each pair of bodies, the product of their masses. You only have do that once, and then you can pass it to accelearations every time you call it.
Then df.product(df2).product(mass_products).sum().div(masses) gives you the accelerations.

Categories