I know that python allows a fast evalutation of a real-valued, one variable function f(x) on a numpy array xarr = np.array([x0,x1,...xN]):
f(xarr) = np.array([f(x0), f(x1), ..., f(xN)])
However, this doesn't seem to work syntax-wise for a multivariable function. Say I have a real-valued function f(x,y), where x and y are two real numbers. Is there a correct syntax to evaluate the function on, say, [(0,0), (0,1), (1,0), (1,1)], avoiding a loop (which is always slow on python...)?
Edit: below are the functions involved:
The 5-variable function I refer to is:
def chisqr(BigOmega, inc, taustar, Q0, U0):
QU = QandU(nusdata, BigOmega, inc, taustar, Q0, U0)
Q = QU[:,0]
U = QU[:,1]
return 1./(2.*N) * (np.sum(((Q - Qs)/errQs)**2.) + np.sum(((U - Us)/errUs)**2.))
where nusdata, Qs, and Us are arrays defined before calling the function. The function calls the following function:
def QandU(nu, BigOmega, inc, taustar, Q0, U0):
lambdalong = nu+omega-np.pi/2.
tau3 = taustar * ((1+ecc*np.cos(nu))/(1-ecc**2.))**gamma
delQ = -tau3 * (1+np.cos(inc)*np.cos(inc))*(np.cos(2.*lambdalong) np.sin(inc)*np.sin(inc))
delU = -2*tau3*np.cos(inc)*np.sin(2*lambdalong)
Q = Q0 + delQ*np.cos(BigOmega) - delU * np.sin(BigOmega)
U = U0 + delQ*np.sin(BigOmega) + delU * np.cos(BigOmega)
bounds = (inc < 0) or (inc > np.pi/2.) or (BigOmega < -2*np.pi) or (BigOmega > 2*np.pi) or (taustar < 0.) or (taustar > 1.)
if bounds:
Q = 10E10
U = 10E10
#return U
return np.column_stack((Q,U))
All variables which aren't the function's arguments are defined outside the function.
Using a simple example:
def add_squares(x, y):
return x*x + y*y
xs = np.array([0, 0, 1, 1])
ys = np.array([0, 1, 1, 0])
res = add_squares(x, y)
np.array([0, 1, 2, 1])
Your statement that f(xarr) = np.array([f(x0), f(x1), ..., f(xN)]) is not in general true. This entirely depends on the definition of f. If f consists only of arithmetic operations, then it is true, but in general, it is not.
Your QandU function should almost work as intended:
def QandU(nu, BigOmega, inc, taustar, Q0, U0):
# left unchanged
lambdalong = nu+omega-np.pi/2.
tau3 = taustar * ((1+ecc*np.cos(nu))/(1-ecc**2.))**gamma
delQ = -tau3 * (1+np.cos(inc)*np.cos(inc))*(np.cos(2.*lambdalong) np.sin(inc)*np.sin(inc))
delU = -2*tau3*np.cos(inc)*np.sin(2*lambdalong)
Q = Q0 + delQ*np.cos(BigOmega) - delU * np.sin(BigOmega)
U = U0 + delQ*np.sin(BigOmega) + delU * np.cos(BigOmega)
# or doesn't vectorize, use bitwise or
bounds = (inc < 0) | (inc > np.pi/2.) | (BigOmega < -2*np.pi) | (BigOmega > 2*np.pi) | (taustar < 0.) | (taustar > 1.)
# if statements also don't vectorize
Q[bounds] = 10E10
U[bounds] = 10E10
# stacking is more trouble that it's worth
return Q, U
Your chisqr function will probably need an axis= argument passed to sum, depending exactly which dimensions you want to sum over.
Related
This is more of a computational physics problem, and I've asked it on physics stack exchange, but no answers on there. This is, I suppose, a mix of the disciplines on here and there (and maybe even mathematics stack exchange), so finding the right place to post is a task in of itself apparently...
I'm attempting to use Crank-Nicolson scheme to solve the TDSE in 1D. The initial wave is a real Gaussian that has been normalised wrt its probability density. As the solution evolves, a depression grows in the central peak of the real part of the wave, and the imaginary part's central trough is perhaps a bit higher than I expect (image below).
Does this behaviour seem reasonable? I have searched around and not seen questions/figures that are similar. I've tested another person's code from Github and it exhibits the same behaviour, which makes me feel a bit better. But I still think the center peak should just decrease in height and increase in width. The likelihood of me getting a physics-based explanation is relatively low here I'd assume, but a computational-based explanation on errors I may have made is more likely.
I'm happy to give more information, for example my code, or the matrices used in the scheme, etc. Thanks in advance!
Here's a link to GIF of time evolution:
And the part of my code relevant to solving the 1D TDSE:
(pretty much the entire thing except the plotting)
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
# Define function for norm.
def normf(dxc, uc, ic):
return sum(dxc * np.square(np.abs(uc[ic, :])))
# Define function for expectation value of position.
def xexpf(dxc, xc, uc, ic):
return sum(dxc * xc * np.square(np.abs(uc[ic, :])))
# Define function for expectation value of squared position.
def xexpsf(dxc, xc, uc, ic):
return sum(dxc * np.square(xc) * np.square(np.abs(uc[ic, :])))
# Define function for standard deviation.
def sdaf(xexpc, xexpsc, ic):
return np.sqrt(xexpsc[ic] - np.square(xexpc[ic]))
# Time t: t0 =< t =< tf. Have N steps at which to evaluate the CN scheme. The
# time interval is dt. decp: variable for plotting to certain number of decimal
# places.
t0 = 0
tf = 20
N = 200
dt = tf / N
t = np.linspace(t0, tf, num = N + 1, endpoint = True)
decp = str(dt)[::-1].find('.')
# Initialise array for filling with norm values at each time step.
norm = np.zeros(len(t))
# Initialise array for expectation value of position.
xexp = np.zeros(len(t))
# Initialise array for expectation value of squared position.
xexps = np.zeros(len(t))
# Initialise array for alternate standard deviation.
sda = np.zeros(len(t))
# Position x: -a =< x =< a. M is an even number. There are M + 1 total discrete
# positions, for the points to be symmetric and centred at x = 0.
a = 100
M = 1200
dx = (2 * a) / M
x = np.linspace(-a, a, num = M + 1, endpoint = True)
# The gaussian function u diffuses over time. sd sets the width of gaussian. u0
# is the initial gaussian at t0.
sd = 1
var = np.power(sd, 2)
mu = 0
u0 = np.sqrt(1 / np.sqrt(np.pi * var)) * np.exp(-np.power(x - mu, 2) / (2 * \
var))
u = np.zeros([len(t), len(x)], dtype = 'complex_')
u[0, :] = u0
# Normalise u.
u[0, :] = u[0, :] / np.sqrt(normf(dx, u, 0))
# Set coefficients of CN scheme.
alpha = dt * -1j / (4 * np.power(dx, 2))
beta = dt * 1j / (4 * np.power(dx, 2))
# Tridiagonal matrices Al and AR. Al to be solved using Thomas algorithm.
Al = np.zeros([len(x), len(x)], dtype = 'complex_')
for i in range (0, M):
Al[i + 1, i] = alpha
Al[i, i] = 1 - (2 * alpha)
Al[i, i + 1] = alpha
# Corner elements for BC's.
Al[M, M], Al[0, 0] = 1 - alpha, 1 - alpha
Ar = np.zeros([len(x), len(x)], dtype = 'complex_')
for i in range (0, M):
Ar[i + 1, i] = beta
Ar[i, i] = 1 - (2 * beta)
Ar[i, i + 1] = beta
# Corner elements for BC's.
Ar[M, M], Ar[0, 0] = 1 - 2*beta, 1 - beta
# Thomas algorithm variables. Following similar naming as in Wiki article.
a = np.diag(Al, -1)
b = np.diag(Al)
c = np.diag(Al, 1)
NT = len(b)
cp = np.zeros(NT - 1, dtype = 'complex_')
for n in range(0, NT - 1):
if n == 0:
cp[n] = c[n] / b[n]
else:
cp[n] = c[n] / (b[n] - (a[n - 1] * cp[n - 1]))
d = np.zeros(NT, dtype = 'complex_')
dp = np.zeros(NT, dtype = 'complex_')
# Iterate over each time step to solve CN method. Maintain boundary
# conditions. Keep track of standard deviation.
for i in range(0, N):
# BC's.
u[i, 0], u[i, M] = 0, 0
# Find RHS.
d = np.dot(Ar, u[i, :])
for n in range(0, NT):
if n == 0:
dp[n] = d[n] / b[n]
else:
dp[n] = (d[n] - (a[n - 1] * dp[n - 1])) / (b[n] - (a[n - 1] * \
cp[n - 1]))
nc = NT - 1
while nc > -1:
if nc == NT - 1:
u[i + 1, nc] = dp[nc]
nc -= 1
else:
u[i + 1, nc] = dp[nc] - (cp[nc] * u[i + 1, nc + 1])
nc -= 1
norm[i] = normf(dx, u, i)
xexp[i] = xexpf(dx, x, u, i)
xexps[i] = xexpsf(dx, x, u, i)
sda[i] = sdaf(xexp, xexps, i)
# Fill in final norm value.
norm[N] = normf(dx, u, N)
# Fill in final position expectation value.
xexp[N] = xexpf(dx, x, u, N)
# Fill in final squared position expectation value.
xexps[N] = xexpsf(dx, x, u, N)
# Fill in final standard deviation value.
sda[N] = sdaf(xexp, xexps, N)
The equations in question are:
And I defined them in the following function:
def lorenz_coupled(sigma = 10, r = 28, b = 8/3, C1 = 1, C1x = 1, S = 0.5, O = -11, C2 = 1, C2x = 1, tau = 0.1):
def rhs(t, X):
[sigma * (X[1] - X[0]) - C1 * (S*X[3] - O), r*X[0] - X[1] - X[0]*X[2] + C1 * (S*X[4] - O), X[0]*X[1] - b*X[2] + C1x*S*X[5], (sigma * (X[4] - X[3]) - C2 * (X[0] + O)) * tau, (r*X[3] - X[4] - S*X[3]*X[5] + C2 * (X[1] + O)) * tau, (S*X[3]*X[5] - b*X[5] + C2x*X[2]) * tau]
return rhs
where X[0] corresponds to x_1, X1 = y_1, X[2] = z_1, X[3] = x_2, X[4] = y_2, and X[5] = z_2.
And then try to solve the system of equations with solve_ivp, in the following way:
#Initial conditions
X0 = np.array([1, 1, 1, 1, 1, 1])
#maiximum time
t_max = 20
t = np.linspace(0, t_max, 10*t_max + 1)
#solve function
rhs_fun = lorenz_coupled(sigma = 10, r = 28, b = 8/3, C1 = 1, C1x = 1, S = 0.5, O = -11, C2 = 1, C2x = 1, tau = 0.1)
sol6 = solve_ivp(rhs_fun, (0, t_max), X0, t_eval=t)
But this takes forever to run and has never given me any output. Is there another way to do this that would be quicker?
Also, ideally I would want to run this up until t_max = 5000 or so, but it can't even run for 20 at the moment.
It's possible that your problem is sufficiently stiff that RK45 (the default solver) is having trouble. Have you tried alternative timestepping schemes? The scipy documentation suggests
If it makes unusually many iterations, diverges, or fails, your problem is likely to be stiff and you should use ‘Radau’ or ‘BDF’. ‘LSODA’ can also be a good universal choice, but it might be somewhat less convenient to work with as it wraps old Fortran code.
You can also pass an arbitrary class derived from OdeSolver which implements the solver.
You could also try vectorizing your rhs evaluation function and setting the vectorized flag to True. This should increase performance. Supplying an exact Jacobian should also be fairly straightforward for this problem and would cut the number of calls to your rhs function by about an order of magnitude.
I have 2 Points A and B and their xyz-coordinates. I need a list of all xyz points that are on the line between those 2 points. The Bresenham's line algorithm was too slow for my case.
Example xyz for A and B:
p = np.array([[ 275.5, 244.2, -27.3],
[ 153.2, 184.3, -0.3]])
Expected output:
x3 = p[0,0] + t*(p[1,0]-p[0,0])
y3 = p[0,1] + t*(p[1,1]-p[0,1])
z3 = p[0,2] + t*(p[1,2]-p[0,2])
p3 = [x3,y3,z3]
There was a very fast approach for 2D:
def connect(ends):
d0, d1 = np.diff(ends, axis=0)[0]
if np.abs(d0) > np.abs(d1):
return np.c_[np.arange(ends[0, 0], ends[1,0] + np.sign(d0), np.sign(d0), dtype=np.int32),
np.arange(ends[0, 1] * np.abs(d0) + np.abs(d0)//2,
ends[0, 1] * np.abs(d0) + np.abs(d0)//2 + (np.abs(d0)+1) * d1, d1, dtype=np.int32) // np.abs(d0)]
else:
return np.c_[np.arange(ends[0, 0] * np.abs(d1) + np.abs(d1)//2,
ends[0, 0] * np.abs(d1) + np.abs(d1)//2 + (np.abs(d1)+1) * d0, d0, dtype=np.int32) // np.abs(d1),
np.arange(ends[0, 1], ends[1,1] + np.sign(d1), np.sign(d1), dtype=np.int32)]
It is impossible to give "all" points on a line, as there are countless points. You could do that for a discrete data type like integers, though.
My answer assumes floating point numbers, as in your example.
Technically, floating point numbers are stored in a binary format with a fixed width, so they are discrete, but I will disregard that fact, as it is most likely not what you want.
As you already typed in your question, every point P on that line satisfies this equation:
P = P1 + t * (P2 - P1), 0 < t < 1
My version uses numpy broadcasting to circumvent explicit loops.
import numpy as np
p = np.array([[ 275.5, 244.2, -27.3],
[ 153.2, 184.3, -0.3]])
def connect(points, n_points):
p1, p2 = points
diff = p2 - p1
t = np.linspace(0, 1, n_points+2)[1:-1]
return p1[np.newaxis, :] + t[:, np.newaxis] * diff[np.newaxis, :]
print(connect(p, n_points=4))
# [[251.04 232.22 -21.9 ]
# [226.58 220.24 -16.5 ]
# [202.12 208.26 -11.1 ]
# [177.66 196.28 -5.7 ]]
Maybe I missunderstand something here, but couldn't you just create a function (like you already did just for one single point) and then create a list of points depending of how manny points you want?
I mean between 2 points are an infinite number of other points, so you have to either define a number or use the function directly, which describes where the points are at.
import numpy as np
p = np.array([[ 275.5, 244.2, -27.3],
[ 153.2, 184.3, -0.3]])
def gen_line(p, n):
points = []
stepsize = 1/n
for t in np.arange(0,1,stepsize):
x = (p[1,0]-p[0,0])
y = (p[1,1]-p[0,1])
z = (p[1,2]-p[0,2])
px = p[0,0]
py = p[0,1]
pz = p[0,2]
x3 = px + t*x
y3 = py + t*y
z3 = pz + t*z
points.append([x3,y3,z3])
return points
# generates list of 30k points
gen_line(p, 30000)
I have checked python non linear ODE with 2 variables , which is not my case. Maybe my case is not called as nonlinear ODE, correct me please.
The question isFrenet Frame actually, in which there are 3 vectors T(s), N(s) and B(s); the parameter s>=0. And there are 2 scalar with known math formula expression t(s) and k(s). I have the initial value T(0), N(0) and B(0).
diff(T(s), s) = k(s)*N(s)
diff(N(s), s) = -k(s)*T(s) + t(s)*B(s)
diff(B(s), s) = -t(s)*N(s)
Then how can I get T(s), N(s) and B(s) numerically or symbolically?
I have checked scipy.integrate.ode but I don't know how to pass k(s)*N(s) into its first parameter at all
def model (z, tspan):
T = z[0]
N = z[1]
B = z[2]
dTds = k(s) * N # how to express function k(s)?
dNds = -k(s) * T + t(s) * B
dBds = -t(s)* N
return [dTds, dNds, dBds]
z = scipy.integrate.ode(model, [T0, N0, B0]
Here is a code using solve_ivp interface from Scipy (instead of odeint) to obtain a numerical solution:
import numpy as np
from scipy.integrate import solve_ivp
from scipy.integrate import cumtrapz
import matplotlib.pylab as plt
# Define the parameters as regular Python function:
def k(s):
return 1
def t(s):
return 0
# The equations: dz/dt = model(s, z):
def model(s, z):
T = z[:3] # z is a (9, ) shaped array, the concatenation of T, N and B
N = z[3:6]
B = z[6:]
dTds = k(s) * N
dNds = -k(s) * T + t(s) * B
dBds = -t(s)* N
return np.hstack([dTds, dNds, dBds])
T0, N0, B0 = [1, 0, 0], [0, 1, 0], [0, 0, 1]
z0 = np.hstack([T0, N0, B0])
s_span = (0, 6) # start and final "time"
t_eval = np.linspace(*s_span, 100) # define the number of point wanted in-between,
# It is not necessary as the solver automatically
# define the number of points.
# It is used here to obtain a relatively correct
# integration of the coordinates, see the graph
# Solve:
sol = solve_ivp(model, s_span, z0, t_eval=t_eval, method='RK45')
print(sol.message)
# >> The solver successfully reached the end of the integration interval.
# Unpack the solution:
T, N, B = np.split(sol.y, 3) # another way to unpack the z array
s = sol.t
# Bonus: integration of the normal vector in order to get the coordinates
# to plot the curve (there is certainly better way to do this)
coords = cumtrapz(T, x=s)
plt.plot(coords[0, :], coords[1, :]);
plt.axis('equal'); plt.xlabel('x'); plt.xlabel('y');
T, N and B are vectors. Therefore, there are 9 equations to solve: z is a (9,) array.
For constant curvature and no torsion, the result is a circle:
thanks for your example. And I thought it again, found that since there is formula for dZ where Z is matrix(T, N, B), we can calculate Z[i] = Z[i-1] + dZ[i-1]*deltaS according to the concept of derivative. Then I code and find this idea can solve the circle example. So
is Z[i] = Z[i-1] + dZ[i-1]*deltaS suitable for other ODE? will it fail in some situation, or does scipy.integrate.solve_ivp/scipy.integrate.ode supply advantage over the direct usage of Z[i] = Z[i-1] + dZ[i-1]*deltaS?
in my code, I have to normalize Z[i] because ||Z[i]|| is not always 1. Why does it happen? A float numerical calculation error?
my answer to my question, at least it works for the circle
import numpy as np
from scipy.integrate import cumtrapz
import matplotlib.pylab as plt
# Define the parameters as regular Python function:
def k(s):
return 1
def t(s):
return 0
def dZ(s, Z):
return np.array(
[k(s) * Z[1], -k(s) * Z[0] + t(s) * Z[2], -t(s)* Z[1]]
)
T0, N0, B0 = np.array([1, 0, 0]), np.array([0, 1, 0]), np.array([0, 0, 1])
deltaS = 0.1 # step to calculate dZ/ds
num = int(2*np.pi*1/deltaS) + 1 # how many points on the curve we have to calculate
T = np.zeros([num, ], dtype=object)
N = np.zeros([num, ], dtype=object)
B = np.zeros([num, ], dtype=object)
T[0] = T0
N[0] = N0
B[0] = B0
for i in range(num-1):
temp_dZ = dZ(i*deltaS, np.array([T[i], N[i], B[i]]))
T[i+1] = T[i] + temp_dZ[0]*deltaS
T[i+1] = T[i+1]/np.linalg.norm(T[i+1]) # have to do this
N[i+1] = N[i] + temp_dZ[1]*deltaS
N[i+1] = N[i+1]/np.linalg.norm(N[i+1])
B[i+1] = B[i] + temp_dZ[2]*deltaS
B[i+1] = B[i+1]/np.linalg.norm(B[i+1])
coords = cumtrapz(
[
[i[0] for i in T], [i[1] for i in T], [i[2] for i in T]
]
, x=np.arange(num)*deltaS
)
plt.figure()
plt.plot(coords[0, :], coords[1, :]);
plt.axis('equal'); plt.xlabel('x'); plt.xlabel('y');
plt.show()
I found that the equation I listed in the first post does not work for my curve. So I read Gray A., Abbena E., Salamon S-Modern Differential Geometry of Curves and Surfaces with Mathematica. 2006 and found that for arbitrary curve, Frenet equation should be written as
diff(T(s), s) = ||r'||* k(s)*N(s)
diff(N(s), s) = ||r'||*(-k(s)*T(s) + t(s)*B(s))
diff(B(s), s) = ||r'||* -t(s)*N(s)
where ||r'||(or ||r'(s)||) is diff([x(s), y(s), z(s)], s).norm()
now the problem has changed to be some different from that in the first post, because there is no r'(s) function or discrete data array. So I think this is suitable for a new reply other than comment.
I met 2 questions while trying to solve the new equation:
how can we program with r'(s) if scipy's solve_ivp is used?
I try to modify my gaussian solution, but the result is totally wrong.
thanks again
import numpy as np
from scipy.integrate import cumtrapz
import matplotlib.pylab as plt
# Define the parameters as regular Python function:
def k(s):
return 1
def t(s):
return 0
def dZ(s, Z, r_norm):
return np.array([
r_norm * k(s) * Z[1],
r_norm*(-k(s) * Z[0] + t(s) * Z[2]),
r_norm*(-t(s)* Z[1])
])
T0, N0, B0 = np.array([1, 0, 0]), np.array([0, 1, 0]), np.array([0, 0, 1])
deltaS = 0.1 # step to calculate dZ/ds
num = int(2*np.pi*1/deltaS) + 1 # how many points on the curve we have to calculate
T = np.zeros([num, ], dtype=object)
N = np.zeros([num, ], dtype=object)
B = np.zeros([num, ], dtype=object)
R0 = N0
T[0] = T0
N[0] = N0
B[0] = B0
for i in range(num-1):
r_norm = np.linalg.norm(R0)
temp_dZ = dZ(i*deltaS, np.array([T[i], N[i], B[i]]), r_norm)
T[i+1] = T[i] + temp_dZ[0]*deltaS
T[i+1] = T[i+1]/np.linalg.norm(T[i+1])
N[i+1] = N[i] + temp_dZ[1]*deltaS
N[i+1] = N[i+1]/np.linalg.norm(N[i+1])
B[i+1] = B[i] + temp_dZ[2]*deltaS
B[i+1] = B[i+1]/np.linalg.norm(B[i+1])
R0 = R0 + T[i]*deltaS
coords = cumtrapz(
[
[i[0] for i in T], [i[1] for i in T], [i[2] for i in T]
]
, x=np.arange(num)*deltaS
)
plt.figure()
plt.plot(coords[0, :], coords[1, :]);
plt.axis('equal'); plt.xlabel('x'); plt.xlabel('y');
plt.show()
Are there any pure-python implementations of the inverse error function?
I know that SciPy has scipy.special.erfinv(), but that relies on some C extensions. I'd like a pure python implementation.
I've tried writing my own using the Wikipedia and Wolfram references, but it always seems to diverge from the true value when the arg is > 0.9.
I've also attempted to port the underlying C code that Scipy uses (ndtri.c and the cephes polevl.c functions) but that's also not passing my unit tests.
Edit: As requested, I've added the ported code.
Docstrings (and doctests) have been removed because they're longer than the functions. I haven't yet put much effort into making the port more pythonic - I'll worry about that once I get something that passes unit tests.
Supporting functions from cephes polevl.c
def polevl(x, coefs, N):
ans = 0
power = len(coefs) - 1
for coef in coefs[:N]:
ans += coef * x**power
power -= 1
return ans
def p1evl(x, coefs, N):
return polevl(x, [1] + coefs, N)
Main Inverse Error Function
def inv_erf(z):
if z < -1 or z > 1:
raise ValueError("`z` must be between -1 and 1 inclusive")
if z == 0:
return 0
if z == 1:
return math.inf
if z == -1:
return -math.inf
# From scipy special/cephes/ndrti.c
def ndtri(y):
# approximation for 0 <= abs(z - 0.5) <= 3/8
P0 = [
-5.99633501014107895267E1,
9.80010754185999661536E1,
-5.66762857469070293439E1,
1.39312609387279679503E1,
-1.23916583867381258016E0,
]
Q0 = [
1.95448858338141759834E0,
4.67627912898881538453E0,
8.63602421390890590575E1,
-2.25462687854119370527E2,
2.00260212380060660359E2,
-8.20372256168333339912E1,
1.59056225126211695515E1,
-1.18331621121330003142E0,
]
# Approximation for interval z = sqrt(-2 log y ) between 2 and 8
# i.e., y between exp(-2) = .135 and exp(-32) = 1.27e-14.
P1 = [
4.05544892305962419923E0,
3.15251094599893866154E1,
5.71628192246421288162E1,
4.40805073893200834700E1,
1.46849561928858024014E1,
2.18663306850790267539E0,
-1.40256079171354495875E-1,
-3.50424626827848203418E-2,
-8.57456785154685413611E-4,
]
Q1 = [
1.57799883256466749731E1,
4.53907635128879210584E1,
4.13172038254672030440E1,
1.50425385692907503408E1,
2.50464946208309415979E0,
-1.42182922854787788574E-1,
-3.80806407691578277194E-2,
-9.33259480895457427372E-4,
]
# Approximation for interval z = sqrt(-2 log y ) between 8 and 64
# i.e., y between exp(-32) = 1.27e-14 and exp(-2048) = 3.67e-890.
P2 = [
3.23774891776946035970E0,
6.91522889068984211695E0,
3.93881025292474443415E0,
1.33303460815807542389E0,
2.01485389549179081538E-1,
1.23716634817820021358E-2,
3.01581553508235416007E-4,
2.65806974686737550832E-6,
6.23974539184983293730E-9,
]
Q2 = [
6.02427039364742014255E0,
3.67983563856160859403E0,
1.37702099489081330271E0,
2.16236993594496635890E-1,
1.34204006088543189037E-2,
3.28014464682127739104E-4,
2.89247864745380683936E-6,
6.79019408009981274425E-9,
]
s2pi = 2.50662827463100050242
code = 1
if y > (1.0 - 0.13533528323661269189): # 0.135... = exp(-2)
y = 1.0 - y
code = 0
if y > 0.13533528323661269189:
y = y - 0.5
y2 = y * y
x = y + y * (y2 * polevl(y2, P0, 4) / p1evl(y2, Q0, 8))
x = x * s2pi
return x
x = math.sqrt(-2.0 * math.log(y))
x0 = x - math.log(x) / x
z = 1.0 / x
if x < 8.0: # y > exp(-32) = 1.2664165549e-14
x1 = z * polevl(z, P1, 8) / p1evl(z, Q1, 8)
else:
x1 = z * polevl(z, P2, 8) / p1evl(z, Q2, 8)
x = x0 - x1
if code != 0:
x = -x
return x
result = ndtri((z + 1) / 2.0) / math.sqrt(2)
return result
I think the error in your code is in the for loop over coefficients in the polevl function. If you replace what you have with the function below everything seems to work.
def polevl(x, coefs, N):
ans = 0
power = len(coefs) - 1
for coef in coefs:
ans += coef * x**power
power -= 1
return ans
I have tested it against scipy's implementation with the following code:
import numpy as np
from scipy.special import erfinv
N = 100000
x = np.random.rand(N) - 1.
# Calculate the inverse of the error function
y = np.zeros(N)
for i in range(N):
y[i] = inv_erf(x[i])
assert np.allclose(y, erfinv(x))
sympy? some digging may be needed to see how its implemented internally http://docs.sympy.org/latest/modules/functions/special.html#sympy.functions.special.error_functions.erfinv
from sympy import erfinv
erfinv(0.9).evalf(30)
1.16308715367667425688580351562