What I want:
I want to apply a 1D function to an arbitrarily shaped ndarray, such that it modifies a certain axis. Similar to the axis argument in numpy.fft.fft.
Take the following example:
import numpy as np
def transf1d(f, x, y, out):
"""Transform `f(x)` to `g(y)`.
This function is actually a C-function that is far more complicated
and should not be modified. It only takes 1D arrays as parameters.
"""
out[...] = (f[None,:]*np.exp(-1j*x[None,:]*y[:,None])).sum(-1)
def transf_all(F, x, y, axis=-1, out=None):
"""General N-D transform.
Perform `transf1d` along the given `axis`.
Given the following:
F.shape == (2, 3, 100, 4, 5)
x.shape == (100,)
y.shape == (50,)
axis == 2
Then the output shape would be:
out.shape == (2, 3, 50, 4, 5)
This function should wrap `transf1d` such that it works on arbitrarily
shaped (compatible) arrays `F`, and `out`.
"""
if out is None:
shape = list(np.shape(F))
shape[axis] = np.size(y)
for f, o in magic_iterator(F, out):
# Given above shapes:
# f.shape == (100,)
# o.shape == (50,)
transf1d(f, x, y, o)
return out
The function transf1d takes a 1D ndarray f, and two more 1D arrays x, and y. It performs a fourier transform of f(x) from the x-axis to the y-axis. The result is stored in the out argument.
Now I want to wrap this in a more general function transf_all, that can take ndarrays of arbitrary shape along with an axis argument, that specifies along which axis to transform.
Notes:
My code is actually written in Cython. Ideally, the magic_iterator would be fast in Cython.
The function transf1d actually is a C-function that returns its output in the out argument. Hence, I couldn't get it to work with numpy.apply_along_axis.
Because transf1d is actually a pretty complicated C-function I cannot rewrite it to work on arbitrary arrays. I need to wrap it in a Cython function that deals with the additional dimensions.
Note, that the arrays x, and y can differ in their lengths.
My question:
How can I do this? How can I iterate over arbitrary dimensions of an ndarray such that at each iteration I will get a 1D array containing the specified axis?
I had a look at nditer, but I'm not sure if that is actually the right tool for this job.
Cheers!
import numpy as np
def transf1d(f, x, y, out):
"""Transform `f(x)` to `g(y)`.
This function is actually a C-function that is far more complicated
and should not be modified. It only takes 1D arrays as parameters.
"""
out[...] = (f[None,:]*np.exp(-1j*x[None,:]*y[:,None])).sum(-1)
def transf_all(F, x, y, axis=-1, out=None):
"""General N-D transform.
Perform `transf1d` along the given `axis`.
Given the following:
F.shape == (2, 3, 100, 4, 5)
x.shape == (100,)
y.shape == (50,)
axis == 2
Then the output shape would be:
out.shape == (2, 3, 50, 4, 5)
This function should wrap `transf1d` such that it works on arbitrarily
shaped (compatible) arrays `F`, and `out`.
"""
def wrapper(f):
"""
wrap transf1d for apply_along_axis compatibility
that is, having a signature of F.shape[axis] -> out.shape[axis]
"""
out = np.empty_like(y)
transf1d(f, x, y, out)
return out
return np.apply_along_axis(wrapper, axis, F)
I believe this should do what you want, although I havnt tested it. Note that the looping happening inside apply_along_axis has python-level performance though, so this only vectorizes the operation in terms of style, not in terms of performance. However, that is quite probably of no concern, assuming the decision to resort to external C code for the inner loop is justified by it being a nontrivial operation in the first place.
To answer your question:
If you really just want to iterate over all but a given axis, you can use:
for s in itertools.product(map(range, arr.shape[:axis]+arr.shape[axis+1:]):
arr[s[:axis] + (slice(None),) + s[axis:]]
Maybe there's a more elegant way to do it, but this should work.
But, don't iterate:
For your problem, I would just rewrite your function to work on a given axis of an ndarray. I think this should work:
def transfnd(f, x, y, axis, out):
s = list(f.shape)
s.insert(axis, 1)
yx = [y.size, x.size] + [1]*(f.ndim - axis - 1)
out[...] = np.sum(f.reshape(*s)*np.exp(-1j*x[None,:]*y[:,None]).reshape(*yx), axis+1)
It's really just the generalization of your current implementation, but instead of inserting a new axis in F at the beginning, it inserts it at axis (there might be a better way to do this than with the list(shape) method, but that was all I could do. Finally, you have to add trailing new axes to your yx outer product, to match as many trailing indices you have in F.
I didn't really know how to test this, but the shapes all work out, so please test it and let me know whether it works.
I found a way of iterating over all but one axis in Cython using the Numpy C-API (Code down below). However, it's not pretty. Whether it's worth the effort depends on the inner function and the size of data.
If any one knows a more elegant way to do this in Cython, please let me know.
I compared to Eelco's solution and they run at a comparable speed for large arguments. For smaller arguments the C-API solution is faster:
In [5]: y=linspace(-1,1,100);
In [6]: %timeit transf.apply_along(f, x, y, axis=1)
1 loops, best of 3: 5.28 s per loop
In [7]: %timeit transf.transfnd(f, x, y, axis=1)
1 loops, best of 3: 5.16 s per loop
As you can see, for this input both functions are roughly at the same speed.
In [8]: f=np.random.rand(10,20,50);x=linspace(0,1,20);y=linspace(-1,1,10);
In [9]: %timeit transf.apply_along(f, x, y, axis=1)
100 loops, best of 3: 15.1 ms per loop
In [10]: %timeit transf.transfnd(f, x, y, axis=1)
100 loops, best of 3: 8.55 ms per loop
However, for less large input arrays the C-API approach is faster.
The code
#cython: boundscheck=False
#cython: wraparound=False
#cython: cdivision=True
import numpy as np
cimport numpy as np
np.import_array()
cdef extern from "complex.h":
double complex cexp(double complex z) nogil
cdef void transf1d(double complex[:] f,
double[:] x,
double[:] y,
double complex[:] out,
int Nx,
int Ny) nogil:
cdef int i, j
for i in xrange(Ny):
out[i] = 0
for j in xrange(Nx):
out[i] = out[i] + f[j]*cexp(-1j*x[j]*y[i])
def transfnd(F, x, y, axis=-1, out=None):
# Make sure everything is a numpy array.
F = np.asanyarray(F, dtype=complex)
x = np.asanyarray(x, dtype=float)
y = np.asanyarray(y, dtype=float)
# Calculate absolute axis.
cdef int ax = axis
if ax < 0:
ax = np.ndim(F) + ax
# Calculate lengths of the axes `x`, and `y`.
cdef int Nx = np.size(x), Ny = np.size(y)
# Output array.
if out is None:
shape = list(np.shape(F))
shape[axis] = Ny
out = np.empty(shape, dtype=complex)
else:
out = np.asanyarray(out, dtype=complex)
# Error check.
assert np.shape(F)[axis] == Nx, \
'Array length mismatch between `F`, and `x`!'
assert np.shape(out)[axis] == Ny, \
'Array length mismatch between `out`, and `y`!'
f_shape = list(np.shape(F))
o_shape = list(np.shape(out))
f_shape[axis] = 0
o_shape[axis] = 0
assert f_shape == o_shape, 'Array shape mismatch between `F`, and `out`!'
# Construct iterator over all but one axis.
cdef np.flatiter itf = np.PyArray_IterAllButAxis(F, &ax)
cdef np.flatiter ito = np.PyArray_IterAllButAxis(out, &ax)
cdef int f_stride = F.strides[axis]
cdef int o_stride = out.strides[axis]
# Memoryview to access one slice per iteration.
cdef double complex[:] fdat
cdef double complex[:] odat
cdef double[:] xdat = x
cdef double[:] ydat = y
while np.PyArray_ITER_NOTDONE(itf):
# View the current `x`, and `y` axes.
fdat = <double complex[:Nx]> np.PyArray_ITER_DATA(itf)
fdat.strides[0] = f_stride
odat = <double complex[:Ny]> np.PyArray_ITER_DATA(ito)
odat.strides[0] = o_stride
# Perform the 1D-transformation on one slice.
transf1d(fdat, xdat, ydat, odat, Nx, Ny)
# Go to next step.
np.PyArray_ITER_NEXT(itf)
np.PyArray_ITER_NEXT(ito)
return out
# For comparison
def apply_along(F, x, y, axis=-1):
# Make sure everything is a numpy array.
F = np.asanyarray(F, dtype=complex)
x = np.asanyarray(x, dtype=float)
y = np.asanyarray(y, dtype=float)
# Calculate absolute axis.
cdef int ax = axis
if ax < 0:
ax = np.ndim(F) + ax
# Calculate lengths of the axes `x`, and `y`.
cdef int Nx = np.size(x), Ny = np.size(y)
# Error check.
assert np.shape(F)[axis] == Nx, \
'Array length mismatch between `F`, and `x`!'
def wrapper(f):
out = np.empty(Ny, complex)
transf1d(f, x, y, out, Nx, Ny)
return out
return np.apply_along_axis(wrapper, axis, F)
Build with the following setup.py
from distutils.core import setup
from Cython.Build import cythonize
import numpy as np
setup(
name = 'transf',
ext_modules = cythonize('transf.pyx'),
include_dirs = [np.get_include()],
)
Related
I was wondering how I could fix the error in the following code
import numpy as np
import matplotlib.pyplot as plt
from sympy.functions.special.polynomials import assoc_legendre
from scipy.misc import factorial, derivative
import sympy as sym
def main():
t = 36000
a=637000000
H=200
g=9.81
x = sym.symbols('x')
for l in range(1, 6):
ω=np.sqrt(g*H*l*(l+1))/a
for n in range(l+1):
nθ, nφ = 128, 256
θ, φ = np.linspace(0, np.pi, nθ), np.linspace(0, 2*np.pi, nφ)
legfun_sym = sym.functions.special.polynomials.assoc_legendre(l, n, x)
legfun_num = sym.lambdify(x,legfun_sym)
X, Y = np.meshgrid(θ, φ)
uθ = (g/(a*ω))*Der_Assoc_Legendre(legfun_num, l, n, X)*np.sin(n*Y-ω*t)
uφ = (g/(a*ω*np.sin(X)))*Assoc_Legendre(l, n, X)*np.cos(n*Y-ω*t)
#speed = np.sqrt(uθ**2 + uφ**2)
fig0, ax = plt.subplots()
strm = ax.streamplot(φ, θ, uφ, uθ, linewidth=2, cmap=plt.cm.autumn)
fig0.colorbar(strm.lines)
plt.show()
def Assoc_Legendre(m, n, X):
L=[]
for i in X:
k=[]
for j in i:
k.append(assoc_legendre(m, n, np.cos(j)))
L.append(k)
return np.array(L)
def Der_Assoc_Legendre(legfun_num, m, n, X):
L=[]
for i in X:
k=[]
for j in i:
k.append(derivative(legfun_num, j, dx=1e-7))
L.append(k)
return np.array(L)
if __name__=='__main__':
main()
The error message 'u' and 'v' must be of shape 'Grid(x,y)' comes up with regard to the strm = ax.streamplot(φ, θ, uφ, uθ, linewidth=2, cmap=plt.cm.autumn) line. How should I fix this?
For reference, I am trying to do a streamplot of $u_{\theta}$ and $u_{\phi}$, where $u_{\theta}=\frac{g}{\omega a}\frac{d}{d\theta}\left(P^n_l\left(cos\theta\right)\right)sin\left(n\phi-\omega t\right)$ and $u_{\phi}=\frac{gn}{\omega a sin\theta}P^n_l\left(cos\theta\right)cos\left(n\phi-\omega t\right)$
EDIT:
This is the current code I have:
import numpy as np
import matplotlib.pyplot as plt
from sympy.functions.special.polynomials import assoc_legendre
from scipy.misc import factorial, derivative
import sympy as sym
def main():
t = 36000
a=637000000
H=200
g=9.81
x = sym.symbols('x')
X, Y = np.mgrid[0.01:np.pi-0.01:100j,0:2*np.pi:100j]
for l in range(1, 6):
ω=np.sqrt(g*H*l*(l+1))/a
for n in range(l+1):
#nθ, nφ = 128, 256
#θ, φ = np.linspace(0.001, np.pi-0.001, nθ), np.linspace(0, 2*np.pi, nφ)
legfun_sym = sym.functions.special.polynomials.assoc_legendre(l, n, x)
legfun_num = sym.lambdify(x, legfun_sym)
uθ = (g/(a*ω*np.sin(X)))*Der_Assoc_Legendre(legfun_num, l, n, X)*np.sin(n*Y-ω*t)
uφ = (g/(a*ω))*Assoc_Legendre(l, n, X)*np.cos(n*Y-ω*t)
#speed = np.sqrt(uθ**2 + uφ**2)
fig0, ax = plt.subplots()
strm = ax.streamplot(Y, X, uθ,uφ, linewidth=0.5, cmap=plt.cm.autumn)
#fig0.colorbar(strm.lines)
plt.show()
print("next")
def Assoc_Legendre(m, n, X):
L=[]
for i in X:
k=[]
for j in i:
k.append(assoc_legendre(m, n, np.cos(j)))
L.append(k)
return np.float64(np.array(L))
def Der_Assoc_Legendre(legfun_num, m, n, X):
L=[]
for i in X:
k=[]
for j in i:
k.append(derivative(legfun_num, j, dx=0.001))
L.append(k)
return np.float64(np.array(L))
if __name__=='__main__':
main()
The current issue seems to be with the derivative function in Der_Assoc_Legendre, that brings up the error ValueError: math domain error after plotting the first plot and onto the second.
While python 3 allows you to use Greek characters as/in variable names, I can assure you that most programmers will find your code unreadable, and it will be a nightmare for others to maintain/develop your code riddled with φ and θ.
Secondly, your code quickly throws a RuntimeWarning concerning a division by zero, which you should most certainly trace down and fix safely.
As for your question, the problem is two-fold. The first problem is that the dimensions of your input don't match in the call to streamline:
>>> print(φ.shape, θ.shape, uφ.shape, uθ.shape)
(256,) (128,) (256, 128) (256, 128)
The trick is that a lot of matplotlib plot functions expect a transposition of its 2d array dimensions, closely related to the weird definition of numpy.meshgrid:
>>> i,j = np.meshgrid(range(3),range(4))
>>> print(i.shape)
(4, 3)
Probably due to this reason, the definition of streamplot is as follows:
Axes.streamplot(ax, *args, **kwargs)
Draws streamlines of a vector flow.
x, y : 1d arrays
an evenly spaced grid.
u, v : 2d arrays
x and y-velocities. Number of rows should match length of y, and the number of columns should match x.
Note the last bit about dimensions. All you need to do is swap x/y, or transpose the angles; you need to check which one of these will lead to a more meaningful plot in your application.
Now, if you fix this, the the following happens:
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
Now this is fishy. All input types should be numeric...right? Well, yeah, but they aren't:
>>> print(φ.dtype, θ.dtype, uφ.dtype, uθ.dtype)
float64 float64 object float64
What's that third object-typed array about?
>>> print(uφ[0,0],uθ[0,0])
+inf -0.00055441014491
>>> print(type(uφ[0,0]),type(uθ[0,0]))
<class 'sympy.core.numbers.Float'> <class 'numpy.float64'>
As #Jelmes noted in a comment. the above type of uφ is the direct consequence of its construction using sympy. If one converts these sympy floats to python or numpy floats before constructing the resulting array, the dtype issue should go away. Whether the remaining infinities (the consequence of division by 0 in 1/sin(X)) will be handled gracefully by streamplot, is another question.
Usually I use Scipy.optimize.curve_fit to fit custom functions to data.
Data in this case was always a 1 dimensional array.
Is there a similiar function for a two dimensional array?
So, for example, I have a 10x10 numpy array. Then I have a function that does some stuff and creates a 10x10 numpy array, and I want to fit the function, so that the resulting 10x10 array has the best fit to the input array.
Maybe an example is better :)
data = pyfits.getdata('data.fits') #fits is an image format, this gives me a NxM numpy array
mod1 = pyfits.getdata('mod1.fits')
mod2 = pyfits.getdata('mod2.fits')
mod3 = pyfits.getdata('mod3.fits')
mod1_1D = numpy.ravel(mod1)
mod2_1D = numpy.ravel(mod2)
mod3_1D = numpy.ravel(mod3)
def dostuff(a,b): #originaly this is a function for 2D arrays
newdata = (mod1_1D*12)+(mod2_1D)**a - mod3_1D/b
return newdata
Now a and b should be fitted, so that newdata is as close as possible to data.
What I got so far:
data1D = numpy.ravel(data)
data_X = numpy.arange(data1D.size)
fit = curve_fit(dostuff,data_X,data1D)
But print fit only gives me
(array([ 1.]), inf)
I do have some nans in the arrays, maybe thats a problem?
The goal is to express the 2D function as a 1D function: g(x, y, ...) --> f(xy, ...)
Converting the coordinate pair (x, y) into a single number xy may seem tricky at first. But it's actually quite simple. Just enumerate all data points and you have a single number that uniquely defines each coordinate pair. The fitted function simply has to reconstruct the original coordinates, do it's calculations and return the result.
Example that fits a 2D linear gradient in a 20x10 image:
import scipy as sp
import numpy as np
import matplotlib.pyplot as plt
n, m = 10, 20
# noisy example data
x = np.arange(m).reshape(1, m)
y = np.arange(n).reshape(n, 1)
z = x + y * 2 + np.random.randn(n, m) * 3
def f(xy, a, b):
i = xy // m # reconstruct y coordinates
j = xy % m # reconstruct x coordinates
out = i * a + j * b
return out
xy = np.arange(z.size) # 0 is the top left pixel and 199 is the top right pixel
res = sp.optimize.curve_fit(f, xy, np.ravel(z))
z_est = f(xy, *res[0])
z_est2d = z_est.reshape(n, m)
plt.subplot(2, 1, 1)
plt.plot(np.ravel(z), label='original')
plt.plot(z_est, label='fitted')
plt.legend()
plt.subplot(2, 2, 3)
plt.imshow(z)
plt.xlabel('original')
plt.subplot(2, 2, 4)
plt.imshow(z_est2d)
plt.xlabel('fitted')
I would recommend using symfit for this, I wrote that to take care of all of the magic for you automatically.
In symfit you would just write the equation pretty much as you would on paper, and then you can run the fit.
I would do something like this:
from symfit import parameters, variables, Fit
# Assuming all this data is in the form of NxM arrays
data = pyfits.getdata('data.fits')
mod1 = pyfits.getdata('mod1.fits')
mod2 = pyfits.getdata('mod2.fits')
mod3 = pyfits.getdata('mod3.fits')
a, b = parameters('a, b')
x, y, z, u = variables('x, y, z, u')
model = {u: (x * 12) + y**a - z / b}
fit = Fit(model, x=mod1, y=mod2, z=mod3, u=data)
fit_result = fit.execute()
print(fit_result)
Unfortunatelly I have not yet included examples of the kind you need in the docs yet, but if you just look at the docs I think you can figure it out in case this doesn't work out of the box.
What is a good way to produce a numpy array containing the values of a function evaluated on an n-dimensional grid of points?
For example, suppose I want to evaluate the function defined by
def func(x, y):
return <some function of x and y>
Suppose I want to evaluate it on a two dimensional array of points with the x values going from 0 to 4 in ten steps, and the y values going from -1 to 1 in twenty steps. What's a good way to do this in numpy?
P.S. This has been asked in various forms on StackOverflow many times, but I couldn't find a concisely stated question and answer. I posted this to provide a concise simple solution (below).
shorter, faster and clearer answer, avoiding meshgrid:
import numpy as np
def func(x, y):
return np.sin(y * x)
xaxis = np.linspace(0, 4, 10)
yaxis = np.linspace(-1, 1, 20)
result = func(xaxis[:,None], yaxis[None,:])
This will be faster in memory if you get something like x^2+y as function, since than x^2 is done on a 1D array (instead of a 2D one), and the increase in dimension only happens when you do the "+". For meshgrid, x^2 will be done on a 2D array, in which essentially every row is the same, causing massive time increases.
Edit: the "x[:,None]", makes x to a 2D array, but with an empty second dimension. This "None" is the same as using "x[:,numpy.newaxis]". The same thing is done with Y, but with making an empty first dimension.
Edit: in 3 dimensions:
def func2(x, y, z):
return np.sin(y * x)+z
xaxis = np.linspace(0, 4, 10)
yaxis = np.linspace(-1, 1, 20)
zaxis = np.linspace(0, 1, 20)
result2 = func2(xaxis[:,None,None], yaxis[None,:,None],zaxis[None,None,:])
This way you can easily extend to n dimensions if you wish, using as many None or : as you have dimensions. Each : makes a dimension, and each None makes an "empty" dimension. The next example shows a bit more how these empty dimensions work. As you can see, the shape changes if you use None, showing that it is a 3D object in the next example, but the empty dimensions only get filled up whenever you multiply with an object that actually has something in those dimensions (sounds complicated, but the next example shows what i mean)
In [1]: import numpy
In [2]: a = numpy.linspace(-1,1,20)
In [3]: a.shape
Out[3]: (20,)
In [4]: a[None,:,None].shape
Out[4]: (1, 20, 1)
In [5]: b = a[None,:,None] # this is a 3D array, but with the first and third dimension being "empty"
In [6]: c = a[:,None,None] # same, but last two dimensions are "empty" here
In [7]: d=b*c
In [8]: d.shape # only the last dimension is "empty" here
Out[8]: (20, 20, 1)
edit: without needing to type the None yourself
def ndm(*args):
return [x[(None,)*i+(slice(None),)+(None,)*(len(args)-i-1)] for i, x in enumerate(args)]
x2,y2,z2 = ndm(xaxis,yaxis,zaxis)
result3 = func2(x2,y2,z2)
This way, you make the None-slicing to create the extra empty dimensions, by making the first argument you give to ndm as the first full dimension, the second as second full dimension etc- it does the same as the 'hardcoded' None-typed syntax used before.
Short explanation: doing x2, y2, z2 = ndm(xaxis, yaxis, zaxis) is the same as doing
x2 = xaxis[:,None,None]
y2 = yaxis[None,:,None]
z2 = zaxis[None,None,:]
but the ndm method should also work for more dimensions, without needing to hardcode the None-slices in multiple lines like just shown. This will also work in numpy versions before 1.8, while numpy.meshgrid only works for higher than 2 dimensions if you have numpy 1.8 or higher.
import numpy as np
def func(x, y):
return np.sin(y * x)
xaxis = np.linspace(0, 4, 10)
yaxis = np.linspace(-1, 1, 20)
x, y = np.meshgrid(xaxis, yaxis)
result = func(x, y)
I use this function to get X, Y, Z values ready for plotting:
def npmap2d(fun, xs, ys, doPrint=False):
Z = np.empty(len(xs) * len(ys))
i = 0
for y in ys:
for x in xs:
Z[i] = fun(x, y)
if doPrint: print([i, x, y, Z[i]])
i += 1
X, Y = np.meshgrid(xs, ys)
Z.shape = X.shape
return X, Y, Z
Usage:
def f(x, y):
# ...some function that can't handle numpy arrays
X, Y, Z = npmap2d(f, np.linspace(0, 0.5, 21), np.linspace(0.6, 0.4, 41))
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_wireframe(X, Y, Z)
The same result can be achieved using map:
xs = np.linspace(0, 4, 10)
ys = np.linspace(-1, 1, 20)
X, Y = np.meshgrid(xs, ys)
Z = np.fromiter(map(f, X.ravel(), Y.ravel()), X.dtype).reshape(X.shape)
In the case your function actually takes a tuple of d elements, i.e. f((x1,x2,x3,...xd)) (for example the scipy.stats.multivariate_normal function), and you want to evaluate f on N^d combinations/grid of N variables, you could also do the following (2D case):
x=np.arange(-1,1,0.2) # each variable is instantiated N=10 times
y=np.arange(-1,1,0.2)
Z=f(np.dstack(np.meshgrid(x,y))) # result is an NxN (10x10) matrix, whose entries are f((xi,yj))
Here np.dstack(np.meshgrid(x,y)) creates an 10x10 "matrix" (technically a 10x10x2 numpy array) whose entries are the 2-dimensional tuples to be evaluated by f.
My two cents:
import numpy as np
x = np.linspace(0, 4, 10)
y = np.linspace(-1, 1, 20)
[X, Y] = np.meshgrid(x, y, indexing = 'ij', sparse = 'true')
def func(x, y):
return x*y/(x**2 + y**2 + 4)
# I have defined a function of x and y.
func(X, Y)
The problem
I'm trying to Cythonize two small functions that mostly deal with numpy ndarrays for some scientific purpose. These two smalls functions are called millions of times in a genetic algorithm and account for the majority of the time taken by the algo.
I made some progress on my own and both work nicely, but i get only a tiny speed improvement (10%). More importantly, cython --annotate show that the majority of the code is still going through Python.
The code
First function:
The aim of this function is to get back slices of data and it is called millions of times in an inner nested loop. Depending on the bool in data[1][1], we either get the slice in the forward or reverse order.
#Ipython notebook magic for cython
%%cython --annotate
import numpy as np
from scipy import signal as scisignal
cimport cython
cimport numpy as np
def get_signal(data):
#data[0] contains the data structure containing the numpy arrays
#data[1][0] contains the position to slice
#data[1][1] contains the orientation to slice, forward = 0, reverse = 1
cdef int halfwinwidth = 100
cdef int midpoint = data[1][0]
cdef int strand = data[1][1]
cdef int start = midpoint - halfwinwidth
cdef int end = midpoint + halfwinwidth
#the arrays we want to slice
cdef np.ndarray r0 = data[0]['normals_forward']
cdef np.ndarray r1 = data[0]['normals_reverse']
cdef np.ndarray r2 = data[0]['normals_combined']
if strand == 0:
normals_forward = r0[start:end]
normals_reverse = r1[start:end]
normals_combined = r2[start:end]
else:
normals_forward = r1[end - 1:start - 1: -1]
normals_reverse = r0[end - 1:start - 1: -1]
normals_combined = r2[end - 1:start - 1: -1]
#return the result as a tuple
row = (normals_forward,
normals_reverse,
normals_combined)
return row
Second function
This one gets a list of tuples of numpy arrays, and we want to add up the arrays element wise, then normalize them and get the integration of the intersection.
def calculate_signal(list signal):
cdef int halfwinwidth = 100
cdef np.ndarray profile_normals_forward = np.zeros(halfwinwidth * 2, dtype='f')
cdef np.ndarray profile_normals_reverse = np.zeros(halfwinwidth * 2, dtype='f')
cdef np.ndarray profile_normals_combined = np.zeros(halfwinwidth * 2, dtype='f')
#b is a tuple of 3 np.ndarrays containing 200 floats
#here we add them up elementwise
for b in signal:
profile_normals_forward += b[0]
profile_normals_reverse += b[1]
profile_normals_combined += b[2]
#normalize the arrays
cdef int count = len(signal)
#print "Normalizing to number of elements"
profile_normals_forward /= count
profile_normals_reverse /= count
profile_normals_combined /= count
intersection_signal = scisignal.detrend(np.fmin(profile_normals_forward, profile_normals_reverse))
intersection_signal[intersection_signal < 0] = 0
intersection = np.sum(intersection_signal)
results = {"intersection": intersection,
"profile_normals_forward": profile_normals_forward,
"profile_normals_reverse": profile_normals_reverse,
"profile_normals_combined": profile_normals_combined,
}
return results
Any help is appreciated - I tried using memory views but for some reason the code got much, much slower.
After fixing the array cdef (as has been indicated, with the dtype specified), you should probably put the routine in a cdef function (which will only be callable by a def function in the same script).
In the declaration of the function, you'll need to provide the type (and the dimensions if it's an array numpy):
cdef get_signal(numpy.ndarray[DTYPE_t, ndim=3] data):
I'm not sure using a dict is a good idea though. You could make use of numpy's column or row slices like data[:, 0].
I have two 2D array, x(ni, nj) and y(ni,nj), that I need to interpolate over one axis. I want to interpolate along last axis for every ni.
I wrote
import numpy as np
from scipy.interpolate import interp1d
z = np.asarray([200,300,400,500,600])
out = []
for i in range(ni):
f = interp1d(x[i,:], y[i,:], kind='linear')
out.append(f(z))
out = np.asarray(out)
However, I think this method is inefficient and slow due to loop if array size is too large. What is the fastest way to interpolate multi-dimensional array like this? Is there any way to perform linear and cubic interpolation without loop? Thanks.
The method you propose does have a python loop, so for large values of ni it is going to get slow. That said, unless you are going to have large ni you shouldn't worry much.
I have created sample input data with the following code:
def sample_data(n_i, n_j, z_shape) :
x = np.random.rand(n_i, n_j) * 1000
x.sort()
x[:,0] = 0
x[:, -1] = 1000
y = np.random.rand(n_i, n_j)
z = np.random.rand(*z_shape) * 1000
return x, y, z
And have tested them with this two versions of linear interpolation:
def interp_1(x, y, z) :
rows, cols = x.shape
out = np.empty((rows,) + z.shape, dtype=y.dtype)
for j in xrange(rows) :
out[j] =interp1d(x[j], y[j], kind='linear', copy=False)(z)
return out
def interp_2(x, y, z) :
rows, cols = x.shape
row_idx = np.arange(rows).reshape((rows,) + (1,) * z.ndim)
col_idx = np.argmax(x.reshape(x.shape + (1,) * z.ndim) > z, axis=1) - 1
ret = y[row_idx, col_idx + 1] - y[row_idx, col_idx]
ret /= x[row_idx, col_idx + 1] - x[row_idx, col_idx]
ret *= z - x[row_idx, col_idx]
ret += y[row_idx, col_idx]
return ret
interp_1 is an optimized version of your code, following Dave's answer. interp_2 is a vectorized implementation of linear interpolation that avoids any python loop whatsoever. Coding something like this requires a sound understanding of broadcasting and indexing in numpy, and some things are going to be less optimized than what interp1d does. A prime example being finding the bin in which to interpolate a value: interp1d will surely break out of loops early once it finds the bin, the above function is comparing the value to all bins.
So the result is going to be very dependent on what n_i and n_j are, and even how long your array z of values to interpolate is. If n_j is small and n_i is large, you should expect an advantage from interp_2, and from interp_1 if it is the other way around. Smaller z should be an advantage to interp_2, longer ones to interp_1.
I have actually timed both approaches with a variety of n_i and n_j, for z of shape (5,) and (50,), here are the graphs:
So it seems that for z of shape (5,) you should go with interp_2 whenever n_j < 1000, and with interp_1 elsewhere. Not surprisingly, the threshold is different for z of shape (50,), now being around n_j < 100. It seems tempting to conclude that you should stick with your code if n_j * len(z) > 5000, but change it to something like interp_2 above if not, but there is a great deal of extrapolating in that statement! If you want to further experiment yourself, here's the code I used to produce the graphs.
n_s = np.logspace(1, 3.3, 25)
int_1 = np.empty((len(n_s),) * 2)
int_2 = np.empty((len(n_s),) * 2)
z_shape = (5,)
for i, n_i in enumerate(n_s) :
print int(n_i)
for j, n_j in enumerate(n_s) :
x, y, z = sample_data(int(n_i), int(n_j), z_shape)
int_1[i, j] = min(timeit.repeat('interp_1(x, y, z)',
'from __main__ import interp_1, x, y, z',
repeat=10, number=1))
int_2[i, j] = min(timeit.repeat('interp_2(x, y, z)',
'from __main__ import interp_2, x, y, z',
repeat=10, number=1))
cs = plt.contour(n_s, n_s, np.transpose(int_1-int_2))
plt.clabel(cs, inline=1, fontsize=10)
plt.xlabel('n_i')
plt.ylabel('n_j')
plt.title('timeit(interp_2) - timeit(interp_1), z.shape=' + str(z_shape))
plt.show()
One optimization is to allocate the result array once like so:
import numpy as np
from scipy.interpolate import interp1d
z = np.asarray([200,300,400,500,600])
out = np.zeros( [ni, len(z)], dtype=np.float32 )
for i in range(ni):
f = interp1d(x[i,:], y[i,:], kind='linear')
out[i,:]=f(z)
This will save you some memory copying that occurs in your implementation, which occurs in the calls to out.append(...).