I would like to apply a (more complex?) function on my 3d numpy array with the shape x,y,z = (4,4,3).
Let's assume I have the following array:
array = np.arange(48)
array = array.reshape([4,4,3])
Now I would like to call the following function on each point of the array:
p(x,y,z) = a(z) + b(z)*ps(x,y)
Let's assume a and b are the following 1d arrays, respectively ps a 2d array.
a = np.random.randint(1,10, size=3)
b = np.random.randint(1,10, size=3)
ps = np.arrange(16)
ps = ps.reshape([4,4])
My intuitive approach was to loop over my array and call the function on each point. It works, but of course it's way too slow:
def calcP(a,b,ps,x,y,z):
p = a[z]+b[z]*ps[x,y]
return p
def stupidLoop(array, a, b, ps, x, y, z):
dummy = array
for z in range (0, 3):
for x in range (0, 4):
for y in range (0, 4):
dummy[x,y,z]=calcP(a,b,ps,x,y,z)
return dummy
updatedArray=stupidLoop(array,a, b, ps, x, y, z)
Is there a faster way? I know it works with vectorized functions, but I cannot figure it out with mine.
I didn't actually try it with these numbers. It's just to exemplify my problem. It comes from the Meteorology world and is a little more complex.
Vectorize the loop, and use broadcasting:
a.reshape([1,1,-1]) + b.reshape([1,1,-1]) * ps.reshape([4,4,1])
EDIT:
Thanks #NilsWerner for offering a more common way in comment:
a + b * ps[:, :, None]
You can do this using numpy.fromfunction():
import numpy as np
a = np.random.randint(1,10, size=3)
b = np.random.randint(1,10, size=3)
ps = np.arange(16)
ps = ps.reshape([4,4])
def calcP(x,y,z,a=a,b=b,ps=ps):
p = a[z]+b[z]*ps[x,y] + 0.0
return p
array = np.arange(48)
array = array.reshape([4,4,3])
updatedArray = np.fromfunction(calcP, (4,4,3), a=a,b=b,ps=ps, dtype=int)
print (updatedArray)
Notice that I've modified your function calcP slightly, to take kwargs. Also, I've added 0.0, to ensure that the output array will be of floats and not ints.
Also, notice that the second argument to fromfunction() merely specifies the shape of the grid, over which the function calcP() is to be invoked.
Output (will vary each time due to randint):
[[[ 8. 5. 3.]
[ 9. 6. 12.]
[ 10. 7. 21.]
[ 11. 8. 30.]]
[[ 12. 9. 39.]
[ 13. 10. 48.]
[ 14. 11. 57.]
[ 15. 12. 66.]]
[[ 16. 13. 75.]
[ 17. 14. 84.]
[ 18. 15. 93.]
[ 19. 16. 102.]]
[[ 20. 17. 111.]
[ 21. 18. 120.]
[ 22. 19. 129.]
[ 23. 20. 138.]]]
Related
I'm using scipy.integrate's solve_ivp method to solve an ivp, and I want to be able to evaluate a function at the time steps that I give for the integration, but I don't know how to do it.
I could go back through each of the elements in the integration, but that would take a ridiculous amount of time in addition to the time that it already takes to solve the ivp, so I would much rather be able to calculate them at the same time that the actual method calculates the values at during the integration.
import scipy.integrate
import numpy
class Foo:
def __init__(self):
self.foo_vector_1 = numpy.zeros(3)
self.foo_vector_2 = numpy.zeros(3)
self.foo_vector_3 = numpy.zeros(3)
foo = Foo()
d_vector_1 = lambda foo: # gets the derivative of foo_vector_1
d_vector_2 = lambda foo: # gets the derivative of foo_vector_2
def get_foo_vector_3_value(foo):
return # returns the ACTUAL VALUE of foo_vector_3, NOT its derivative
def dy(t, y):
foo.foo_vector_1 = numpy.array((y[0],y[1],y[2]))
foo.foo_vector_2 = numpy.array((y[3],y[4],y[5]))
return numpy.array((d_vector_1(foo),d_vector_2(foo))).flatten().tolist()
foo.foo_vector_1 = numpy.array((1,2,3))
foo.foo_vector_2 = numpy.array((4,5,6))
y0 = numpy.array((foo.foo_vector_1, foo.foo_vector_2)).flatten().tolist()
sol = scipy.integrate.solve_ivp(dy, (0,10), y0, t_eval=numpy.arange(0,1000,1))
foo_vectors_1 = numpy.column_stack((sol.y[0], sol.y[1], sol.y[2]))
foo_vectors_2 = numpy.column_stack((sol.y[3], sol.y[4], sol.y[5]))
foo_vectors_3 = ????????
Ideally, I would be able to get the value of foo_vectors_3 without having to reset foo in a loop over the whole lists of foo vectors, because for me that would actually take a significant amount of computation time.
I think the friction here is avoiding the use 1D numpy ndarray as the base object for computation. You can mentally apportion the 1D array into your 2 separate foo attributes Then the computation of foo_vectors_3 will be trivial compared to the ODE integration. You could also add helper functions to map from the 1D ndarray for solve_ivp and your foo_vectors and back.
In [65]: import scipy.integrate
...: import numpy as np
...:
...: def d_vec1(t, y):
...: # put in your function here instead of just returning 1
...: return 1 * np.ones_like(y)
...:
...: def d_vec2(t, y):
...: # put in your function here instead of just returning 2
...: return 2 * np.ones_like(y)
...:
...: def eval_foo3(t, y):
...: return y[0:3,:] + y[3:,:] # use your own function instead
...:
...: def dy(t, y):
...: return numpy.array((d_vec1(t, y[0:3]), d_vec2(t, y[3:]))).flatten()
...:
...: v1 = np.array([1, 2, 3])
...: v2 = np.array([4, 5, 6])
...: y0 = numpy.array((v1, v2)).flatten()
...: t_eval = np.linspace(0, 10, 11)
...: sol = scipy.integrate.solve_ivp(dy, (0, 10), y0, t_eval=t_eval)
...:
...: foo3 = eval_foo3(sol.t, sol.y)
...: print(sol.y[0:3])
...: print(sol.y[3:])
...: print(foo3)
[[ 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.]
[ 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.]
[ 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.]]
[[ 4. 6. 8. 10. 12. 14. 16. 18. 20. 22. 24.]
[ 5. 7. 9. 11. 13. 15. 17. 19. 21. 23. 25.]
[ 6. 8. 10. 12. 14. 16. 18. 20. 22. 24. 26.]]
[[ 5. 8. 11. 14. 17. 20. 23. 26. 29. 32. 35.]
[ 7. 10. 13. 16. 19. 22. 25. 28. 31. 34. 37.]
[ 9. 12. 15. 18. 21. 24. 27. 30. 33. 36. 39.]]
Given a vector, [1,2,3,4,5] for example, how to upsample the vector with linear interpolation to a certain length, such as 45 in python.
If it is linear, there should be a constant increase or decrease between each new element. In your case it is one. So sample the difference between two elements, then add that to the last element however many times you want to.
a = [1,2,3,4,5]
num_add = 45 -len(a)
b = a[1] - a[0]
for z in range(1,num_add):
a.append(b + a[-1])
I think this should work, although you may have to play with the range.
Well, I interpreted your list of [1, 2, 3, 4, 5] as simply an example. If you want a script that will actually interpolate the series you give it, try this:
from scipy.optimize import curve_fit
import numpy as np
# Line equation - doesn't have to be linear
def lin_eq(x, m, b):
return x*m + b
# Your actual data
std_y = np.array([1, 2, 3, 4, 5])
# Index of data
std_x = np.arange(1, len(std_y) + 1)
popt, pcov = curve_fit(lin_eq, std_x, std_y)
top = 45
# Index of projected data
proj_x = np.arange(1, top + 1)
# Interpolated data
proj_y = lin_eq(proj_x, *popt)
print proj_y
[ 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30.
31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45.]
I'm looking for a way to calculate the cumulative sum with numpy, but don't want to roll forward the value (or set it to zero) in case the cumulative sum is very close to zero and negative.
For instance
a = np.asarray([0, 4999, -5000, 1000])
np.cumsum(a)
returns [0, 4999, -1, 999]
but, I'd like to set the [2]-value (-1) to zero during the calculation. The problem is that this decision can only be done during calculation as the intermediate result isn't know a priori.
The expected array is: [0, 4999, 0, 1000]
The reason for this is that I'm getting very small values (floating point, not integers as in the example) which are due to floating point calculations which should in reality be zero. Calculating the cumulative sum compounds those values which leads to errors.
The Kahan summation algorithm could solve the problem. Unfortunately, it is not implemented in numpy. This means a custom implementation is required:
def kahan_cumsum(x):
x = np.asarray(x)
cumulator = np.zeros_like(x)
compensation = 0.0
cumulator[0] = x[0]
for i in range(1, len(x)):
y = x[i] - compensation
t = cumulator[i - 1] + y
compensation = (t - cumulator[i - 1]) - y
cumulator[i] = t
return cumulator
I have to admit, this is not exactly what was asked for in the question. (A value of -1 at the 3rd output of the cumsum is correct in the example). However, I hope this solves the actual problem behind the question, which is related to floating point precision.
I wonder if rounding will do what you are asking for:
np.cumsum(np.around(a,-1))
# the -1 means it rounds to the nearest 10
gives
array([ 0, 5000, 0, 1000])
It is not exactly as you put in your expected array from your answer, but using around, perhaps with the decimals parameter set to 0, might work when you apply it to the problem with floats.
Probably the best way to go is to write this bit in Cython (name the file cumsum_eps.pyx):
cimport numpy as cnp
import numpy as np
cdef inline _cumsum_eps_f4(float *A, int ndim, int dims[], float *out, float eps):
cdef float sum
cdef size_t ofs
N = 1
for i in xrange(0, ndim - 1):
N *= dims[i]
ofs = 0
for i in xrange(0, N):
sum = 0
for k in xrange(0, dims[ndim-1]):
sum += A[ofs]
if abs(sum) < eps:
sum = 0
out[ofs] = sum
ofs += 1
def cumsum_eps_f4(cnp.ndarray[cnp.float32_t, mode='c'] A, shape, float eps):
cdef cnp.ndarray[cnp.float32_t] _out
cdef cnp.ndarray[cnp.int_t] _shape
N = np.prod(shape)
out = np.zeros(N, dtype=np.float32)
_out = <cnp.ndarray[cnp.float32_t]> out
_shape = <cnp.ndarray[cnp.int_t]> np.array(shape, dtype=np.int)
_cumsum_eps_f4(&A[0], len(shape), <int*> &_shape[0], &_out[0], eps)
return out.reshape(shape)
def cumsum_eps(A, axis=None, eps=np.finfo('float').eps):
A = np.array(A)
if axis is None:
A = np.ravel(A)
else:
axes = list(xrange(len(A.shape)))
axes[axis], axes[-1] = axes[-1], axes[axis]
A = np.transpose(A, axes)
if A.dtype == np.float32:
out = cumsum_eps_f4(np.ravel(np.ascontiguousarray(A)), A.shape, eps)
else:
raise ValueError('Unsupported dtype')
if axis is not None: out = np.transpose(out, axes)
return out
then you can compile it like this (Windows, Visual C++ 2008 Command Line):
\Python27\Scripts\cython.exe cumsum_eps.pyx
cl /c cumsum_eps.c /IC:\Python27\include /IC:\Python27\Lib\site-packages\numpy\core\include
F:\Users\sadaszew\Downloads>link /dll cumsum_eps.obj C:\Python27\libs\python27.lib /OUT:cumsum_eps.pyd
or like this (Linux use .so extension/Cygwin use .dll extension, gcc):
cython cumsum_eps.pyx
gcc -c cumsum_eps.c -o cumsum_eps.o -I/usr/include/python2.7 -I/usr/lib/python2.7/site-packages/numpy/core/include
gcc -shared cumsum_eps.o -o cumsum_eps.so -lpython2.7
and use like this:
from cumsum_eps import *
import numpy as np
x = np.array([[1,2,3,4], [5,6,7,8]], dtype=np.float32)
>>> print cumsum_eps(x)
[ 1. 3. 6. 10. 15. 21. 28. 36.]
>>> print cumsum_eps(x, axis=0)
[[ 1. 2. 3. 4.]
[ 6. 8. 10. 12.]]
>>> print cumsum_eps(x, axis=1)
[[ 1. 3. 6. 10.]
[ 5. 11. 18. 26.]]
>>> print cumsum_eps(x, axis=0, eps=1)
[[ 1. 2. 3. 4.]
[ 6. 8. 10. 12.]]
>>> print cumsum_eps(x, axis=0, eps=2)
[[ 0. 2. 3. 4.]
[ 5. 8. 10. 12.]]
>>> print cumsum_eps(x, axis=0, eps=3)
[[ 0. 0. 3. 4.]
[ 5. 6. 10. 12.]]
>>> print cumsum_eps(x, axis=0, eps=4)
[[ 0. 0. 0. 4.]
[ 5. 6. 7. 12.]]
>>> print cumsum_eps(x, axis=0, eps=8)
[[ 0. 0. 0. 0.]
[ 0. 0. 0. 8.]]
>>> print cumsum_eps(x, axis=1, eps=3)
[[ 0. 0. 3. 7.]
[ 5. 11. 18. 26.]]
and so on, of course normally eps would be some small value, here integers are used just for the sake of demonstration / easiness of typing.
If you need this for double as well the _f8 variants are trivial to write and another case has to be handled in cumsum_eps().
When you're happy with the implementation you should make it a proper part of your setup.py - Cython setup.py
Update #1: If you have good compiler support in run environment you could try [Theano][3] to implement either compensation algorithm or your original idea:
import numpy as np
import theano
import theano.tensor as T
from theano.ifelse import ifelse
A=T.vector('A')
sum=T.as_tensor_variable(np.asarray(0, dtype=np.float64))
res, upd=theano.scan(fn=lambda cur_sum, val: ifelse(T.lt(cur_sum+val, 1.0), np.asarray(0, dtype=np.float64), cur_sum+val), outputs_info=sum, sequences=A)
f=theano.function(inputs=[A], outputs=res)
f([0.9, 2, 3, 4])
will give [0 2 3 4] output. In either Cython or this you get at least +/- performance of the native code.
My aim is to interpolate some data. To do that i have to create a meshgrid.
To do this step, i got an array with my 2D coordinate "coord" (first column : element number, second : X and third : Y).
I do a meshgrid with np.meshgrid as you can see below.
But my results seem to be strange, so i would like to know if i have done
a mistake...Must i have to reorganize my data before meshgrid step?
import numpy as np
coord = np.array([[ 1. , -1.38888667, -1.94444333],
[ 2. , -1.94444333, -1.38888667],
[ 3. , 0.27777667, -1.94444333],
[ 4. , -0.27777667, -1.38888667],
[ 5. , 1.94444333, -1.94444333],
[ 6. , 1.38888667, -1.38888667],
[ 7. , -1.38888667, -0.27777667],
[ 8. , -1.94444333, 0.27777667],
[ 9. , 0.27777667, -0.27777667],
[ 10. , -0.27777667, 0.27777667],
[ 11. , 1.94444333, -0.27777667],
[ 12. , 1.38888667, 0.27777667],
[ 13. , -1.38888667, 1.38888667],
[ 14. , -1.94444333, 1.94444333],
[ 15. , 0.27777667, 1.38888667],
[ 16. , -0.27777667, 1.94444333],
[ 17. , 1.94444333, 1.38888667],
[ 18. , 1.38888667, 1.94444333]])
[Y,X]=np.meshgrid(coord[:,2],coord[:,1])
If i plot Y, i got that :
plt.imshow(Y);plt.colorbar();plt.show()
---- EDIT LATER -----
I m wondering (for example) if the coordinates with meshgrid have to be strictly increasing? if there is a better way when i have some coordinates not organized?
For the interpolation, i would like to use :
def interpolate(values, tri,uv,d=2):
simplex = tri.find_simplex(uv)
vertices = np.take(tri.simplices, simplex, axis=0)
temp = np.take(tri.transform, simplex, axis=0)
delta = uv- temp[:, d]
bary = np.einsum('njk,nk->nj', temp[:, :d, :], delta)
return np.einsum('nj,nj->n', np.take(values, vertices), np.hstack((bary, 1.0 - bary.sum(axis=1, keepdims=True))))
which was used in Stack before Speedup scipy griddata for multiple interpolations between two irregular grids allowing to limit the calculation time
EDIT: Paul has solved this one below. Thanks!
I'm trying to resample (upscale) a 3x3 matrix to 5x5, filling in the intermediate points with either interpolate.interp2d or interpolate.RectBivariateSpline (or whatever works).
If there's a simple, existing function to do this, I'd like to use it, but I haven't found it yet. For example, a function that would work like:
# upscale 2x2 to 4x4
matrixSmall = ([[-1,8],[3,5]])
matrixBig = matrixSmall.resample(4,4,cubic)
So, if I start with a 3x3 matrix / array:
0,-2,0
-2,11,-2
0,-2,0
I want to compute a new 5x5 matrix ("I" meaning interpolated value):
0, I[1,0], -2, I[3,0], 0
I[0,1], I[1,1], I[2,1], I[3,1], I[4,1]
-2, I[1,2], 11, I[3,2], -2
I[0,3], I[1,3], I[2,3], I[3,3], I[4,3]
0, I[1,4], -2, I[3,4], 0
I've been searching and reading up and trying various different test code, but I haven't quite figured out the correct syntax for what I'm trying to do. I'm also not sure if I need to be using meshgrid, mgrid or linspace in certain lines.
EDIT: Fixed and working Thanks to Paul
import numpy, scipy
from scipy import interpolate
kernelIn = numpy.array([[0,-2,0],
[-2,11,-2],
[0,-2,0]])
inKSize = len(kernelIn)
outKSize = 5
kernelOut = numpy.zeros((outKSize,outKSize),numpy.uint8)
x = numpy.array([0,1,2])
y = numpy.array([0,1,2])
z = kernelIn
xx = numpy.linspace(x.min(),x.max(),outKSize)
yy = numpy.linspace(y.min(),y.max(),outKSize)
newKernel = interpolate.RectBivariateSpline(x,y,z, kx=2,ky=2)
kernelOut = newKernel(xx,yy)
print kernelOut
Only two small problems:
1) Your xx,yy is outside the bounds of x,y (you can extrapolate, but I'm guessing you don't want to.)
2) Your sample size is too small for a kx and ky of 3 (default). Lower it to 2 and get a quadratic fit instead of cubic.
import numpy, scipy
from scipy import interpolate
kernelIn = numpy.array([
[0,-2,0],
[-2,11,-2],
[0,-2,0]])
inKSize = len(kernelIn)
outKSize = 5
kernelOut = numpy.zeros((outKSize),numpy.uint8)
x = numpy.array([0,1,2])
y = numpy.array([0,1,2])
z = kernelIn
xx = numpy.linspace(x.min(),x.max(),outKSize)
yy = numpy.linspace(y.min(),y.max(),outKSize)
newKernel = interpolate.RectBivariateSpline(x,y,z, kx=2,ky=2)
kernelOut = newKernel(xx,yy)
print kernelOut
##[[ 0. -1.5 -2. -1.5 0. ]
## [ -1.5 5.4375 7.75 5.4375 -1.5 ]
## [ -2. 7.75 11. 7.75 -2. ]
## [ -1.5 5.4375 7.75 5.4375 -1.5 ]
## [ 0. -1.5 -2. -1.5 0. ]]
If you are using scipy already, I think scipy.ndimage.interpolate.zoom can do what you need:
import numpy
import scipy.ndimage
a = numpy.array([[0.,-2.,0.], [-2.,11.,-2.], [0.,-2.,0.]])
out = numpy.round(scipy.ndimage.interpolation.zoom(input=a, zoom=(5./3), order = 2),1)
print out
#[[ 0. -1. -2. -1. 0. ]
# [ -1. 1.8 4.5 1.8 -1. ]
# [ -2. 4.5 11. 4.5 -2. ]
# [ -1. 1.8 4.5 1.8 -1. ]
# [ 0. -1. -2. -1. 0. ]]
Here the "zoom factor" is 5./3 because we are going from a 3x3 array to a 5x5 array. If you read the docs, it says that you can also specify the zoom factor independently for the two axes, which means you can upscale non-square matrices as well. By default, it uses third order spline interpolation, which I am not sure is best.
I tried it on some images and it works nicely.