I have a meshgrid in numpy. I make some calculations on the points. I want to filter out points that could not be calcutaled for some reason ( division by zero).
from numpy import arange, array
Xout = arange(-400, 400, 20)
Yout = arange(0, 400, 20)
Zout = arange(0, 400, 20)
Xout_3d, Yout_3d, Zout_3d = numpy.meshgrid(Xout,Yout,Zout)
#some calculations
# for example
b = z / ( y - x )
To perform z / ( y - x ) using those 3D mesh arrays, you can create a mask of the valid ones. Now, the valid ones would be the ones where any pair of combinations between y and x aren't identical. So, this mask would be of shape (M,N), where M and N are the lengths of the Y and X axes respectively. To get such a mask to span across all combinations between X and Y, we could use NumPy's broadcasting. Thus, we would have such a mask like so -
mask = Yout[:,None] != Xout
Finally, and again using broadcasting to broadcast the mask along the first two axes of the3D arrays, we could perform such a division and choose between an invalid specifier and the actual division result using np.where, like so -
invalid_spec = 0
out = np.where(mask[...,None],Zout_3d/(Yout_3d-Xout_3d),invalid_spec)
Alternatively, we can directly get to such an output using broadcasting and thus avoid using meshgrid and having those heavy 3D arrays in workspace. The idea is to simultaneously populate the 3D grids and perform the subtraction and division computations, both on the fly. So, the implementation would look something like this -
np.where(mask[...,None],Zout/(Yout[:,None,None] - Xout[:,None]),invalid_spec)
Related
I'm trying to compare a 2D array to the product of two 1D arrays (joint-probability density v.s. product of its individual probability densities) in order to determine if variables x and y are independent, where independence is given as ρ(x,y)=ρ(x)*ρ(y).
Let's say I called the 2D array h, and the 1D lists n and m. How would I go about iterating over h to check if it's elements are equivalent to n*m?
To test for exact equality, just use np.all()
import numpy as np
m = np.random.rand(10)
n = np.random.rand(20)
h = m.reshape(1, -1) * n.reshape(-1, 1)
print(np.all(h == m.reshape(1, -1) * n.reshape(-1, 1))) # True
To test whether the numbers are all close, you could use:
print(np.all(np.isclose(h, m.reshape(1, -1) * n.reshape(-1, 1))))
I want to vectorise the dot product of several 3x3 matrices (rotation matrix around x-axis) with several 3x1 vectors. The application is the transformation of points (approx 500k per array) from one to another coordinate system.
Here in the example only four of each. Hence, the result should be again 4 times a 3x1 vector, respectively the single components x,y,z be a 4x0 vector. But I cannot get the dimensions figured out: Here the dot product with tensordot in results in a shape of (4,3,4), of which I need the diagonals again:
x,y,z = np.zeros((3,4,1))
rota = np.arange(4* 3 * 3).reshape((4,3, 3))
v= np.arange(4 * 3).reshape((4, 3))
result = np.zeros_like(v, dtype = np.float64)
vec_rotated = np.tensordot(rota,v, axes=([-1],[1]))
for i in range(result.shape[0]):
result[i,:] = vec_rotated[i,:,i]
x,y,z = result.T
How can i vectorise the complete thing?
Use np.einsum for an efficient solution -
x,y,z = np.einsum('ijk,ik->ji',rota,v)
Alternative with np.matmul/# operator in Python 3.x -
x,y,z = np.matmul(rota,v[:,:,None])[...,0].T
x,y,z = (rota#v[...,None])[...,0].T
works via transpose to obtain one component per diagonal:
vec_rotated = vec_rotated.transpose((1,0,2))
x,y,z = np.diag(vec_rotated[0,:,:]),np.diag(vec_rotated[1,:,:]),np.diag(vec_rotated[2,:,:])
I've an image of about 8000x9000 size as a numpy matrix. I also have a list of indices in a numpy 2xn matrix. These indices are fractional as well as may be out of image size. I need to interpolate the image and find the values for the given indices. If the indices fall outside, I need to return numpy.nan for them. Currently I'm doing it in for loop as below
def interpolate_image(image: numpy.ndarray, indices: numpy.ndarray) -> numpy.ndarray:
"""
:param image:
:param indices: 2xN matrix. 1st row is dim1 (rows) indices, 2nd row is dim2 (cols) indices
:return:
"""
# Todo: Vectorize this
M, N = image.shape
num_indices = indices.shape[1]
interpolated_image = numpy.zeros((1, num_indices))
for i in range(num_indices):
x, y = indices[:, i]
if (x < 0 or x > M - 1) or (y < 0 or y > N - 1):
interpolated_image[0, i] = numpy.nan
else:
# Todo: Do Bilinear Interpolation. For now nearest neighbor is implemented
interpolated_image[0, i] = image[int(round(x)), int(round(y))]
return interpolated_image
But the for loop is taking huge amount of time (as expected). How can I vectorize this? I found scipy.interpolate.interp2d, but I'm not able to use it. Can someone explain how to use this or any other method is also fine. I also found this, but again it is not according to my requirements. Given x and y indices, these generated interpolated matrices. I don't want that. For the given indices, I just want the interpolated values i.e. I need a vector output. Not a matrix.
I tried like this, but as said above, it gives a matrix output
f = interpolate.interp2d(numpy.arange(image.shape[0]), numpy.arange(image.shape[1]), image, kind='linear')
interp_image_vect = f(indices[:,0], indices[:,1])
RuntimeError: Cannot produce output of size 73156608x73156608 (size too large)
For now, I've implemented nearest-neighbor interpolation. scipy interp2d doesn't have nearest neighbor. It would be good if the library function as nearest neighbor (so I can compare). If not, then also fine.
It looks like scipy.interpolate.RectBivariateSpline will do the trick:
from scipy.interpolate import RectBivariateSpline
image = # as given
indices = # as given
spline = RectBivariateSpline(numpy.arange(M), numpy.arange(N), image)
interpolated = spline(indices[0], indices[1], grid=False)
This gets you the interpolated values, but it doesn't give you nan where you need it. You can get that with where:
nans = numpy.zeros(interpolated.shape) + numpy.nan
x_in_bounds = (0 <= indices[0]) & (indices[0] < M)
y_in_bounds = (0 <= indices[1]) & (indices[1] < N)
bounded = numpy.where(x_in_bounds & y_in_bounds, interpolated, nans)
I tested this with a 2624x2624 image and 100,000 points in indices and all told it took under a second.
What I am trying to do is take a numpy array representing 3D image data and calculate the hessian matrix for every voxel. My input is a matrix of shape (Z,X,Y) and I can easily take a slice along z and retrieve a single original image.
gx, gy, gz = np.gradient(imgs)
gxx, gxy, gxz = np.gradient(gx)
gyx, gyy, gyz = np.gradient(gy)
gzx, gzy, gzz = np.gradient(gz)
And I can access the hessian for an individual voxel as follows:
x = 100
y = 100
z = 63
H = [[gxx[z][x][y], gxy[z][x][y], gxz[z][x][y]],
[gyx[z][x][y], gyy[z][x][y], gyz[z][x][y]],
[gzx[z][x][y], gzy[z][x][y], gzz[z][x][y]]]
But this is cumbersome and I can't easily slice the data.
I have tried using reshape as follows
H = H.reshape(Z, X, Y, 3, 3)
But when I test this by retrieving the hessian for a specific voxel the, the value returned from the reshaped array is completely different than the original array.
I think I could use zip somehow but I have only been able to find that for making lists of tuples.
Bonus: If there's a faster way to accomplish this please let me know, I essentially need to calculate the three eigenvalues of the hessian matrix for every voxel in the 3D data set. Calculating the hessian values is really fast but finding the eigenvalues for a single 2D image slice takes about 20 seconds. Are there any GPUs or tensor flow accelerated libraries for image processing?
We can use a list comprehension to get the hessians -
H_all = np.array([np.gradient(i) for i in np.gradient(imgs)]).transpose(2,3,4,0,1)
Just to give it a bit of explanation : [np.gradient(i) for i in np.gradient(imgs)] loops through the two levels of outputs from np.gradient calls, resulting in a (3 x 3) shaped tensor at the outer two axes. We need these two as the last two axes in the final output. So, we push those at the end with the transpose.
Thus, H_all holds all the hessians and hence we can extract our specific hessian given x,y,z, like so -
x = 100
y = 100
z = 63
H = H_all[z,y,x]
I tried to translate a piece of code from Matlab to Python and I'm running into some errors:
Matlab:
function [beta] = linear_regression_train(traindata)
y = traindata(:,1); %output
ind2 = find(y == 2);
ind3 = find(y == 3);
y(ind2) = -1;
y(ind3) = 1;
X = traindata(:,2:257); %X matrix,with size of 1389x256
beta = inv(X'*X)*X'*y;
Python:
def linear_regression_train(traindata):
y = traindata[:,0] # This is the output
ind2 = (labels==2).nonzero()
ind3 = (labels==3).nonzero()
y[ind2] = -1
y[ind3] = 1
X = traindata[ : , 1:256]
X_T = numpy.transpose(X)
beta = inv(X_T*X)*X_T*y
return beta
I am receiving an error: operands could not be broadcast together with shapes (257,0,1389) (1389,0,257) on the line where beta is calculated.
Any help is appreciated!
Thanks!
The problem is that you are working with numpy arrays, not matrices as in MATLAB. Matrices, by default, do matrix mathematical operations. So X*Y does a matrix multiplication of X and Y. With arrays, however, the default is to use element-by-element operations. So X*Y multiplies each corresponding element of X and Y. This is the equivalent of MATLAB's .* operation.
But just like how MATLAB's matrices can do element-by-element operations, Numpy's arrays can do matrix multiplication. So what you need to do is use numpy's matrix multiplication instead of its element-by-element multiplication. For Python 3.5 or higher (which is the version you should be using for this sort of work), that is just the # operator. So your line becomes:
beta = inv(X_T # X) # X_T # y
Or, better yet, you can use the simpler .T transpose, which is the same as np.transpose but much more concise (you can get rid of the `np.transpose line entirely):
beta = inv(X.T # X) # X.T # y
For Python 3.4 or earlier, you will need to use np.dot since those versions of python don't have the # matrix multiplication operator:
beta = np.dot(np.dot(inv(np.dot(X.T, X)), X.T), y)
Numpy has a matrix object that uses matrix operations by default like the MATLAB matrix. Do not use it! It is slow, poorly-supported, and almost never what you really want. The Python community has standardized around arrays, so use those.
There may also be some issues with the dimensions of traindata. For this to work properly then traindata.ndim should be equal to 3. In order for y and X to be 2D, traindata should be 3D.
This could be an issue if traindata is 2D and you want y to be MATLAB-style "vector" (what MATLAB calls "vectors" aren't really vectors). In numpy, using a single index like traindata[:, 0] reduces the number of dimensions, while taking a slice like traindata[:, :1] doesn't. So to keep y 2D when traindata is 2D, just do a length-1 slice, traindata[:, :1]. This is exactly the same values, but this keeps the same number of dimensions as traindata.
Notes: Your code can be significantly simplified using logical indexing:
def linear_regression_train(traindata):
y = traindata[:, 0] # This is the output
y[labels == 2] = -1
y[labels == 3] = 1
X = traindata[:, 1:257]
return inv(X.T # X) # X.T # y
return beta
Also, your slice is wrong when defining X. Python slicing excludes the last value, so to get a 256 long slice you need to do 1:257, as I did above.
Finally, please keep in mind that modifications to arrays inside functions carry over outside the functions, and indexing does not make a copy. So your changes to y (setting some values to 1 and others to -1), will affect traindata outside of your function. If you want to avoid that, you need to make a copy before you make your changes:
y = traindata[:, 0].copy()