xarray multiplication is creating incorrect output shape

xarray multiplication is creating incorrect output shape - python

I am doing a multiplication of two xarrays in python such as follows:
z = x * y where x and y are xarrays with dimensions [30,33,720,1440] for x and [33,720,1440] for y. The coordinates along dimensions 1, 2 and 3 for x match the coordinates along dimensions 0, 1 and 2 for y, and have the same dimension names (pressure, latitude, longitude). Strangely, the output, z has dimensions [30,33,630,1237].
I have figured out this is caused by one of the latitude arrays differing from the other by an extremely small amount, -1.42108547e-14, at 90 points, which is practically the no difference. A similar thing happens for the longitude. Any ideas on how to eliminate this difference or force xarray to ignore it? (I'd rather not do the multiplication with numpy).
In case you are wondering, I did try
x.assign_coords(lat=y.lat,lon=y.lon)
I don't know why that didn't work. Maybe its because one coordinate array has a different type (float32 vs float64)?

This works.
y_new = x[0,:,:,:].copy()
y_new.values = y.values
z = x * y_new

Related

Squaring every single item in a 2D list

So I have a large list of points.
I have split those points up into the x coordinates and the y coordinates and then further split them into groups of 1000.
x = [points_Cartesian[x: x + 1000, 0] for x in range(0, len(points_Cartesian), 1000)]
(The y coordinates looks the same but with y instead of x.)
I am trying to turn the cartesian points into polar and to do so I must square every item in x and every item in y.
for sublist1 in x:
temp1 = []
for inte1 in sublist1:
temp1.append(inte1**2)
xSqua.append(temp1)
After that I add both of the Squared values together and square root them to get rad.
rad = np.sqrt(xSqua + ySqua)
The problem is, I started with 10,000 points and somewhere in this code it gets trimmed down to 1,000.
Does anyone know what the error is and how I fix it?

You're already using numpy. You can reshape matrices using numpy.reshape() and square the entire array elementwise using the ** operator on the entire array and your code will be much faster than iterating.
For example, let's say we have a 10000x3 points_cartesian
points_Cartesian = np.random.random((10000,2))
# reshape to 1000 columns, as many rows as required
xpts = points_Cartesian[:, 0].reshape((-1, 1000))
ypts = points_Cartesian[:, 1].reshape((-1, 1000))
# elementwise square using **
rad = np.sqrt(xpts**2 + ypts**2)
ang = np.arctan2(ypts, xpts)
Now rad and ang are 10x1000 arrays.

scipy RegularGridInterpolator The points in dimension 0 must be strictly ascending

When calling:
interpolator = scipy.interpolate.RegularGridInterpolator((X, Y, Z), data, method='linear')
I get the error "The points in dimension 0 must be strictly ascending".
Why must the points have strictly ascending x values? Surely I can create an interpolator with data with the same x values at time, for example with the coordinates into the data array of
0,0,0 and 0,0,1
(or X = [0,0], y = [0,0] and Z = [0,1]
I must be missing something about the input format, but can't see what.

Ok, it looks like RegularGridInterpolator isn't what I need, because it requires all values in the grid to be defined. LinearNDInterpolator is what I need.

Reshaping numpy array

What I am trying to do is take a numpy array representing 3D image data and calculate the hessian matrix for every voxel. My input is a matrix of shape (Z,X,Y) and I can easily take a slice along z and retrieve a single original image.
gx, gy, gz = np.gradient(imgs)
gxx, gxy, gxz = np.gradient(gx)
gyx, gyy, gyz = np.gradient(gy)
gzx, gzy, gzz = np.gradient(gz)
And I can access the hessian for an individual voxel as follows:
x = 100
y = 100
z = 63
H = [[gxx[z][x][y], gxy[z][x][y], gxz[z][x][y]],
[gyx[z][x][y], gyy[z][x][y], gyz[z][x][y]],
[gzx[z][x][y], gzy[z][x][y], gzz[z][x][y]]]
But this is cumbersome and I can't easily slice the data.
I have tried using reshape as follows
H = H.reshape(Z, X, Y, 3, 3)
But when I test this by retrieving the hessian for a specific voxel the, the value returned from the reshaped array is completely different than the original array.
I think I could use zip somehow but I have only been able to find that for making lists of tuples.
Bonus: If there's a faster way to accomplish this please let me know, I essentially need to calculate the three eigenvalues of the hessian matrix for every voxel in the 3D data set. Calculating the hessian values is really fast but finding the eigenvalues for a single 2D image slice takes about 20 seconds. Are there any GPUs or tensor flow accelerated libraries for image processing?

We can use a list comprehension to get the hessians -
H_all = np.array([np.gradient(i) for i in np.gradient(imgs)]).transpose(2,3,4,0,1)
Just to give it a bit of explanation : [np.gradient(i) for i in np.gradient(imgs)] loops through the two levels of outputs from np.gradient calls, resulting in a (3 x 3) shaped tensor at the outer two axes. We need these two as the last two axes in the final output. So, we push those at the end with the transpose.
Thus, H_all holds all the hessians and hence we can extract our specific hessian given x,y,z, like so -
x = 100
y = 100
z = 63
H = H_all[z,y,x]

contour plot - 2D shape of X and Y values [duplicate]

This question already has answers here:
Why does pyplot.contour() require Z to be a 2D array?
(5 answers)
Closed 5 years ago.
I plot a contour plot which indicates the seperating hyperplane of a SVC estimator in a 2D axes using the following code.
X,y= make_circles(n_samples=50,factor=.1,noise=.1)
x_fit=np.linspace(-1.5,1.5,10)
y_fit=np.linspace(-1.5,1.5,10)
Y,XX=np.meshgrid(x_fit,y_fit)
xy=np.vstack([XX.ravel(),Y.ravel()]).T
P=clf.decision_function(xy).reshape(XX.shape)
plt.contour(XX,Y,P,colors="k",levels=[-1,0,1],alpha=0.5,linestyles=["--","-","--"])
Question
Based on this question and the answer of Ilya V. Schurov there is still one issue for me. I understand, that X and Y provides the x and y values and Z provides the "depth" for each xy coordiante and thus has to be 2 dimensional. Further, the X and Y values of the plt.contour() function can be either 1D or 2D (if 1D the meshgrid gets computed internally).
BUT what is the benefit/ reason for X and Y to be 2D?
Because actually the "second dimension" of X and Y can not be plotted on a 2D axes. So has it some "algorithmic performance" reasons for X and Y to be 2D or what is the reason?

Contour plot is not designed for just plotting hyperplanes for classfier. It represents a 3-D surface with a 2-D format; or it plots elevations of a 2-D area. Therefore, plt.contour() has to somehow understand/know elevations covering the whole area. One way, or the current way, is to provide a set of elevations for a set of points covering the 2-D area. And the more you provide, the better/finer the final contour plot is. When providing a 1-D x and y, it represents a line rather than an area, which cannot be used to interpolated a 2-D area.
Another way to plot hyperplanes is to calculate the exact planes yourself. Then you can plot hyperplanes with a 1-D linespace. But I don't think this will be easier than using plt.contour() since plt.contour() did the hard calculation by simulating with interpolation for you.
Edit: How Z works with X and Y in plt.contour()?
It takes some assumption for Z works with X and Y.
If X and Y is 2-D, a value in Z is the depth for a point specified by corresponding (same location by index) values in X and Y.
If X and Y is 1-D, it will be convert to a meshgrid first, as you can see in the source code. Then the rest will work the same way as explained above.
So for your case specifically, using x_fit and y_fit can give you the same result because plt.contour() makes the meshgrid for you. As long as you understand the mechanism, either way is fine. The only thing I would say is if you end up making the meshgrid for calculating P anyway, why not using the meshgrid to avoid assumption/ambiguity?

Python numpy grid transformation using universal functions

Here is my problem : I manipulate 432*46*136*136 grids representing time*(space) encompassed in numpy arrays with numpy and python. I have one array alt, which encompasses the altitudes of the grid points, and another array temp which stores the temperature of the grid points.
It is problematic for a comparison : if T1 and T2 are two results, T1[t0,z0,x0,y0] and T2[t0,z0,x0,y0] represent the temperature at H1[t0,z0,x0,y0] and H2[t0,z0,x0,y0] meters, respectively. But I want to compare the temperature of points at the same altitude, not at the same grid point.
Hence I want to modify the z-axis of my matrices to represent the altitude and not the grid point. I create a function conv(alt[t,z,x,y]) which attributes a number between -20 and 200 to each altitude. Here is my code :
def interpolation_extended(self,temp,alt):
[t,z,x,y]=temp.shape
new=np.zeros([t,220,x,y])
for l in range(0,t):
for j in range(0,z):
for lat in range(0,x):
for lon in range(0,y):
new[l,conv(alt[l,j,lat,lon]),lat,lon]=temp[l,j,lat,lon]
return new
But this takes definitely too much time, I can't work this it. I tried to write it using universal functions with numpy :
def interpolation_extended(self,temp,alt):
[t,z,x,y]=temp.shape
new=np.zeros([t,220,x,y])
for j in range(0,z):
new[:,conv(alt[:,j,:,:]),:,:]=temp[:,j,:,:]
return new
But that does not work. Do you have any idea of doing this in python/numpy without using 4 nested loops ?
Thank you

I can't really try the code since I don't have your matrices, but something like this should do the job.
First, instead of declaring conv as a function, get the whole altitude projection for all your data:
conv = np.round(alt / 500.).astype(int)
Using np.round, the numpys version of round, it rounds all the elements of the matrix by vectorizing operations in C, and thus, you get a new array very quickly (at C speed). The following line aligns the altitudes to start in 0, by shifting all the array by its minimum value (in your case, -20):
conv -= conv.min()
the line above would transform your altitude matrix from [-20, 200] to [0, 220] (better for indexing).
With that, interpolation can be done easily by getting multidimensional indices:
t, z, y, x = np.indices(temp.shape)
the vectors above contain all the indices needed to index your original matrix. You can then create the new matrix by doing:
new_matrix[t, conv[t, z, y, x], y, x] = temp[t, z, y, x]
without any loop at all.
Let me know if it works. It might give you some erros since is hard for me to test it without data, but it should do the job.
The following toy example works fine:
A = np.random.randn(3,4,5) # Random 3x4x5 matrix -- your temp matrix
B = np.random.randint(0, 10, 3*4*5).reshape(3,4,5) # your conv matrix with altitudes from 0 to 9
C = np.zeros((3,10,5)) # your new matrix
z, y, x = np.indices(A.shape)
C[z, B[z, y, x], x] = A[z, y, x]
C contains your results by altitude.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.