Indexing Arrays in NumPy - python

New Here. I have a 100x100x100 array. I want to select only the first 10x10x10 values in the array, then average across first two dimensions for a 100x10 array. Then I need to pull out 20 of the values from the 100x10 and plot those 20 numbers on a scatterplot. Any help?

Related

Changing a 2D array into a 3D array

I have an array that has 8450 rows and 16 columns. I want to feed these data points into an RNN with each 50 points being an entry. So 0-49 is z=0, and 1-50 is z=1 and so forth. The columns need to remain unchanged so that I can still have the same data in each z axis entry. So basically I am taking every chunk of 50 points and moving it into a third axis. Is there simple way to do this python? I tried the reshape but I may not have been doing it correctly. Currently the data is in a pandas dataframe.
points = 50
for i in range(len(data_prepped_dataframe)-points):
x_data = data_prepped_dataframe.iloc[i:i+points,:]
So far I have this but all this does is give me the last 50 points in the data set. I tried adding indexes to the x_data term but that threw an error
I tried
x_data[:,:,i] = data_prepped_dataframe.iloc[i:i+points,:]
but the error said x_data wasn't defined.
If you change the dataframe to an array using np.array() and then add .copy() at the end this will output the 3D array. Specifically in this case a (8400 x 50 x 16) array

Efficiently filter 3D matrix in numpy with variable 2D masks

I have a 3D numpy array points of dimensions [10000x3000x128] where the first dimension is the number of frames, the second dimension the number of points in each frame and the third dimension is a 128-element feature vector associated to each point. What I want to do is to efficiently filter the points in each frame by using a boolean 2D mask of dimensions [10000x3000] and for each of the selected points also take the related 128-dim vector of features. Moreover, in output I need still a 3D vector and not a merged 2D vector and possibly avoid any for loop.
Actually what I'm doing is:
# example of points
points = np.array([10000, 3000, 128])
# fg, bg = 2D dimensional boolean np.array
# init empty lists
fg_points, bg_points = [], []
for i in range(points.shape[0]):
fg_mask_tmp, bg_mask_tmp = fg[i], bg[i]
fg_points.append(points[i,fg_mask_tmp,:])
bg_points.append(points[i,bg_mask_tmp,:])
fg_features, bg_features = np.array(fg_points), np.array(bg_points)
But this is a quite naive solution that for sure can be improved in a more numpy-like way.
In addition, I also tried other solutions as:
fg_features = points[fg,:]
But this solution does not preserve the dimensions of the array merging the two first dimensions since the number of filtered points for each frame can vary.
Another solution I tried is to enlarge the 2D masks by appending a [128] true value to the last dimension, but with any successful result.
Dos anyone know a possible efficient solution?
Thank you in advance for any help!

1D plots from 3D array

I have a 3D data cube and I am trying to make a plot of the first axis at a specific value of the other two axes. The goal is to make a velocity plot at given coordinates in the sky.
I have tried to create an 1D array from the 3D array by putting in my values for the last two axes. This is what I have tried
achan=50
dchan = 200
lmcdata[:][achan][dchan] #this array has three axes, vchan, achan, dchan.
I am expecting an array of size 120 as there are 120 velocity channels that make up the vchan axis. When trying the code above I keep getting an array of size 655 which is the number of entries for the dchan axis.
Python slicing works from left to right. In this case, lmcdata[:] is returning the whole lmcdata list. So, lmcdata[:][achan][dchan] is equivalent to just lmcdata[achan][dchan].
For higher level indexing and slicing tasks like this, I highly recommend the numpy package. You will be able to slice lmcdata as expected after turning it into a numpy array: lmcdata = np.asarray(lmcdata).

Concatenating numpy arrays of different shapes

I have several N-dimensional arrays of different shapes and want to combine them into a new (N+1)-dimensional array, where the new axis has a length corresponding to the number of initial N-d arrays.
This answer is sufficient if the original arrays are all the same shape; however, it does not work if they have different shapes.
I don't really want to reshape the arrays to a congruent size and fill with empty elements due to the subsequent analysis I need to perform on the final array.
Specifically, I have four 4D arrays. One of the things I want to do with the resulting 5D array is plot parts of the four arrays on the same matplotlib figure. Obviously I could plot each one separately, however soon I will have more than four 4D arrays and am looking for a dynamic solution.
While I was writing this, Sven gave the same answer in the comments...
Put the arrays in a python list in the following manner:
5d_list = []
5d_list.append(4D_array_1)
5d_list.append(4D_array_2)
...
Then you can unpack them:
for 4d_array in 5d_list:
#plot 4d array on figure

Numpy Index values for each element in 3D array

I have a 3D array created using the numpy mgrid command so that each element has a certain value and the indexes retain the spatial information. For example, if one summed over the z-axis (3rd dimension) then the the resultant 2D array could be used in matplotlib with the function imshow() to obtain an image with different binned pixel values.
My question is: How can I obtain the index values for each element in this grid (a,b,c)?
I need to use the index values to calculate the relative angle of each point to the origin of the grid. (eg. theta=sin-1(sqrt(x^2+y^2)/sqrt(x^2+y^2+z^2))
Maybe this can be translated to another 3D grid where each element is the array [a,b,c]?
I'm not exactly clear on your meaning, but if you are looking for 3d arrays that contain the indices x, y, and z, then the following may suit your needs; assume your data is held in a 3D array called "abc":
import numpy as nm
x,y,z = nm.mgrid[[slice(dm) for dm in abc.shape]]

Categories