I have an image of size M*N whose pixels coordinates has been flattened to a 1D array according to a space-filling curve (i.e. not a classical rasterization where I could have used reshape).
I thus process my 1D array (flattened image) and I then would like to reshape it to a M*N array (initial size).
So far, I have done this with a for-loop:
for i in range(img_flat.size):
img_res[x[i], y[i]] = img_flat[i]
x and y being the x and y pixels coordinates according to my path scan.
However, I am wondering how to do this in a unique line of code.
If x and y are numpy arrays of dimension 1 and lengths n, and img_flat also has length n img_res is a numpy array of dimension 2 (h, w) such that `h*w = n, then:
img_res[x, y] = img_flat
Should suffice
In fact, it was easy:
vec = np.arange(0, seg.size, dtype=np.uint)
img_res[x[vec], y[vec]] = seg[vec]
Related
I'm currently trying to fill a matrix K where each entry in the matrix is just a function applied to two entries of an array x.
At the moment I'm using the most obvious method of running through rows and columns one at a time using a double for-loop:
K = np.zeros((x.shape[0],x.shape[0]), dtype=np.float32)
for i in range(x.shape[0]):
for j in range(x.shape[0]):
K[i,j] = f(x[i],x[j])
While this works fine the resulting matrix is a 10,000 by 10,000 matrix and takes very long to calculate. I was wondering if there is a more efficient way to do this built into NumPy?
EDIT: The function in question here is a gaussian kernel:
def gaussian(a,b,sigma):
vec = a-b
return np.exp(- np.dot(vec,vec)/(2*sigma**2))
where I set sigma in advance before calculating the matrix.
The array x is an array of shape (10000, 8). So the scalar product in the gaussian is between two vectors of dimension 8.
You can use a single for loop together with broadcasting. This requires to change the implementation of the gaussian function to accept 2D inputs:
def gaussian(a,b,sigma):
vec = a-b
return np.exp(- np.sum(vec**2, axis=-1)/(2*sigma**2))
K = np.zeros((x.shape[0],x.shape[0]), dtype=np.float32)
for i in range(x.shape[0]):
K[i] = gaussian(x[i:i+1], x)
Theoretically you could accomplish this even without any for loop, again by using broadcasting, but here an intermediary array of size len(x)**2 * x.shape[1] will be created which might run out of memory for your array sizes:
K = gaussian(x[None, :, :], x[:, None, :])
I am trying to normalize some Nx3 data. If X is a Nx3 array and D is a Nx1 array, in MATLAB, I can do
Y = X./D
If I do the following in Python, I get an error
X = np.random.randn(100,3)
D = np.linalg.norm(X,axis=1)
Y = X/D
ValueError: operands could not be broadcast together with shapes (100,3) (100,)
Any suggestions?
Edit: Thanks to dm2.
Y = X/D.reshape((100,1))
Another way is to use scikitlearn.
from sklearn import preprocessing
Y = preprocessing.normalize(X)
From numpy documentation on array broadcasting:
When operating on two arrays, NumPy compares their shapes
element-wise. It starts with the trailing (i.e. rightmost) dimensions
and works its way left. Two dimensions are compatible when
they are equal, or
one of them is 1
Both of your arrays have the same first dimension, but your X array is 2-dimensional, while your D array is 1-dimensional, which means the shapes of these two arrays do not meet the requirements to be broadcast together.
To make sure they do, you could reshape your D array into a 2-dimensional array of shape (100,1), which would satisfy the requirements to broadcast: rightmost dimensions are 3 and 1 (one of them is 1) and the other dimensions are equal (100 and 100).
So:
Y = X/D.reshape((-1,1))
or
Y = X/D.reshape((100,1))
or
Y = X/D[:,np.newaxis]
Should give you the result you're after.
I am trying to calculate the average of a 3D array between two indices on the 1st axis. The start and end indices vary from cell to cell and are represented by two separate 2D arrays that are the same shape as a slice of the 3D array.
I have managed to implement a piece of code that loops through the pixels of my 3D array, but this method is painfully slow in the case of my array with a shape of (70, 550, 350). Is there a way to vectorise the operation using numpy or xarray (the arrays are stored in an xarray dataset)?
Here is a snippet of what I would like to optimise:
# My 3D raster containing values; shape = (time, x, y)
values = np.random.rand(10, 55, 60)
# A 2D raster containing start indices for the averaging
start_index = np.random.randint(0, 4, size=(values.shape[1], values.shape[2]))
# A 2D raster containing end indices for the averaging
end_index = np.random.randint(5, 9, size=(values.shape[1], values.shape[2]))
# Initialise an array that will contain results
mean_array = np.zeros_like(values[0, :, :])
# Loop over 3D raster to calculate the average between indices on axis 0
for i in range(0, values.shape[1]):
for j in range(0, values.shape[2]):
mean_array[i, j] = np.mean(values[start_index[i, j]: end_index[i, j], i, j], axis=0)
One way to do this without loops is to zero-out the entries you don't want to use, compute the sum of the remaining items, then divide by the number of nonzero entries. For example:
i = np.arange(values.shape[0])[:, None, None]
mean_array_2 = np.where((i >= start_index) & (i < end_index), values, 0).sum(0) / (end_index - start_index)
np.allclose(mean_array, mean_array_2)
# True
Note that this assumes that the indices are in the range 0 <= i < values.shape[0]; if this is not the case you can use np.clip or other means to standardize the indices before computation.
I have a 2D numpy array input_array and two lists of indices (x_coords and y_coords). Id like to slice a 3x3 subarray for each x,y pair centered around the x,y coordinates. The end result will be an array of 3x3 subarrays where the number of subarrays is equal to the number of coordinate pairs I have.
Preferably by avoiding for loops. Currently I use a modification of game of life strides from the scipy cookbook:
http://wiki.scipy.org/Cookbook/GameOfLifeStrides
shape = (input_array.shape[0] - 2, input_array.shape[0] - 2, 3, 3)
strides = input_array.strides + input_array.strides
strided = np.lib.stride_trics.as_strided(input_array, shape=shape, strides=strides).\
reshape(shape[0]*shape[1], shape[2], shape[3])
This creates a view of the original array as a (flattened) array of all possible 3x3 subarrays. I then convert the x,y coordinate pairs to be able to select the subarrays I want from strided:
coords = x_coords - 1 + (y_coords - 1)*shape[1]
sub_arrays = strided[coords]
Although this works perfectly fine, I do feel it is a bit cumbersome. Is there a more direct approach to do this? Also, in the future I would like to extend this to the 3D case; slicing nx3x3 subarrays from a nxmxk array. It might also be possible using strides but so far I haven't been able to make it work in 3D
Here is a method that use array broadcast:
x = np.random.randint(1, 63, 10)
y = np.random.randint(1, 63, 10)
dy, dx = [grid.astype(int) for grid in np.mgrid[-1:1:3j, -1:1:3j]]
Y = dy[None, :, :] + y[:, None, None]
X = dx[None, :, :] + x[:, None, None]
then you can use a[Y, X] to select blocks from a. Here is an example code:
img = np.zeros((64, 64))
img[Y, X] = 1
Here is graph ploted by pyplot.imshow():
A very straight forward solution would be a list comprehension and itertools.product:
import itertools
sub_arrays = [input_array[x-1:x+2, y-1:y+2]
for x, y in itertools.product(x_coords, y_coords)]
This creates all possible tuples of coordinates and then slices the 3x3 arrays from the input_array.
But this is sort-of a for loop. And you will have to take care, that x_coords and y_coords are not on the border of the matrix.
I'm searching for an algorithm to merge a given number of multidimensional arrays (each of the same shape) to a given proportion (x,y,z).
For example 4 arrays with the shape (128,128,128) and the proportion (1,1,4) to an array of the shape (128,128,512).
Or 2 arrays with the shape (64,64,64) and the proportion (1,2,1) to an array of the shape (64,128,64)
I know how to do it manually with np.concatenate, but I need a general algorithm to do this. (np.reshape doesn't work - this will mess up the order)
edit: It's possible that the proportion is (1,2,3), then it is necessary to compare the left_edge of the box, to know where to place it. every array have a corresponding block with the attribute left_edge (xmin, ymin, zmin). Can I solve this with a if-condition?
If your proportion is always one-dimensional (i.e. concatenate in one dimension only), you can use this:
arrays = [...]
proportion = (1,1,4)
np.concatenate(arrays, axis=next(i for i,p in enumerate(proportion) if p>1))
Otherwise you have to explain what to do with proportion = (1,2,3)
Okay I programmed it this way and it seems to work. Maybe not the nicest way, but it do what I want.
blocks.sort(key=lambda x: (x.left_edge[2],x.left_edge[1],x.left_edge[0]))
proportion = (Nx * nblockx, Ny * nblocky, Nz * nblockz)
arrays = np.zeros((nblockx, nblocky, nblockz, Nx, Ny, Nz))
for block, (x,y,z) in zip(root_list,
product(range(nblockx),
range(nblocky),
range(nblockz))):
array = np.zeros((Nx, Ny, Nz), dtype = np.float64)
# this is only the function to fill the array
writearray(array, ...)
arrays[x,y,z] = array
shape = arrays.shape
array = np.zeros((shape[0]*shape[3], shape[1]*shape[4], shape[2]*shape[5]))
for x,y,z in product(range(shape[0]), range(shape[1]), range(shape[2])):
slicex = slice(x*shape[3], (x+1)*shape[3])
slicey = slice(y*shape[4], (y+1)*shape[4])
slicez = slice(z*shape[5], (z+1)*shape[5])
array[slicex, slicey, slicez] = arrays[x,y,z]
return array