rebuild numpy array, need faster method - python

I have image numpy array (640,480,3) where X,Y is coordinates and (255 255 255) is a color mask for point
I try to get new 2d array(x,y), where X and Y is coordinates for point when color > zero..
I try this code
and it works, but it takes too much processor time
enter code here
for x in range(edges.shape[0]):
for y in range(edges.shape[1]):
if edges[[x],[y],[0]]!=0:
new.append([x,y])

You could slice the first element of last axis, compare it against 0 and then use np.argwhere to get those indices, which would be the x, y coordinates in a (N,2) shaped array.
Thus, the implementation would be simply -
new = np.argwhere( edges[...,0]!=0 )

Related

How to Convert a 2D List of Tuples to a 3D Numpy Array Quickly?

I have a stream of images coming in from a source which is, unfortunately a list of list of tuples of RGB values. I want to perform real-time processing on the image, but that code expects a Numpy array of shape (X,Y,3) where X and Y are the image height and width.
X = [[(R, G, B)...]]
img_arr = np.array([*X])
The above works fine but takes nearly a quarter of a second with my images which is obviously too slow. Interestingly, I also need to go back the other direction after the processing is done, and that code (which seems to work) is not so slow:
imgout = map(tuple, flipped_image)
Some relevant other questions:
why is converting a long 2D list to numpy array so slow?
Convert List of List of Tuples Into 2d Numpy Array
To answer the title of your question, numpy automatically lists and tuples to numpy arrays, so you can just use np.array(X), which will be about as fast as you can get:
img_arr = np.array(X)
A simple list comprehension will convert it back to the list-list-tuple form:
imgout = [[tuple(Z) for Z in Y] for Y in img_arr]
Code to generate a sample 10x10 X array:
X = [[tuple(Z) for Z in Y] for Y in np.random.randint(0,255,(10,10,3))]

Given a 3D image array, return a list of indices with a value above a threshold and a minimum distance between all selected indices?

I have a 3D numpy array that represents a 3D image and I want to create a list from it with all the (x,y,z) coordinates/index tuples that are both above a certain value, and within a certain distance from other coordinates also above that certain value. So if coords (3,4,5) and (3,3,3) were both above the value, but the minimum distance apart was 4, then only one of these coords would be added to the new array (doesnt matter which).
I thought about doing something like this:
arr = [(x,y,z) for x in range(x_dim) for y in range(y_dim) for z in range(z_dim) if original_arr[z][y][x]>threshold
To get arr, which contains all coordinates above the threshold. Im stuck on how to remove all coordinates from array 'arr' which are then too close to other coordinates also inside it. Checking each coordinate against every other coordinate isnt possible, as due to the image being very large it would take too long.
Any ideas? Thanks
You can replace your threshold checking with:
import numpy as np
arr = np.argwhere(original_array> threshold)
The rest depends on your arr size and data type(please provide image size and dtype to assist better). If the number of points above the threshold is not too high you can use:
from sklearn.metrics.pairwise import euclidean_distances
euclidean_distances(arr,arr)
And check for distance threshold. If it is a high number of points, you can check via a loop iteration(I usually try to avoid changing loop variable array inside the loop, but this will save you a lot of memory space and time in case of large image):
arr = np.argwhere(original_array>threshold)
for i in range(arr.shape[0]):
try:
diff = np.argwhere(np.sum(arr[i+1:,:]-arr[i,:], axis=1)<=distance)
arr = np.delete(arr, diff+i+1, axis=0)
except IndexError as e:
break
your arr will contain coordinates you want:
output for sample code:
original_array = np.arange(40).reshape(10,2,2).astype(np.int32)
threshold = 5
distance = 3
arr:
[[1 1 0]
[4 1 1]
[8 1 1]]
distance matrix between final points:
[[0. 3.16227766 7.07106781]
[3.16227766 0. 4. ]
[7.07106781 4. 0. ]]
EDIT: per comment, if you want to ignore distance along z axis, replace this line:
diff = np.argwhere(np.sum((arr[i+1:,:]-arr[i,:])[:,0:2], axis=1)<=distance)

Assigning multiple different slices of a numpy/torch axis simultaneously

I have a batch of images (4d tensor/array with dimensions "batchsize x channels x height x width" and I would like to draw horizontal bars of zeros of size s on each image, but across different rows for each image. I can do this trivially with a for loop, but I haven't been able to figure out a vectorized implementation.
Ideally I would generate a 1-D tensor r of "batchsize" random starting points, and do something like
t[:,:,r:r+s,:] = 0. If I try this I get TypeError: only integer scalar arrays can be converted to a scalar index
If I do a toy example and just try to pull out two different sections of a batch with only two images, doing something like t[:,:,torch.tensor(([1,2],[2,3])),:] I get back a 5D tensor because it is pulling both of those sections from both images in the batch. How do I grab those different sections but only one for each image? In this case if the input were 2xCxHxW I would want 2xCx2xW where the first item corresponds to rows 1 and 2 of the first image, and the second item corresponds to rows 2 and 3 of the second image. Thank you.
You can use this function which will create a mask where you can perform operations across the y or x axis by their index. You can do this by arranging the x values of the index to be set to their y index.
bsg = sgs.data
device = sgs.device
bs, _, x, y = bsg.shape
max_y = y-size-1
rs = torch.randint(0, max_y, (bs,1), device=device)
m = torch.arange(y,device=device).repeat(bs, x)
gpumask = ((m < rs) | (m > (rs+size))).view(bs, 1, x, -1)
gpumask*bsg

Numpy filter pixel indices

I want to extract the sum count of all pixels of an RGB image where R=0 and B=0 and where the x,y coordinates of those pixels are lying on the border of an image.
First I get the coordinates of the pixels with R=0 and B=0:
import cv2
import numpy as np
i = cv2.imread("test2.png")
indices = np.where((i[:, :, 0] == 0) & (i[:, :, 2] == 0))
Which gives me a list of the coordinates. Now I want to get the sum of all pixels where the x position is 0 or the image width (in this case 21).
I could sort the list but I would like to stick to numpy arrays if possible. Is there an fancy way to do it?
Approach #1
With X along the second axis, here's one fancy way -
(i[...,[0,2]]==0).all(-1)[:,[0,-1]].sum()
Approach #2
With multi-dim indexing -
(i[:,[0,-1],[0,2]]==0).sum()
Approach #3
For performance, use more of slicing -
mask = (i[...,0]==0) & (i[...,2]==0)
out_x = (mask[:,0] + mask[:,-1]).sum()
On older NumPy versions, np.count_nonzero might be better than .sum().

Reshape from flattened indices in Python

I have an image of size M*N whose pixels coordinates has been flattened to a 1D array according to a space-filling curve (i.e. not a classical rasterization where I could have used reshape).
I thus process my 1D array (flattened image) and I then would like to reshape it to a M*N array (initial size).
So far, I have done this with a for-loop:
for i in range(img_flat.size):
img_res[x[i], y[i]] = img_flat[i]
x and y being the x and y pixels coordinates according to my path scan.
However, I am wondering how to do this in a unique line of code.
If x and y are numpy arrays of dimension 1 and lengths n, and img_flat also has length n img_res is a numpy array of dimension 2 (h, w) such that `h*w = n, then:
img_res[x, y] = img_flat
Should suffice
In fact, it was easy:
vec = np.arange(0, seg.size, dtype=np.uint)
img_res[x[vec], y[vec]] = seg[vec]

Categories