Numpy filter pixel indices - python

I want to extract the sum count of all pixels of an RGB image where R=0 and B=0 and where the x,y coordinates of those pixels are lying on the border of an image.
First I get the coordinates of the pixels with R=0 and B=0:
import cv2
import numpy as np
i = cv2.imread("test2.png")
indices = np.where((i[:, :, 0] == 0) & (i[:, :, 2] == 0))
Which gives me a list of the coordinates. Now I want to get the sum of all pixels where the x position is 0 or the image width (in this case 21).
I could sort the list but I would like to stick to numpy arrays if possible. Is there an fancy way to do it?

Approach #1
With X along the second axis, here's one fancy way -
(i[...,[0,2]]==0).all(-1)[:,[0,-1]].sum()
Approach #2
With multi-dim indexing -
(i[:,[0,-1],[0,2]]==0).sum()
Approach #3
For performance, use more of slicing -
mask = (i[...,0]==0) & (i[...,2]==0)
out_x = (mask[:,0] + mask[:,-1]).sum()
On older NumPy versions, np.count_nonzero might be better than .sum().

Related

NumPy: How to check if a tuple is in a 1D numpy array

I'm having some trouble trying to check if a python tuple is in a one dimensional numpy array. I'm working on a loop that will record all the colors present in an image and store them into an array. It worked well using normal lists, but the image is very large and I think NumPy Arrays will speed up the loop as it took several minutes to complete the loop.
Here's what the code looks like:
from PIL import Image
import numpy as np
img = Image.open("bg.jpg").convert("RGB")
pixels = img.load()
colors = np.array([])
for h in range(img.size[1]):
for w in range(img.size[0]):
if pixels[w,h] not in colors:
colors = np.append(colors, pixels[w,h])
else:
continue
When I run this, I get the following error:
DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
if pixels[w,h] in colors:
Thanks in advance, and if you know a faster way to do this please let me know.
I'm not sure what you need exactly. But i hope the next piece of code will help you.
import numpy as np
image = np.arange(75).reshape(5, 5, 3) % 8
# Get the set of unique pixles
pixel_list = image.reshape(-1, 3)
unique_pixels = np.unique(pixel_list, axis = 0)
# Test whether a pixel is in the list of pixels:
i = 0
pixel_in_list = (unique_pixels[i] == pixel_list).all(1).any(0)
# all(1) - all the dimensions (rgb) of the pixels need to match
# any(0) - test if any of the pixels match
# Test whether any of the pixels in the set is in the list of pixels:
compare_all = unique_pixels.reshape(-1, 1, 3) == pixel_list.reshape(1, -1, 3)
pixels_in_list = compare_all.all(2).any()
# all(2) - all the dimensions (rgb) of the pixels need to match
# any() - test if any of the pixelsin the set matches any of the pixels in the list
I found a faster way to make my loop run faster without NumPy and that is by using sets, which is way faster than using lists or NumPy. This is what the code looks like now:
from PIL import Image
img = Image.open("bg.jpg").convert("RGB")
pixels = img.load()
colors = set({})
for h in range(img.size[1]):
for w in range(img.size[0]):
if pixels[w,h] in colors:
continue
else:
colors.add(pixels[w,h])
This solves my initial problem of the lists being too slow to loop through, and it goes around the second problem of NumPy unable to compare the tuples. Thanks for all the replies, have a good day.
Assuming pixels is of shape (3, w, h) or (3, h, w) (i.e., the color channels are along the first axis), and assuming all you're after are the unique colors in the image:
channels = (channel.flatten() for channel in pixels)
colors = set(zip(*channels))
If you want a list instead of a set, colors = list(set(zip(*channels))).
You seem to be misunderstanding where numpy comes in handy. A numpy array of tuples is not going to be any faster than a a Python list of tuples. The speed of numpy comes into play in numerical computation on matrices and vectors. A numpy array of tuples cannot take advantage of any of the things that make numpy so fast.
What you're trying to do is simply not appropriate for numpy, and won't help speed up your code at all.

Find an array inside a matrix with tolerance

Summary
Using numpy, I'm trying to find the value of a pixel (expressed as [r,g,b]) in a matrix which size is N by 3; so I want to find the row where the array is but I want to add a tolerance because it can happen that it does not match exactly.
With np.all (see raw below) it is possible to do this but the value should be exactly the same.
result_primo_check = np.all(element_2_find==matrix, axis=1)
Example
The problem is that I have element_2_find = [144, 0, 256] but in matrix the most similar row is [148, 0, 250]. Is there a command that add a tolerance or something similar?
Just compute whatever distance (e.g. Euclidean) you want to use between your pixel and the rest of the image and select the image location that matches most closely (perhaps only if the distance is below some threshold).
import numpy as np
img = np.random.rand(100, 3)
pixel = np.random.rand(1, 3)
dists = ((img - pixel) ** 2).sum(-1)
min_idx = np.unravel_index(dists.argmin(), dists.shape)
min_dist = dists[min_idx]
# img[min_idx] is the closest pixel in the image to your target pixel
print(min_idx, min_dist, img[min_idx])

Given a 3D image array, return a list of indices with a value above a threshold and a minimum distance between all selected indices?

I have a 3D numpy array that represents a 3D image and I want to create a list from it with all the (x,y,z) coordinates/index tuples that are both above a certain value, and within a certain distance from other coordinates also above that certain value. So if coords (3,4,5) and (3,3,3) were both above the value, but the minimum distance apart was 4, then only one of these coords would be added to the new array (doesnt matter which).
I thought about doing something like this:
arr = [(x,y,z) for x in range(x_dim) for y in range(y_dim) for z in range(z_dim) if original_arr[z][y][x]>threshold
To get arr, which contains all coordinates above the threshold. Im stuck on how to remove all coordinates from array 'arr' which are then too close to other coordinates also inside it. Checking each coordinate against every other coordinate isnt possible, as due to the image being very large it would take too long.
Any ideas? Thanks
You can replace your threshold checking with:
import numpy as np
arr = np.argwhere(original_array> threshold)
The rest depends on your arr size and data type(please provide image size and dtype to assist better). If the number of points above the threshold is not too high you can use:
from sklearn.metrics.pairwise import euclidean_distances
euclidean_distances(arr,arr)
And check for distance threshold. If it is a high number of points, you can check via a loop iteration(I usually try to avoid changing loop variable array inside the loop, but this will save you a lot of memory space and time in case of large image):
arr = np.argwhere(original_array>threshold)
for i in range(arr.shape[0]):
try:
diff = np.argwhere(np.sum(arr[i+1:,:]-arr[i,:], axis=1)<=distance)
arr = np.delete(arr, diff+i+1, axis=0)
except IndexError as e:
break
your arr will contain coordinates you want:
output for sample code:
original_array = np.arange(40).reshape(10,2,2).astype(np.int32)
threshold = 5
distance = 3
arr:
[[1 1 0]
[4 1 1]
[8 1 1]]
distance matrix between final points:
[[0. 3.16227766 7.07106781]
[3.16227766 0. 4. ]
[7.07106781 4. 0. ]]
EDIT: per comment, if you want to ignore distance along z axis, replace this line:
diff = np.argwhere(np.sum((arr[i+1:,:]-arr[i,:])[:,0:2], axis=1)<=distance)

rebuild numpy array, need faster method

I have image numpy array (640,480,3) where X,Y is coordinates and (255 255 255) is a color mask for point
I try to get new 2d array(x,y), where X and Y is coordinates for point when color > zero..
I try this code
and it works, but it takes too much processor time
enter code here
for x in range(edges.shape[0]):
for y in range(edges.shape[1]):
if edges[[x],[y],[0]]!=0:
new.append([x,y])
You could slice the first element of last axis, compare it against 0 and then use np.argwhere to get those indices, which would be the x, y coordinates in a (N,2) shaped array.
Thus, the implementation would be simply -
new = np.argwhere( edges[...,0]!=0 )

Remove transparent pixels from an image numpy array

I have an image's numpy array of shape (224,224,4). Each pixel has 4 dimension - r,g,b,alpha. I need to extract the (r,g,b) values for each pixel where it's alpha channel is 255.
I thought to first delete all elements in the array where alpha value is <255, and then extract only the first 3 values(r,g,b) of these remaining elements, but doing it in simple loops in Python is very slow. Is there a fast way to do it using numpy operations?
Something similar to this? https://stackoverflow.com/a/21017621/4747268
This should work: arr[arr[:,:,3]==255][:,:,:3]
something like this?
import numpy as np
x = np.random.random((255,255,4))
y = np.where(x[:,:,3] >0.5)
res = x[y][:,0:3]
where you have to fit > 0.5 to your needs (e.g. ==255). The result will be a matrix with all pixels stacked vertically

Categories