I'm trying to find peaks on the left and right halves of the image (basically this is a binary image of a road with left and right lanes).
For some reason, the left argmax is giving a value to the right of midpoint, and right is giving beyond the size of the image.
Here's my code
import numpy as np
import cv2
binary_warped = cv2.imread('data\Sobel\warped-example.jpg')
histogram = np.sum(binary_warped[binary_warped.shape[0]//2:,:], axis=0)
plt.plot(histogram)
midpoint = np.int(histogram.shape[0]//2)
leftx_base = np.argmax(histogram[:midpoint])
rightx_base = np.argmax(histogram[midpoint:]) + midpoint
print('Shape {} midpoint {} left peak {} right peak {}'.format(histogram.shape, midpoint, leftx_base, rightx_base))
This is my input
Input with shapes on the axis
Ideally the left peak should be around 370 and right should be arnd 1000, but
Here is my result
Shape (1280, 3) midpoint 640 left peak 981 right peak 1633
Where was the mistake?
The clue is given when you look at the shape of your histogram. It is 2-dimensional as it has a shape of (1280, 3)
When you call np.argmax(histogram[:midpoint]), argmax is called on a 2-d array and will first be unraveled before finding the largest value/index
You can see an example of this in the numpy docs:
>>> a = np.arange(6).reshape(2,3)
>>> a
array([[0, 1, 2],
[3, 4, 5]])
>>> np.argmax(a)
5
Related
Supposing I have an image with motion artifacts such that each column of the image should be rotated by a known amount, what would be the best way to get an output image with motion corrected. For example, take a 2x3 image like
import numpy as np
image = np.array([[1, 2, 3],
[1, 2, 3]])
column_rotation = np.array([0, 45, -45]) # degrees by which to rotate each column
rotation_pivot = np.array([0, 0, 0])
assert dewarp(image, column_rotation, rotation_pivot) == np.array([[1, 2, 3],
[1, 3, 2]])
Notice how the first column remained unchanged, second column was rotated around the 0th element (item in first row) by 45 degrees, and the third column was rotated around the 0th element by -45 degrees.
The best approach I have so far is to use griddata.
def dewarp(image, rotation_degrees, rotation_pivot) --> np.ndarray:
coordinates = get_each_pixel_coordinates(rotation_degrees, rotation_pivot)
return griddata(image.flatten(), coordinates, np.meshgrid(np.arange(image.shape[0]), np.arange(image.shape[1])), method='nearest')
where get_pixel_coordinates() finds the pixel coordinate for each pixel in the original image.
My problem with this approach is that it's too slow for the image sizes I'm working with (which are actually 3D of shape (200, 500, 1000), but I would settle for a fast 2D solution).
If only griddata was implemented in cupyx.scipy with GPU support I suspect this approach would be fast enough.
griddata might also be suboptimal in that it doesn't take advantage of the fact that columns are rotated in bulk, and just treats each pixel as independently warped.
Summary
Using numpy, I'm trying to find the value of a pixel (expressed as [r,g,b]) in a matrix which size is N by 3; so I want to find the row where the array is but I want to add a tolerance because it can happen that it does not match exactly.
With np.all (see raw below) it is possible to do this but the value should be exactly the same.
result_primo_check = np.all(element_2_find==matrix, axis=1)
Example
The problem is that I have element_2_find = [144, 0, 256] but in matrix the most similar row is [148, 0, 250]. Is there a command that add a tolerance or something similar?
Just compute whatever distance (e.g. Euclidean) you want to use between your pixel and the rest of the image and select the image location that matches most closely (perhaps only if the distance is below some threshold).
import numpy as np
img = np.random.rand(100, 3)
pixel = np.random.rand(1, 3)
dists = ((img - pixel) ** 2).sum(-1)
min_idx = np.unravel_index(dists.argmin(), dists.shape)
min_dist = dists[min_idx]
# img[min_idx] is the closest pixel in the image to your target pixel
print(min_idx, min_dist, img[min_idx])
I need to stack many images that are represented by 2D numpy arrays of the same shape (i.e., take the sum or the median of them all). However, as I stack them, they need to be aligned properly -- each image, while the same shape, is all black with a small circular object around the center, but not exactly at the center. I can find the coordinates of the centroid for each image (using the module SourceProperties.centroid through the package photutils), but these coordinates will be different for each image -- they are also subpixel coordinates (example: (y, x) = (203.018, 207.397)).
I do not know of a way to simply move the objects to the center of the arrays, given the centroids have subpixel coordinates, so it seems like it would be more straightforward if there was a way to align each one by their unique centroid coordinates as I stack them... in other words:
import numpy as np
# First image = array1, shape = (400, 400)
centroid1 = (203.018, 207.397)
# Second image = array2, shape = (400, 400)
centroid2 = (205.256, 199.312)
array_list = [array1, array2]
>>> stacked = np.median(array_list, axis=0) # but while setting centroid1 = centroid2 so that the two centroid points exactly overlap while computing median
But I'm not really sure how this would look in code. Is this possible?
Step 1: ignore the subpixel/fractional part, as it makes no sense for arrays. An array cannot be shifted by 0.34 elements to the right.
Step 2: roll arrays to place the centroids consistently.
Step 3: stack them.
As illustrated by the code below, which places centroids in the geometric center of the array.
centroid1 = (203.018, 207.397)
centroid2 = (205.256, 199.312)
centroid1 = np.round(centroid1).astype(np.int)
centroid2 = np.round(centroid2).astype(np.int)
center = np.array(array1.shape)//2
array1_rolled = np.roll(array1, center-centroid1, (0, 1))
array2_rolled = np.roll(array2, center-centroid2, (0, 1))
array_list = [array1_rolled, array2_rolled]
stacked = np.median(array_list, axis=0)
I want to extract the sum count of all pixels of an RGB image where R=0 and B=0 and where the x,y coordinates of those pixels are lying on the border of an image.
First I get the coordinates of the pixels with R=0 and B=0:
import cv2
import numpy as np
i = cv2.imread("test2.png")
indices = np.where((i[:, :, 0] == 0) & (i[:, :, 2] == 0))
Which gives me a list of the coordinates. Now I want to get the sum of all pixels where the x position is 0 or the image width (in this case 21).
I could sort the list but I would like to stick to numpy arrays if possible. Is there an fancy way to do it?
Approach #1
With X along the second axis, here's one fancy way -
(i[...,[0,2]]==0).all(-1)[:,[0,-1]].sum()
Approach #2
With multi-dim indexing -
(i[:,[0,-1],[0,2]]==0).sum()
Approach #3
For performance, use more of slicing -
mask = (i[...,0]==0) & (i[...,2]==0)
out_x = (mask[:,0] + mask[:,-1]).sum()
On older NumPy versions, np.count_nonzero might be better than .sum().
I have a lot of 750x750 images. I want to take the geometric mean of non-overlapping 5x5 patches from each image, and then for each image, average those geometric means to create one feature per image. I wrote the code below, and it seems to work just fine. But, I know it's not very efficient. Running it on 300 or so images takes around 60 seconds. I have about 3000 images. So, while it works for my purpose, it's not efficient. How can I improve this code?
#each sublist of gmeans will contain a list of 22500 geometric means
#corresponding to the non-overlapping 5x5 patches for a given image.
gmeans = [[],[],[],[],[],[],[],[],[],[],[],[]]
#the loop here populates gmeans.
for folder in range(len(subfolders)):
just_thefilename, colorsourceimages, graycroppedfiles = get_all_images(folder)
for items in graycroppedfiles:
myarray = misc.imread(items)
area_of_big_matrix=750*750
area_of_small_matrix= 5*5
how_many = area_of_big_matrix / area_of_small_matrix
n = 0
p = 0
mylist=[]
while len(mylist) < how_many:
mylist.append(gmean(myarray[n:n+5,p:p+5],None))
n=n+5
if n == 750:
p = p+5
n = 0
gmeans[folder].append(my list)
#each sublist of mean_of_gmeans will contain just one feature per image, the mean of the geometric means of the 5x5 patches.
mean_of_gmeans = [[],[],[],[],[],[],[],[],[],[],[],[]]
for folder in range(len(subfolders)):
for items in range(len(gmeans[0])):
mean_of_gmeans[folder].append((np.mean(gmeans[folder][items],dtype=np.float64)))
I can understand the suggestion to move this to the code review site,
but this problem provides a nice example of the power of using vectorized
numpy and scipy functions, so I'll give an answer.
The function below, cleverly called func, computes the desired value.
The key is to reshape the image into a four-dimensional array. Then
it can be interpreted as a two-dimensional array of two-dimensional
arrays, where the inner arrays are the 5x5 blocks.
scipy.stats.gmean can compute the geometric mean over more than one
dimension, so that is used to reduce the four-dimensional array to the
desired two-dimensional array of geometric means. The return value is the
(arithmetic) mean of those geometric means.
import numpy as np
from scipy.stats import gmean
def func(img, blocksize=5):
# img must be a 2-d array whose dimensions are divisible by blocksize.
if (img.shape[0] % blocksize) != 0 or (img.shape[1] % blocksize) != 0:
raise ValueError("blocksize does not divide the shape of img.")
# Reshape 'img' into a 4-d array 'blocks', so blocks[i, :, j, :] is
# the subarray with shape (blocksize, blocksize).
blocks_nrows = img.shape[0] // blocksize
blocks_ncols = img.shape[1] // blocksize
blocks = img.reshape(blocks_nrows, blocksize, blocks_ncols, blocksize)
# Compute the geometric mean over axes 1 and 3 of 'blocks'. This results
# in the array of geometric means with size (blocks_nrows, blocks_ncols).
gmeans = gmean(blocks, axis=(1, 3), dtype=np.float64)
# The return value is the average of 'gmeans'.
avg = gmeans.mean()
return avg
For example, here the function is applied to an array with shape (750, 750).
In [358]: np.random.seed(123)
In [359]: img = np.random.randint(1, 256, size=(750, 750)).astype(np.uint8)
In [360]: func(img)
Out[360]: 97.035648309350179
It isn't easy to verify that that is the correct result, so here is a much smaller example:
In [365]: np.random.seed(123)
In [366]: img = np.random.randint(1, 4, size=(3, 6))
In [367]: img
Out[367]:
array([[3, 2, 3, 3, 1, 3],
[3, 2, 3, 2, 3, 2],
[1, 2, 3, 2, 1, 3]])
In [368]: func(img, blocksize=3)
Out[368]: 2.1863131342986666
Here is the direct calculation:
In [369]: 0.5*(gmean(img[:,:3], axis=None) + gmean(img[:, 3:], axis=None))
Out[369]: 2.1863131342986666