To be more specific, here is the exact requirement. I'm not sure how to word the question.
I have an image, of size say (500,500). I extract only r and g channels
r = image[:, :, 0]
g = image[:, :, 1]
Then, I compute the 2D histogram of r and g
hist2d = np.histogram2d(r, g, bins=256, range=[(255,255),(255,255)])
Now, hist2d[0].shape is (256, 256) since It corresponds to every pair of 256x256 colors. Fine
The main requirement is, in an separate image, called result with same shape as original image i.e. (500, 500), I want to populate each element of result with the value of 2d histogram of r and g channels
For example, if r[200,200] is 23 and g[200, 200] is 26, I want to place result[200, 200] = hist2d[0][23, 26]
The naive method for doing this is, simple python loop.
for i in range(r.shape[0]):
for j in range(r.shape[1]):
result[i, j] = hist2d[0][r[i, j], g[i, j]]
But for a large image, this takes a significant time to compute. Is there a numpy way of doing this?
Thanks
just use hist2d[0][r, g]:
import numpy as np
r, g, b = np.random.randint(0, 256, size=(3, 500, 500)).astype(np.uint8)
hist2d = np.histogram2d(r.ravel(), g.ravel(), bins=256, range=[[0, 256], [0, 256]])
hist2d[0][r, g]
Related
I have a 2D numpy array arr of shape (m,n) with nonnegative values. I would like to find a pair (k,l) such that
the difference between sum(arr[:k, :]) and sum(arr[k:, :]) is minimal
similarly, the difference between sum(arr[:, :l]) and sum(arr[:, l:]) is minimal
If you can come up with an algorithm only for k, the rest is actually easy. We simply transpose the matrix to find l.
A note for the skeptical: We may assume that sum(arr[:k, :]) and sum(arr[:,:l]) are strictly increasing functions of k and l, respectively.
This works:
sum_to_k = np.pad(np.cumsum(np.sum(a, axis=1)), (1, 0))
sum_to_l = np.pad(np.cumsum(np.sum(a, axis=0)), (1, 0))
k = np.argmin(np.abs(sum_to_k - (sum_to_k[-1] - sum_to_k)))
l = np.argmin(np.abs(sum_to_l - (sum_to_l[-1] - sum_to_l)))
The SQDIFF is defined as openCV definition. (I believe they omit channels)
Which in junior numpy Python should be
A = np.arange(27, dtype=np.float32)
A = A.reshape(3,3,3) # The "image"
B = np.ones([2, 2, 3], dtype=np.float32) # window
rw, rh = A.shape[0] - B.shape[0] + 1, A.shape[1] - B.shape[1] + 1 # End result size
result = np.zeros([rw, rh])
for i in range(rw):
for j in range(rh):
w = A[i:i + B.shape[0], j:j + B.shape[1]]
res = B - w
result[i, j] = np.sum(
res ** 2
)
cv_result = cv.matchTemplate(A, B, cv.TM_SQDIFF) # this result is the same as the simple for loops
assert np.allclose(cv_result, result)
This is comparatively slow solution. I have read about sliding_window_view but cannot get it correct.
# This will fail with these large arrays but is ok for smaller ones
A = np.random.rand(1028, 1232, 3).astype(np.float32)
B = np.random.rand(248, 249, 3).astype(np.float32)
locations = np.lib.stride_tricks.sliding_window_view(A, B.shape)
sqdiff = np.sum((B - locations) ** 2, axis=(-1,-2, -3, -4)) # This will fail with normal sized images
will fail with MemoryError even if the result easily fits to memory. How can I produce similar results to the cv2.matchTemplate function with this faster way?
As a last resort, you may perform the computation in tiles, instead of computing "all at once".
np.lib.stride_tricks.sliding_window_view returns a view of the data, so it doesn't consume a lot of RAM.
The expression B - locations can't use a view, and requires the RAM for storing an array with shape (781, 984, 1, 248, 249, 3) of float elements.
The total RAM for storing B - locations is 781*984*1*248*249*3*4 = 569,479,908,096 bytes.
For avoiding the need for storing B - locations at the RAM at once, we may compute sqdiff in tiles, when "tile" computation requires less RAM.
A simple tiles division is using every row as a tile - loop over the rows of sqdiff, and compute the output row by row.
Example:
sqdiff = np.zeros((locations.shape[0], locations.shape[1]), np.float32) # Allocate an array for storing the result.
# Compute sqdiff row by row instead of computing all at once.
for i in range(sqdiff.shape[0]):
sqdiff[i, :] = np.sum((B - locations[i, :, :, :, :, :]) ** 2, axis=(-1, -2, -3, -4))
Executable code sample:
import numpy as np
import cv2
A = np.random.rand(1028, 1232, 3).astype(np.float32)
B = np.random.rand(248, 249, 3).astype(np.float32)
locations = np.lib.stride_tricks.sliding_window_view(A, B.shape)
cv_result = cv2.matchTemplate(A, B, cv2.TM_SQDIFF) # this result is the same as the simple for loops
#sqdiff = np.sum((B - locations) ** 2, axis=(-1, -2, -3, -4)) # This will fail with normal sized images
sqdiff = np.zeros((locations.shape[0], locations.shape[1]), np.float32) # Allocate an array for storing the result.
# Compute sqdiff row by row instead of computing all at once.
for i in range(sqdiff.shape[0]):
sqdiff[i, :] = np.sum((B - locations[i, :, :, :, :, :]) ** 2, axis=(-1, -2, -3, -4))
assert np.allclose(cv_result, sqdiff)
I know the solution is a bit disappointing... But it is the only generic solution I could find.
is equivalent to
where the 'star' operation is a cross-correlation, the 1_[m, n] is a window the size of the template, and 1_[k, l] is a window with the size of the image.
You can compute the cross-correlation terms using 'scipy.signal.correlate' and find the matches by looking for local minima in the square difference map.
You might want to do some non-minimum suppression too.
This solution will require orders of magnitude less memory to store.
For more help, please post a reproducible example with an image and template that are valid for the algorithm. Using noise will result in meaningless outputs.
I have some code here (used for gradient calculation) - Example values are commented:
dE_dx_strided = np.einsum('wxyd,ijkd->wxyijk', dE_dy, f)
# dE_dx_strided.shape = (64, 25, 25, 4, 4, 3)
imax, jmax, di, dj = dE_dx_strided.shape[1:5]
# imax, jmax, di, dj = (25, 25, 4, 4)
dE_dx = np.zeros_like(x)
# dE_dx.shape = (64, 28, 28, 3)
for i in range(imax):
for j in range(jmax):
dE_dx[:, i:i+di, j:j+dj, :] += dE_dx_strided[:, i, j, ...]
where dE_dx is the object of interest and dE_dx_strided is a 6-tensor which is being summed over 'piecewise', effectively, and it looks reminiscent of a convolution operation along axes 1 and 2:
# Verbose convolution operation (not my actual implementation)
for i in range(imax):
for j in range(jmax):
# Vaguely similar, but with filter multiplication, and = instead of +=
y[i, j] = x[i:i+di, j:dj] * f[di, dj]
My original idea was to make all elements of dE_dx_strided that are to be added to a single dE_dx[:, i:i+di, j:j+dj, :] lie along one axis, and then sum over it; but I couldn't get this to work.
Now I know that for loops aren't inherently slow, but is there a numpy-esque way to optimise this further, perhaps by reshaping, summing, strides, etc.?
I'm trying to assign some values to a torch tensor. In the sample code below, I initialized a tensor U and try to assign a tensor b to its last 2 dimensions. In reality, this is a loop over i and j that solves some relation for a number of training data (here 10) and assigns it to its corresponding location.
import torch
U = torch.zeros([10, 1, 4, 4])
b = torch.rand([10, 1, 1, 1])
i = 2
j = 2
U[:, :, i, j] = b
I was expecting vector b to be assigned for dimensions i and j of corresponding training data (shape of training data being (10,1)) but it gives me an error. The error that I get is the following
RuntimeError: expand(torch.FloatTensor{[10, 1, 1, 1]}, size=[10, 1]): the number of sizes provided (2) must be greater or equal to the number of dimensions in the tensor (4)
Any suggestions on how to fix it would be appreciated.
As an example, you can think of this as if '[10, 1]' is the shape of my data. Imagine it is 10 images, each of which has one channel. Then imagine each image is of shape '[4, 4]'. In each iteration of the loop, pixel '[i, j]' for all images and channels is being calculated.
Your b tensor has too much dimensions.
U[:, :, i, j] has a [10, 1] shape (try U[:, :, i, j].shape)
Use b = torch.rand([10, 1]) instead.
Thanks to #Khoyo's tip on the source of the problem, I used reshape to fix this as following
import torch
U = torch.zeros([10, 1, 4, 4])
b = torch.rand([10, 1, 1, 1])
i = 2
j = 2
U[:, :, i, j] = b.reshape((-1))
there is a shape mismatch in your assignment. U[..., [i], [j]] will do the same meanwhile keep the last two dimensions for you.
I need to vectorize two nested for loops and dont know how to do it. One is for gray scale images and one for color images. I want to filter an image with the kuwahara filter. The code you see below is the last step I need to vectorize to get a fast function.
the array img_kuwahara is in shape of mxn or mxnx3 (color image)
the array index_min is in shape of mxn
the array mean is in shape of 4xmxn (gray scale) or 3x4xmxn (color)
I need to get the right value out of the mean array into the img_kuwahara array.
as sample data you can use the following arrays:
index_min = np.array([[0, 1, 1, 2, 3],[3, 3, 2, 2, 2],[2, 3, 3, 0, 2],[0, 1, 1, 0, 3],[2, 1, 3, 0, 0]])
mean = np.random.randint(0, 256, size=(4,5,5)) (gray scale images)
mean = np.random.randint(0, 256, size=(3,4,5,5)) (color images)
row = 5, columns = 5
Thank you for your help
# Edit gray scale image
if len(image.shape) == 2:
# Set result image
img_kuwahara = np.zeros((row, columns), dtype=imgtyp)
for k in range(0, row):
for i in range(0, columns):
img_kuwahara[k, i] = mean[index_min[k, i], k, i]
# Edit color image
if len(image.shape) == 3:
# Set result image
img_kuwahara = np.zeros((row, columns, 3), dtype=imgtyp)
for k in range(0, row):
for i in range(0, columns):
img_kuwahara[k, i, 0] = mean[0][index_min[k, i], k, i]
img_kuwahara[k, i, 1] = mean[1][index_min[k, i], k, i]
img_kuwahara[k, i, 2] = mean[2][index_min[k, i], k, i]
The first loop can be vectorized using np.meshgrid:
j, i = np.meshgrid(range(columns), range(rows))
img_kuwahara = mean[index_min[i, j], i, j]
The second loop can be vectorized by using an additional np.moveaxis (assuming that mean is actually a 4D array in that case, not a list of 3D arrays; otherwise just convert it):
j, i = np.meshgrid(range(columns), range(rows))
img_kuwahara = np.moveaxis(mean, 0, -1)[index_min[i, j], i, j]
Alternatively to np.meshgrid you can also use np.mgrid (which supports a more natural syntax):
i, j = np.mgrid[:rows, :columns]