I need to write function that convolving an image with a kernel.
In other words -The function receives an image with a single color channel (ie a two-dimensional list(for example - [[1,1,1],[1,1,1],[1,1,1]]) and a kernel (also a two-dimensional list), and returns an image of the same size as the original image, with each pixel in the new image calculated by running the kernel on it.
That is: identify the pixel [image [row] [column with the main input in the kernel matrix, and sum the values of its neighbors (including
The pixel itself) double the corresponding input for them in the kernel.
When calculating a value for an x-pixel that is on the image boundaries, pixel values that are outside the image boundaries should be considered.
The source seemed to have the same value as the pixel x.
For example- for input:
image = [[0,128,255]]
kernel =[[1/9 ,1/9 ,1/9],[1/9 ,1/9 ,1/9] ,[1/9 ,1/9 ,1/9]]
output: [[14,128,241]]
The function starts at about zero and will place it in the center of a kernel-sized matrix, along with the adjacent values that are within the boundaries of this matrix.
In the example, this is a 3 * 3 matrix and therefore the matrix we will receive after entering the values is-[[0,0,0],[0,128,0],[0,0,0]].
After we have entered the corresponding values we will multiply the enter matrix by the kernel matrix (respectively so that pixels in the same coordinates between the two matrices will be multiplied by each other) and sum it all together then enter the result in the image size list instead of the value 0.
And then do the same with the next value- 128 and so on.
Eventually, we will return a new image with the new pixels we calculated as I presented.
Another explanation-
https://towardsdatascience.com/types-of-convolution-kernels-simplified-f040cb307c37
According to the instructions I received I can not use a numpy.
def new_image(image,kernel):
new_image= copy.deepcopy(image)
rows = len(image)
columns = len(image[0])
kernel_h = len(kernel)
kernel_w = len(kernel[0])
for i in range(rows):
for j in range(columns):
sum = 0
h = (-1 * (kernel_h // 2))
w = (-1 * (kernel_w // 2))
for m in range(kernel_h):
for n in range(kernel_w):
if 0 <= j+w < columns:
sum += round(kernel[m][n] * new_image[i][j+h])
if j + h < 0 or j + h >= columns:
sum += round(kernel[m][n] * new_image[i][j])
h+=1
w+=1
new_image[i][j] = sum
return new_image
This is what I wrote until now, but it does not work as required, meaning it does not return the image as required.
Output-[[42, 131, 239]]
instead of- [[14,128,241]]
Input=[[0,128,255]
I have no idea how to fix it, i would appreciate help.
Related
A piece of equipment outputs a heatmap with a scale bar as an image, but has no option to the save the data as a .csv or something that can easily be imported into Python for analysis.
I have used PIL to pull in the image, then create an array of the heatmap, frame1, with dimensions 680, 900, 3 (an XY array with the 3 RGB values for each pixel). I then made an array from the scalebar, scale1, with dimensions 254, 3 (a line with the 3 RGB values for each point on the scale). To relate this to the actual scale values I create a linear space scaleval = np.linspace(maxval,minval, 254), where maxval and minval are the max and min of the scalebar, which I transcribe from the image.
I want to match each pixel in frame1 to its closest colour match in scale1, and then store the corresponding scale value from scaleval into a dataframe df. In terms of for loops, what I want to do is:
# function returning the distance between two RGB values
def distance(c1, c2):
(r1,g1,b1) = c1
(r2,g2,b2) = c2
return math.sqrt((r1 - r2)**2 + (g1 - g2) ** 2 + (b1 - b2) **2)
#cycle through columns in frame1
for j in range(frame1.shape[1]):
#cycle through rows in frame1
for k in range(frame1.shape[0]):
# create empty list for the distances between the selected pixel and the values in scale1
distances = []
# cycle through scale1 creating list of distances with current pixel
for i in range(len(scale1)):
distances.append(distance(scale1[i], frame1[k,j,:]))
# find the index position of the minimum value, and store the scale value to a dataframe in the current XY position
distarr = np.asarray(distances)
idx = distarr.argmin()
df.loc[k,j] = scaleval[idx]
print("Column " + str(j+1) + " completed")
However this would be quite slow. Any advice on how to avoid using for loops here?
In case anyone with a similar problem finds this while searching later:
I was able to vectorise the inner-most loop. The function cdist in Scipy allows you to generate a list of distances between one point and an array of points without iterating.
So this portion:
distances = []
# cycle through scale1 creating list of distances with current pixel
for i in range(len(scale1)):
distances.append(distance(scale1[i], frame1[k,j,:]))
# find the index position of the minimum value, and store the scale value to a dataframe in the current XY position
distarr = np.asarray(distances)
idx = distarr.argmin()
df.loc[k,j] = scaleval[idx]
became
# create list of distances from current pixel to values in scale1 and store index of minimum distance
idx = cdist([frame1[k,j,:]],scale1).argmin()
df.loc[k,j] = scaleval[idx]
While there are still two for loops iterating through each pixel in frame1, the above change cut the run time to less than a third of what it was.
I have the following skeleton:
From this image I'd like to eliminate lines that are not part of loops.
I imagine this as a process in which ends of lines are found (marked with red dots) and the lines are gobbled up until there's a point where they branch (marked with blue dots).
I haven't found an operation for this in OpenCV or Scikit-Image.
Is there a name for such a transform? Is there a way to implement it in Python that would work efficiently?
I've also uploaded the image here in case the above image doesn't load correctly.
I haven't found a good way to do this in Python using existing libraries (though I hope someone is able to point me one), nor the name of this.
So I've decided to call this the Fuse Transform, since the action of the algorithm is similar to burning the lines away like fuses until they split.
I've implemented the Fuse Transform below as a Cython function for efficiency.
The algorithm requires a single O(N) time in the size of the matrix to sweep to identify seed cells (those cells that are at the start of a fuse) and then O(N) time in the total length of the fuses to eliminate the lines in question.
The Fuse Transform Algorithm
%%cython -a --cplus
import numpy as np
import cv2
import skimage.morphology as skm
import cython
from libcpp.queue cimport queue
cimport numpy as np
#cython.boundscheck(False)
#cython.wraparound(False)
#cython.nonecheck(False)
#cython.cdivision(True)
#Richard's Fuse Transform
#https://stackoverflow.com/a/51738867/752843
cpdef void FuseTransform(unsigned char [:, :] image):
# set the variable extension types
cdef int c, x, y, nx, ny, width, height, neighbours
cdef queue[int] q
# grab the image dimensions
height = image.shape[0]
width = image.shape[1]
cdef int dx[8]
cdef int dy[8]
#Offsets to neighbouring cells
dx[:] = [-1,-1,0,1,1,1,0,-1]
dy[:] = [0,-1,-1,-1,0,1,1,1]
#Find seed cells: those with only one neighbour
for y in range(1, height-1):
for x in range(1, width-1):
if image[y,x]==0: #Seed cells cannot be blank cells
continue
neighbours = 0
for n in range(0,8): #Looks at all neighbours
nx = x+dx[n]
ny = y+dy[n]
if image[ny,nx]>0: #This neighbour has a value
neighbours += 1
if neighbours==1: #Was there only one neighbour?
q.push(y*width+x) #If so, this is a seed cell
#Starting with the seed cells, gobble up the lines
while not q.empty():
c = q.front()
q.pop()
y = c//width #Convert flat index into 2D x-y index
x = c%width
image[y,x] = 0 #Gobble up this part of the fuse
neighbour = -1 #No neighbours yet
for n in range(0,8): #Look at all neighbours
nx = x+dx[n] #Find coordinates of neighbour cells
ny = y+dy[n]
#If the neighbour would be off the side of the matrix, ignore it
if nx<0 or ny<0 or nx==width or ny==height:
continue
if image[ny,nx]>0: #Is the neighbouring cell active?
if neighbour!=-1: #If we've already found an active neighbour
neighbour=-1 #Then pretend we found no neighbours
break #And stop looking. This is the end of the fuse.
else: #Otherwise, make a note of the neighbour's index.
neighbour = ny*width+nx
if neighbour!=-1: #If there was only one neighbour
q.push(neighbour) #Continue burning the fuse
#Read in image
img = cv2.imread('part.jpg')
ShowImage('Original',img,'bgr')
#Convert image to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
#Apply Otsu's method to eliminate pixels of intermediate colour
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_OTSU)
#Apply the Fuse Transform
skh_dilated = skelhuman.copy()
FuseTransform(skh_dilated)
Input
Result
In the following algorithm first I normalize the image pixels to have values zeros and ones. Then I examine the 8-connected neighbors of a non-zero pixel by applying a 3x3 unnormalized box-filter. If we multiply (pixelwise) the filter output from the input image, we get all the non-zero pixels, this time, their values tell us how many 8-connected neighbors they have plus 1. So, here the center pixel counts itself as its neighbor.
Red is the center pixel. Yellow are its 8-connected neighborhood.
We should eliminate the result pixel values less than 3.
The code will make things clearer. It may not be very efficient. I didn't try to dig into Richard's code. May be he's doing a similar thing efficiently.
import cv2
import numpy as np
im = cv2.imread('USqDW.png', 0)
# set max pixel value to 1
s = np.uint8(im > 0)
count = 0
i = 0
while count != np.sum(s):
# non-zero pixel count
count = np.sum(s)
# examine 3x3 neighborhood of each pixel
filt = cv2.boxFilter(s, -1, (3, 3), normalize=False)
# if the center pixel of 3x3 neighborhood is zero, we are not interested in it
s = s*filt
# now we have pixels where the center pixel of 3x3 neighborhood is non-zero
# if a pixels' 8-connectivity is less than 2 we can remove it
# threshold is 3 here because the boxfilter also counted the center pixel
s[s < 3] = 0
# set max pixel value to 1
s[s > 0] = 1
i = i + 1
After pruning:
I want to reduce one size of an image by keeping only the maximum pixel value of each pixel set. I implemented this in python :
def pixel_max_resize(img, h, w):
imr = np.zeros((h,w), dtype=np.uint8)
r = int(h/w)
for j in range(0,w):
imr[:,j] = np.amax(img[:,j*r:j*r+r], axis = 1)
return imr
This function is a lot slower than a cv2.resize of the same size (by a factor of 5-10). Anyone has an idea how to optimize the speed of this function ? Is there a list comprehension formulation that could speed up the process ?
I'm not 100% sure what you are trying to achieve since your code throws an error if the target height is not equal to the source height. Anyway, here is a function that resizes an image based on the maximum value of each subsample area. It's about 3-5 times faster than your code.
def pixel_max_resize(img, h, w):
source_h, source_w = img.shape
return img.reshape(h,source_h // h,-1,source_w // w).swapaxes(1,2).reshape(h,w,-1).max(axis=2)
(Caveat: source width and height must be an integer multiple of target width and height respectively)
Explanation:
The source 2d image is partitioned into a 3d array so that the first and second axis have the size of the target width and height and the third axis contains the values of all pixels to be subsampled for one target pixel. max() over this axis returns the maximum value for each subsample.
I have a numpy 2D array of values. Each element in the array represents a grid point from a grid where each box is 13km on a side. I need to determine the average value of all points within 50 miles of a specific point on the grid.
My current solution determines a bounding box and then references items in the array within that box using their indices, which is slow with numpy. I'm trying to determine a faster solution.
Current solution:
num_x = 400 #horizontal dimension of the 2D array
num_y = 300 #vertical dimension of the 2D array
num_dx = 6 #maximum number of horizontal grid points that fit within 50 miles
num_dy = 6 #same as above but for vertical (square grid)
radius_m = 80467.2 #50 miles expressed in meters
values = [] # stores the extracted values
for ix in range(-num_dx,num_dx+1):
for jy in range(-num_dy,num_dy+1):
# Determine distance to this point
dist = ((ix*dx)**2+(jy*dy)**2)**0.5
if dist <= radius_m:
# Ensure this grid point actually exists within the grid
if (j+jy) < num_y and (i+ix) < num_x:
value = myarray[i+ix,j+jy]
if value is not masked and value >= 0:
values.append(float(value))
average = sum(values) / float(len(values))
This is slow (takes about 1.5 seconds) due to accessing myarray over 100 times to extract the value of a single element. Is there a vector method that would work better here? I can't seem to figure out a way to do this with a mask since the conditional is based on the location of the grid point relative to another, not the value of the element itself.
Your code isn't runable and seems to contain a bug for when i < num_dx or j < num_dy (then it wraps around to the other side of the array). But making some assumptions on your variable names, this is how I would do it:
# First make sure we stay in the grid
i1, i2 = max(i-num_dx, 0), min(i+num_dx+1, num_x)
j1, j2 = max(j-num_dy, 0), min(j+num_dy+1, num_y)
# Get the radius in blocks, grid should be homogeneous
radius_i = radius_m / 13000.0
# Calc distances per element by broadcasting
DX = np.arange(i1, i2) - i
DY = np.arange(j1, j2)[:, None] - j
mask = DX*DX + DY*DY <= radius_i*radius_i
# Get block of interest and apply mask
values = myarray[i1:i2, j1:j2][mask]
For interior points (where the radius doesn't extend outside your image), you can just compute a single mask that is used for any interior point. Start with an array of zeros:
mask = np.zeros((2 * num_dx + 1, 2 * num_dy + 1), dtype=np.int)
Assuming your point of interest is at the center of that array, set each element that falls within the radius to 1 (not shown here). Then,
indices = np.argwhere(mask.ravel() == 1)
Then for any interior element (i, j) in myarray, you would get the values within the radius like:
values = myarray[i-num_dx: i+num_dx+1, j-num_dy: j+num_dy+1].ravel()[indices]
For points near the border, you would make a copy of mask and set rows/cols outside the image to zero before setting indices.
I have a range image of a scene. I traverse the image and calculate the average change in depth under the detection window. The detection windows changes size based on the average depth of the surrounding pixels of the current location. I accumulate the average change to produce a simple response image.
Most of the time is spent in the for loop, it is taking about 40+s for a 512x52 image on my machine. I was hoping for some speed up. Is there a more efficient/faster way to traverse the image? Is there a better pythonic/numpy/scipy way to visit each pixel? Or shall I go learn cython?
EDIT: I have reduced running time to about 18s by using scipy.misc.imread() instead of skimage.io.imread(). Not sure what the difference is, I will try to investigate.
Here is a simplified version of the code:
import matplotlib.pylab as plt
import numpy as np
from skimage.io import imread
from skimage.transform import integral_image, integrate
import time
def intersect(a, b):
'''Determine the intersection of two rectangles'''
rect = (0,0,0,0)
r0 = max(a[0],b[0])
c0 = max(a[1],b[1])
r1 = min(a[2],b[2])
c1 = min(a[3],b[3])
# Do we have a valid intersection?
if r1 > r0 and c1 > c0:
rect = (r0,c0,r1,c1)
return rect
# Setup data
depth_src = imread("test.jpg", as_grey=True)
depth_intg = integral_image(depth_src) # integrate to find sum depth in region
depth_pts = integral_image(depth_src > 0) # integrate to find num points which have depth
boundary = (0,0,depth_src.shape[0]-1,depth_src.shape[1]-1) # rectangle to intersect with
# Image to accumulate response
out_img = np.zeros(depth_src.shape)
# Average dimensions of bbox/detection window per unit length of depth
model = (0.602,2.044) # width, height
start_time = time.time()
for (r,c), junk in np.ndenumerate(depth_src):
# Find points around current pixel
r0, c0, r1, c1 = intersect((r-1, c-1, r+1, c+1), boundary)
# Calculate average of depth of points around current pixel
scale = integrate(depth_intg, r0, c0, r1, c1) * 255 / 9.0
# Based on average depth, create the detection window
r0 = r - (model[0] * scale/2)
c0 = c - (model[1] * scale/2)
r1 = r + (model[0] * scale/2)
c1 = c + (model[1] * scale/2)
# Used scale optimised detection window to extract features
r0, c0, r1, c1 = intersect((r0,c0,r1,c1), boundary)
depth_count = integrate(depth_pts,r0,c0,r1,c1)
if depth_count:
depth_sum = integrate(depth_intg,r0,c0,r1,c1)
avg_change = depth_sum / depth_count
# Accumulate response
out_img[r0:r1,c0:c1] += avg_change
print time.time() - start_time, " seconds"
plt.imshow(out_img)
plt.gray()
plt.show()
Michael, interesting question. It seems that the main performance problem you have is that each pixel in the image has two integrate() functions computed on it, one of size 3x3 and the other of a size which is not known in advance. Calculating individual integrals in this way is extremely inefficient, regardless of what numpy functions you use; it's an algorithmic issue, not an implementation issue. Consider an image of size NN. You can calculate all integrals of any size KK in that image using only approximately 4*NN operations, not (as one might naively expect) NNKK. The way you do that is first calculate an image of sliding sums over a window K in each row, and then sliding sums over the result in each column. Updating each sliding sum to move to the next pixel requires only adding the newest pixel in the current window and subtracting the oldest pixel in the previous window, thus two operations per pixel regardless of window size. We do have to do that twice (for rows and columns), therefore 4 operations per pixel.
I am not sure if there is a sliding window sum built into numpy, but this answer suggests a couple of ways to do it, using stride tricks: https://stackoverflow.com/a/12713297/1828289. You can certainly accomplish the same with one loop over columns and one loop over rows (taking slices to extract a row/column).
Example:
# img is a 2D ndarray
# K is the size of sums to calculate using sliding window
row_sums = numpy.zeros_like(img)
for i in range( img.shape[0] ):
if i > K:
row_sums[i,:] = row_sums[i-1,:] - img[i-K-1,:] + img[i,:]
elif i > 1:
row_sums[i,:] = row_sums[i-1,:] + img[i,:]
else: # i == 0
row_sums[i,:] = img[i,:]
col_sums = numpy.zeros_like(img)
for j in range( img.shape[1] ):
if j > K:
col_sums[:,j] = col_sums[:,j-1] - row_sums[:,j-K-1] + row_sums[:,j]
elif j > 1:
col_sums[:,j] = col_sums[:,j-1] + row_sums[:,j]
else: # j == 0
col_sums[:,j] = row_sums[:,j]
# here col_sums[i,j] should be equal to numpy.sum(img[i-K:i, j-K:j]) if i >=K and j >= K
# first K rows and columns in col_sums contain partial sums and can be ignored
How do you best apply that to your case? I think you might want to pre-compute the integrals for 3x3 (average depth) and also for several larger sizes, and use the value of the 3x3 to select one of the larger sizes for the detection window (assuming I understand the intent of your algorithm). The range of larger sizes you need might be limited, or artificially limiting it might still work acceptably well, just pick the nearest size. Calculating all integrals together using sliding sums is so much more efficient that I am almost certain it is worth calculating them for a lot of sizes you would never use at a particular pixel, especially if some of the sizes are large.
P.S. This is a minor addition, but you may want to avoid calling intersect() for every pixel: either (a) only process pixels which are farther from the edge than the max integral size, or (b) add margins to the image of the max integral size on all sides, filling the margins with either zeros or nans, or (c) (best approach) use slices to take care of this automatically: a slice index outside the boundary of an ndarray is automatically limited to the boundary, except of course negative indexes are wrapped around.
EDIT: added example of sliding window sums