I have a range image of a scene. I traverse the image and calculate the average change in depth under the detection window. The detection windows changes size based on the average depth of the surrounding pixels of the current location. I accumulate the average change to produce a simple response image.
Most of the time is spent in the for loop, it is taking about 40+s for a 512x52 image on my machine. I was hoping for some speed up. Is there a more efficient/faster way to traverse the image? Is there a better pythonic/numpy/scipy way to visit each pixel? Or shall I go learn cython?
EDIT: I have reduced running time to about 18s by using scipy.misc.imread() instead of skimage.io.imread(). Not sure what the difference is, I will try to investigate.
Here is a simplified version of the code:
import matplotlib.pylab as plt
import numpy as np
from skimage.io import imread
from skimage.transform import integral_image, integrate
import time
def intersect(a, b):
'''Determine the intersection of two rectangles'''
rect = (0,0,0,0)
r0 = max(a[0],b[0])
c0 = max(a[1],b[1])
r1 = min(a[2],b[2])
c1 = min(a[3],b[3])
# Do we have a valid intersection?
if r1 > r0 and c1 > c0:
rect = (r0,c0,r1,c1)
return rect
# Setup data
depth_src = imread("test.jpg", as_grey=True)
depth_intg = integral_image(depth_src) # integrate to find sum depth in region
depth_pts = integral_image(depth_src > 0) # integrate to find num points which have depth
boundary = (0,0,depth_src.shape[0]-1,depth_src.shape[1]-1) # rectangle to intersect with
# Image to accumulate response
out_img = np.zeros(depth_src.shape)
# Average dimensions of bbox/detection window per unit length of depth
model = (0.602,2.044) # width, height
start_time = time.time()
for (r,c), junk in np.ndenumerate(depth_src):
# Find points around current pixel
r0, c0, r1, c1 = intersect((r-1, c-1, r+1, c+1), boundary)
# Calculate average of depth of points around current pixel
scale = integrate(depth_intg, r0, c0, r1, c1) * 255 / 9.0
# Based on average depth, create the detection window
r0 = r - (model[0] * scale/2)
c0 = c - (model[1] * scale/2)
r1 = r + (model[0] * scale/2)
c1 = c + (model[1] * scale/2)
# Used scale optimised detection window to extract features
r0, c0, r1, c1 = intersect((r0,c0,r1,c1), boundary)
depth_count = integrate(depth_pts,r0,c0,r1,c1)
if depth_count:
depth_sum = integrate(depth_intg,r0,c0,r1,c1)
avg_change = depth_sum / depth_count
# Accumulate response
out_img[r0:r1,c0:c1] += avg_change
print time.time() - start_time, " seconds"
plt.imshow(out_img)
plt.gray()
plt.show()
Michael, interesting question. It seems that the main performance problem you have is that each pixel in the image has two integrate() functions computed on it, one of size 3x3 and the other of a size which is not known in advance. Calculating individual integrals in this way is extremely inefficient, regardless of what numpy functions you use; it's an algorithmic issue, not an implementation issue. Consider an image of size NN. You can calculate all integrals of any size KK in that image using only approximately 4*NN operations, not (as one might naively expect) NNKK. The way you do that is first calculate an image of sliding sums over a window K in each row, and then sliding sums over the result in each column. Updating each sliding sum to move to the next pixel requires only adding the newest pixel in the current window and subtracting the oldest pixel in the previous window, thus two operations per pixel regardless of window size. We do have to do that twice (for rows and columns), therefore 4 operations per pixel.
I am not sure if there is a sliding window sum built into numpy, but this answer suggests a couple of ways to do it, using stride tricks: https://stackoverflow.com/a/12713297/1828289. You can certainly accomplish the same with one loop over columns and one loop over rows (taking slices to extract a row/column).
Example:
# img is a 2D ndarray
# K is the size of sums to calculate using sliding window
row_sums = numpy.zeros_like(img)
for i in range( img.shape[0] ):
if i > K:
row_sums[i,:] = row_sums[i-1,:] - img[i-K-1,:] + img[i,:]
elif i > 1:
row_sums[i,:] = row_sums[i-1,:] + img[i,:]
else: # i == 0
row_sums[i,:] = img[i,:]
col_sums = numpy.zeros_like(img)
for j in range( img.shape[1] ):
if j > K:
col_sums[:,j] = col_sums[:,j-1] - row_sums[:,j-K-1] + row_sums[:,j]
elif j > 1:
col_sums[:,j] = col_sums[:,j-1] + row_sums[:,j]
else: # j == 0
col_sums[:,j] = row_sums[:,j]
# here col_sums[i,j] should be equal to numpy.sum(img[i-K:i, j-K:j]) if i >=K and j >= K
# first K rows and columns in col_sums contain partial sums and can be ignored
How do you best apply that to your case? I think you might want to pre-compute the integrals for 3x3 (average depth) and also for several larger sizes, and use the value of the 3x3 to select one of the larger sizes for the detection window (assuming I understand the intent of your algorithm). The range of larger sizes you need might be limited, or artificially limiting it might still work acceptably well, just pick the nearest size. Calculating all integrals together using sliding sums is so much more efficient that I am almost certain it is worth calculating them for a lot of sizes you would never use at a particular pixel, especially if some of the sizes are large.
P.S. This is a minor addition, but you may want to avoid calling intersect() for every pixel: either (a) only process pixels which are farther from the edge than the max integral size, or (b) add margins to the image of the max integral size on all sides, filling the margins with either zeros or nans, or (c) (best approach) use slices to take care of this automatically: a slice index outside the boundary of an ndarray is automatically limited to the boundary, except of course negative indexes are wrapped around.
EDIT: added example of sliding window sums
Related
A piece of equipment outputs a heatmap with a scale bar as an image, but has no option to the save the data as a .csv or something that can easily be imported into Python for analysis.
I have used PIL to pull in the image, then create an array of the heatmap, frame1, with dimensions 680, 900, 3 (an XY array with the 3 RGB values for each pixel). I then made an array from the scalebar, scale1, with dimensions 254, 3 (a line with the 3 RGB values for each point on the scale). To relate this to the actual scale values I create a linear space scaleval = np.linspace(maxval,minval, 254), where maxval and minval are the max and min of the scalebar, which I transcribe from the image.
I want to match each pixel in frame1 to its closest colour match in scale1, and then store the corresponding scale value from scaleval into a dataframe df. In terms of for loops, what I want to do is:
# function returning the distance between two RGB values
def distance(c1, c2):
(r1,g1,b1) = c1
(r2,g2,b2) = c2
return math.sqrt((r1 - r2)**2 + (g1 - g2) ** 2 + (b1 - b2) **2)
#cycle through columns in frame1
for j in range(frame1.shape[1]):
#cycle through rows in frame1
for k in range(frame1.shape[0]):
# create empty list for the distances between the selected pixel and the values in scale1
distances = []
# cycle through scale1 creating list of distances with current pixel
for i in range(len(scale1)):
distances.append(distance(scale1[i], frame1[k,j,:]))
# find the index position of the minimum value, and store the scale value to a dataframe in the current XY position
distarr = np.asarray(distances)
idx = distarr.argmin()
df.loc[k,j] = scaleval[idx]
print("Column " + str(j+1) + " completed")
However this would be quite slow. Any advice on how to avoid using for loops here?
In case anyone with a similar problem finds this while searching later:
I was able to vectorise the inner-most loop. The function cdist in Scipy allows you to generate a list of distances between one point and an array of points without iterating.
So this portion:
distances = []
# cycle through scale1 creating list of distances with current pixel
for i in range(len(scale1)):
distances.append(distance(scale1[i], frame1[k,j,:]))
# find the index position of the minimum value, and store the scale value to a dataframe in the current XY position
distarr = np.asarray(distances)
idx = distarr.argmin()
df.loc[k,j] = scaleval[idx]
became
# create list of distances from current pixel to values in scale1 and store index of minimum distance
idx = cdist([frame1[k,j,:]],scale1).argmin()
df.loc[k,j] = scaleval[idx]
While there are still two for loops iterating through each pixel in frame1, the above change cut the run time to less than a third of what it was.
I would like to find the find the distance transform of a binary image in the fastest way possible without using the scipy package distance_trnsform_edt(). The image is 256 by 256. The reason I don't want to use scipy is because using it is difficult in tensorflow. Evry time I want to use this package I need to start a new session and this takes a lot of time. So I would like to make a custom function that only utilizes numpy.
My approach is as follows: Find the coordinated for all the ones and all the zeros in the image. Find the euclidian distance between each of the zero pixels (a) and the one pixels (b) and then the value at each (a) position is the minimum distance to a (b) pixel. I do this for each 0 pixel. The resultant image has the same dimensions as the original binary map. My attempt at doing this is shown below.
I tried to do this as fast as possible using no loops and only vectorization. But my function still can't work as fast as the scipy package can. When I timed the code it looks like the assignment to the variable "a" is taking the longest time. But I do not know if there is a way to speed this up.
If anyone has any other suggestions for different algorithms to solve this problem of distance transforms or can direct me to other implementations in python, it would be very appreciated.
def get_dst_transform_img(og): #og is a numpy array of original image
ones_loc = np.where(og == 1)
ones = np.asarray(ones_loc).T # coords of all ones in og
zeros_loc = np.where(og == 0)
zeros = np.asarray(zeros_loc).T # coords of all zeros in og
a = -2 * np.dot(zeros, ones.T)
b = np.sum(np.square(ones), axis=1)
c = np.sum(np.square(zeros), axis=1)[:,np.newaxis]
dists = a + b + c
dists = np.sqrt(dists.min(axis=1)) # min dist of each zero pixel to one pixel
x = og.shape[0]
y = og.shape[1]
dist_transform = np.zeros((x,y))
dist_transform[zeros[:,0], zeros[:,1]] = dists
plt.figure()
plt.imshow(dist_transform)
The implementation in the OP is a brute-force approach to the distance transform. This algorithm is O(n2), as it computes the distance from each background pixel to each foreground pixel. Furthermore, because of the way it is vectorized, it requires a lot of memory. On my computer it couldn't compute the distance transform of a 256x256 image without thrashing. Many other algorithms are described in the literature, below I'll discuss two O(n) algorithms.
Note: Typically, the distance transform is computed for object pixels (value 1) to the nearest background pixel (value 0). The code in the OP does the reverse, and so the code I've pasted below follows OP's convention, not the more common convention.
The easiest to implement, IMO, is the chamfer distance algorithm. This is a recursive algorithm that does two passes over the image: one left to right and top to bottom, and one right to left and bottom to top. In each pass, the distance computed for previous pixels is propagated. This algorithm can be implemented using integer distances or floating-point distances between neighbors. The latter yields smaller errors, of course. But in both cases the errors can be reduced significantly by increasing the number of neighbors queried in this propagation. The algorithm is older, but G. Borgefors analyzed it and proposed suitable neighbor distances (G. Borgefors, Distance Transformations in Digital Images, Computer Vision, Graphics, and Image Processing 34:344-371, 1986).
Here is an implementation using 3-4 distance (distance to edge-connected neighbors is 3, distance to vertex-connected neighbors is 4):
def chamfer_distance(img):
w, h = img.shape
dt = np.zeros((w,h), np.uint32)
# Forward pass
x = 0
y = 0
if img[x,y] == 0:
dt[x,y] = 65535 # some large value
for x in range(1, w):
if img[x,y] == 0:
dt[x,y] = 3 + dt[x-1,y]
for y in range(1, h):
x = 0
if img[x,y] == 0:
dt[x,y] = min(3 + dt[x,y-1], 4 + dt[x+1,y-1])
for x in range(1, w-1):
if img[x,y] == 0:
dt[x,y] = min(4 + dt[x-1,y-1], 3 + dt[x,y-1], 4 + dt[x+1,y-1], 3 + dt[x-1,y])
x = w-1
if img[x,y] == 0:
dt[x,y] = min(4 + dt[x-1,y-1], 3 + dt[x,y-1], 3 + dt[x-1,y])
# Backward pass
for x in range(w-2, -1, -1):
y = h-1
if img[x,y] == 0:
dt[x,y] = min(dt[x,y], 3 + dt[x+1,y])
for y in range(h-2, -1, -1):
x = w-1
if img[x,y] == 0:
dt[x,y] = min(dt[x,y], 3 + dt[x,y+1], 4 + dt[x-1,y+1])
for x in range(1, w-1):
if img[x,y] == 0:
dt[x,y] = min(dt[x,y], 4 + dt[x+1,y+1], 3 + dt[x,y+1], 4 + dt[x-1,y+1], 3 + dt[x+1,y])
x = 0
if img[x,y] == 0:
dt[x,y] = min(dt[x,y], 4 + dt[x+1,y+1], 3 + dt[x,y+1], 3 + dt[x+1,y])
return dt
Note that a lot of the complication here is to avoid indexing out of bounds, but still computing distances all the way to the edges of the image. If we simply skip the pixels around the border of the image, the code becomes much simpler.
Because it is a recursive algorithm, it is not possible to vectorize its implementation. The Python code will not be very efficient. But programmed in C or the like will yield a very fast algorithm that yields a fairly good approximation to the Euclidean distance.
OpenCV's cv.distanceTransform implements this algorithm.
Another very efficient algorithm computes the square of the distance transform. The square distance is separable (i.e. can be computed independently for each axis and added). This leads to an algorithm that is easy to parallelize. For each image row, the algorithm does a forward and a backward pass. For each column in the result, the algorithm then does another forward and backward pass. This process leads to an exact Euclidean distance transform.
This algorithm was first proposed by R. van den Boomgaard in his Ph.D. thesis in 1992. Unfortunately this went unnoticed. The algorithm was then again proposed by A. Meijster, J.B.T.M. Roerdink and W.H. Hesselink (A General Algorithm for Computing Distance Transforms in Linear Time, Mathematical Morphology and its Applications to Image and Signal Processing, pp 331-340, 2002), and again by P. Felzenszwalb and D. Huttenlocher (Distance transforms of sampled functions, Technical report, Cornell University, 2004).
This is the most efficient algorithm known, in part because it is the only one that can be easily and efficiently parallelized (computation on each image row, and later on each image column, is independent of other rows/columns).
Unfortunately I don't have any Python code for this one to share, but you can find implementations online. For example OpenCV's cv.distanceTransform implements this algorithm, and DIPlib's dip.EuclideanDistanceTransform does too.
I want to generate random points on the surface of cylinder such that distance between the points fall in a range of 230 and 250. I used the following code to generate random points on surface of cylinder:
import random,math
H=300
R=20
s=random.random()
#theta = random.random()*2*math.pi
for i in range(0,300):
theta = random.random()*2*math.pi
z = random.random()*H
r=math.sqrt(s)*R
x=r*math.cos(theta)
y=r*math.sin(theta)
z=z
print 'C' , x,y,z
How can I generate random points such that they fall with in the range(on the surfaceof cylinder)?
This is not a complete solution, but an insight that should help. If you "unroll" the surface of the cylinder into a rectangle of width w=2*pi*r and height h, the task of finding distance between points is simplified. You have not explained how to measure "distance along the surface" between points on the top of the cylinder and the side- this is a slightly tricky bit of geometry.
As for computing the distance along the surface when we created an artificial "seam", just use both (x1-x2) and (w -x1+x2) - whichever gives the shorter distance is the one you want.
I do think that #VincentNivoliers' suggestion to use Poisson disk sampling is very good, but with the constraints of h=300 and r=20 you will get terrible results no matter what.
The basic way of creating a set of random points with constraints in the positions between them, is to have a function that modulates the probability of points being placed at a certain location. this function starts out being a constant, and whenever a point is placed, forbidden areas surrounding the point are set to zero. That is difficult to do with continuous variables, but reasonably easy if you discretize your problem.
The other thing to be careful about is the being on a cylinder part. It may be easier to think of it as random points on a rectangular area that repeats periodically. This can be handled in two different ways:
the simplest is to take into consideration not only the rectangular tile where you are placing the points, but also its neighbouring ones. Whenever you place a point in your main tile, you also place one in the neighboring ones and compute their effect on the probability function inside your tile.
A more sophisticated approach considers the probability function then convolution of a kernel that encodes forbidden areas, with a sum of delta functions, corresponding to the points already placed. If this is computed using FFTs, the periodicity is anatural by product.
The first approach can be coded as follows:
from __future__ import division
import numpy as np
r, h = 20, 300
w = 2*np.pi*r
int_w = int(np.rint(w))
mult = 10
pdf = np.ones((h*mult, int_w*mult), np.bool)
points = []
min_d, max_d = 230, 250
available_locs = pdf.sum()
while available_locs:
new_idx = np.random.randint(available_locs)
new_idx = np.nonzero(pdf.ravel())[0][new_idx]
new_point = np.array(np.unravel_index(new_idx, pdf.shape))
points += [new_point]
min_mask = np.ones_like(pdf)
if max_d is not None:
max_mask = np.zeros_like(pdf)
else:
max_mask = True
for p in [new_point - [0, int_w*mult], new_point +[0, int_w*mult],
new_point]:
rows = ((np.arange(pdf.shape[0]) - p[0]) / mult)**2
cols = ((np.arange(pdf.shape[1]) - p[1]) * 2*np.pi*r/int_w/mult)**2
dist2 = rows[:, None] + cols[None, :]
min_mask &= dist2 > min_d*min_d
if max_d is not None:
max_mask |= dist2 < max_d*max_d
pdf &= min_mask & max_mask
available_locs = pdf.sum()
points = np.array(points) / [mult, mult*int_w/(2*np.pi*r)]
If you run it with your values, the output is usually just one or two points, as the large minimum distance forbids all others. but if you run it with more reasonable values, e.g.
min_d, max_d = 50, 200
Here's how the probability function looks after placing each of the first 5 points:
Note that the points are returned as pairs of coordinates, the first being the height, the second the distance along the cylinder's circumference.
I have a code which creates a square image with dimensions 4x4 arcsec running from -2 arcsec to +2 arcsec and is created on an 80x80 grid. To this I want to add another image.
This second image is created through a FFT of an 80x80 grid and thus starts out in Fourier space. After the FFT, I want the image to have exactly the same dimensions in real space as the first image.
Because Fourier space represents the scales and the wavenumber is defined as k = 2pi/x (although in this case the numpy.fft uses the definition where I think k = 1/x), I thought the largest scale would have to have the smallest k-value and the smallest scale the largest k-value.
So if x_max = 2 (the dimensions in the x-direction of the first image) and dim_x = 80 (the number of columns in the grid):
k_x,max = 1/(2*x_max/dim_x)
k_x,min = 1/(2*x_max)
and let the grid in Fourier-space run from k_x,min to k_x,max (same for the y-direction)
I hope I explained this clearly enough, but I haven't been able to find any confirmation or explanation for this in the literature about FFT's and would really like to know if this correct.
Thanks in advance
This is not correct. The k-space values will range from -N/2*omega_0 to (N-1)/2*omega_0, where omega_0 is the inverse of the sample length, given by 2*pi/(max(x)-min(x)) and N is the number of samples. So for your case you get something along the lines of this:
N = len(x)
dx = x[-1]-x[0]
k = np.linspace(-N*pi/dx, (N+1)*pi/dx, N)
I'm working on a piece of software which needs to implement the wiggliness of a set of data. Here's a sample of the input I would receive, merged with the lightness plot of each vertical pixel strip:
It is easy to see that the left margin is really wiggly (i.e. has a ton of minima/maxima), and I want to generate a set of critical points of the image. I've applied a Gaussian smoothing function to the data ~ 10 times, but it seems to be pretty wiggly to begin with.
Any ideas?
Here's my original code, but it does not produce very nice results (for the wiggliness):
def local_maximum(list, center, delta):
maximum = [0, 0]
for i in range(delta):
if list[center + i] > maximum[1]: maximum = [center + i, list[center + i]]
if list[center - i] > maximum[1]: maximum = [center - i, list[center - i]]
return maximum
def count_maxima(list, start, end, delta, threshold = 10):
count = 0
for i in range(start + delta, end - delta):
if abs(list[i] - local_maximum(list, i, delta)[1]) < threshold: count += 1
return count
def wiggliness(list, start, end, delta, threshold = 10):
return float(abs(start - end) * delta) / float(count_maxima(list, start, end, delta, threshold))
Take a look at lowpass/highpass/notch/bandpass filters, fourier transforms, or wavelets. The basic idea is there's lots of different ways to figure out the frequency content of a signal quantized over different time-periods.
If we can figure out what wiggliness is, that would help. I would say the leftmost margin is wiggly b/c it has more high-frequency content, which you could visualize by using a fourier transform.
If you take a highpass filter of that red signal, you'll get just the high frequency content, and then you can measure the amplitudes and do thresholds to determine wiggliness. But I guess wiggliness just needs more formalism behind it.
For things like these, numpy makes things much easier, as it provides useful functions for manipulating vector data, e.g. adding a scalar to each element, calculating the average value etc.
For example, you might try with zero crossing rate of either the original data-wiggliness1 or the first difference-wiggliness2 (depending on what wiggliness is supposed to be, exactly-if global trends are to be ignored, you should probably use the difference data). For x you would take the slice or window of interest from the original data, getting a sort of measure of local wiggliness.
If you use the original data, after removing the bias you might also want to set all values smaller than some threshold to 0 to ignore low-amplitude wiggles.
import numpy as np
def wiggliness1(x):
#remove bias:
x=x-np.average(x)
#calculate zero crossing rate:
np.sum(np.abs(np.sign(np.diff(x))))
def wiggliness(x):
#calculate zero crossing rate of the first difference:
return np.sum(np.abs(np.sign(np.diff(np.sign(np.diff(x))))))