OpenCV - Managing thresholds in image processing with python - python

I am new to image processing and I am processing the following image and applying threshold to identify edges with the following code
import cv2
import numpy as np
img = cv2.imread("box.jpg")
img_gray = cv2.cvtColor(img,cv2.COLOR_RGB2GRAY)
noise_removal = cv2.bilateralFilter(img_gray,9,75,75)
ret,thresh_image = cv2.threshold(noise_removal,0,255,cv2.THRESH_OTSU)
On the left is the original image. In the middle is the gray image calculated by img_gray in the code. On the right is the threshold image calculated by thresh_imgage.
My question is from image 1 and 2 we can see that there is a significant change in the gradient at the corners but in the threshold image it is also including shadow as the part of box object.
I have run the code several times by changing threshold values but did not succeed to get only the box. What am I doing wrong ? Can someone help in this ? Thanks.

You should have considered trying adaptive threshold
adp_th = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY, 5, 1.8)
This is what I got:
Now playing with the morphological operations mentioned on THIS PAGE you can obtain your desired object.

I just came across another solution regarding selection of optimal thresholds for edge detection. My previous answer was about adaptive threshold of which you know very well.
By optimal I mean choosing a two values (lower and upper thresholds) based on the median value of the gray scale image. The following code shows you how its done:
v = np.median(gray_img)
sigma = 0.33
#---- apply optimal Canny edge detection using the computed median----
lower_thresh = int(max(0, (1.0 - sigma) * v))
upper_thresh = int(min(255, (1.0 + sigma) * v))
edge_img = cv2.Canny(gray_img, lower_thresh, upper_thresh)
cv2.imshow('Edge_of_box',edge_img)
The sigma value of 0.33 is the most optimal value in the field of data science.
Illustration: If you observe a Gaussian curve in statistics, values between 0.33 from both sides of the curve are considered in the distribution. Any value outside these points are assumed to be outliers. Since images are considered to be data, this concept is assumed here as well.
Have a look at this:
Now the second box which you so frequently post:
How can you improve this?
I always wanted to try out the following. Give it a try and do let me know:
First try replacing median value with mean and observe the
results.
Change the sigma value and observe how edge detection changes.
Try performing the above mentioned technique for a small patch of the image. Divide the image into small patches and work your way through. (My way of saying 'Localized edge detection')
There might be better ways to detect out there which I have not come across yet. But this is a great and fun way to start.

Related

Generating a segmentation mask for circular particles from threshold mask?

I am trying to find all the circular particles in the image attached. This is the only image I am have (along with its inverse).
I have read this post and yet I can't use hsv values for thresholding. I have tried using Hough Transform.
circles = cv2.HoughCircles(img, cv2.HOUGH_GRADIENT, dp=0.01, minDist=0.1, param1=10, param2=5, minRadius=3,maxRadius=6)
and using the following code to plot
names =[circles]
for nums in names:
color_img = cv2.imread(path)
blue = (211,211,211)
for x, y, r in nums[0]:
cv2.circle(color_img, (x,y), r, blue, 1)
plt.figure(figsize=(15,15))
plt.title("Hough")
plt.imshow(color_img, cmap='gray')
The following code was to plot the mask:
for masks in names:
black = np.zeros(img_gray.shape)
for x, y, r in masks[0]:
cv2.circle(black, (x,y), int(r), 255, -1) # -1 to draw filled circles
plt.imshow(black, gray)
Yet I am only able to get the following mask which if fairly poor.
This is an image of what is considered a particle and what is not.
One simple approach involves slightly eroding the image, to separate touching circular objects, then doing a connected component analysis and discarding all objects larger than some chosen threshold, and finally dilating the image back so the circular objects are approximately of the original size again. We can do this dilation on the labelled image, such that you retain the separated objects.
I'm using DIPlib because I'm most familiar with it (I'm an author).
import diplib as dip
a = dip.ImageRead('6O0Oe.png')
a = a(0) > 127 # the PNG is a color image, but OP's image is binary,
# so we binarize here to simulate OP's condition.
separation = 7 # tweak these two parameters as necessary
size_threshold = 500
b = dip.Erosion(a, dip.SE(separation))
b = dip.Label(b, maxSize=size_threshold)
b = dip.Dilation(b, dip.SE(separation))
Do note that the image we use here seems to be a zoomed-in screen grab rather than the original image OP is dealing with. If so, the parameters must be made smaller to identify the smaller objects in the smaller image.
My approach is based on a simple observation that most of the particles in your image have approximately same perimeter and the "not particles" have greater perimeter than them.
First, have a look at the RANSAC algorithm and how does it find inliers and outliers. It basically is for 2D data but we will have to transform it to 1D data in our case.
In your case, I am calling inliers to the correct particles and Outliers to incorrect particles.
Our data on which we have to work on will be the perimeter of these particles. To get the perimeter, find contours in this image and get the perimeter of each contour. Refer this for information about Contours.
Now we have the data, knowledge about RANSAC algo and our simple observation mentioned above. Now in this data, we have to find the most dense and compact cluster which will contain all the inliers and others will be outliers.
Now let's assume the inliers are in the range of 40-60 and the outliers are beyond 60. Let's define a threshold value T = 0. We say that for each point in the data, inliers for that point are in the range of (value of that point - T, value of that point + T).
Now first iterate over all the points in the data and count number of inliers to that point for a T and store this information. Find the maximum number of inliers possible for a value of T. Now increment the value of T by 1 and again find the maximum number of inliers possible for that T. Repeat these steps by incrementing value of T one by one.
There will be a range of values of T for which Maximum number of inliers are the same. These inliers are the particles in your image and the particles having perimeter greater than these inliers are the outliers thus the "not particles" in your image.
I have tried this algorithm in my test cases which are similar to your and it works. I am always able to determine the outliers. I hope it works for you too.
One last thing, I see that boundary of your particles are irregular and not smooth, try to make them smooth and use this algorithm if this doesn't work for you in this image.

Linear Inverted Image on Opencv

I'm trying to emulate the linear invert function in GIMP with opencv and python. I can't find more information on how that function has been implemented, besides it being used under linear light.Since I read that opencv imports linear BGR images, I proceeded to trying normal inversion on a RGB opencv but I can only replicate the common inversion method on GIMP.
Inversion function:
def negative(image):
img_negative = (255-image)
return img_negative
Original
Linear Inverted (Negative) Image on GIMP
Inverted (Negative) Image on GIMP
Any insight would be appreciated.
It took a little bit of trial and error, but in addition to you inverting the image, you also have to do some scaling and translation as well.
What I did specifically was in addition to inverting the image, I truncated any values that were beyond intensity 153 for every channel and saturated them to 153. After using this intermediate output, I shift the range such that the lowest value gets mapped to 102 and the highest value gets mapped to 255. This is simply done by adding 102 to every value in the intermediate output.
When I did that, I got a similar image to the one you're after.
In other words:
import cv2
import numpy as np
im = cv2.imread('input.png') # Your image goes here
im_neg = 255 - im
im_neg[im_neg >= 153] = 153 # Step #1
im_neg = im_neg + 102 # Step #2
cv2.imshow('Output', np.hstack((im, im_neg)))
cv2.waitKey(0)
cv2.destroyWindow('Output')
Thankfully, the minimum and maximum values are 0 and 255 which makes this process simpler. I get this output and note that I'm concatenating both images together:
Take note your desired image is stored in im_neg. If you want to see just the image by itself:
Compared to yours:
It's not exactly what you see in the output image you provide, especially because there seems to be some noise around the coloured squares, but this is the closest I could get it to and one could argue that the result I produced is better perceptually.
Hope this helps!
Gimp as of 2.10 works in linear color space and if you look at the original source code for the function it's just bitwise not. SO, here's what the code should look like in opencv-python:
import numpy as np
import cv2
#https://www.pyimagesearch.com/2015/10/05/opencv-gamma-correction/
def adjust_gamma(image, gamma=1.0):
# build a lookup table mapping the pixel values [0, 255] to
# their adjusted gamma values
invGamma = 1.0 / gamma
table = np.array([((i / 255.0) ** invGamma) * 255
for i in np.arange(0, 256)]).astype("uint8")
# apply gamma correction using the lookup table
return cv2.LUT(image, table)
def invert_linear(img):
x= adjust_gamma(img, 1/2.2)
x= cv2.bitwise_not(x)
y= adjust_gamma(x, 2.2)
return y

Python OpenCV HoughLinesP Fails to Detect Lines

I am using OpenCV HoughlinesP to find horizontal and vertical lines. It is not finding any lines most of the time. Even when it finds a lines it is not even close to actual image.
import cv2
import numpy as np
img = cv2.imread('image_with_edges.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
flag,b = cv2.threshold(gray,0,255,cv2.THRESH_OTSU)
element = cv2.getStructuringElement(cv2.MORPH_CROSS,(1,1))
cv2.erode(b,element)
edges = cv2.Canny(b,10,100,apertureSize = 3)
lines = cv2.HoughLinesP(edges,1,np.pi/2,275, minLineLength = 100, maxLineGap = 200)[0].tolist()
for x1,y1,x2,y2 in lines:
for index, (x3,y3,x4,y4) in enumerate(lines):
if y1==y2 and y3==y4: # Horizontal Lines
diff = abs(y1-y3)
elif x1==x2 and x3==x4: # Vertical Lines
diff = abs(x1-x3)
else:
diff = 0
if diff < 10 and diff is not 0:
del lines[index]
gridsize = (len(lines) - 2) / 2
cv2.line(img,(x1,y1),(x2,y2),(0,0,255),2)
cv2.imwrite('houghlines3.jpg',img)
Input Image:
Output Image: (see the Red Line):
#ljetibo Try this with:
c_6.jpg
There's quite a bit wrong here so I'll just start from the beginning.
Ok, first thing you do after opening an image is tresholding. I recommend strongly that you have another look at the OpenCV manual on tresholding and the exact meaning of the treshold methods.
The manual mentions that
cv2.threshold(src, thresh, maxval, type[, dst]) → retval, dst
the special value THRESH_OTSU may be combined with one of the above
values. In this case, the function determines the optimal threshold
value using the Otsu’s algorithm and uses it instead of the specified
thresh .
I know it's a bit confusing because you don't actully combine THRESH_OTSU with any of the other methods (THRESH_BINARY etc...), unfortunately that manual can be like that. What this method actually does is it assumes that there's a "foreground" and a "background" that follow a bi-modal histogram and then applies the THRESH_BINARY I believe.
Imagine this as if you're taking an image of a cathedral or a high building mid day. On a sunny day the sky will be very bright and blue, and the cathedral/building will be quite a bit darker. This means the group of pixels belonging to the sky will all have high brightness values, that is will be on the right side of the histogram, and the pixels belonging to the church will be darker, that is to the middle and left side of the histogram.
Otsu uses this to try and guess the right "cutoff" point, called thresh. For your image Otsu's alg. supposes that all that white on the side of the map is the background, and the map itself the foreground. Therefore your image after thresholding looks like this:
After this point it's not hard to guess what goes wrong. But let's go on, What you're trying to achieve is, I believe, something like this:
flag,b = cv2.threshold(gray,160,255,cv2.THRESH_BINARY)
Then you go on, and try to erode the image. I'm not sure why you're doing this, was your intention to "bold" the lines, or was your intention to remove noise. In any case you never assigned the result of erosion to something. Numpy arrays, which is the way images are represented, are mutable but it's not the way the syntax works:
cv2.erode(src, kernel, [optionalOptions] ) → dst
So you have to write:
b = cv2.erode(b,element)
Ok, now for the element and how the erosion works. Erosion drags a kernel over an image. Kernel is a simple matrix with 1's and 0's in it. One of the elements of that matrix, usually centre one, is called an anchor. An anchor is the element that will be replaced at the end of the operation. When you created
cv2.getStructuringElement(cv2.MORPH_CROSS, (1, 1))
what you created is actually a 1x1 matrix (1 column, 1 row). This makes erosion completely useless.
What erosion does, is firstly retrieves all the values of pixel brightness from the original image where the kernel element, overlapping the image segment, has a "1". Then it finds a minimal value of retrieved pixels and replaces the anchor with that value.
What this means, in your case, is that you drag [1] matrix over the image, compare if the source image pixel brightness is larger, equal or smaller than itself and then you replace it with itself.
If your intention was to remove "noise", then it's probably better to use a rectangular kernel over the image. Think of it this way, "noise" is that thing that "doesn't fit in" with the surroundings. So if you compare your centre pixel with it's surroundings and you find it doesn't fit, it's most likely noise.
Additionally, I've said it replaces the anchor with the minimal value retrieved by the kernel. Numerically, minimal value is 0, which is coincidentally how black is represented in the image. This means that in your case of a predominantly white image, erosion would "bloat up" the black pixels. Erosion would replace the 255 valued white pixels with 0 valued black pixels if they're in the reach of the kernel. In any case it shouldn't be of a shape (1,1), ever.
>>> cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
array([[0, 1, 0],
[1, 1, 1],
[0, 1, 0]], dtype=uint8)
If we erode the second image with a 3x3 rectangular kernel we get the image bellow.
Ok, now we got that out of the way, next thing you do is you find edges using Canny edge detection. The image you get from that is:
Ok, now we look for EXACTLY vertical and EXACTLY horizontal lines ONLY. Of course there are no such lines apart from the meridian on the left of the image (is that what it's called?) and the end image you get after you did it right would be this:
Now since you never described your exact idea, and my best guess is that you want the parallels and meridians, you'll have more luck on maps with lesser scale because those aren't lines to begin with, they are curves. Additionally, is there a specific reason to get a Probability Hough done? The "regular" Hough doesn't suffice?
Sorry for the too-long post, hope it helps a bit.
Text here was added as a request for clarification from the OP Nov. 24th. because there's no way to fit the answer into a char limited comment.
I'd suggest OP asks a new question more specific to the detection of curves because you are dealing with curves op, not horizontal and vertical lines.
There are several ways to detect curves but none of them are easy. In the order of simplest-to-implement to hardest:
Use RANSAC algorithm. Develop a formula describing the nature of the long. and lat. lines depending on the map in question. I.e. latitude curves will almost be a perfect straight lines on the map when you're near the equator, with the equator being the perfectly straight line, but will be very curved, resembling circle segments, when you're at high latitudes (near the poles). SciPy already has RANSAC implemented as a class all you have to do is find and the programatically define the model you want to try to fit to the curves. Of course there's the ever-usefull 4dummies text here. This is the easiest because all you have to do is the math.
A bit harder to do would be to create a rectangular grid and then try to use cv findHomography to warp the grid into place on the image. For various geometric transformations you can do to the grid you can check out OpenCv manual. This is sort of a hack-ish approach and might work worse than 1. because it depends on the fact that you can re-create a grid with enough details and objects on it that cv can identify the structures on the image you're trying to warp it to. This one requires you to do similar math to 1. and just a bit of coding to compose the end solution out of several different functions.
To actually do it. There are mathematically neat ways of describing curves as a list of tangent lines on the curve. You can try to fit a bunch of shorter HoughLines to your image or image segment and then try to group all found lines and determine, by assuming that they're tangents to a curve, if they really follow a curve of the desired shape or are they random. See this paper on this matter. Out of all approaches this one is the hardest because it requires a quite a bit of solo-coding and some math about the method.
There could be easier ways, I've never actually had to deal with curve detection before. Maybe there are tricks to do it easier, I don't know. If you ask a new question, one that hasn't been closed as an answer already you might have more people notice it. Do make sure to ask a full and complete question on the exact topic you're interested in. People won't usually spend so much time writing on such a broad topic.
To show you what you can do with just Hough transform check out bellow:
import cv2
import numpy as np
def draw_lines(hough, image, nlines):
n_x, n_y=image.shape
#convert to color image so that you can see the lines
draw_im = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)
for (rho, theta) in hough[0][:nlines]:
try:
x0 = np.cos(theta)*rho
y0 = np.sin(theta)*rho
pt1 = ( int(x0 + (n_x+n_y)*(-np.sin(theta))),
int(y0 + (n_x+n_y)*np.cos(theta)) )
pt2 = ( int(x0 - (n_x+n_y)*(-np.sin(theta))),
int(y0 - (n_x+n_y)*np.cos(theta)) )
alph = np.arctan( (pt2[1]-pt1[1])/( pt2[0]-pt1[0]) )
alphdeg = alph*180/np.pi
#OpenCv uses weird angle system, see: http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_houghlines/py_houghlines.html
if abs( np.cos( alph - 180 )) > 0.8: #0.995:
cv2.line(draw_im, pt1, pt2, (255,0,0), 2)
if rho>0 and abs( np.cos( alphdeg - 90)) > 0.7:
cv2.line(draw_im, pt1, pt2, (0,0,255), 2)
except:
pass
cv2.imwrite("/home/dino/Desktop/3HoughLines.png", draw_im,
[cv2.IMWRITE_PNG_COMPRESSION, 12])
img = cv2.imread('a.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
flag,b = cv2.threshold(gray,160,255,cv2.THRESH_BINARY)
cv2.imwrite("1tresh.jpg", b)
element = np.ones((3,3))
b = cv2.erode(b,element)
cv2.imwrite("2erodedtresh.jpg", b)
edges = cv2.Canny(b,10,100,apertureSize = 3)
cv2.imwrite("3Canny.jpg", edges)
hough = cv2.HoughLines(edges, 1, np.pi/180, 200)
draw_lines(hough, b, 100)
As you can see from the image bellow, straight lines are only longitudes. Latitudes are not as straight therefore for each latitude you have several detected lines that behave like tangents on the line. Blue drawn lines are drawn by the if abs( np.cos( alph - 180 )) > 0.8: while the red drawn lines are drawn by rho>0 and abs( np.cos( alphdeg - 90)) > 0.7 condition. Pay close attention when comparing the original image with the image with lines drawn on it. The resemblance is uncanny (heh, get it?) but because they're not lines a lot of it only looks like junk. (especially that highest detected latitude line that seems like it's too "angled" but in reality those lines make a perfect tangent to the latitude line on its thickest point, just as hough algorithm demands it). Acknowledge that there are limitations to detecting curves with a line detection algorithm

Cross Correlation in numpy, with FFT - strange result?

I'm attempting to perform a cross-correlation of two images using numpy's FFT.
As far as I'm aware, we have that the cross-correlation of two images is equal to the inverseFFT of the multiplication of - Fourier transform of image A, and the complex conjugate of the Fourier transform of image B.
Thus, I have the following code:
img1 = cv2.imread("...jpg")
img1 = cv2.cvtColor(img1, cv2.COLOR_RGB2GRAY)
fft1 = numpy.fft.fft2(img1)
# I'm cross correlating the same image with itself
fft2 = fft1.copy()
fft2 = numpy.conj(fft2)
#Element wise multiplication
result = fft1*fft2
result_img = numpy.fft.ifft2(result)
result_img = numpy.abs(result_img) #Remove complex values
#Following images are attached
image_shifted = normalize(numpy.fft.fftshift(result_img))
image_nonshifted = normalize(result_img)
However, my results are rather strange. In order to obtain what I believe to be the actual correlation-result I have to fftshift the result. Here are some example images:
Image, not shifted, you can see bright parts at each corner
Image, shifted, looks much more like what an auto-correlation result should look like (centre point is maximal)
I'm not sure if my code, or expected mathematics is wrong, but I can't quite figure out what's going on!
Any help would be greatly appreciated, thanks.
FFTSHIFT shifts the zero-frequency component to the center of the signal. In this case the signal is an image. A good visual guide is this. If you expand the original output image, you will see something akin to this:
So all the FFTSHIFT is doing is centering around a zero frequency component. Mostly used for visualization purposes.Your original results are mathematically correct, but the axis just aren't centered where you expected them.

Python - matplotlib - imshow - How to influence displayed value of unzoomed image

I need to search outliers in more or less homogeneous images representing some physical array. The images have a resolution which is much higher than the screen resolution. Thus every pixel on screen originates from a block of image pixels. Is there the possibility to customize the algorithm which calculates the displayed value for such a block? Especially the possibility to either use the lowest or the highest value would be helpful.
Thanks in advance
Scipy provides several such filters. To get a new image (new) whose pixels are the maximum/minimum over a w*w block of an original image (img), you can use:
new = scipy.ndimage.filters.maximum_filter(img, w)
new = scipy.ndimage.filters.minimum_filter(img, w)
scipy.ndimage.filters has several other filters available.
If the standard filters don't fit your requirements, you can roll your own. To get you started here is an example that shows how to get the minimum in each block in the image. This function reduces the size of the full image (img) by a factor of w in each direction. It returns a smaller image (new) in which each pixel is the minimum pixel in a w*w block of pixels from the original image. The function assumes the image is in a numpy array:
import numpy as np
def condense(img, w):
new = np.zeros((img.shape[0]/w, img.shape[1]/w))
for i in range(0, img.shape[1]//w):
col1 = i * w
new[:, i] = img[:, col1:col1+w].reshape(-1, w*w).min(1)
return new
If you wanted the maximum, replace min with max.
For the condense function to work well, the size of the full image must be a multiple of w in each direction. The handling of non-square blocks or images that don't divide exactly is left as an exercise for the reader.

Categories