I'm using OpenCV's findHomography function (with RANSAC) in Python to find the transformation between two sets of points.
Looking at the documentation, the output is a mask and a transformation matrix.
The documentation is not clear about what the mask represents, and how the matrix is structured.
Is a 1 in the output mask a point that fits the found transformation or a point that was ignored?
And could you explain the makeup of the 3x3 output transformation matrix?
Thanks in advance and sorry if I missed some documentation which explains this.
Based on my limited search, mask returned by findHomography() has status of inliers and outliers, i.e. it's a matrix representing matches after finding the homography of an object.
This answer addresses your first question.
This answer addresses what a mask is and what are its dimensions.
Well what do you need to do with the mask? Because that field is not needed so you don't have to put any mask.
As for the resulting matrix. It is called a homography matrix, or H matrix and it represents the transformation of one point in an image plane to the same point in another image plane.
X1 = H * X2
The point X1 is the same point (X2) in a different plane.
So the H matrix is basically the description of how one point in, lets say, image 1 matches 1 point in image2.
Related
I am trying to find all the circular particles in the image attached. This is the only image I am have (along with its inverse).
I have read this post and yet I can't use hsv values for thresholding. I have tried using Hough Transform.
circles = cv2.HoughCircles(img, cv2.HOUGH_GRADIENT, dp=0.01, minDist=0.1, param1=10, param2=5, minRadius=3,maxRadius=6)
and using the following code to plot
names =[circles]
for nums in names:
color_img = cv2.imread(path)
blue = (211,211,211)
for x, y, r in nums[0]:
cv2.circle(color_img, (x,y), r, blue, 1)
plt.figure(figsize=(15,15))
plt.title("Hough")
plt.imshow(color_img, cmap='gray')
The following code was to plot the mask:
for masks in names:
black = np.zeros(img_gray.shape)
for x, y, r in masks[0]:
cv2.circle(black, (x,y), int(r), 255, -1) # -1 to draw filled circles
plt.imshow(black, gray)
Yet I am only able to get the following mask which if fairly poor.
This is an image of what is considered a particle and what is not.
One simple approach involves slightly eroding the image, to separate touching circular objects, then doing a connected component analysis and discarding all objects larger than some chosen threshold, and finally dilating the image back so the circular objects are approximately of the original size again. We can do this dilation on the labelled image, such that you retain the separated objects.
I'm using DIPlib because I'm most familiar with it (I'm an author).
import diplib as dip
a = dip.ImageRead('6O0Oe.png')
a = a(0) > 127 # the PNG is a color image, but OP's image is binary,
# so we binarize here to simulate OP's condition.
separation = 7 # tweak these two parameters as necessary
size_threshold = 500
b = dip.Erosion(a, dip.SE(separation))
b = dip.Label(b, maxSize=size_threshold)
b = dip.Dilation(b, dip.SE(separation))
Do note that the image we use here seems to be a zoomed-in screen grab rather than the original image OP is dealing with. If so, the parameters must be made smaller to identify the smaller objects in the smaller image.
My approach is based on a simple observation that most of the particles in your image have approximately same perimeter and the "not particles" have greater perimeter than them.
First, have a look at the RANSAC algorithm and how does it find inliers and outliers. It basically is for 2D data but we will have to transform it to 1D data in our case.
In your case, I am calling inliers to the correct particles and Outliers to incorrect particles.
Our data on which we have to work on will be the perimeter of these particles. To get the perimeter, find contours in this image and get the perimeter of each contour. Refer this for information about Contours.
Now we have the data, knowledge about RANSAC algo and our simple observation mentioned above. Now in this data, we have to find the most dense and compact cluster which will contain all the inliers and others will be outliers.
Now let's assume the inliers are in the range of 40-60 and the outliers are beyond 60. Let's define a threshold value T = 0. We say that for each point in the data, inliers for that point are in the range of (value of that point - T, value of that point + T).
Now first iterate over all the points in the data and count number of inliers to that point for a T and store this information. Find the maximum number of inliers possible for a value of T. Now increment the value of T by 1 and again find the maximum number of inliers possible for that T. Repeat these steps by incrementing value of T one by one.
There will be a range of values of T for which Maximum number of inliers are the same. These inliers are the particles in your image and the particles having perimeter greater than these inliers are the outliers thus the "not particles" in your image.
I have tried this algorithm in my test cases which are similar to your and it works. I am always able to determine the outliers. I hope it works for you too.
One last thing, I see that boundary of your particles are irregular and not smooth, try to make them smooth and use this algorithm if this doesn't work for you in this image.
This is the continuation of my previous question. I now have an image like this
Here the corners are detected. Now I am trying to estimate the dimensions of the bigger box while smaller black box dimensions are known.
Can anyone guide me what is the best way to estimate the dimensions of the box? I can do it with simple Euclidean distance but I don't know if it is the correct way or not. Or even if it is the correct way then from a list of tuples (coordinates) how can I find distances like A-B or A-D or G-H but not like A-C or A-F?
The sequence has to be preserved in order to get correct dimensions. Also I have two boxes here so when I create list of corners coordinates then it should contain all coordinates from A-J and I don't know which coordinates belong to which box. So how can I preserve that for two different boxes because I want to run this code for more similar images.
Note: The corners in this image is not a single point but a set of points so I clustered the set of the corner and average them to get a single (x,y) coordinate for each corner.
I have tried my best to explain my questions. Will be extremely glad to have some answers :) Thanks.
For the
How can I find distances like A-B or A-D or G-H but not like A-C or
A-F
part
Here's a quick code, not efficient for images with lots of corners, but for your case it's OK. The idea is to start from the dilated edge image you got in your other question (with only the big box, but the idea is the same for the image where there is also the small box)
then for every possible combination of corners, you look at a few points on an imaginary line between them, and then you check if these points actually fall on a real line in the image.
import cv2
import numpy as np
#getting intermediate points on the line between point1 and point2
#for example, calling this function with (p1,p2,3) will return the point
#on the line between p1 and p2, at 1/3 distance from p2
def get_intermediate_point(p1,p2,ratio):
return [p1[0]+(p2[0]-p1[0])/ratio,p1[1]+(p2[1]-p1[1])/ratio]
#open dilated edge images
img=cv2.imread(dilated_edges,0)
#corners you got from your segmentation and other question
corners=[[29,94],[102,21],[184,52],[183,547],[101,576],[27,509]]
nb_corners=len(corners)
#intermediate points between corners you are going to test
ratios=[2,4,6,8] #in this example, the middle point, the quarter point, etc
nb_ratios=len(ratios)
#list which will contain all connected corners
connected_corners=[]
#double loop for going through all possible corners
for i in range(nb_corners-1):
for j in range(i+1,nb_corners):
cpt=0
c1=corners[i]; c2=corners[j]
#testing every intermediate points between the selected corners
for ratio in ratios:
p=get_intermediate_point(c1,c2,ratio)
#checking if these points fall on a white pixel in the image
if img[p[0],p[1]]==255:
cpt+=1
#if enough of the intermediate points fall on a white pixel
if cpt>=int(nb_ratios*0.75):
#then we assume that the 2 corners are indeed connected by a line
connected_corners.append([i,j])
print(connected_corners)
In general you cannot, since any reconstruction is only up to scale.
Basically, given a calibrated camera and 6 2D-points (6x2=12) you want to find 6 3D points + scale = 6x3+1=19. There aren't enough equations.
In order to do so, you will have to make some assumptions and insert them into the equations.
Form example:
The box edges are perpendicular to each other (which means that every 2 neighboring points share at least one coordinate value).
You need to assume that you know the height of the bottom points, i.e. they are on the same plane as your calibration box (this will give you the Z of the visible bottom points).
Hopefully, these constraints are enough to given you less equations that unknown and you can solve the linear equation set.
I want to detect the rectangle from an image.
I used cv2.findContours() with cv2.convexHull() to filter out the irregular polygon.
Afterwards, I will use the length of hull to determine whether the contour is a rectangle or not.
hull = cv2.convexHull(contour,returnPoints = True)
if len(hull) ==4:
return True
However, sometimes, the convexHull() will return an array with length 5.
If I am using the criterion above, I will miss this rectangle.
For example,
After using cv2.canny()
By using the methods above, I will get the hull :
[[[819 184]]
[[744 183]]
[[745 145]]
[[787 145]]
[[819 146]]]
Here is my question: Given an array (Convex Hull) with length 5, how can I determine whether it is actually referring to a quadrilateral? Thank you.
=====================================================================
updated:
After using Sobel X and Y direction,
sobelxy = cv2.Sobel(img_inversion, cv2.CV_8U, 1, 1, ksize=3)
I got:
Well,
This is not the right way to extract rectangles. Since we are talking basics here, I would suggest you to take the inversion of the image and apply Sobel in X and Y direction and then run the findcontours function. Then with this you will be able to get lot of rectangles that you can filter out. You will have to apply lot of checks to identify the rectangle having text in it. Also I dont understand why do you want to force select rectangle with length 5. You are limiting the scale.
Secondly, another way is to use the Sobel X and Y image and then apply OpenCVs LineSegmentDetector. Once you get all the line segments you have to apply RANSAC for (Quad fit) so the condition here should be all the angles on a set of randomly chosen intersecting lines should be acute(roughly) and finally filter out the quad roi with text( for this use SWT or other reliable techniques).
As for your query you should select quad with ideally length 4 (points).
Ref: Crop the largest rectangle using OpenCV
This link will give you the jist of detecting the rectangle in a very simple way.
The images below give you a sort of walkthrough for inversion and sobel of image. Inversion of image eliminates the double boundaries you get from sobel.
For Inversion you use tilde operator.
Also before taking inversion also, its better you suppress the illumination artifacts. This can be done using homomorphic filtering. or taking log of an image.
It isn't so easy to fit a rectangle to a convex polygon.
You can try to find the minimum area or minimum perimeter rectangle by rotating calipers (https://en.wikipedia.org/wiki/Rotating_calipers).
Then by comparing the areas/perimeters of the hull and the rectangle, you can assess "rectangularity".
I'm attempting to perform a cross-correlation of two images using numpy's FFT.
As far as I'm aware, we have that the cross-correlation of two images is equal to the inverseFFT of the multiplication of - Fourier transform of image A, and the complex conjugate of the Fourier transform of image B.
Thus, I have the following code:
img1 = cv2.imread("...jpg")
img1 = cv2.cvtColor(img1, cv2.COLOR_RGB2GRAY)
fft1 = numpy.fft.fft2(img1)
# I'm cross correlating the same image with itself
fft2 = fft1.copy()
fft2 = numpy.conj(fft2)
#Element wise multiplication
result = fft1*fft2
result_img = numpy.fft.ifft2(result)
result_img = numpy.abs(result_img) #Remove complex values
#Following images are attached
image_shifted = normalize(numpy.fft.fftshift(result_img))
image_nonshifted = normalize(result_img)
However, my results are rather strange. In order to obtain what I believe to be the actual correlation-result I have to fftshift the result. Here are some example images:
Image, not shifted, you can see bright parts at each corner
Image, shifted, looks much more like what an auto-correlation result should look like (centre point is maximal)
I'm not sure if my code, or expected mathematics is wrong, but I can't quite figure out what's going on!
Any help would be greatly appreciated, thanks.
FFTSHIFT shifts the zero-frequency component to the center of the signal. In this case the signal is an image. A good visual guide is this. If you expand the original output image, you will see something akin to this:
So all the FFTSHIFT is doing is centering around a zero frequency component. Mostly used for visualization purposes.Your original results are mathematically correct, but the axis just aren't centered where you expected them.
I would like to determine an angle from an image (2D array).
I can get the coordinates of the point whose intensity is maximum with "unravel_index" and "argmax" but i would like to know how to get an another point whose intensity is high in order to calculate my angle.
I have to automatise that because i have a great number of images for post-treatement
So for the first coordinates, i can do that :
import numpy as np
from numpy import unravel_index
t = unravel_index(eyy.argmax(), eyy.shape)
And i need an another coordinates in order to calculate my angle...
t2 = ....
theta = np.arctan2(t[0]-t2[0],t[1]-t2[1])
What you could try is to look into the Hough Transform (Wikipedia - Hough Transform). The Hough Transform is a tool developed for finding lines and their orientation in images.
There is a Python implementation of the Hough Transform over at Rosetta Code.
I'm not sure if the lines in your data are distinct enough for the Hough Transform to yield good results but I hope it helps.
You can put your array in a masked array, find the pixel with the maximum intensity, then mask it, then find the next pixel with the maximum intensity.