I am trying to use opencv with python. I wrote a descriptor (SIFT, SURF, or ORB) matching code in C++ version of opencv 2.4. I want to convert this code to opencv with python. I found some documents about how to use opencv functions in c++ but many of the opencv function in python I could not find how to use them. Here is my python code, and my current problem is that I don't know how to use "drawMatches" of opencv c++ in python. I found cv2.DRAW_MATCHES_FLAGS_DEFAULT but I have no idea how to use it. Here is my python code of matching using ORB descriptors:
im1 = cv2.imread(r'C:\boldt.jpg')
im2 = cv2.cvtColor(im1, cv2.COLOR_BGR2GRAY)
im3 = cv2.imread(r'C:\boldt_resize50.jpg')
im4 = cv2.cvtColor(im3, cv2.COLOR_BGR2GRAY)
orbDetector2 = cv2.FeatureDetector_create("ORB")
orbDescriptorExtractor2 = cv2.DescriptorExtractor_create("ORB")
orbDetector4 = cv2.FeatureDetector_create("ORB")
orbDescriptorExtractor4 = cv2.DescriptorExtractor_create("ORB")
keypoints2 = orbDetector2.detect(im2)
(keypoints2, descriptors2) = orbDescriptorExtractor2.compute(im2,keypoints2)
keypoints4 = orbDetector4.detect(im4)
(keypoints4, descriptors4) = orbDescriptorExtractor4.compute(im4,keypoints4)
matcher = cv2.DescriptorMatcher_create('BruteForce-Hamming')
raw_matches = matcher.match(descriptors2, descriptors4)
img_matches = cv2.DRAW_MATCHES_FLAGS_DEFAULT(im2, keypoints2, im4, keypoints4, raw_matches)
cv2.namedWindow("Match")
cv2.imshow( "Match", img_matches);
Error message of the line "img_matches = cv2.DRAW_MATCHES_FLAGS_DEFAULT(im2, keypoints2, im4, keypoints4, raw_matches)"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'long' object is not callable
I spent much time search documentation and examples of using opencv functions with python. However, I am very frustrated because there is very little information of using opencv functions in python. It will be extremely helpful if anyone can teach me where I can find the documentation of how to use every function of the opencv module in python. I appreciate your time and help.
I've also written something myself that just uses the OpenCV Python interface and I didn't use scipy. drawMatches is part of OpenCV 3.0.0 and isn't part of OpenCV 2, which is what I'm currently using. Even though I'm late to the party, here's my own implementation that mimics drawMatches to the best of my ability.
I've provided my own images where one is of a camera man, and the other one is the same image but rotated by 55 degrees counter-clockwise.
The basic premise of what I wrote is that I allocate an output RGB image where the amount of rows is the maximum of the two images to accommodate for placing both of the images in the output image and the columns are simply the summation of both the columns together. I place each image in their corresponding spots, then run through a loop of all of the matched keypoints. I extract which keypoints matched between the two images, then extract their (x,y) co-ordinates. I then draw circles at each of the detected locations, then draw a line connecting these circles together.
Bear in mind that the detected keypoint in the second image is with respect to its own co-ordinate system. If you want to place this in the final output image, you need to offset the column co-ordinate by the amount of columns from the first image so that the column co-ordinate is with respect to the co-ordinate system of the output image.
Without further ado:
import numpy as np
import cv2
def drawMatches(img1, kp1, img2, kp2, matches):
"""
My own implementation of cv2.drawMatches as OpenCV 2.4.9
does not have this function available but it's supported in
OpenCV 3.0.0
This function takes in two images with their associated
keypoints, as well as a list of DMatch data structure (matches)
that contains which keypoints matched in which images.
An image will be produced where a montage is shown with
the first image followed by the second image beside it.
Keypoints are delineated with circles, while lines are connected
between matching keypoints.
img1,img2 - Grayscale images
kp1,kp2 - Detected list of keypoints through any of the OpenCV keypoint
detection algorithms
matches - A list of matches of corresponding keypoints through any
OpenCV keypoint matching algorithm
"""
# Create a new output image that concatenates the two images together
# (a.k.a) a montage
rows1 = img1.shape[0]
cols1 = img1.shape[1]
rows2 = img2.shape[0]
cols2 = img2.shape[1]
out = np.zeros((max([rows1,rows2]),cols1+cols2,3), dtype='uint8')
# Place the first image to the left
out[:rows1,:cols1,:] = np.dstack([img1, img1, img1])
# Place the next image to the right of it
out[:rows2,cols1:cols1+cols2,:] = np.dstack([img2, img2, img2])
# For each pair of points we have between both images
# draw circles, then connect a line between them
for mat in matches:
# Get the matching keypoints for each of the images
img1_idx = mat.queryIdx
img2_idx = mat.trainIdx
# x - columns
# y - rows
(x1,y1) = kp1[img1_idx].pt
(x2,y2) = kp2[img2_idx].pt
# Draw a small circle at both co-ordinates
# radius 4
# colour blue
# thickness = 1
cv2.circle(out, (int(x1),int(y1)), 4, (255, 0, 0), 1)
cv2.circle(out, (int(x2)+cols1,int(y2)), 4, (255, 0, 0), 1)
# Draw a line in between the two points
# thickness = 1
# colour blue
cv2.line(out, (int(x1),int(y1)), (int(x2)+cols1,int(y2)), (255, 0, 0), 1)
# Show the image
cv2.imshow('Matched Features', out)
cv2.waitKey(0)
cv2.destroyAllWindows()
To illustrate that this works, here are the two images that I used:
I used OpenCV's ORB detector to detect the keypoints, and used the normalized Hamming distance as the distance measure for similarity as this is a binary descriptor. As such:
import numpy as np
import cv2
img1 = cv2.imread('cameraman.png') # Original image
img2 = cv2.imread('cameraman_rot55.png') # Rotated image
# Create ORB detector with 1000 keypoints with a scaling pyramid factor
# of 1.2
orb = cv2.ORB(1000, 1.2)
# Detect keypoints of original image
(kp1,des1) = orb.detectAndCompute(img1, None)
# Detect keypoints of rotated image
(kp2,des2) = orb.detectAndCompute(img2, None)
# Create matcher
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
# Do matching
matches = bf.match(des1,des2)
# Sort the matches based on distance. Least distance
# is better
matches = sorted(matches, key=lambda val: val.distance)
# Show only the top 10 matches
drawMatches(img1, kp1, img2, kp2, matches[:10])
This is the image I get:
you can visualize the feature matching in Python as following. Note the use of scipy library.
# matching features of two images
import cv2
import sys
import scipy as sp
if len(sys.argv) < 3:
print 'usage: %s img1 img2' % sys.argv[0]
sys.exit(1)
img1_path = sys.argv[1]
img2_path = sys.argv[2]
img1 = cv2.imread(img1_path, cv2.CV_LOAD_IMAGE_GRAYSCALE)
img2 = cv2.imread(img2_path, cv2.CV_LOAD_IMAGE_GRAYSCALE)
detector = cv2.FeatureDetector_create("SURF")
descriptor = cv2.DescriptorExtractor_create("BRIEF")
matcher = cv2.DescriptorMatcher_create("BruteForce-Hamming")
# detect keypoints
kp1 = detector.detect(img1)
kp2 = detector.detect(img2)
print '#keypoints in image1: %d, image2: %d' % (len(kp1), len(kp2))
# descriptors
k1, d1 = descriptor.compute(img1, kp1)
k2, d2 = descriptor.compute(img2, kp2)
print '#keypoints in image1: %d, image2: %d' % (len(d1), len(d2))
# match the keypoints
matches = matcher.match(d1, d2)
# visualize the matches
print '#matches:', len(matches)
dist = [m.distance for m in matches]
print 'distance: min: %.3f' % min(dist)
print 'distance: mean: %.3f' % (sum(dist) / len(dist))
print 'distance: max: %.3f' % max(dist)
# threshold: half the mean
thres_dist = (sum(dist) / len(dist)) * 0.5
# keep only the reasonable matches
sel_matches = [m for m in matches if m.distance < thres_dist]
print '#selected matches:', len(sel_matches)
# #####################################
# visualization of the matches
h1, w1 = img1.shape[:2]
h2, w2 = img2.shape[:2]
view = sp.zeros((max(h1, h2), w1 + w2, 3), sp.uint8)
view[:h1, :w1, :] = img1
view[:h2, w1:, :] = img2
view[:, :, 1] = view[:, :, 0]
view[:, :, 2] = view[:, :, 0]
for m in sel_matches:
# draw the keypoints
# print m.queryIdx, m.trainIdx, m.distance
color = tuple([sp.random.randint(0, 255) for _ in xrange(3)])
cv2.line(view, (int(k1[m.queryIdx].pt[0]), int(k1[m.queryIdx].pt[1])) , (int(k2[m.trainIdx].pt[0] + w1), int(k2[m.trainIdx].pt[1])), color)
cv2.imshow("view", view)
cv2.waitKey()
As the error message says, DRAW_MATCHES_FLAGS_DEFAULT is of type 'long'. It is a constant defined by the cv2 module, not a function. Unfortunately, the function you want, 'drawMatches' only exists in OpenCV's C++ interface.
Related
I'm a novice at openCV, currently i'm following this tutorial on image alignment, i have the following image and template for testing
scanned image(test_image.jpg):
template image(template.jpg):
and the following python code:
from __future__ import print_function
import cv2
import numpy as np
MAX_FEATURES = 500
GOOD_MATCH_PERCENT = 0.15
def alignImages(im1, im2):
# Convert images to grayscale
im1Gray = cv2.cvtColor(im1, cv2.COLOR_BGR2GRAY)
im2Gray = cv2.cvtColor(im2, cv2.COLOR_BGR2GRAY)
# Detect ORB features and compute descriptors.
orb = cv2.ORB_create(MAX_FEATURES)
keypoints1, descriptors1 = orb.detectAndCompute(im1Gray, None)
keypoints2, descriptors2 = orb.detectAndCompute(im2Gray, None)
# Match features.
matcher = cv2.DescriptorMatcher_create(
cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMING)
matches = list(matcher.match(descriptors1, descriptors2, None))
# Sort matches by score
matches.sort(key=lambda x: x.distance, reverse=False)
# Remove not so good matches
numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
matches = matches[:numGoodMatches]
# Draw top matches
imMatches = cv2.drawMatches(im1, keypoints1, im2, keypoints2, matches, None)
cv2.imwrite("matches.jpg", imMatches)
# Extract location of good matches
points1 = np.zeros((len(matches), 2), dtype=np.float32)
points2 = np.zeros((len(matches), 2), dtype=np.float32)
for i, match in enumerate(matches):
points1[i, :] = keypoints1[match.queryIdx].pt
points2[i, :] = keypoints2[match.trainIdx].pt
# Find homography
h, mask = cv2.findHomography(points1, points2, cv2.RANSAC)
# Use homography
height, width, channels = im2.shape
im1Reg = cv2.warpPerspective(im1, h, (width, height))
return im1Reg, h
if __name__ == '__main__':
# Read reference image
refFilename = "template.jpg"
print("Reading reference image : ", refFilename)
imReference = cv2.imread(refFilename, cv2.IMREAD_COLOR)
# Read image to be aligned
imFilename = "test_image.jpg"
print("Reading image to align : ", imFilename)
im = cv2.imread(imFilename, cv2.IMREAD_COLOR)
print("Aligning images ...")
# Registered image will be resotred in imReg.
# The estimated homography will be stored in h.
imReg, h = alignImages(im, imReference)
# Write aligned image to disk.
outFilename = "aligned.jpg"
print("Saving aligned image : ", outFilename)
cv2.imwrite(outFilename, imReg)
# Print estimated homography
print("Estimated homography : \n", h)
I get the following results after i ran the script:
matches.jpg:
UPDATE:
I was able to get the image when i increase the amount of orb features to 2000
aligned.jpg
But the homography is still not rotating the image, how can i rotate the image to the same position as the template?
There are two types of forms to finding a homography (forward and backward), but if you already found the homography, applying it can be done without using opencv as follows:
import numpy as np
from scipy.interpolate import griddata
# creating the homogenious coordinates
src_h, src_w, _ = src_image.shape
values = np.matrix.reshape(src_image, (-1, 3), order='F')
yy, xx = np.meshgrid(np.arange(src_h), np.arange(src_w))
input_flat = np.concatenate((xx.reshape((1, -1)), yy.reshape((1, -1)), np.ones_like(xx.reshape((1, -1)))), axis=0)
# applying the homography and converting back to homogenious coordinates
points = np.matmul(homography, input_flat)
points_homogeneous = points[0:2, :] / points[2, :]
# interpolating the result to nicely fit the grid coordinates
dst_image_shape = [400, 400] # could be any number here
yy, xx = np.meshgrid(np.arange(dst_image_shape[1]), np.arange(dst_image_shape[0]))
src_image_warp = griddata(np.transpose(points_homogeneous ), values_relevant, (yy, xx), method='linear')
#numerical rounding
src_image_warp[np.isnan(src_image_warp)] = 0
src_image_warp[src_image_warp > 255] = 255
src_image_warp = np.uint8(src_image_warp)
Note that this is done for a 1 channel image, for RGB image this has to be done for each channel searately. In addition, this could be made to run faster by interpolating only the relevant coordinates since the interpolation is the most time-consuming operation.
With opencv this can be done by:
import cv2
image_dst = cv2.warpPerspective(image_src, homography, size) # size is a tuple (width, height) of the destination image
Read more on homographies and the opencv implementation here.
Finding the homography
The homography can be found without using opencv but that requires knowlage in linear algebra adn the explanation is a bit lengthy, if needed I will post it as an edit. For any practical case however, the homography can be found using opencv as follows:
homography, status = cv2.findHomography(pts_src, pts_dst)
where pts_src are coordinates in the original image and pts_dst are their matching location in the destination image. Since you already found the point pairs, this will yield you the homography (opencv optimizes the hmography for minimal distortion in the backward operation which is the correct way to perform homography computations).
You have a homography h calculated from findHomography and you can use warpPerspective to transform the template to have the same perspective as the photo.
Now you just need to invert the homography, and apply it to the photo instead of the template.
Either use np.linalg.inv for that, or pass the WARP_INVERSE_MAP flag to warpPerspetive instead.
Hi I'm trying to create an OCR where the model should be able to read an uploaded document. However, lot of times, the documents uploaded are skewed or tilted. I plan to straighten and/or resize the document based on a template.
To achieve this, I intend to use feature mapping and homography. However, whenever I calculate my keypoints and descriptors (using ORB), and try to match them using Brute Force Matching, none of the features seem to match. Here's the code that I've used so far and the results with it. Can someone point me in the right direction if I'm missing something or doing it in a certain incorrect way?
def straighten_image(ORIG_IMG, IMG2):
# read both the images:
orig_image = cv2.imread(ORIG_IMG)
img_input = cv2.imread(IMG2)
orig_gray_scale = cv2.cvtColor(orig_image, cv2.COLOR_BGR2GRAY)
gray_scale_img = cv2.cvtColor(img_input, cv2.COLOR_BGR2GRAY)
#Detect ORB features and compute descriptors
MAX_NUM_FEATURES = 100
orb = cv2.ORB_create(MAX_NUM_FEATURES)
keypoints1, descriptors1 = orb.detectAndCompute(orig_gray_scale, None)
keypoints2, descriptors2= orb.detectAndCompute(gray_scale_img, None)
#display image with keypoints
orig_wid_decriptors = cv2.drawKeypoints(orig_gray_scale, keypoints1, outImage = np.array([]), color= (255, 0, 0), flags= cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
inp_wid_decriptors = cv2.drawKeypoints(img_input, keypoints2, outImage = np.array([]), color= (255, 0, 0), flags= cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
#Match features
matcher = cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMING)
matches = matcher.match(descriptors1, descriptors2, None)
print(type(matches))
#sort matches
# matches.sort(key=lambda x: x.distance, reverse=False)
#Remove not-so-good matches
numGoodMatches = int(len(matches)*0.1)
matches = matches[:numGoodMatches]
#Draw Top matches
im_matches = cv2.drawMatches(orig_gray_scale, keypoints1, gray_scale_img, keypoints2, matches, None)
cv2.imshow("", im_matches)
cv2.waitKey(0)
#Homography
points1 = np.zeros((len(matches), 2), dtype = np.float32)
points2 = np.zeros((len(matches), 2), dtype = np.float32)
for i, match in enumerate(matches):
points1[i, :] = keypoints1[match.queryIdx].pt
points2[i, :] = keypoints2[match.trainIdx].pt
#Find homography:
h, mask = cv2.findHomography(points2, points1, cv2.RANSAC)
#Warp image
# Use homography to warp image
height, width = orig_gray_scale.shape
inp_reg = cv2.warpPerspective(gray_scale_img, h, (width, height), borderValue = 255)
return inp_reg
import cv2
import matplotlib.pyplot as plt
import numpy as np
template = "template_aadhaar.jpg"
test = "test.jpeg"
str_img = straighten_image(template, test)
cv2.imshow("", str_img)
cv2.waitKey(0)
EDIT: If I use my own ID-card (perfectly straight) as the template and try to align the same ID-card that is tilted, it matches the features and re-aligns the tilted image perfectly. However, I need the model to be able to re-align any other ID-card based on the template. By any ID, I mean the details could be different but the location and font would be exactly the same.
EDIT#2: As suggested by #Olli, I tried using a template with only those features that are same for all Aadhaar cards. Image attached. But still the feature matching is a bit arbitrary.
Feature mapping tries to detect the most significant features on an image and tries to match them. This only works if the features really are the same. If the features are similar but different, it will fail.
If you have some features that are always the same (e.g. the logo on the top left), you could try to create a template with only these features and blank in all other areas, i.e. remove the person and the name and the QR code and...
But because there are more differences ("Government of India inside the green area on image and above on the other,...) than similarities, I would try to find the rotation based on the corners and/or the edges of the shape.
For example:
convert to grayscale
perform canny edge detection
detect corners, e.g. using cv2.goodFeaturesToTrack. If some corners are hidden, try finding the sides using Hough lines instead.
undistort
If some images are rotated 90, 180 or 270 degrees after undistortion, you could use a filter to find the orange and green areas and rotate so that this area is at the top again.
I am trying to learn OpenCV in order to improve a script I wrote for comparing engineering drawings. I am using the code (see below) found on this tutorial but I am having zero success with it. In the tutorial the author uses the example of a blank form for the reference image and a photo of the completed form as the image to align. My situation is very similar because I am attempting to use a blank drawing title block as my reference image and a scanned image of a drawing as my image to align.
My goal is to use OpenCV to clean up the scanned engineering drawings so that they are aligned properly but no matter what I try in the MAX_FEATURES and GOOD_MATCH_PERCENT parameters, I get an image that looks like a black and white star burst. Also, when I review the "matches.jpg" file generated by the script, it appears that there are no correct matches. I have tried multiple drawings and I get the same results.
Can anyone see a reason why this script would not work in the way I am trying to use it?
from __future__ import print_function
import cv2
import numpy as np
MAX_FEATURES = 500
GOOD_MATCH_PERCENT = 0.15
def alignImages(im1, im2):
# Convert images to grayscale
im1Gray = cv2.cvtColor(im1, cv2.COLOR_BGR2GRAY)
im2Gray = cv2.cvtColor(im2, cv2.COLOR_BGR2GRAY)
# Detect ORB features and compute descriptors.
orb = cv2.ORB_create(MAX_FEATURES)
keypoints1, descriptors1 = orb.detectAndCompute(im1Gray, None)
keypoints2, descriptors2 = orb.detectAndCompute(im2Gray, None)
# Match features.
matcher = cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMING)
matches = matcher.match(descriptors1, descriptors2, None)
# Sort matches by score
matches.sort(key=lambda x: x.distance, reverse=False)
# Remove not so good matches
numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
matches = matches[:numGoodMatches]
# Draw top matches
imMatches = cv2.drawMatches(im1, keypoints1, im2, keypoints2, matches, None)
cv2.imwrite("matches.jpg", imMatches)
# Extract location of good matches
points1 = np.zeros((len(matches), 2), dtype=np.float32)
points2 = np.zeros((len(matches), 2), dtype=np.float32)
for i, match in enumerate(matches):
points1[i, :] = keypoints1[match.queryIdx].pt
points2[i, :] = keypoints2[match.trainIdx].pt
# Find homography
h, mask = cv2.findHomography(points1, points2, cv2.RANSAC)
# Use homography
height, width, channels = im2.shape
im1Reg = cv2.warpPerspective(im1, h, (width, height))
return im1Reg, h
if __name__ == '__main__':
# Read reference image
refFilename = "form.jpg"
print("Reading reference image : ", refFilename)
imReference = cv2.imread(refFilename, cv2.IMREAD_COLOR)
# Read image to be aligned
imFilename = "scanned-form.jpg"
print("Reading image to align : ", imFilename);
im = cv2.imread(imFilename, cv2.IMREAD_COLOR)
print("Aligning images ...")
# Registered image will be resotred in imReg.
# The estimated homography will be stored in h.
imReg, h = alignImages(im, imReference)
# Write aligned image to disk.
outFilename = "aligned.jpg"
print("Saving aligned image : ", outFilename);
cv2.imwrite(outFilename, imReg)
# Print estimated homography
print("Estimated homography : \n", h)
Template Image:
Image to Align:
Expected output Image:
Here is one way in Python/OpenCV using a Rigid Affine Transformation (scale, rotation and translation only - no skew or perspective) to warp one image to match the other. It uses findTransformECC() -- Enhanced Correlation Coefficient Maximization) -- to get the rotation matrix and then uses warpAffine to do the rigid warping.
Template:
Image to be warped:
import cv2
import numpy as np
import math
import sys
# Get the image files from the command line arguments
# These are full paths to the images
# image2 will be warped to match image1
# argv[0] is name of script
image1 = sys.argv[1]
image2 = sys.argv[2]
outfile = sys.argv[3]
# Read the images to be aligned
# im2 is to be warped to match im1
im1 = cv2.imread(image1);
im2 = cv2.imread(image2);
# Convert images to grayscale for computing the rotation via ECC method
im1_gray = cv2.cvtColor(im1,cv2.COLOR_BGR2GRAY)
im2_gray = cv2.cvtColor(im2,cv2.COLOR_BGR2GRAY)
# Find size of image1
sz = im1.shape
# Define the motion model - euclidean is rigid (SRT)
warp_mode = cv2.MOTION_EUCLIDEAN
# Define 2x3 matrix and initialize the matrix to identity matrix I (eye)
warp_matrix = np.eye(2, 3, dtype=np.float32)
# Specify the number of iterations.
number_of_iterations = 5000;
# Specify the threshold of the increment
# in the correlation coefficient between two iterations
termination_eps = 1e-3;
# Define termination criteria
criteria = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, number_of_iterations, termination_eps)
# Run the ECC algorithm. The results are stored in warp_matrix.
(cc, warp_matrix) = cv2.findTransformECC (im1_gray, im2_gray, warp_matrix, warp_mode, criteria, None, 1)
# Warp im2 using affine
im2_aligned = cv2.warpAffine(im2, warp_matrix, (sz[1],sz[0]), flags=cv2.INTER_LINEAR + cv2.WARP_INVERSE_MAP);
# write output
cv2.imwrite(outfile, im2_aligned)
# Print rotation angle
row1_col0 = warp_matrix[0,1]
angle = math.degrees(math.asin(row1_col0))
print(angle)
Result:
Resulting Angle of Rotation (in deg):
-0.3102187026194794
Note, you can change the background color in the affineWarp to white if desired.
Also make the termination epsilon smaller by an order of magnitude or two for more accuracy, but longer processing times.
The other Rigid Affine approach that I mentioned in my comments earlier is to use ORB feature matching, filter the key points, then use estimateAffinePartial2D() to get the rigid affine matrix. Then use that to warp the image. For large angles this seems to me to be more reliable than the ECC method. But the ECC method seems more accurate for small rotations.
import cv2
import numpy as np
import math
import sys
MAX_FEATURES = 10000
GOOD_MATCH_PERCENT = 0.15
DIFFY_THRESH = 2
# Get the image files from the command line arguments
# These are full paths to the images
# image[2] will be warped to match image[1]
# argv[0] is name of script
file1 = sys.argv[1]
file2 = sys.argv[2]
outFile = sys.argv[3]
# Read image1
image1 = cv2.imread(file1, cv2.IMREAD_COLOR)
# Read image2 to be warped to match image1
image2 = cv2.imread(file2, cv2.IMREAD_COLOR)
# Convert images to grayscale
image1Gray = cv2.cvtColor(image1, cv2.COLOR_BGR2GRAY)
image2Gray = cv2.cvtColor(image2, cv2.COLOR_BGR2GRAY)
# Detect ORB features and compute descriptors.
orb = cv2.ORB_create(MAX_FEATURES)
keypoints1, descriptors1 = orb.detectAndCompute(image1Gray, None)
keypoints2, descriptors2 = orb.detectAndCompute(image2Gray, None)
# Match features.
matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
matches = matcher.match(descriptors1, descriptors2, None)
# Sort matches by score
matches.sort(key=lambda x: x.distance, reverse=False)
# Remove not so good matches
numGoodMatches = int(len(matches) * GOOD_MATCH_PERCENT)
matches = matches[:numGoodMatches]
#print('numgood',numGoodMatches)
# Extract location of good matches and filter by diffy if rotation is small
points1 = np.zeros((len(matches), 2), dtype=np.float32)
points2 = np.zeros((len(matches), 2), dtype=np.float32)
for i, match in enumerate(matches):
points1[i, :] = keypoints1[match.queryIdx].pt
points2[i, :] = keypoints2[match.trainIdx].pt
# initialize empty arrays for newpoints1 and newpoints2 and mask
newpoints1 = np.empty(shape=[0, 2], dtype=np.float32)
newpoints2 = np.empty(shape=[0, 2], dtype=np.float32)
matches_Mask = [0] * len(matches)
count=0
for i in range(len(matches)):
pt1 = points1[i]
pt2 = points2[i]
pt1x, pt1y = zip(*[pt1])
pt2x, pt2y = zip(*[pt2])
diffy = np.float32( np.float32(pt2y) - np.float32(pt1y) )
if abs(diffy) < DIFFY_THRESH:
newpoints1 = np.append(newpoints1, [pt1], axis=0).astype(np.uint8)
newpoints2 = np.append(newpoints2, [pt2], axis=0).astype(np.uint8)
matches_Mask[i]=1
count += 1
# Find Affine Transformation
# note swap of order of newpoints here so that image2 is warped to match image1
m, inliers = cv2.estimateAffinePartial2D(newpoints2,newpoints1)
# Use affine transform to warp im2 to match im1
height, width, channels = image1.shape
image2Reg = cv2.warpAffine(image2, m, (width, height))
# Write aligned image to disk.
cv2.imwrite(outFile, image2Reg)
# Print angle
row1_col0 = m[1,0]
print('row1_col0:',row1_col0)
angle = math.degrees(math.asin(row1_col0))
print('angle', angle)
Result Image:
Result Rotation Angle:
-0.6123936361765413
After some trial and error I determined that I don't need to find a homography in order to align my images properly. Since my images only need to be scaled and rotated slightly, my best option is to find the outer most points of the drawing title block and align one image to the other with a transform.
My approach is to use the Harris corner finding function to find all of the corners on the drawing, then do a simple calculation to find the points that are the shortest distance to the corners of the drawing canvas (these are the outside corners of the drawing title block). I then take 3 of the points (top left, top right, and bottom left) and use a transform to scale/rotate one drawing to the other.
Below is the code that I used:
import cv2
import numpy as np
import math
img1 = cv2.imread('reference.jpg')
img2 = cv2.imread('to-be-aligned.jpg')
#Find the corner points of img1
h1,w1,c=img1.shape
gray1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
gray1 = np.float32(gray1)
dst1 = cv2.cornerHarris(gray1,5,3,0.04)
ret1, dst1 = cv2.threshold(dst1,0.1*dst1.max(),255,0)
dst1 = np.uint8(dst1)
ret1, labels1, stats1, centroids1 = cv2.connectedComponentsWithStats(dst1)
criteria1 = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 100, 0.001)
corners1 = cv2.cornerSubPix(gray1,np.float32(centroids1),(5,5),(-1,-1),criteria1)
#Find the corner points of img2
h2,w2,c=img2.shape
gray2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
gray2 = np.float32(gray2)
dst2 = cv2.cornerHarris(gray2,5,3,0.04)
ret2, dst2 = cv2.threshold(dst2,0.1*dst2.max(),255,0)
dst2 = np.uint8(dst2)
ret2, labels2, stats2, centroids2 = cv2.connectedComponentsWithStats(dst2)
criteria2 = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 100, 0.001)
corners2 = cv2.cornerSubPix(gray2,np.float32(centroids2),(5,5),(-1,-1),criteria2)
#Find the top left, top right, and bottom left outer corners of the drawing frame for img1
a1=[0,0]
b1=[w1,0]
c1=[0,h1]
a1_dist=[]
b1_dist=[]
c1_dist=[]
for i in corners1:
temp_a1=math.sqrt((i[0]-a1[0])**2+(i[1]-a1[1])**2)
temp_b1=math.sqrt((i[0]-b1[0])**2+(i[1]-b1[1])**2)
temp_c1=math.sqrt((i[0]-c1[0])**2+(i[1]-c1[1])**2)
a1_dist.append(temp_a1)
b1_dist.append(temp_b1)
c1_dist.append(temp_c1)
print("Image #1 (reference):")
print("Top Left:")
print(corners1[a1_dist.index(min(a1_dist))])
print("Top Right:")
print(corners1[b1_dist.index(min(b1_dist))])
print("Bottom Left:")
print(corners1[c1_dist.index(min(c1_dist))])
#Find the top left, top right, and bottom left outer corners of the drawing frame for img2
a2=[0,0]
b2=[w2,0]
c2=[0,h2]
a2_dist=[]
b2_dist=[]
c2_dist=[]
for i in corners2:
temp_a2=math.sqrt((i[0]-a2[0])**2+(i[1]-a2[1])**2)
temp_b2=math.sqrt((i[0]-b2[0])**2+(i[1]-b2[1])**2)
temp_c2=math.sqrt((i[0]-c2[0])**2+(i[1]-c2[1])**2)
a2_dist.append(temp_a2)
b2_dist.append(temp_b2)
c2_dist.append(temp_c2)
print("Image #2 (image to align):")
print("Top Left:")
print(corners2[a2_dist.index(min(a2_dist))])
print("Top Right:")
print(corners2[b2_dist.index(min(b2_dist))])
print("Bottom Left:")
print(corners2[c2_dist.index(min(c2_dist))])
#Create the points for img1
point1 = np.zeros((3,2), dtype=np.float32)
point1[0][0]=corners1[a1_dist.index(min(a1_dist))][0]
point1[0][1]=corners1[a1_dist.index(min(a1_dist))][1]
point1[1][0]=corners1[b1_dist.index(min(b1_dist))][0]
point1[1][1]=corners1[b1_dist.index(min(b1_dist))][1]
point1[2][0]=corners1[c1_dist.index(min(c1_dist))][0]
point1[2][1]=corners1[c1_dist.index(min(c1_dist))][1]
#Create the points for img2
point2 = np.zeros((3,2), dtype=np.float32)
point2[0][0]=corners2[a2_dist.index(min(a2_dist))][0]
point2[0][1]=corners2[a2_dist.index(min(a2_dist))][1]
point2[1][0]=corners2[b2_dist.index(min(b2_dist))][0]
point2[1][1]=corners2[b2_dist.index(min(b2_dist))][1]
point2[2][0]=corners2[c2_dist.index(min(c2_dist))][0]
point2[2][1]=corners2[c2_dist.index(min(c2_dist))][1]
#Make sure points look ok:
print(point1)
print(point2)
#Transform the image
m = cv2.getAffineTransform(point2,point1)
image2Reg = cv2.warpAffine(img2, m, (w1, h1), borderValue=(255,255,255))
#Highlight found points in red:
img1[dst1>0.1*dst1.max()]=[0,0,255]
img2[dst2>0.1*dst2.max()]=[0,0,255]
#Output the images:
cv2.imwrite("output-img1-harris.jpg", img1)
cv2.imwrite("output-img2-harris.jpg", img2)
cv2.imwrite("output-harris-transform.jpg",image2Reg)
I am trying to use template matching to find an equation inside a given pdf document that is generated from LaTeX. When I use the code over here, I get only a very good matching when I crop the picture from the original page (converted to jpeg or png), however when I compile the equation code separately and generate an jpg/png output of it the matching goes wrong tremendously.
I believe the reason is relevant to the resolution, but since I am an amateur in this field, I cannot reasonably make the jpg generated out of standalone equation to have the same pixel structure of the jpg for the whole page. Here is the code that is copied (more or less) from the above-mentioned website of OpenCV, which is an implementation for python:
import cv2
from PIL import Image
img = cv2.imread('location of the original image', 0)
img2 = img.copy()
template = cv2.imread('location of the patch I look for',0)
w, h = template.shape[::-1]
# All the 6 methods for comparison in a list
methods = ['cv2.TM_CCOEFF', 'cv2.TM_CCOEFF_NORMED', 'cv2.TM_CCORR',
'cv2.TM_CCORR_NORMED', 'cv2.TM_SQDIFF', 'cv2.TM_SQDIFF_NORMED']
method = eval(methods[0])
# Apply template Matching
res = cv2.matchTemplate(img,template,method)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
# If the method is TM_SQDIFF or TM_SQDIFF_NORMED, take minimum
if method in [cv2.TM_SQDIFF, cv2.TM_SQDIFF_NORMED]:
top_left = min_loc
else:
top_left = max_loc
bottom_right = (top_left[0] + w, top_left[1] + h)
print top_left, bottom_right
img = Image.open('location of the original image')
#cropping the original image with the found coordinates to make a qualitative comparison
cropped = img.crop((top_left[0], top_left[1], bottom_right[0], bottom_right[1]))
cropped.save('location to save the cropped image using the coordinates found by template matching')
Here is a sample page that I look for the first equation:
The code to generate a specific standalone equation is as follows:
\documentclass[preview]{standalone}
\usepackage{amsmath}
\begin{document}\begin{align*}
(\mu_1+\mu_2)(\emptyset) = \mu_1(\emptyset) + \mu_2(\emptyset) = 0 + 0 =0
\label{eq_0}
\end{align*}
\end{document}
Which I compile and later trim the white space around the equation either using pdfcrop or using .image() method in PythonMagick. Template matching generated with this trimmed output on the original page does not give a reasonable result. Here is the trimmed/converted output using pdfcrop/Mac's Preview.app:
.
Cropping directly the equation from the above page works perfectly. I would appreciate some explanation and help.
EDIT:
I also found the following which uses template matching by bruteforcing different possible scales:
http://www.pyimagesearch.com/2015/01/26/multi-scale-template-matching-using-python-opencv/
However since I am willing to process as many as 1000 of documents, this seems a very slow method to go for. Plus I imagine there should be a more logical way of handling it, by somehow finding the relevant scales.
Instead of template matching you could use features, i.e. keypoints with descriptors. They are scale- invariant, so you do not need to iterate over different scaled versions of the image.
The python example find_obj.py
provieded with OpenCV works with ORB features for your given example.
python find_obj.py --feature=brisk rB4Yy_big.jpg ZjBAA.jpg
Note that I did not use the cropped version of the formula to search for, but a version with some white pixels around it, so the keypoint detection can work correctly. There needs to be some space around it, because keypoints have to be completely inside the image. Otherwise the descriptors can not be calculated.
The big image is the original from your post.
One additional remark: You will always get some matches. If the formula image you are searching for is not present in the big images, the matches will be nonsensical. If you need to sort out these false positives, you have the following options:
Check if the average distance of the resulting DMatches is small enough.
Check if the transformation matrix can be calculated.
Edit: Since you asked for it, here is a version that draws the bounding box around the found formula instead of the matches:
#!/usr/bin/env python
# Python 2/3 compatibility
from __future__ import print_function
import numpy as np
import cv2
def init_feature():
detector = cv2.BRISK_create()
norm = cv2.NORM_HAMMING
matcher = cv2.BFMatcher(norm)
return detector, matcher
def filter_matches(kp1, kp2, matches, ratio = 0.75):
mkp1, mkp2 = [], []
for m in matches:
if len(m) == 2 and m[0].distance < m[1].distance * ratio:
m = m[0]
mkp1.append( kp1[m.queryIdx] )
mkp2.append( kp2[m.trainIdx] )
p1 = np.float32([kp.pt for kp in mkp1])
p2 = np.float32([kp.pt for kp in mkp2])
kp_pairs = zip(mkp1, mkp2)
return p1, p2, kp_pairs
def explore_match(win, img1, img2, kp_pairs, status = None, H = None):
h1, w1 = img1.shape[:2]
h2, w2 = img2.shape[:2]
vis = np.zeros((max(h1, h2), w1+w2), np.uint8)
vis[:h1, :w1] = img1
vis[:h2, w1:w1+w2] = img2
vis = cv2.cvtColor(vis, cv2.COLOR_GRAY2BGR)
if H is not None:
corners = np.float32([[0, 0], [w1, 0], [w1, h1], [0, h1]])
corners = np.int32( cv2.perspectiveTransform(corners.reshape(1, -1, 2), H).reshape(-1, 2) + (w1, 0) )
cv2.polylines(vis, [corners], True, (0, 0, 255))
cv2.imshow(win, vis)
return vis
if __name__ == '__main__':
img1 = cv2.imread('rB4Yy_big.jpg' , 0)
img2 = cv2.imread('ZjBAA.jpg', 0)
detector, matcher = init_feature()
kp1, desc1 = detector.detectAndCompute(img1, None)
kp2, desc2 = detector.detectAndCompute(img2, None)
raw_matches = matcher.knnMatch(desc1, trainDescriptors = desc2, k = 2)
p1, p2, kp_pairs = filter_matches(kp1, kp2, raw_matches)
if len(p1) >= 4:
H, status = cv2.findHomography(p1, p2, cv2.RANSAC, 5.0)
print('%d / %d inliers/matched' % (np.sum(status), len(status)))
vis = explore_match('find_obj', img1, img2, kp_pairs, status, H)
cv2.waitKey()
cv2.destroyAllWindows()
else:
print('%d matches found, not enough for homography estimation' % len(p1))
The problem with template matching is that it only works in very controlled environments. Meaning that it will work perfectly if you take the template from the actual image, but it won't work if the resolution is different or even if the image is a little bit turned.
I would suggest you finding another algorithm more suitable for this problem. In OpenCV docs you can find some specific algorithms for your problem.
I have two images that I want to align by using openCV. One of the images is a green band of true color imagery, the other is a NIR image of almost the same area (offset is about 180 pixels). For this alignment I want to use python-opencv 3.0 and the ORB algorithm. I use the following script to create the KNNmatches:
img1 = cv2.imread('rgb.png',1)
img2 = cv2.imread('nir.png',0)
img1=img1[:,:,1]
orb = cv2.ORB_create()
kp1, des1 = orb.detectAndCompute(img1,None)
kp2, des2 = orb.detectAndCompute(img2,None)
bf = cv2.BFMatcher()
matches = bf.knnMatch(des1,des2, k=2)
good = []
for m,n in matches:
if m.distance < 0.75*n.distance:
good.append([m])
img3 = cv2.drawMatchesKnn(img1,kp1,img2,kp2,good,flags=2,outImg=None)
However, when I do that with my images I get just a few matches with the following images:
Would anyone of you know how I could best align these images? Thank you in advance and apologies if this was posted in the wrong forum.
The next step is to extract the keypoint locations from your "good matches", as use these to calculate a 3x3 transformation matrix that will transform the corners of one image to the other.
For this case, lets say that we want to transform img2 to align with img1. First we extract locations of good matches:
pts1 = np.float32([kp1[m.queryIdx].pt for m in good]).reshape(-1, 1, 2)
pts2 = np.float32([kp2[m.queryIdx].pt for m in good]).reshape(-1, 1, 2)
Then we find the transformation matrix:
M = cv2.findHomography(pts2, pts1)
Finally, we can apply the transformation:
warpedImg2 = cv2.warpPerspective(img2, M, img1.shape)
Here is a great resource on feature detection in OpenCV using Python.