I'm chasing a little assistance with an idea I'm playing with. I want to take the features located in an image with code similar to the example on
See sample image at bottom of page here
Last section/Example is the one I'm talking about
in particular for my issue I wanted to use the matches indicated in the image to find the target in the scene image like illustrated with a seemingly simple addition. I want to draw a bounding box around the target when located in the scene frame
Example of output I'm after
Rather than just putting a bounding box around the features, I would rather have a list of the four contour points that represent the transformed target on the scene frame if that makes sense.
Big picture, I want to take the subsection of the scene image containing my target and crop it out of the scene image, mask the non-target areas out of the image remaining and then use this as my source for a further process.
At this point I've managed to do all it need to with a hard coded set of points to represent the corners of the target image as rotated and transformed in the scene image so everything works I just need an example of how to determine the x,y co-ords of each corner of the target in that scene
I didn't want to post the code as its a bit clunky and its the concept I'm after, not a complete 'do it for me please' fix
Any advice much appreciated, If you could show me using the example code attached how to do this I'd be very grateful, Cheers.
import numpy as np
import cv2
from matplotlib import pyplot as plt
img1 = cv2.imread('box.png',0) # queryImage
img2 = cv2.imread('box_in_scene.png',0) # trainImage
# Initiate SIFT detector
sift = cv2.SIFT()
# find the keypoints and descriptors with SIFT
kp1, des1 = sift.detectAndCompute(img1,None)
kp2, des2 = sift.detectAndCompute(img2,None)
# FLANN parameters
FLANN_INDEX_KDTREE = 0
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
search_params = dict(checks=50) # or pass empty dictionary
flann = cv2.FlannBasedMatcher(index_params,search_params)
matches = flann.knnMatch(des1,des2,k=2)
# Need to draw only good matches, so create a mask
matchesMask = [[0,0] for i in xrange(len(matches))]
# ratio test as per Lowe's paper
for i,(m,n) in enumerate(matches):
if m.distance < 0.7*n.distance:
matchesMask[i]=[1,0]
draw_params = dict(matchColor = (0,255,0),
singlePointColor = (255,0,0),
matchesMask = matchesMask,
flags = 0)
img3 = cv2.drawMatchesKnn(img1,kp1,img2,kp2,matches,None,**draw_params)
plt.imshow(img3,),plt.show()
You need to find the prescriptive transform between the two images.
Create a set of corresponding coordinates according to the matched features.
For example you find that the feature FtI1 in image 1 corresponds to FtJ1 in image 2 so you know that coordinate of FtI1 (xi,yi) corresponds to the coordinate of FtJ1 (xj,yj) and you have this for all the corresponding features.
After you have a list of corresponding coordinates between the two images you can calculate the prescriptive transform using opecv getPerspectiveTransform.
Finally use the transformation you found on the 4 coordinates of the enclosing shape in the first image to get the coordinates of the enclosing shape in the second image. The opencv function for that is warpPerspective.
An example of how to do that in opecv is in:
http://docs.opencv.org/3.1.0/da/d6e/tutorial_py_geometric_transformations.html
Related
Hi I'm trying to create an OCR where the model should be able to read an uploaded document. However, lot of times, the documents uploaded are skewed or tilted. I plan to straighten and/or resize the document based on a template.
To achieve this, I intend to use feature mapping and homography. However, whenever I calculate my keypoints and descriptors (using ORB), and try to match them using Brute Force Matching, none of the features seem to match. Here's the code that I've used so far and the results with it. Can someone point me in the right direction if I'm missing something or doing it in a certain incorrect way?
def straighten_image(ORIG_IMG, IMG2):
# read both the images:
orig_image = cv2.imread(ORIG_IMG)
img_input = cv2.imread(IMG2)
orig_gray_scale = cv2.cvtColor(orig_image, cv2.COLOR_BGR2GRAY)
gray_scale_img = cv2.cvtColor(img_input, cv2.COLOR_BGR2GRAY)
#Detect ORB features and compute descriptors
MAX_NUM_FEATURES = 100
orb = cv2.ORB_create(MAX_NUM_FEATURES)
keypoints1, descriptors1 = orb.detectAndCompute(orig_gray_scale, None)
keypoints2, descriptors2= orb.detectAndCompute(gray_scale_img, None)
#display image with keypoints
orig_wid_decriptors = cv2.drawKeypoints(orig_gray_scale, keypoints1, outImage = np.array([]), color= (255, 0, 0), flags= cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
inp_wid_decriptors = cv2.drawKeypoints(img_input, keypoints2, outImage = np.array([]), color= (255, 0, 0), flags= cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
#Match features
matcher = cv2.DescriptorMatcher_create(cv2.DESCRIPTOR_MATCHER_BRUTEFORCE_HAMMING)
matches = matcher.match(descriptors1, descriptors2, None)
print(type(matches))
#sort matches
# matches.sort(key=lambda x: x.distance, reverse=False)
#Remove not-so-good matches
numGoodMatches = int(len(matches)*0.1)
matches = matches[:numGoodMatches]
#Draw Top matches
im_matches = cv2.drawMatches(orig_gray_scale, keypoints1, gray_scale_img, keypoints2, matches, None)
cv2.imshow("", im_matches)
cv2.waitKey(0)
#Homography
points1 = np.zeros((len(matches), 2), dtype = np.float32)
points2 = np.zeros((len(matches), 2), dtype = np.float32)
for i, match in enumerate(matches):
points1[i, :] = keypoints1[match.queryIdx].pt
points2[i, :] = keypoints2[match.trainIdx].pt
#Find homography:
h, mask = cv2.findHomography(points2, points1, cv2.RANSAC)
#Warp image
# Use homography to warp image
height, width = orig_gray_scale.shape
inp_reg = cv2.warpPerspective(gray_scale_img, h, (width, height), borderValue = 255)
return inp_reg
import cv2
import matplotlib.pyplot as plt
import numpy as np
template = "template_aadhaar.jpg"
test = "test.jpeg"
str_img = straighten_image(template, test)
cv2.imshow("", str_img)
cv2.waitKey(0)
EDIT: If I use my own ID-card (perfectly straight) as the template and try to align the same ID-card that is tilted, it matches the features and re-aligns the tilted image perfectly. However, I need the model to be able to re-align any other ID-card based on the template. By any ID, I mean the details could be different but the location and font would be exactly the same.
EDIT#2: As suggested by #Olli, I tried using a template with only those features that are same for all Aadhaar cards. Image attached. But still the feature matching is a bit arbitrary.
Feature mapping tries to detect the most significant features on an image and tries to match them. This only works if the features really are the same. If the features are similar but different, it will fail.
If you have some features that are always the same (e.g. the logo on the top left), you could try to create a template with only these features and blank in all other areas, i.e. remove the person and the name and the QR code and...
But because there are more differences ("Government of India inside the green area on image and above on the other,...) than similarities, I would try to find the rotation based on the corners and/or the edges of the shape.
For example:
convert to grayscale
perform canny edge detection
detect corners, e.g. using cv2.goodFeaturesToTrack. If some corners are hidden, try finding the sides using Hough lines instead.
undistort
If some images are rotated 90, 180 or 270 degrees after undistortion, you could use a filter to find the orange and green areas and rotate so that this area is at the top again.
I have to detect same sized and same colored rectangles with same areas in an image for a project. This is an example image.
I don't know how to go about it. I am using OpenCV and python which I am new to.
I tried SIFT and SURF feature descriptors to get the similar features. I also tried template matching but it is not feasible in the case as the trainImage could change. But the main idea is to get those similar rectangles from the image provided.
I am using python3 and openCV3.
I took this code from the opencv tutorial site.
import numpy as np
import cv2
from matplotlib import pyplot as plt
img1 = cv2.imread('template.jpg',0) # queryImage
img2 = cv2.imread('input.jpg',0) # trainImage
sift=cv2.xfeatures2d.SIFT_create()
kp1, des1 = sift.detectAndCompute(img1,None)
kp2, des2 = sift.detectAndCompute(img2,None)
# BFMatcher with default params
bf = cv2.BFMatcher()
matches = bf.knnMatch(des1,des2, k=2)
# Apply ratio test
good = []
for m,n in matches:
if m.distance < 0.75*n.distance:
good.append([m])
# cv2.drawMatchesKnn expects list of lists as matches.
img3 = cv2.drawMatchesKnn(img1,kp1,img2,kp2,good,None,flags=2)
Image result for project
reslut
Here's a simple approach.
generate a list of the unique colours in the image
for each unique colour
make everything that colour in the image white and everything else black
run findContours() and compare shapes and sizes
end for
For increased fun, do each colour in a separate thread :-)
I want to find dim edges using Python.
Input images (100 X 100) :
It consists of several horizontal boards: top, middle, bottom.
I want to find middle board bounding box like:
I used several edge detection methods (prewitt_x, sobel_x, cv2.findContours) but cannot detect well.
Because edge btw black region and board region is dim.
How can I find bounding box like red box?
Code below is example using prewitt_x and cv2.findContours:
import cv2
import numpy as np
img = cv2.imread('my_dir/my_img.bmp',0)
# prewitts_x
kernelx = np.array([[1,1,1],[0,0,0],[-1,-1,-1]])
img_prewittx = cv2.filter2D(img, -1, kernelx)
img_prewittx_gray = cv2.cvtColor(img_prewittx, cv2.COLOR_BGR2GRAY)
cv2.imwrite('my_outdir/my_outimg.bmp',img_prewittx)
# cv2.findContours
image, contours, hierarchy = cv2.findContours(img_prewittx_gray, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
rects = [cv2.boundingRect(cnt) for cnt in contours]
print(rects)
In fact, I don't want to use slower one like Canny detector.
Help me :)
My suggestion:
use a simple edge detection filter such as Prewitt
project horizontally (sum of the pixels in every row)
analyze the resulting profile to detect the regions of low/high activity and delimit the desired slabs.
You can also try the maximum along rows instead of the sum.
But don't expect miracles, this is a hard problem.
I'm building an image similarity program and, as I am a begginer in CV, I talked with an expert who gave me the following recommended steps to get the really basic functionality:
Extract keypoints (DoG, Harris, etc.) and local invariant descriptors (SIFT, SURF, etc.) from all images.
Cluster them to form a codebook (bag of visual words dictionary; BOVW)
Quantize the features from each image into a BOVW histogram
Compare the BOVW histograms for each image (typically using chi-squared, cosine, or euclidean distance)
The point number one is easy, but I start getting confused at step 2. This is the code I've written so far:
import cv2
import numpy as np
dictionarySize = 20
BOW = cv2.BOWKMeansTrainer(dictionarySize)
for imgpath in ['testimg/testcropped1.jpg','testimg/testcropped2.jpg','testimg/testcropped3.jpg']:
img = cv2.imread(imgpath)
gray= cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
dst = cv2.cornerHarris(gray,2,3,0.04)
sift = cv2.xfeatures2d.SIFT_create()
kp = sift.detect(gray,None)
kp,des = sift.compute(img,kp)
img=cv2.drawKeypoints(gray,kp,img)
cv2.imwrite('%s_keypoints.jpg' % imgpath, img)
BOW.add(des)
I extract some features using SIFT and then I try to build a BOVW o each image descriptor. The problem is I have no idea if this is correct and how to get the histograms.
I need to precisely align two images. To do that I am using Enhanced Correlation Coefficient (ECC). Which gives me great results except for images that are rotated a lot. For example if the Reference image (base image) and tested image (that I want to align) are rotated by 90 degrees ECC method doesn't work which is right according to the documentation of findTransformECC() which says
Note that if images undergo strong displacements/rotations, an initial transformation that roughly aligns the images is necessary (e.g., a simple euclidean/similarity transform that allows for the images showing the same image content approximately).
So I have to use feature point based alignment method to do some rough alignment. I tried both SIFT and ORB and I am facing same problem with both. It works fine for some images and for others the resulting transformation is shifted or rotated on wrong side.
These are input images:
I thought that the problem is caused by wrong matches but if I use just 10 keypoints with smaller distance it seems to me that all of them are good matches(I exactly the same result when I use 100 keypoints)
This is the result of matching:
This is the result:
If you compare the rotated image it is shifted to the right and upside down.
What am I missing?
This is my code:
# Initiate detector
orb = cv2.ORB_create()
# find the keypoints with ORB
kp_base = orb.detect(base_gray, None)
kp_test = orb.detect(test_gray, None)
# compute the descriptors with ORB
kp_base, des_base = orb.compute(base_gray, kp_base)
kp_test, des_test = orb.compute(test_gray, kp_test)
# Debug print
base_keypoints = cv2.drawKeypoints(base_gray, kp_base, color=(0, 0, 255), flags=0, outImage=base_gray)
test_keypoints = cv2.drawKeypoints(test_gray, kp_test, color=(0, 0, 255), flags=0, outImage=test_gray)
output.debug_show("Base image keypoints",base_keypoints, debug_mode=debug_mode,fxy=fxy,waitkey=True)
output.debug_show("Test image keypoints",test_keypoints, debug_mode=debug_mode,fxy=fxy,waitkey=True)
# find matches
# create BFMatcher object
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
# Match descriptors.
matches = bf.match(des_base, des_test)
# Sort them in the order of their distance.
matches = sorted(matches, key=lambda x: x.distance)
# Debug print - Draw first 10 matches.
number_of_matches = 10
matches_img = cv2.drawMatches(base_gray, kp_base, test_gray, kp_test, matches[:number_of_matches], flags=2, outImg=base_gray)
output.debug_show("Matches", matches_img, debug_mode=debug_mode,fxy=fxy,waitkey=True)
# calculate transformation matrix
base_keypoints = np.float32([kp_base[m.queryIdx].pt for m in matches[:number_of_matches]]).reshape(-1, 1, 2)
test_keypoints = np.float32([kp_test[m.trainIdx].pt for m in matches[:number_of_matches]]).reshape(-1, 1, 2)
# Calculate Homography
h, status = cv2.findHomography(base_keypoints, test_keypoints)
# Warp source image to destination based on homography
im_out = cv2.warpPerspective(test_gray, h, (base_gray.shape[1], base_gray.shape[0]))
output.debug_show("After rotation", im_out, debug_mode=debug_mode, fxy=fxy)
The answer to this problem is both mundane and irritating. Assuming this is the same issue as what I've encountered (I think it is):
Problem and Explanation
Images are saved by most cameras with EXIF tags that include an "Orientation" value. Beginning with OpenCV 3.2, this orientation tag is automatically read-in when an image is loaded with cv.imread(), and the image is oriented based on the tag (there are 8 possible orientations, which include 90* rotations, mirroring and flipping). Some image viewing applications (such as Image Viewer in Linux Mint Cinnamon, and Adobe Photoshop) will display images rotated in the direction of the EXIF Orientation tag. Other applications (such as QGIS and OpenCV < 3.2) ignore the tag. If your Image 1 has an orientation tag, and Image 2 has an orientation tag, and you perform the alignment with ORB (I haven't tried SIFT for this) in OpenCV, your aligned Image 2 will appear with the correct orientation (that of Image 1) when opened in an application that reads the EXIF Orientation tag. However, if you open both images in an application that ignores the EXIF Orientation tag, then they will not appear to have the same orientation. This problem becomes even more pronounced when 1 image has an orientation tag and the other does not.
One Possible Solution
Remove the EXIF Orientation tags prior to reading the images into OpenCV. Now, as of OpenCV 3.4 (maybe 3.3?) there is an option to load the images ignoring the tag, but when this is done, they are loaded as grayscale (1 channel), which is not helpful if you NEED color cv.imread('image.jpg',128) where 128 means "ignore orientation). So, I use pyexiv2 in python to remove the offending EXIF Orientation tag from my images:
import pyexiv2
image = path_to_image
imageMetadata = pyexiv2.ImageMetadata(image)
imageMetadata.read()
try:
del imageMetadata['Exif.Image.Orientation']
imageMetadata.write()
except:
continue