OpenCV Python Feature Detection: how to provide a mask? (SIFT) - python

I am building a simple project in Python3, using OpenCV3, trying to match jigsaw pieces to the "finished" jigsaw image. I have started my tests by using SIFT.
I can extract the contour of the jigsaw piece and crop the image but since most of the high frequencies reside, of course, around the piece (where the piece ends and the floor starts), I want to pass a mask to the SIFT detectAndCompute() method, thus forcing the algorithm to look for the keypoints only within the piece.
test_mask = np.ones(img1.shape, np.uint8)
kp1, des1 = sift.detectAndCompute(img1, mask = test_mask)
After passing a test mask (to make sure it's uint8), I get the following error:
kp1, des1 = sift.detectAndCompute(img1,mask = test_mask)
cv2.error: /home/pyimagesearch/opencv_contrib/modules/xfeatures2d/src/sift.cpp:772: error: (-5) mask has incorrect type (!=CV_8UC1) in function detectAndCompute
From my research, uint8 is just an alias for CV_8U, which is the same as CV_8UC1. Couldn't find any code sample passing a mask to any feature detection algorithm in Python.

Thanks to Miki I've managed to find a bug.
It turned out that my original mask that I created using threshold operations, even though looked binary, was a 3-channel image ([rows], [cols], 3). Thus it couldn't be accepted as a mask.
After checking for type and shape (has to be uint8 and [rows,cols,1]):
print(mask.dtype)
print(mask.shape)
Convert the mask to gray if it's still 3-channel:
mask = cv2.cvtColor(mask, cv2.COLOR_BGR2GRAY)

Related

Segment text from bad lightining images using python

I have three types of images and want to segment text from them. So I get a clean binarized img like the first image below. The three types of images are below
I've tried various techniques but it always have some cases to fail. I tried first to threshold the img using otsu algorithm but it gave bad results in images below
I tried Guassian, bilateral and normal blur kernel but didn't enhance the results too much
Any one can provide help!
Code as the best I got results from
import cv2
gray = cv2.imread("/home/shrouk/Pictures/f2.png", 0)
thresholded = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
cv2.imshow("img", thresholded)
This is the final result I need
This the first type of images that fail. It fails because the text grey level it lighter in the right of the image
The result of otsu on it is here, I just need a way to enhance the words in the third line from right:
Second type that fails because the darker background
otsu result is not very good as the words to the left look like dilated words
This is the type that correctly thresholded by otsu as there is no noise
Try using cv2.adaptiveThreshold()
import cv2
image = cv2.imread("2.png", 0)
adaptive = cv2.adaptiveThreshold(image,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY,11,5)
cv2.imshow("adaptive", adaptive)
cv2.waitKey()

(scikit-image) HOG visualization image appears black when saved

I am new to computer vision and image processing and am using this code
from skimage.feature import hog
hog_list, hog_img = hog(test_img_gray,
orientations=8,
pixels_per_cell=(16, 16), cells_per_block=(1, 1),
block_norm='L1',
visualise=True,
feature_vector=True)
plt.figure(figsize=(15,10))
plt.imshow(hog_img)
to get this HOG visualization image
I have 2 questions at this point:
When I try to save this image (as a .pdf or .jpg) the resulting image is pure black. Converting this image to PIL format and examining it with
hog_img_pil = Image.fromarray(hog_img)
hog_img_pil.show()
still shows the image as pure black. Why is this happening and how can I fix it?
When I try to run this code
hog_img = cv2.cvtColor(hog_img, cv2.COLOR_BGR2GRAY)
to convert the image to grayscale I get the error error: (-215) depth == CV_8U || depth == CV_16U || depth == CV_32F in function cvtColor. What do I need to do to get this image in grayscale and why would this be happening?
As additional information, running hog_img.shape returns (1632, 1224) which is just the size of the image, which I had initially interpreted to mean that the image is already is already in grayscale (since it appears to be lacking a dimension for color channel). However, when I then tried to run
test_img_bw = cv2.adaptiveThreshold(
src=hog_img,
maxValue=255,
adaptiveMethod=cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
thresholdType=cv2.THRESH_BINARY,
blockSize=115, C=4)
I got the error error: (-215) src.type() == CV_8UC1 in function adaptiveThreshold which this answer seems to indicate means that the image is not in grayscale.
Finally, another bit of useful information is that running print(hog_img.dtype) on the image returns float64.
I will continue to debug, in the meantime
Thanks for any thoughts :)
Inverting the image with hog_img_inv = cv2.bitwise_not(hog_img) and using
plt.figure(figsize=(15,10))
plt.imshow(hog_img_uint8_inv)
showed that the lines were in fact there but are very faint (I've included the image here for comletness, but you can barley see it (but trust me, it's there)). I will have to do some more processing of the image to get the lines more distinguishable.
Running print(hog_img.dtype) showed that the dtype was float64 when (I think) it should have been uint8. I fixed this by running hog_img_uint8 = hog_img.astype(np.uint8) which seems to have fixed the problem with passing the image to other algorithms (eg. cv2.adaptiveThreshold).
If had the same problem. But if you look inside the docu, they also use this code for better visualisation:
# Rescale histogram for better display
hog_image_rescaled = exposure.rescale_intensity(hog_image, in_range=(0, 0.02))
But I still have the same problem. Visualisation with matplotlib is no problem. saving the image with opencv (or skimage) saves only a black image...

Python OpenCV: inRange() stopped working without change

I am currently doing real time object detection of an orange ball with Raspberry Pi 3 Model B. The code below is supposed to take a frame, then with the cv2.inRange() function, filter out the image using RGB (BGR). Then I apply dialation and erosion to remove noise. Then I find the contours and draw them. This code worked until now. However when I ran it today without changing it, I got the folowing error:
Traceback (most recent call last):
File "/home/pi/Desktop/maincode.py", line 12, in <module>
mask = cv2.inRange(frame, lower, upper)
error: /build/opencv-ISmtkH/opencv-2.4.9.1+dfsg/modules/core/src/arithm.cpp:2701: error: (-209) The lower bounary is neither an array of the same size and same type as src, nor a scalar in function inRange
Any help would be really awesome, because I was new to openCV and spent a lot of time proggraming this, and I have a competetion of robotics in 5 days.
Thank you in advance
import cv2
import cv2.cv as cv
import numpy as np
capture = cv2.VideoCapture(0)
while capture.isOpened:
ret, frame = capture.read()
im = frame
lower = np.array([0, 100 ,150], dtype = 'uint8')
upper = np.array([10,180,255], dtype = 'uint8')
mask = cv2.inRange(frame, lower, upper)
eroded = cv2.erode(mask, np.ones((7, 7)))
dilated = cv2.dilate(eroded, np.ones((7, 7)))
contours, hierarchy = cv2.findContours(dilated,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(im,contours,-1,(0,255,0),3)
cv2.imshow('colors',im)
cv2.waitKey(1)
The error you receive almost certainly means you have an empty image (or you mix up the sizes of your input image).
Webcam captures in OpenCV often start with one or a couple of black/emtpy images (crappy drivers). Since it goes too fast, that's why you don't notice this. However, this will have an influence on your application if you want to process the image. Therefore, I recommend you to check the image before proceeding with the calculations on them. Just add this after your capture.read() line:
if ret == True:
Note: make sure (by printing in the console or something) that this only happens when you start capturing. If this happens regularly (empty frames from your webcam), maybe there's something else wrong (or maybe with your webcam). Also check it on another computer.

Extract external contour or silhouette of image in Python

I want to extract the silhouette of an image, and I'm trying to do it using the contour function of MatplotLib. This is my code:
from PIL import Image
from pylab import *
# read image to array
im = array(Image.open('HOJA.jpg').convert('L'))
# create a new figure
figure()
# show contours with origin upper left corner
contour(im, origin='image')
axis('equal')
show()
This is my original image:
And this is my result:
But I just want to show the external contour, the silhouette. Just the read lines in this example.
How can I do it? I read the documentation of the contour function, but I can't get what I want.
If you know a better way to do this in Python, please tell me! (MatplotLib, OpenCV, etc.)
If you want to stick with your contour approach you can simply add a levels argument with a value 'thresholding' the image between the white background and the leaf.
You could use the histogram to find an appropriate value. But in this case any value slightly lower than 255 will do.
So:
contour(im, levels=[245], colors='black', origin='image')
Make sure you checkout Scikit-Image if you want to do some serious image processing. It contains several edge detection algoritms etc.
http://scikit-image.org/docs/dev/auto_examples/
For those who want the OpenCV solution, here it is:
ret,thresh = cv2.threshold(image,245,255,0)
contours, hierarchy = cv2.findContours(thresh,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)
tam = 0
for contorno in contours:
if len(contorno) > tam:
contornoGrande = contorno
tam = len(contorno)
cv2.drawContours(image,contornoGrande.astype('int'),-1,(0,255,0),2)
cv2.imshow('My image',image)
cv2.waitKey()
cv2.destroyAllWindows()
In this example, I only draw the biggest contour. Remember that 'image' must be a single-channel array.
You should change the parameters of the threshold function, the findContours function and the drawContours function to get what you want.
threshold Documentation
findContours Documentation
drawContours Documentation
I do the conversion to 'int' in the drawContours function because there is a bug in the Open CV 2.4.3 version, and if you don't do this conversion, the program breaks.
This is the bug.
I would recommand using OpenCV for performance.
It has a findContour functions accessible from python using the cv2 binding.
This function can be set to return only the external contour.
You will have to threshold your image as well.

Using PIL to fill empty image space with nearby colors (aka inpainting)

I create an image with PIL:
I need to fill in the empty space (depicted as black). I could easily fill it with a static color, but what I'd like to do is fill the pixels in with nearby colors. For example, the first pixel after the border might be a Gaussian blur of the filled-in pixels. Or perhaps a push-pull type algorithm described in The Lumigraph, Gortler, et al..
I need something that is not too slow because I have to run this on many images. I have access to other libraries, like numpy, and you can assume that I know the borders or a mask of the outside region or inside region. Any suggestions on how to approach this?
UPDATE:
As suggested by belisarius, opencv's inpaint method is perfect for this. Here's some python code that uses opencv to achieve what I wanted:
import Image, ImageDraw, cv
im = Image.open("u7XVL.png")
pix = im.load()
#create a mask of the background colors
# this is slow, but easy for example purposes
mask = Image.new('L', im.size)
maskdraw = ImageDraw.Draw(mask)
for x in range(im.size[0]):
for y in range(im.size[1]):
if pix[(x,y)] == (0,0,0):
maskdraw.point((x,y), 255)
#convert image and mask to opencv format
cv_im = cv.CreateImageHeader(im.size, cv.IPL_DEPTH_8U, 3)
cv.SetData(cv_im, im.tostring())
cv_mask = cv.CreateImageHeader(mask.size, cv.IPL_DEPTH_8U, 1)
cv.SetData(cv_mask, mask.tostring())
#do the inpainting
cv_painted_im = cv.CloneImage(cv_im)
cv.Inpaint(cv_im, cv_mask, cv_painted_im, 3, cv.CV_INPAINT_NS)
#convert back to PIL
painted_im = Image.fromstring("RGB", cv.GetSize(cv_painted_im), cv_painted_im.tostring())
painted_im.show()
And the resulting image:
A method with nice results is the Navier-Stokes Image Restoration. I know OpenCV has it, don't know about PIL.
Your example:
I did it with Mathematica.
Edit
As per your reuquest, the code is:
i = Import["http://i.stack.imgur.com/uEPqc.png"];
Inpaint[i, ColorNegate#Binarize#i, Method -> "NavierStokes"]
The ColorNegate# ... part creates the replacement mask.
The filling is done with just the Inpaint[] command.
Depending on how you're deploying this application, another option might be to use the Gimp's python interface to do the image manipulation.
The doc page I linked to is oriented more towards writing GIMP plugins in python, rather than interacting with a background gimp instance from a python app, but I'm pretty sure that's also possible (it's been a while since I played with the gimp/python interface, I'm a little hazy).
You can also create the mask with the function CreateImage(), for instance:
inpaint_mask = cv.CreateImage(cv.GetSize(im), 8, 1)

Categories