I'd like to be able to detect the presence of a black box using opencv in an image. The black box itself may have various types of text and can vary in size. The rest of the image won't be black. Here are a few examples of the type of black boxes I'm referring to:
Obviously, template matching is a no-go, and using feature extraction also seems to be completely wrong since the only real features of the box are the lines of text (which are irrelevant). To my inexperienced computer vision senses it would seem that corner detection would be perhaps be the best approach. However, this still seems somewhat crude and imprecise for something as generic as a rectangle. Can someone suggest a more rigorous method for detection?
Related
I always wanted to have a device that, from a live camera feed, could detect an object, create a 3D model of it, and then identify it. It would work a lot like the Scanner tool from Subnautica. Imagine my surprise when I found OpenCV, a free-to-use computer vision tool for Python!
My first step is to get the computer to recognize that there is an object at the center of the camera feed. To do this, I found a Canny() function that could detect edges and display them as white lines in a black image, which should make a complete outline of the object in the center. I also used the floodFill() function to fill in the black zone between the white lines with gray, which would show that the computer recognizes that there is an object there. My attempt is in the following image.
The red dot is the center of the live video.
The issue is that the edge lines can have holes in them due to a blur between two colors, which can range from individual pixels to entire missing lines. As a result, the gray gets out and doesn't highlight me as the only object, and instead highlights the entire wall as well. Is there a way to fill those missing pixels in or is there a better way of doing this?
Welcome to SO and the exiting world of machine vision !
What you are describing is a very classical problem in the field, and not a trivial one at all. It depends heavily on the shape and appearance of what you define as the object of interest and the overall structure, homogeneity and color of the background. Remember, the computer has no concept of what an "object" is, the only thing it 'knows' is a matrix of numbers.
In your example, you might start out with selecting the background area by color (or hue, look up HSV). Everything else is your object. This is what classical greenscreening techniques do, and it only works with (a) a homogenous background, which does not share a color with your object and (b) a single or multiple not overlapping objects.
The problem with your edge based approach is that you won't get a closed edge safely, and deciding where the inside and outside of the object is might get tricky.
Advanced ways to do this would get you into Neural Network territory, but maybe try to get the basics down first.
Here are two links to tutorials on converting color spaces and extracting contours:
https://docs.opencv.org/4.x/df/d9d/tutorial_py_colorspaces.html
https://docs.opencv.org/3.4/d4/d73/tutorial_py_contours_begin.html
If you got that figured out, look into stereo vision or 3D imaging in general, and that subnautica scanner might just become reality some day ;)
Good luck !
I would like to get the coordinates of framed text on an image. The paragraphs have thin black borders. The rest of the image contains usual paragraphs and sketchs.
Here is an example:
Do you have any idea of what kind of algorithms should I use in Python with an image library to achieve this ? Thanks.
A few ideas to detect a framed text which largely comes down to searching boxes/rectangles of substantial size:
find contours with OpenCV, analyze shapes using cv2.approxPolyDP() polygon approximation algorithm (also known as Ramer–Douglas–Peucker algorithm). You could additionally check the aspect ratio of the bounding box to make sure the shape is a rectangle as well as check the page width as this seems to be a known metric in your case. PyImageSearch did this amazing article:
OpenCV shape detection
in a related question, there is also a suggestion to look into Hough Lines to detect a horizontal line, taking a turn a detecting vertical lines the same way. Not 100% sure how reliable this approach would be.
Once you find the box frames, the next step would be to check if there is any text inside them. Detecting text is a broader problem in general and there are many ways of doing it, here are a few examples:
apply EAST text detector
PixelLink
tesseract (e.g. via pytesseract) but not sure if this would not have too many false positives
if it is a simpler case of boxes being empty or not, you could check for average pixel values inside - e.g. with cv2.countNonZero(). Examples:
How to identify empty rectangle using OpenCV
Count the black pixels using OpenCV
Additional references:
ideas on quadrangle/rectangle detection using convolutional neural networks
I was asked to recognize logo in an image using opencv. The lecturer told me that I don't have to do logo detection but logo recognition only. I am using opencv in c++. Can I know the easiest way to do it??
Ps: newbie in computer vision.
It largely depends on your kind of images.
If your logo occupies say 90% of the image, you don't need detection, since you are probably good with color histograms.
If the logo is small compared to the image, you should "find" the logo, in order to focus your comparison on that and not on the background clutter.
There could be multiple logos on the same image?
The logo is always fully visible?
The logo is rigid? Or could be deformed? (think for example of a logo on a shirt or a small bottle)
Assuming that you have a single complete rigid logo to find, the simplest thing to try is template matching.
A more accurate approach is to match descriptors.
You can also see a related topic on SO here
Other more robust approaches would require to build constellations of keypoints on your reference logo, and match those constellations on the target image. See here and here for an example.
Last, but not least, have fun on Google!
I agree with #Miki , you need to do template matching, my recomendation to you is to use sum of square differences and only use a rigid transformation, you can find a lot of information here. The last is one of the best books that I've red is simple to understand and it have the major part of the equations step by step.
I am writing a simple fly tracking software and I would love some input from opencv experts.
The image I have looks pretty much like:
I used to do tracking using kmeans and PIL/numpy but I re-wrote everything to use blob detection in opencv. Tracking works OK but I would also like to automatize division of ROI.
What I need to do is find each of the 32 grooves that appear in the picture, where flies live. See the black rectangle on the image as example of what I mean.
I think cornerHarris may be what I need but how do I specify only the grooves and not each single rectangle found in the image? All those grooves have proportions of roughly 10:1.
Thanks!
I don't think cvCornerHarris is even close to what you need.
A much better start would be to experiment with the demo available at: OpenCV-2.3.0/samples/cpp/squares.cpp. This technique uses Canny(), dilate() and findCountour().
Right out of the box, this demo outputs:
I believe that with a few tweaks here and there you can have your party started.
Hi I am wanting to use the python imaging library to crop images to a specific size for a website. I have a problem, these images are meant to show people's faces so I need to automatically crop based on them.
I know face detection is a difficult concept so I'm thinking of using the face.com API http://developers.face.com/tools/#faces/detect which is fine for what I want to do.
I'm just a little stuck on how I would use this data to crop a select area based on the majority of faces.
Can anybody help?
Joe
There is a library for python that have a concept of smart-cropping that among other options, can use face detection to do a smarter cropping.
It uses opencv under the hood, but you are isolated from it.
https://github.com/globocom/thumbor
If you have some rectangle that you want to excise from an image, here's what I might try first:
(optional) If the image is large, do a rough square crop centered on the face with dimensions sqrt(2) larger than the longer edge (if rectangular). Worst-case (45° rotation), it will still grab everything important.
Rotate based on the face orientation (something like rough_crop.rotate(math.degrees(math.atan(ydiff/xdiff)), trig is fun)
Do a final crop. If you did the initial crop, the face should be centered, otherwise you'll have to transform (rotate) all your old coordinates to the new image (more trig!).