How to use opencv for poster detection - python

Is it possible to use opencv for better detection of posters (see example image)? I have tried following approach:
create mask of higher intensity pixels (higher v value)
apply erosion and dilation to remove noise and make smooth.
findContours and draw bounding boxes.
Result of this approach is only good if there are lights behind the posters(poster glowing). However, for my goal is to detect posters even when its not in the highest intensities. Please anyone guide me about it.

First thought was neural nets... and openCV has an implementation:
http://docs.opencv.org/2.4/modules/ml/doc/neural_networks.html
They call them 'Multi Layer Perceptrons'
Other machine learning examples in openCV here:
http://bytefish.de/blog/machine_learning_opencv/

If you know how the posters, you want to detect, look like, you could search for keypoints and match them by their descriptors.
See the example Features2D + Homography to find a known object in the documentation for code.

Related

What is the best way to track an object in a video?

I'm trying to learn computer vision and more specifically open-cv in python.
I want to make a program that would track my barbell in a video and show me its path. (I know apps like this exists but I want to make it myself). I tried using the Canny edge detection and the HoughCircles functions but I seem to get everything but a good result.
I have been using this code to find the edges of my image:
gray = cv.cvtColor(src=img, code=cv.COLOR_BGR2GRAY)
blur = cv.blur(gray, (2,2))
canny = cv.Canny(blur, 60, 60)
And then this code to find the circle:
circles = cv.HoughCircles(canny, cv.HOUGH_GRADIENT, dp=2, minDist=1000, circles=None,maxRadius=50)
This is the result:
Result
left = original image with detected circle // right = canny image
Is this the right way to go or should I use another method?
Train the YOLO model for the barbell to detect barbel object is better than anything you tried with OpenCV. You need at least 500 images. Those images can be found on the internet easily. This tutorial is kick start tutorial on YOLO. Let's give a try.
If you tweak the parameters of HoughCircles it may recognize the barbell [EDIT: but with more preprocessing, gamma correction, blurring etc., so better not], however OpenCV has many algorithms for such object tracking - only a region from the image has to be specified first (if that's OK).
In your case the object is always visible and is not changing much, so I guess many of the available algorithms would work fine.
OpenCV has a built-in function for selection:
initBB = cv2.selectROI("Frame", frame, fromCenter=False, showCrosshair=True)
See this tutorial for tracking: https://www.pyimagesearch.com/2018/07/30/opencv-object-tracking/
The summary from the author suggestion is:
CSRT Tracker: Discriminative Correlation Filter (with Channel and Spatial Reliability). Tends to be more accurate than KCF but slightly slower. (minimum OpenCV 3.4.2)
Use CSRT when you need higher object tracking accuracy and can tolerate slower FPS throughput
I guess accuracy is what you want, if it is for offline usage.
Can you share a sample video?
What's your problem exactly? Why do you track the barbell? Do you need semantic segmentation or normal detection? These are important questions. Canny is a very basic approach It' needs a very stable background to use it. That's why there is deep learning to handle that kind of problem If we need to talk about deep learning you can use MaskRCNN, yolvoV4, etc. there are many available solutions out there.

Convert Edge Detection to Mask

Giving an image that I applied an edge detection filter, what would be the way (hopefully efficient/performant one) to achieve a mask of the "sum" of the point in a marked segment?
Image for illustration:
Thank you in advance.
UPDATE:
Added example of a lighter image (https://imgur.com/a/MN0t3pH).
As you'll see in the below image, we assume that when the user marks a region (ROI), there will be an object that will "stand out" from its background. Our end goal is to get the most accurate "mask" of this object, so we can use it for ML processing.
From the two examples you've uploaded I can assume you are thresholding based on difference in color/intensity- I can suggest grabcut as basic foreground separation- use the edges in the mask in that ROI as input to the algorithm.
Even better- if your thresholding is good as the first image, just skip the edge detection part and this will be the input to grabcut.
======= EDIT =======
#RoiMulia if you need production level I can suggest that you leave the threshold + edge detection direction completly and try background removal techniques (SOTA are currently neural networks such as Background Matting: The World is Your Green Screen (example)).
You can also try some ready made background removal APIs such as https://www.remove.bg/ or https://clippingmagic.com/
1.
Given the "ROI" supervision you have, I strongly recommend you to explore GrabCut (as proposed by YoniChechnik):
Rother C, Kolmogorov V, Blake A. "GrabCut" interactive foreground extraction using iterated graph cuts. ACM transactions on graphics (TOG). 2004.
To get a feeling of how this works, you can use power-point's "background removal" tool:
which is based on GrabCut algorithm.
This is how it looks in power point:
GrabCut segment the foreground object in a selected ROI mainly based on its foreground/background color distributions, and less on edge/boundary information though this extra information can be integrated into the formulation.
It seems like opencv has a basic implementation of GrabCut, see here.
2.
If you are seeking a method that uses only the boundary information, you might find this answer useful.
3.
An alternative method is to use NCuts:
Shi J, Malik J. Normalized cuts and image segmentation. IEEE Transactions on pattern analysis and machine intelligence. 2000.
If you have very reliable edge map, you can modify the "affinity matrix" NCuts works with to be a binary matrix
0 if there is a boundary between i and j
w_ij = 1 if there is no boundary between i and j
0 if i and j are not neighbors of each other
NCuts can be viewed as a way to estimate "robust connected components".

Computer Vision: Removing noise from an image

I need to delete the noise from this image. My problem is that I need a neat contour without all the lines like in this image.
Do you have any suggestions how to do that using python?
Looking at your example images, I suppose you are looking for an image processing algorithm that finds the edges of your image (in your case, the border lines of the ground plan).
Have a look the Canny edge detection algorithm which might be a well-suited for this task. A tutorial with an example implementation in python can be found here.

Live Iris Detection with OpenCV - Thresholding vs HoughTransform

I am trying to create an application that is able to detect and track the iris of an eye in a live video stream. In order to do that, I want to use Python and OpenCV. While researching for this on the internet, it seemed to me that there are multiple possible ways to do that.
First Way:
Run a Canny Filter to get the edges, and then use HoughCircle to find the Iris.
Second Way:
Use Otsus-Algorithm to find the perfect threshold and then use cv2.findContours() to find the Iris.
Since I want this to run on a Raspberry Pi (4B), my question is which of these methods is better, especially in terms of reliability and performance?
I would take a third path and start from a well enstablished method for facial landmark detection (e.g. dlib). You can use a pre-trained model to get a reliable estimate on the position of the eye.
This is an example output from a facial landmark detector:
Then you go ahead from there to find the iris, either using edge detection, Hough or whathever.
Probably you can simply use an heuristic as you can assume the iris to be always in the center of mass of the keypoints around each eye.
There are also some good tutorials online in a similar setting (even for Raspberry) for example this one or this other one from PyImageSearch.

OpenCV-Python dense SIFT

OpenCV has very good documentation on generating SIFT descriptors, but this is a version of "weak SIFT", where the key points are detected by the original Lowe algorithm. The OpenCV example reads something like:
img = cv2.imread('home.jpg')
gray= cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
sift = cv2.SIFT()
kp = sift.detect(gray,None)
kp,des = sift.compute(gray,kp)
What I'm looking for is strong/dense SIFT, which does not detect keypoints but instead calculates SIFT descriptors for a set of patches (e.g. 16x16 pixels, 8 pixels padding) covering an image as a grid. As I understand it, there are two ways to do this in OpenCV:
I could divide the image in a grid myself, and somehow convert those patches to KeyPoints
I could use a grid-based feature detector
In other words, I'd have to replace the sift.detect() line with something that gives me the keypoints I require.
My problem is that the rest of the OpenCV documentation, especially wrt Python, is severely lacking, so I have no idea how to achieve either of these things. I see in the C++ documentation that there are keypoint detectors for grid, but I don't know how to use these from Python.
The alternative is to switch to VLFeat, which has a very good DSift/PHOW implementation but means that I'll have to switch from python to matlab.
Any ideas? Thanks.
You can use Dense Sift in opencv 2.4.6 <.
Creates a feature detector by its name.
cv2.FeatureDetector_create(detectorType)
Then "Dense" string in place of detectorType
eg:-
dense=cv2.FeatureDetector_create("Dense")
kp=dense.detect(imgGray)
kp,des=sift.compute(imgGray,kp)
I'm not sure what your goal is here, but be warned, the SIFT descriptor calculation is extremely slow and was never designed to be used in a dense fashion. That being said, OpenCV makes it fairly trivial to do so.
Basically instead of using sift.detect(), you just fill in the keypoint array yourself by making a grid a keypoints however dense you want them. Then a descriptor will be calculated for each keypoint when you pass the keypoints to sift.compute().
Depending on the size of your image and the speed of your machine, this might take a very long time. If copmutational time is a factor, I suggest you look at some of the binary descriptors OpenCV has to offer.
Inspite of the OpenCV way being the standard, it was too slow for me. So for that, I used pyvlfeat, which is basically python bindings to VL-FEAT. The functions carry similar syntax as the Matlab functions

Categories