Is there any simple implemented method for human silhouette extraction in OpenCV? The method can be work only for video.
Here is a sample frame:
For images like these, OpenCV's HOG (Histogram of Oriented Gradients) works very well. And example can be found here. The example is in python, but it is not hard to create a C++ version if you want. The trained parameters are already there, so you can use it immediately.
If you are interested in deep learning based approaches, both SSD (Single Shot Multiple Box Detector) and YOLO (You Only Look Once) can detect persons.
All these methods can only exact a bounding box. For extracting the exact silhouette, you will need to combine the results with image differencing or background subtraction.
Related
I started studying machine learning in python a few days ago and I was trying some examples online when I decided to try it for myself using a custom dataset.
However, I noticed that most datasets involve images that are taken from camera photos composing of hundreds, if not thousands of images with the same target image.
If I create a custom icon in Photoshop, do I need to take a picture of my monitor a thousand times to achieve this? Is it possible to train an AI using only a single PNG file?
My goal right now is to let the AI do object detection on another big image and it needs to find the custom icon inside the image, kind of like Finding Waldo. All of which are digital images straight from Photoshop though, so I don`t know if it is possible.
Right now, I am using a python-based Computer Vision library called ImageAI.
You can use a data preparation strategy called Data Augmentation.
There are mainly two types of Augmentation
Linear Transformation
Affine Transformation
Here is a good white paper
http://cs231n.stanford.edu/reports/2017/pdfs/300.pdf
I'm doing object detection for texts in image and want to use Yolo to draw a bounding box where the text is in the image.
Then, how do you do data augmentation? Also, what is the difference between augmentation (contrast adjustment, gamma conversion, smoothing, noise, inversion, scaling, etc.) in ordinary image recognition?
If you have any useful website links, would you tell me plz :)
If you mean by what should you use then, it just a regular object detection task, the common augment, like flips or crop, works fine.
For the difference, if you mean by what the output images will look like then look at this repo https://github.com/albumentations-team/albumentations
But of you mean by the model performance difference then there's probably no answer for that, you can only try several ways and see what's the best.
I am trying to create an application that is able to detect and track the iris of an eye in a live video stream. In order to do that, I want to use Python and OpenCV. While researching for this on the internet, it seemed to me that there are multiple possible ways to do that.
First Way:
Run a Canny Filter to get the edges, and then use HoughCircle to find the Iris.
Second Way:
Use Otsus-Algorithm to find the perfect threshold and then use cv2.findContours() to find the Iris.
Since I want this to run on a Raspberry Pi (4B), my question is which of these methods is better, especially in terms of reliability and performance?
I would take a third path and start from a well enstablished method for facial landmark detection (e.g. dlib). You can use a pre-trained model to get a reliable estimate on the position of the eye.
This is an example output from a facial landmark detector:
Then you go ahead from there to find the iris, either using edge detection, Hough or whathever.
Probably you can simply use an heuristic as you can assume the iris to be always in the center of mass of the keypoints around each eye.
There are also some good tutorials online in a similar setting (even for Raspberry) for example this one or this other one from PyImageSearch.
Imagine someone taking a burst shot from camera, he will be having multiple images, but since no tripod or stand was used, images taken will be slightly different.
How can I align them such that they overlay neatly and crop out the edges
I have searched a lot, but most of the solutions were either making a 3D reconstruction or using matlab.
e.g. https://github.com/royshil/SfM-Toy-Library
Since I'm very new to openCV, I will prefer a easy to implement solution
I have generated many datasets by manually rotating and cropping images in MSPaint but any link containing corresponding datasets(slightly rotated and translated images) will also be helpful.
EDIT:I found a solution here
http://www.codeproject.com/Articles/24809/Image-Alignment-Algorithms
which gives close approximations to rotation and translation vectors.
How can I do better than this?
It depends on what you mean by "better" (accuracy, speed, low memory requirements, etc). One classic approach is to align each frame #i (with i>2) with the first frame, as follows:
Local feature detection, for instance via SIFT or SURF (link)
Descriptor extraction (link)
Descriptor matching (link)
Alignment estimation via perspective transformation (link)
Transform image #i to match image 1 using the estimated transformation (link)
I am trying to detect a vehicle in an image (actually a sequence of frames in a video). I am new to opencv and python and work under windows 7.
Is there a way to get horizontal edges and vertical edges of an image and then sum up the resultant images into respective vectors?
Is there a python code or function available for this.
I looked at this and this but would not get a clue how to do it.
You may use the following image for illustration.
EDIT
I was inspired by the idea presented in the following paper (sorry if you do not have access).
Betke, M.; Haritaoglu, E. & Davis, L. S. Real-time multiple vehicle detection and tracking from a moving vehicle Machine Vision and Applications, Springer-Verlag, 2000, 12, 69-83
I would take a look at the squares example for opencv, posted here. It uses canny and then does a contour find to return the sides of each square. You should be able to modify this code to get the horizontal and vertical lines you are looking for. Here is a link to the documentation for the python call of canny. It is rather helpful for all around edge detection. In about an hour I can get home and give you a working example of what you are wanting.
Do some reading on Sobel filters.
http://en.wikipedia.org/wiki/Sobel_operator
You can basically get vertical and horizontal gradients at each pixel.
Here is the OpenCV function for it.
http://docs.opencv.org/modules/imgproc/doc/filtering.html?highlight=sobel#sobel
Once you get this filtered images then you can collect statistics column/row wise and decide if its an edge and get that location.
Typically geometrical approaches to object detection are not hugely successful as the appearance model you assume can quite easily be violated by occlusion, noise or orientation changes.
Machine learning approaches typically work much better in my opinion and would probably provide a more robust solution to your problem. Since you appear to be working with OpenCV you could take a look at Casacade Classifiers for which OpenCV provides a Haar wavelet and a local binary pattern feature based classifiers.
The link I have provided is to a tutorial with very complete steps explaining how to create a classifier with several prewritten utilities. Basically you will create a directory with 'positive' images of cars and a directory with 'negative' images of typical backgrounds. A utiltiy opencv_createsamples can be used to create training images warped to simulate different orientations and average intensities from a small set of images. You then use the utility opencv_traincascade setting a few command line parameters to select different training options outputting a trained classifier for you.
Detection can be performed using either the C++ or the Python interface with this trained classifier.
For instance, using Python you can load the classifier and perform detection on an image getting back a selection of bounding rectangles using:
image = cv2.imread('path/to/image')
cc = cv2.CascadeClassifier('path/to/classifierfile')
objs = cc.detectMultiScale(image)