Steps I have followed:
Background subtraction with preprocessing.
Contour detection.
With these two steps, I am able to draw contours on all moving cars in the video. But how do I track contours to count number of cars in the video ?
I searched around a bit and there seem to be different techniques like Kalman Filter, Lucas Kannade and Optical Flow... But I don't know which one to use for my usecase. I am using opencv3-python.
Actually, this seems like a general question, but I am going to give a point of view (Myself, I had the same problem but with pointclouds, although it may be different than what you asked, I hope it will give you an idea of how to proceed).
Most of the times, once your contours are detected, tracking moving objects in the scene involves 3 main steps:
Feature Matching :
This step is about detecting features in your object (Frame N) and match it to features of objects in frame (N+1). the detection part has some standard algorithms and descriptors available in OpenCV (SURF, SIFT, ORB...) as well as the Features matching part.
Kalman Filter
The Kalman filter is used to get an initial prediction (generally by applying a constant velocity model for your objects). For each appearance-point of the track, a correspondence search is executed. If the average distance is above a specified threshold, feature matching is applied to get a better initial estimate.
In order to do that, you need to model your problem in a way it can be solved by a Kalman filter.
Dynamic Mapping
After the motion estimation, the appearance of each track is updated. In contrast to standard mapping techniques, dynamic mapping is an approach which tries to accumulate appearance details of both static and dynamic objects. thus refining your motion estimation and tracking process.
There are a lot of papers out there, you may as well take a further look at these papers :
Robust Visual Tracking and Vehicle Classification via Sparse Representation
Motion Estimation from Range Images in Dynamic Outdoor Scenes
Multiple Objects Tracking using CAMshift Algorithm in OpenCV
Hope it helps !
Related
There are many tutorials on how to calculate the distance between a camera and an object. Is it possible to calculate the approximate distance between a person detected and the camera using OpenCV?.
Yes it is possible. Like mentioned by #hkchengrex consider your face an object. There's plenty of methods. I'd recommend SIFT Feature Matching of the methods described following that link.
Here are roughly the required steps:
Take a picture of the person and measure the distance manually.
Crop this picture to only contain the person.
Extract the image features (e.g. as sift descriptor)
Take a second picture with the same person but unknown distance.
Detect the person via sift matching (see link above)
Compute a transformation between those two sift feature vectors
Apply the transformation to the distance measured in 1.
Best start at the link provided and further SIFT tutorials in opencv. The required approach is a very simple one and will only work if the person in the picture that is being examined is very similar to the person of picture one. For more advanced approaches I'd refer to scientific papers. Search for "person detection".
In reply to the comments
TL;DR person with same height/width in reality but displayed smaller/larger in the image can be measured regarding distance.
The depicted approach works under the hood as follows. The person (=cropped image) captured at step 2 can be found in any future image as long as he/she appears very similar. In the new image it will give you the rectangular region where the person is located. As the dimensions of this rectangle are now smaller/larger you can take those changes to compute the transformation (which is basically intercept theorem) and thereby the new distance.
What does this mean for a general approach measuring ANY person?
In case the person has the same width/height as the person from step 2 this process works flawlessly. In case they are of similar but but not identical height/width there will be calculation errors. But the results MAY still suffice for your use case. (You can define a generic human e.g. 1,8m of height and XX of width). Nevertheless SIFT might be a bit too specific here. Sorry I'd just refer you to google to see what works best.
If your camera is fixated and the recorded scene doesn't change too much I'd just define a ground plane and manually annotate every pixel projected on this plane with a depth value. So you only have to detect the arbitrary person, see where their feet touch the ground plane and look up this pixel's defined depth value.
If the use case has higher demands you'd have to measure depth in a more complex fashion. This can be done using a stereo-camera rig, a depth sensor or an image sequence via structure from motion.
So there is not the "one can do all" method in OpenCV. It always depends on the use case, the environment and a combination of quite elaborate methods.
I have a grid on pictures (they are from camera). After binarization they look like this (red is 255, blue is 0):
What is the best way to detect grid nodes (crosses) on these pictures?
Note: grid is distorted from cell to cell non-uniformly.
Update:
Some examples of different grids and thier distortions before binarization:
In cases like this I first try to find the best starting point.
So, first I thresholded your image (however I could also skeletonize it and just then threshold. But this way some data is lost irrecoverably):
Then, I tried loads of tools to get the most prominent features emphasized in bulk. Finally, playing with Gimp's G'MIC plugin I found this:
Based on the above I prepared a universal pattern that looks like this:
Then I just got a part of this image:
To help determine angle I made local Fourier freq graph - this way you can obtain your pattern local angle:
Then you can make a simple thick that works fast on modern GPUs - get difference like this (missed case):
When there is hit the difference is minimal; what I had in mind talking about local maximums refers more or less to how the resulting difference should be treated. It wouldn't be wise to weight outside of the pattern circle difference the same as inside due to scale factor sensitivity. Thus, inside with cross should be weighted more in used algorithm. Nevertheless differenced pattern with image looks like this:
As you can see it's possible to differentiate between hit and miss. What is crucial is to set proper tolerance and use Fourier frequencies to obtain angle (with thresholded images Fourier usually follows overall orientation of image analyzed).
The above way can be later complemented by Harris detection, or Harris detection can be modified using above patterns to distinguish two to four closely placed corners.
Unfortunately, all techniques are scale dependent in such case and should be adjusted to it properly.
There are also other approaches to your problem, for instance by watershedding it first, then getting regions, then disregarding foreground, then simplifying curves, then checking if their corners form a consecutive equidistant pattern. But to my nose it would not produce correct results.
One more thing - libgmic is G'MIC library from where you can directly or through bindings use transformations shown above. Or get algorithms and rewrite them in your app.
I suppose that this can be a potential answer (actually mentioned in comments): http://opencv.itseez.com/2.4/modules/imgproc/doc/feature_detection.html?highlight=hough#houghlinesp
There can also be other ways using skimage tools for feature detection.
But actually I think that instead of Hough transformation that could contribute to huge bloat and and lack of precision (straight lines), I would suggest trying Harris corner detection - http://docs.opencv.org/2.4/doc/tutorials/features2d/trackingmotion/harris_detector/harris_detector.html .
This can be further adjusted (cross corners, so local maximum should depend on crossy' distribution) to your specific issue. Then some curves approximation can be done based on points got.
Maybe you cloud calculate Hough Lines and determine the intersections. An OpenCV documentation can be found here
I am trying to detect a vehicle in an image (actually a sequence of frames in a video). I am new to opencv and python and work under windows 7.
Is there a way to get horizontal edges and vertical edges of an image and then sum up the resultant images into respective vectors?
Is there a python code or function available for this.
I looked at this and this but would not get a clue how to do it.
You may use the following image for illustration.
EDIT
I was inspired by the idea presented in the following paper (sorry if you do not have access).
Betke, M.; Haritaoglu, E. & Davis, L. S. Real-time multiple vehicle detection and tracking from a moving vehicle Machine Vision and Applications, Springer-Verlag, 2000, 12, 69-83
I would take a look at the squares example for opencv, posted here. It uses canny and then does a contour find to return the sides of each square. You should be able to modify this code to get the horizontal and vertical lines you are looking for. Here is a link to the documentation for the python call of canny. It is rather helpful for all around edge detection. In about an hour I can get home and give you a working example of what you are wanting.
Do some reading on Sobel filters.
http://en.wikipedia.org/wiki/Sobel_operator
You can basically get vertical and horizontal gradients at each pixel.
Here is the OpenCV function for it.
http://docs.opencv.org/modules/imgproc/doc/filtering.html?highlight=sobel#sobel
Once you get this filtered images then you can collect statistics column/row wise and decide if its an edge and get that location.
Typically geometrical approaches to object detection are not hugely successful as the appearance model you assume can quite easily be violated by occlusion, noise or orientation changes.
Machine learning approaches typically work much better in my opinion and would probably provide a more robust solution to your problem. Since you appear to be working with OpenCV you could take a look at Casacade Classifiers for which OpenCV provides a Haar wavelet and a local binary pattern feature based classifiers.
The link I have provided is to a tutorial with very complete steps explaining how to create a classifier with several prewritten utilities. Basically you will create a directory with 'positive' images of cars and a directory with 'negative' images of typical backgrounds. A utiltiy opencv_createsamples can be used to create training images warped to simulate different orientations and average intensities from a small set of images. You then use the utility opencv_traincascade setting a few command line parameters to select different training options outputting a trained classifier for you.
Detection can be performed using either the C++ or the Python interface with this trained classifier.
For instance, using Python you can load the classifier and perform detection on an image getting back a selection of bounding rectangles using:
image = cv2.imread('path/to/image')
cc = cv2.CascadeClassifier('path/to/classifierfile')
objs = cc.detectMultiScale(image)
I am working with python and opencv on a piece of software which should compare two images and return as result a value representing their similarity.
I tried first with histograms, and then with SIFT and SURF but the first method is not localized while the second and the third are slow and do not fit very much with my datased content (mostly pictures of crowds).
I would avoid people detector, so I would like to apply some algorithm connected to edges and textures comparison. Cany you give some hints or online resource?
This is an interesting, although challenging problem! Recently, I came across an article by the University of California, San Diego's Vision Group about classifying scenes of crowds. Here is the link: Urban Tribes: Analyzing Group Photos from a Social Perspective.
As you can see, there is no one-size-fits-all solution, but I would think that this should provide you a good place to start from.
What you're asking is a general image classification framework.
Try googling: image classification, scene classification, image Indexing and Retrieval.
In most cases, you'll have to use a multimodal descriptor. Use color, texture, entropy, keypoints, edge histograms.
You can read this and try that.
I'm learning the basics of OpenCV, and I thought a good project would help me make the studying more fun. After thinking some ideas I came up with some material recognition project. Let's say, I got myself a conveyor and it's transporting material for production of some product ( this product don't really matter, tho). There are 3 materials, and the illumination conditions will vary, (using natural light at the morning through the afternoon, and a light-bulb at night). That would be the problem description.
I was thinking of using sand, wood and rocks, which are easy to get. and place them on a plastic surface. After taking a pic, I'll apply some histogram to get the color, and using this color I'll identify the material. But, since the lightning conditions will change over time, when i take this photograph and apply the histogram, the color will change and the material won't be recognized properly. And I thought, what if I were to use sand and dust, they have very similar color, but different texture, is there something that can help me with that?
I just want some ideas, and maybe some expert in the field could guide me.
Quite an advanced idea for a starting project. The differences in lighting could be tackled by using the HSV or other color spaces, taking the Hue component. However the matter of "texture" can be handled in two ways:
Feature descriptors: If you deal with the grey level image, there are a set of feature descriptors called the Grey Level Co-occurrence Matrix (GLCM) that gives a measure of the textures of different regions in the image. This is present in Matlab, for OpenCV there is the following code: in C.
So you could take several standard shots of the sand, wood and rocks and use them as training samples on a classifier - NN, SVM, OpenCV's Haar classifier, whatever. Then train it with negative samples. The feature vector for the classifier will be the GLCM output for each picture. Then run it on the actual pictures and see how accurate they are.
Texture Roughness: Came across this useful paper that shows a single-valued measure for the 'roughness' of a texture called the Eigen Transform. The calculations are quite simple, especially if you use OpenCV's SVD() for eigenvalue calculations. The result of the Eigen-transform gives a value corresponding to the roughness of that portion. This can be used to separate out required portions.