I'm trying to get the black region from an image using TensorFlow. To this point I was using OpenCV but it fails to get the hole region given that the gray scale is very complicated.
The image I'm using is a photo of a electric meter, the whole meter is white(normally) except for the part with the numbers that is black. I would want to isolate this part in order to get the numbers later on.
To de the date, I have been using the function findContours from OpenCV, with a defined threshold .
I have seen that TensorFlow is very potent so I think this could no be a problem, but I can't find any documentation. Any hints? Thanks!
Tensorflow is a general purpose math library that is unique in two respects:
It provides automatic differentiation.
It has efficient kernels built to run on either the CPU or GPU.
It does have a library of image functions, but it's nowhere near as extensive as OpenCV, and will never be. Those are mostly for data augmentation (as it pertains to ML) and data loading.
Note that you can run OpenCV code on the GPU in many cases (I'm not sure about findContours in particular. So sticking with OpenCV should be considered.
But within tensorflow you would have to re-write that function yourself. In looking at the code (which I provided a link to in your question) it doesn't look very hard to do. You could replicate that in symbolic tensorflow operations in relatively short order, but nothing like that exists pre-built in tensorflow. Nor is it likely to in the future.
Related
I'm trying to learn computer vision and more specifically open-cv in python.
I want to make a program that would track my barbell in a video and show me its path. (I know apps like this exists but I want to make it myself). I tried using the Canny edge detection and the HoughCircles functions but I seem to get everything but a good result.
I have been using this code to find the edges of my image:
gray = cv.cvtColor(src=img, code=cv.COLOR_BGR2GRAY)
blur = cv.blur(gray, (2,2))
canny = cv.Canny(blur, 60, 60)
And then this code to find the circle:
circles = cv.HoughCircles(canny, cv.HOUGH_GRADIENT, dp=2, minDist=1000, circles=None,maxRadius=50)
This is the result:
Result
left = original image with detected circle // right = canny image
Is this the right way to go or should I use another method?
Train the YOLO model for the barbell to detect barbel object is better than anything you tried with OpenCV. You need at least 500 images. Those images can be found on the internet easily. This tutorial is kick start tutorial on YOLO. Let's give a try.
If you tweak the parameters of HoughCircles it may recognize the barbell [EDIT: but with more preprocessing, gamma correction, blurring etc., so better not], however OpenCV has many algorithms for such object tracking - only a region from the image has to be specified first (if that's OK).
In your case the object is always visible and is not changing much, so I guess many of the available algorithms would work fine.
OpenCV has a built-in function for selection:
initBB = cv2.selectROI("Frame", frame, fromCenter=False, showCrosshair=True)
See this tutorial for tracking: https://www.pyimagesearch.com/2018/07/30/opencv-object-tracking/
The summary from the author suggestion is:
CSRT Tracker: Discriminative Correlation Filter (with Channel and Spatial Reliability). Tends to be more accurate than KCF but slightly slower. (minimum OpenCV 3.4.2)
Use CSRT when you need higher object tracking accuracy and can tolerate slower FPS throughput
I guess accuracy is what you want, if it is for offline usage.
Can you share a sample video?
What's your problem exactly? Why do you track the barbell? Do you need semantic segmentation or normal detection? These are important questions. Canny is a very basic approach It' needs a very stable background to use it. That's why there is deep learning to handle that kind of problem If we need to talk about deep learning you can use MaskRCNN, yolvoV4, etc. there are many available solutions out there.
I'm working on a Raspberry PI, an embedded linux platform with Raspbian Jessie where Python 2.7 is already installed, and I have OpenCV algorithms that must run in real-time and must apply several HAAR Cascade classifiers on the same image. Is there any method to reduce the time of these operations? such as multithreading for example?
I also hear about GPU calculations but I didn't know from where I can start.
Thank you for the help.
If you haven't already done so, you should consider the following:
Reduce image size to the minimum required size for recognizing the target object for each classifier. If different objects require different resolutions, you can even use a set of copies of the original image, with different sizes.
Identify search regions for each classifier and thereby reduce the search area. For example, if you are searching for face landmarks, you can define search regions for the left eye, right eye, nose, and mouth after running the face detector and finding the rectangle that contains the face.
I am not very sure if optimization is going to be very helpful, because OpenCv already does some hardware optimization.
OpenCv with OpenCL for parallel processing can be of use to you, as raspberry pi 3 is Quad core. I do not think GPU that comes with these boards are enough powerful. You can try Qualcomm's DSP to computer vision and Neural Network. Nvidia's Tegra GPUs are another option.
I am trying to compute a rough "quality" metric for a video, which takes the following into consideration:
"Smoothness" of video; i.e., the opposite of how "choppy" it is
Image quality; i.e. if there are a lot of compression artifacts, the quality should decrease in size
I came across https://github.com/aizvorski/scikit-video, but the code seems to be littered with FIXMEs and TODOs, and on top of that there's barely any comments or documentation.
Is there a Python library, or even a program with a CLI, for computing video quality, or perhaps a set of libraries that will help me compute the above two metrics separately?
Image Quality
I would think that "Image Quality" is largely a function of bit-depth (or effective bit-depth) and bit-rate.
You can parse ffmpeg output to get this information. PIL or PyQt/PySide can also do this.
Smoothness
For smoothness, you may need to use some type of optical flow algorithm and get deltas from frame to frame.
OpenCV looks like a project that does many of these things.
I am trying to detect a vehicle in an image (actually a sequence of frames in a video). I am new to opencv and python and work under windows 7.
Is there a way to get horizontal edges and vertical edges of an image and then sum up the resultant images into respective vectors?
Is there a python code or function available for this.
I looked at this and this but would not get a clue how to do it.
You may use the following image for illustration.
EDIT
I was inspired by the idea presented in the following paper (sorry if you do not have access).
Betke, M.; Haritaoglu, E. & Davis, L. S. Real-time multiple vehicle detection and tracking from a moving vehicle Machine Vision and Applications, Springer-Verlag, 2000, 12, 69-83
I would take a look at the squares example for opencv, posted here. It uses canny and then does a contour find to return the sides of each square. You should be able to modify this code to get the horizontal and vertical lines you are looking for. Here is a link to the documentation for the python call of canny. It is rather helpful for all around edge detection. In about an hour I can get home and give you a working example of what you are wanting.
Do some reading on Sobel filters.
http://en.wikipedia.org/wiki/Sobel_operator
You can basically get vertical and horizontal gradients at each pixel.
Here is the OpenCV function for it.
http://docs.opencv.org/modules/imgproc/doc/filtering.html?highlight=sobel#sobel
Once you get this filtered images then you can collect statistics column/row wise and decide if its an edge and get that location.
Typically geometrical approaches to object detection are not hugely successful as the appearance model you assume can quite easily be violated by occlusion, noise or orientation changes.
Machine learning approaches typically work much better in my opinion and would probably provide a more robust solution to your problem. Since you appear to be working with OpenCV you could take a look at Casacade Classifiers for which OpenCV provides a Haar wavelet and a local binary pattern feature based classifiers.
The link I have provided is to a tutorial with very complete steps explaining how to create a classifier with several prewritten utilities. Basically you will create a directory with 'positive' images of cars and a directory with 'negative' images of typical backgrounds. A utiltiy opencv_createsamples can be used to create training images warped to simulate different orientations and average intensities from a small set of images. You then use the utility opencv_traincascade setting a few command line parameters to select different training options outputting a trained classifier for you.
Detection can be performed using either the C++ or the Python interface with this trained classifier.
For instance, using Python you can load the classifier and perform detection on an image getting back a selection of bounding rectangles using:
image = cv2.imread('path/to/image')
cc = cv2.CascadeClassifier('path/to/classifierfile')
objs = cc.detectMultiScale(image)
I have a camera that will be stationary, pointed at an indoors area. People will walk past the camera, within about 5 meters of it. Using OpenCV, I want to detect individuals walking past - my ideal return is an array of detected individuals, with bounding rectangles.
I've looked at several of the built-in samples:
None of the Python samples really apply
The C blob tracking sample looks promising, but doesn't accept live video, which makes testing difficult. It's also the most complicated of the samples, making extracting the relevant knowledge and converting it to the Python API problematic.
The C 'motempl' sample also looks promising, in that it calculates a silhouette from subsequent video frames. Presumably I could then use that to find strongly connected components and extract individual blobs and their bounding boxes - but I'm still left trying to figure out a way to identify blobs found in subsequent frames as the same blob.
Is anyone able to provide guidance or samples for doing this - preferably in Python?
The latest SVN version of OpenCV contains an (undocumented) implementation of HOG-based pedestrian detection. It even comes with a pre-trained detector and a python wrapper. The basic usage is as follows:
from cv import *
storage = CreateMemStorage(0)
img = LoadImage(file) # or read from camera
found = list(HOGDetectMultiScale(img, storage, win_stride=(8,8),
padding=(32,32), scale=1.05, group_threshold=2))
So instead of tracking, you might just run the detector in each frame and use its output directly.
See src/cvaux/cvhog.cpp for the implementation and samples/python/peopledetect.py for a more complete python example (both in the OpenCV sources).
Nick,
What you are looking for is not people detection, but motion detection. If you tell us a lot more about what you are trying to solve/do, we can answer better.
Anyway, there are many ways to do motion detection depending on what you are going to do with the results. Simplest one would be differencing followed by thresholding while a complex one could be proper background modeling -> foreground subtraction -> morphological ops -> connected component analysis, followed by blob analysis if required. Download the opencv code and look in samples directory. You might see what you are looking for. Also, there is an Oreilly book on OCV.
Hope this helps,
Nand
This is clearly a non-trivial task. You'll have to look into scientific publications for inspiration (Google Scholar is your friend here). Here's a paper about human detection and tracking: Human tracking by fast mean shift mode seeking
This is similar to a project we did as part of a Computer Vision course, and I can tell you right now that it is a hard problem to get right.
You could use foreground/background segmentation, find all blobs and then decide that they are a person. The problem is that it will not work very well since people tend to go together, go past each other and so on, so a blob might very well consist of two persons and then you will see that blob splitting and merging as they walk along.
You will need some method of discriminating between multiple persons in one blob. This is not a problem I expect anyone being able to answer in a single SO-post.
My advice is to dive into the available research and see if you can find anything there. The problem is not unsolvavble considering that there exists products which do this: Autoliv has a product to detect pedestrians using an IR-camera on a car, and I have seen other products which deal with counting customers entering and exiting stores.