I'm currently an intern at a quality inspector company. My job is to write a program that can detect faulty products (for example, missing screw). They take a picture of every single product. My idea is that I choose an image which could serve as a benchmark and I would compare the other images to that, with the SSIM score, and maybe display the faulty part with a rectangle. Is this a viable idea? (Its a strange internship, because it seems like I'm the only one who can code there...) that's why I'm asking here.
It sounds good idea if your goal is to classify different objects within images comparing benchmark image.
But in my experience, SSIM score was sensitive to angle, light or environment.
So in conclusion, if your goal is to classify different objects in images, your idea would work. But if your goal is to classify exactly same objects, it might not be able to classify.
Related
Most neural nets for images are great a detecting objects and labeling them. They can take a picture and label some of the objects in it. -- think yolo5
Template matching, on the other hand, looks for a template that is mostly the same in a larger image. -- opencv2.templateMatching
What I hope to have is something kind of "inbetween" the two version. Given a manual entered template image, give me the Rectangles in a larger picture where this template occurs - but must be scale invariant, and transform invariant (within reason).
The opencv2 version is too strict in what it counts as matches -- 10% size change can make matches fail, slight rotations can cause it to fail. This makes it not robust enough to be useful for.
Take for instance the following (below), where we see highlighted pictures of airplanes.
This would be the ideal output.
The input would be 1 of the small green squares, ideally any 1 of them would work.
Are there things out there that can do this already?
Essentially, a opencv2.templateMatching that is more reasonably "fuzzy".
Or if I was doing this with Balls, I would use as template a picture of a ball or even baseball as a clean template, and then highlight 3 balls in the following image.
I don't need image recognition, I need image...similarity with a given template (that is better than opencv2.templateMatching cause that one is terrible)
For those interested in the future, I ultimately had to go with a full YOLOv5 network to do a custom training.
I was unable to find a "cheaper" solution.
I'm new to object detection and computer vision, but I'm working on a project where I'm taking pictures of disks and I'm hoping to receive a confidence level. For example, if the disk is kind of round but slightly jagged on the edges it can return "80% circle". Is this possible?
I would check out Hough Circles if I were you. I used this technique for several projects during my MS degree. It works really well and you can set the parameters to give you different margins about what does and doesnt count as a circle. It wont give you a specific confidence level, but there are ways for doing that if thats what youre trying to accomplish. That would be more of a classification problem and you could approach it different ways. Anyway, heres the resource on the Hough Circles...
https://www.pyimagesearch.com/2014/07/21/detecting-circles-images-using-opencv-hough-circles/
for my school project, I need to find images in a large dataset. I'm working with python and opencv. Until now, I've managed to find an exact match of an image in the dataset but it takes a lot of time even though I had 20 images for the test code. So, I've searched few pages of google and I've tried the code on these pages
image hashing
building an image hashing search engine
feature matching
Also, I've been thinking to search through the hashed dataset, save their paths, then find the best feature matching image among them. But most of the time, my narrowed down working area is so much different than what is my query image.
The image hashing is really great. It looks like what I need but there is a problem: I need to find an exact match, not similar photos. So, I'm asking you guys, if you have any suggestion or a piece of code might help or improve the reference code that I've linked, can you share it with me? I'd be really happy to try or research what you guys send or suggest.
opencv is probably the wrong tool for this. The algorithms there are geared towards finding similar matches, not exact ones. The general idea is to use machine learning to teach the code to recognize what a car looks like so it can detect cars in videos, even when the color or form changes (driving in the shadow, different make, etc).
I've found two approaches work well when trying to build an image database.
Use a normal hash algorithm like SHA-256 plus maybe some metadata (file or image size) to find matches
Resize the image down to 4x4 or even 2x2. Use the pixel RGB values as "hash".
The first approach is to reduce the image to a number. You can then put the number in a look up table. When searching for the image, apply the same hashing algorithm to the image you're looking for. Use the new number to look in the table. If it's there, you have a match.
Note: In all cases, hashing can produce the same number for different pictures. So you have to compare all the pixels of two pictures to make sure it's really an exact match. That's why it sometimes helps to add information like the picture size (in pixels, not file size in bytes).
The second approach allows to find pictures which very similar to the eye but in fact slightly different. Imagine cropping off a single pixel column on the left or tilting the image by 0.01°. To you, the image will be the same but for a computer, they will by totally different. The second approach tries to average small changes out. The cost here is that you will get more collisions, especially for B&W pictures.
Finding exact image matches using hash functions can be done with the undouble library (Disclaimer: I am also the author). It works using a multi-step process of pre-processing the images (grayscaling, normalizing, and scaling), computing the image hash, and the grouping of images based on a threshold value.
I am generating images (thumbnails) from a video every 3 seconds. Now I need to discard/remove all the similar images. Is there a way I could this?
I generate thumbnails using FFMPEG. I read about various image-diff solutions like given in this SO post, but I do not want to do this manually. How and what parameters should be considered that could tell if a particular image is similar to other images present.
You can calculate the Structural Similarity Index between images and based on the score keep or discard an image. There are other measures you can use, but basically a method that returns a score. Try PIL or OpenCV
https://pillow.readthedocs.io/en/3.1.x/reference/ImageChops.html?highlight=difference
https://www.pyimagesearch.com/2017/06/19/image-difference-with-opencv-and-python/
I dont have enough reputation to comment my idea on your problem, so i will just go ahead and post it as an answer in hope of helping you.
I am quite confused about the term "similar" but since you are reffering on video frames i am going to assume that you want to avoid having "similar" frames that have been captured because of poor camera movement. If that's the case you might want to consider using salient point descriptors.
To be more specific you can detect salient points (using for instance Harris) and then use a point descriptor algorithm (such as SURF) and discard the frames that have been found to have "too many" similar points with a pre-selected frame.
Keep in mind that in order for the above process to be successful, the frames must be as sharp as possible, i guess you don't want to extract as a thubnail a blurred frame anyway. So applying a blurred images detection might be useful in your case.
I have a camera that will be stationary, pointed at an indoors area. People will walk past the camera, within about 5 meters of it. Using OpenCV, I want to detect individuals walking past - my ideal return is an array of detected individuals, with bounding rectangles.
I've looked at several of the built-in samples:
None of the Python samples really apply
The C blob tracking sample looks promising, but doesn't accept live video, which makes testing difficult. It's also the most complicated of the samples, making extracting the relevant knowledge and converting it to the Python API problematic.
The C 'motempl' sample also looks promising, in that it calculates a silhouette from subsequent video frames. Presumably I could then use that to find strongly connected components and extract individual blobs and their bounding boxes - but I'm still left trying to figure out a way to identify blobs found in subsequent frames as the same blob.
Is anyone able to provide guidance or samples for doing this - preferably in Python?
The latest SVN version of OpenCV contains an (undocumented) implementation of HOG-based pedestrian detection. It even comes with a pre-trained detector and a python wrapper. The basic usage is as follows:
from cv import *
storage = CreateMemStorage(0)
img = LoadImage(file) # or read from camera
found = list(HOGDetectMultiScale(img, storage, win_stride=(8,8),
padding=(32,32), scale=1.05, group_threshold=2))
So instead of tracking, you might just run the detector in each frame and use its output directly.
See src/cvaux/cvhog.cpp for the implementation and samples/python/peopledetect.py for a more complete python example (both in the OpenCV sources).
Nick,
What you are looking for is not people detection, but motion detection. If you tell us a lot more about what you are trying to solve/do, we can answer better.
Anyway, there are many ways to do motion detection depending on what you are going to do with the results. Simplest one would be differencing followed by thresholding while a complex one could be proper background modeling -> foreground subtraction -> morphological ops -> connected component analysis, followed by blob analysis if required. Download the opencv code and look in samples directory. You might see what you are looking for. Also, there is an Oreilly book on OCV.
Hope this helps,
Nand
This is clearly a non-trivial task. You'll have to look into scientific publications for inspiration (Google Scholar is your friend here). Here's a paper about human detection and tracking: Human tracking by fast mean shift mode seeking
This is similar to a project we did as part of a Computer Vision course, and I can tell you right now that it is a hard problem to get right.
You could use foreground/background segmentation, find all blobs and then decide that they are a person. The problem is that it will not work very well since people tend to go together, go past each other and so on, so a blob might very well consist of two persons and then you will see that blob splitting and merging as they walk along.
You will need some method of discriminating between multiple persons in one blob. This is not a problem I expect anyone being able to answer in a single SO-post.
My advice is to dive into the available research and see if you can find anything there. The problem is not unsolvavble considering that there exists products which do this: Autoliv has a product to detect pedestrians using an IR-camera on a car, and I have seen other products which deal with counting customers entering and exiting stores.