Find Frames in a video that matches an image

Find Frames in a video that matches an image - python

I am trying to find frames that matches an image using opencv. I also want to find the timeframe at which the the image is found. The video is a masked video. The code so far:
def occurence_counter(self):
img = cv2.imread('ref_img.jpg', cv2.IMREAD_COLOR)
# shrink
img = cv2.resize(img, (10, 10))
# convert to b&w
img = color.rgb2gray(img)
similarities = []
result = self.parse_video(img, str(self.lineEdit.text()).strip(), 1, False)
print result
def parse_video(self, image, video, n_matches, break_point=False,
verbose=False):
similarities = [{'frame': 0, 'similarity': 0}]
frame_count = 0
cap = cv2.VideoCapture(video)
while cap.isOpened():
ret, frame = cap.read()
if type(frame) == type(None):
break
# increment frame counter
frame_count += 1
# resize current video frame
small_frame = cv2.resize(frame, (10, 10))
# convert to greyscale
small_frame_bw = color.rgb2gray(small_frame)

Finding the same frame is not so easy issue. There are many possible solutions.I will describe here the possible solutions in a very general way.
Template Matching
Template matching is algorithm that calculate the similarity of the corresponding pixels in the images. So if You are looking for very similar image (without rotation, translation, large intenisty changes) it is not so bad algorithm. It is not so fast for whole images. It is rather used to find the same fragment on several images or smaller image on bigger image, not to check similarity of two images.
https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_template_matching/py_template_matching.html
For whole images it is easier to just simply subtract the images then use template matching. It is much faster. There must be an assumption that they are really similar to each other.
Histogram Comparision
You can use histogram comparision. It is the fastest way, but it is not accurate. Grass and apples are both green, but dissimilar to each other. It's usually better to use the HSV color space when it comes to color.
https://docs.opencv.org/3.4.1/d8/dc8/tutorial_histogram_comparison.html
Feature matching
Algorithm is searching for simillar characteristic points on images. There are many algorithms to find features on images. They should be insensitive to scale change and rotation etc. But it depends on the feature extraction algoritm.
https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_feature2d/py_features_meaning/py_features_meaning.html#features-meaning
Other alogithms
Other algoritms are PSNR or SSIM. I have never used it, but there are used to calculate similarity for original and blur image or for similarity of whole video sequence.
https://docs.opencv.org/3.4.2/d5/dc4/tutorial_video_input_psnr_ssim.html
You can too try to compare hashes of images. It is very interessting algorithm (for me), but it is not well documented.
https://www.pyimagesearch.com/2017/11/27/image-hashing-opencv-python/
Feature matching are the most-used algorithm for this type of task. The reason for that is feature matching alogrithms can detect similar fragments of images when images are taken from a different angle, in different conditions, or only partially overlap. Structure From Motion algorithms are often using feature matching. https://hub.packtpub.com/exploring-structure-motion-using-opencv/
The solution to the problem always depends on the data we have. So there is no one answer.

If I am not mistaken what you want to do is called Template Matching, you can find opencv tutorial of that feature here. Also this thread might be useful for you, especially #Sam answer, which beyond Template Matching describe also Comparing Histograms and Feature Matching.

Related

What is an efficient way to filter calibration images?

I am performing calibration as showed in this tutorial.
Instead of manually visualizing to decide, I want my calibration routine to decide if an image is a good fit for calibration or not, since there are a few images where the chessboard pattern detected is crooked and they might have a bad influence on the calibration.
I have around 400 images so it is not possible to visualize and decide for each image.
The following is a possible solution, but really slow considering the huge number of images.
def calculate_error(img_points_p, obj_points_p, rot_vectors_p, tr_vectors_p, mtx_p, dist_p):
error_data = []
for i in range(len(obj_points_p)):
img_points_2, _ = cv.projectPoints(obj_points_p[i], rot_vectors_p[i], tr_vectors_p[i], mtx_p, dist_p)
error = cv.norm(img_points_p[i], img_points_2, cv.NORM_L2) / len(img_points_2)
error_data.append(error)
return error_data
# perform calibration
# call calculate_error(...)
# remove from img_points (2d points in image plane) the values which correspond to value greater than 0.1 in error_data
# perform calibration again with data with only lesser values from error_data
Is a faster alternative to this possible? Like checking if the image is a good one right after we detect the chessboard pattern, for all images?

In algorithm form:
Run findChessboardCorners() on each image, reject those where a chessboard is not detected.
On the survivor images, run findHomography() on the detected chessboard corners, using the RANSAC or LMEDS estimators, reject those that fail or find less than N inliers. Use a reasonable value for N, say, 16 or 36 (meaning you want to "see" the equivalent of 4x4 or 6x6 inliers. at least. Don't be too tight on the max acceptable reprojection error, since you are not correcting for lens distortion yet.

What features do glitched images have that I could detect?

I'm trying to build a footage filter that only sends only "good" frames to the database.
Here is my current rating function:
def rateImg(img):
try:
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
except:
gray = img
edges = cv2.Canny(gray, 0, 255)
countours, _ = cv2.findContours(
edges, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)
num_of_countours = len(countours)
lap = cv2.Laplacian(gray, cv2.CV_64F).var()
lap = round(lap, 2)
return [lap, num_of_countours]
First off, I use variance of Laplacian to calculate the sharpness of an image from a particular time window.
It should technically provide me a "good" frame, but that's not always the case.
The camera I have to use isn't great and sometimes glitches out like this and frames like this have the highest variance of Laplacian.
So, my current solution is to calculate the number of countours in an image and if an image crosses a particular threshold I classify it as "glitched". But with this approach the algorithm rates images with a lot of objects as "glitched".
Also, I have tried detecting squares and rectangles, but that proved to be much less effective than the countour approach.
Is there any way to detect obvious glitches in an image?
I feel like there should be, because as a human I can easily classify glitched and normal images at a glance. I just can't seem to pin-point what exactly makes them different.

Is there any way to detect obvious glitches in an image?
Yes, but probably not for complex random glitches, have a look in this similar question
In that case, you can detect if there is a large area of the image containing the same color. Photo taken from the camera would never contain the same RGB value although they look similar. However, this would be perfectly normal if the images are arts drawn on a digital devices.
As a human I can easily classify glitched and normal images at a
glance... What exactly makes them (me and a program) different. Is there any way to detect obvious glitches in an image
In fact, you can't identify a glitched image. You try to recognize the objects in it. When you see something "weird" that you don't recognize, you consider it as a glitched image. The machine can neither achieve this. You can train an AI that report images with unrecognizable "parts" as glitched but it will never be 100% accurate

Converting your image to HSV and runnign the Brightness Channel through an edge filter on ImageJ gives me this:
As you can see, the "glitched" region appears pretty uniformly brighter then the rest of the image, and should be detectable in some form. How often do you get a picture from you camera ? Depending on how much change occurs between two pictures, you might get away with subtracting the current one from the one before it to just look at changes.
You have not shown what an image with
a lot of objects
looks like, so you'd have to try if this works for those cases.
OpenCV functions needed for this workflow would be:
cvtColor() with COLOR_BGR2GRAY for color conversion (there might be faster ways to get a good greyvalue then HSV)
one of the edge detectors. Canny() or Sobel() would be the first i'd try
some greyvalue statistics. threshold() and CountNonZero() for a simple approach, which you could refine for sectors on the image or such
PS:
I feel like there should be, because as a human I can easily classify
glitched and normal images at a glance.
That's a common fallacy: Us humans (the sight-centric beings that we are) are fantastic at pattern recognition and interpolation and are rarely aware how much of that (including a lot of error correction) is happening every microsecond. A puny 2D camera can not hope to live up to that. (obligatory XKCD: https://xkcd.com/1425/)

Recognizing digits with OpenCV and Python (Simple digit OCR)

So I'm trying to create a program that can see what number an image is and print the integer in the console. (I'm using python 3)
For example that the program recognizes that the following image (an actual image the program has to check) is number 2:
I've tried to just compare it with an other image with the 2 in it with cv2.matchTemplate() but each time the blue pixels rgb values are a little bit different for each image and the image could be a bit larger or smaller. for example the following image:
It also has to recognize it apart from al the other blue number images (0-9), for example the following one:
I've tried mulitple match template codes, and make a folder with number 0-9 images as templates, but each time almost every single number is recognized in the number that needs to be recognized. for example number 5 gets recognized in an image that is number 2. And if its doesnt recognize all of them, it recognizes the wrong one(s).
The ones I've tried:
Answer from this question
Both codes from this tutorial
And the one from this tutorial
but like I said before it comes with those problems.
I've also tried to see how much percentage blue is in each image, but those numbers were to close to tell the numbers appart by seeing how much blue was in them.
Does anyone have a solution? Am I being stupid for using cv2.matchTemplate() and is there a much simpler option? (I don't mind using a library for it, because this is part of a bigger piece of code, but I prefer to code it, instead of libraries)

Instead of using Template Matching, a better approach is to use Pytesseract OCR to read the number with image_to_string(). But before performing OCR, you need to preprocess the image. For optimal OCR performance, the preprocessed image should have the desired text/number/characters to OCR in black with the background in white. A simple preprocessing step is to convert the image to grayscale, Otsu's threshold to obtain a binary image, then invert the image. Here's a visualization of the preprocessing step:
Input image -> Grayscale -> Otsu's threshold -> Inverted image ready for OCR
Result from Pytesseract OCR
2
Here's the results with the other images:
2
5
We use the --psm 6 configuration option to assume a single uniform block of text. See here for more configuration options.
Code
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Load image, grayscale, Otsu's threshold, then invert
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
invert = 255 - thresh
# Perfrom OCR with Pytesseract
data = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.imshow('invert', invert)
cv2.waitKey()
Note: If you insist on using Template Matching, you need to use scale variant template matching. Take a look at how to isolate everything inside of a contour, scale it, and test the similarity to an image? and Python OpenCV line detection to detect X symbol in image for some examples. If you know for certain that your images are blue, then another approach would be to use color thresholding with cv2.inRange() to obtain a binary mask image then apply OCR on the image.

Given the lovely regular input, I expect that all you need is simple comparison to templates. Since you neglected to supply your code and output, it's hard to tell what might have gone wrong.
Very simply ...
Rescale your input to the size or your templates.
Calculate any straightforward matching evaluation on the input with each of the 10 templates. A simply matching count should suffice: how many pixels match between the two images.
The template with the highest score is the identification.
You might also want to set a lower threshold for declaring a match, perhaps based on how well that template matches each of the other templates: any identification has to clearly exceed the match between two different templates.

If you don't have access to an OCR engine, just know you can build your own OCR system via a KNN classifier. In this example, the implementation should not be very difficult, as you are only classifying numbers. OpenCV provides a very straightforward implementation of KNN.
The classifier is trained using features calculated from samples from known instances of classes. In this case, you have 10 classes (if you are working with digits 0 - 9), so you can prepare a "template" with your digits, extract some features, train the classifier and use it to classify new instances.
All can be done in OpenCV without the need of extra libraries and the KNN (for this kind of application) has a more than acceptable accuracy rate.

cv2.createStitcher() not enough keypoints?

I am working on a project to have multiple cameras, each taking an image and then the images will be stitched together. Currently I am trying to use the cv2.createStitcher().stitch(images) function. Below is the code that I use:
import cv2
imageFiles = ['imageCapture1_0.png','imageCapture2_0.png']
images = []
for filename in imageFiles:
img = cv2.imread(filename)
images.append(img)
cv2.ocl.setUseOpenCL(False)
stitcher = cv2.createStitcher()
status, result = stitcher.stitch(images)
cv2.imwrite('result.png',result)
The image input is:
left image:
right image:
However, result output type becomes NoneType with size 1 and value: NoneType object of builtins modules. From what I have googled, the cause of this is because there is not enough matching keypoint to stitch the images together. If so, is there a way to stitch image even with less keypoint? Is there a way to set the parameter? I read through the documentation with no luck trying to find the solution. Thank you in advance

The image stitching operation status, result = stitcher.stitch(images) returns two values, a status indicator and the resulting stitched image. You can check the value of status to determine whether or not the image stitching operation was a success. From the docs it can be one of four variables:
OK = 0: Image stitching was successful.
ERR_NEED_MORE_IMGS = 1: There were not enough keypoints detected in your input images to construct the panorama. You will need more input images.
ERR_HOMOGRAPHY_EST_FAIL = 2: This error occurs when the RANSAC homography estimation fails. Similarly, you may need more input images or the images provided do not have enough distinguishing features for keypoints to be accurately matched.
ERR_CAMERA_PARAMS_ADJUST_FAIL = 3: Usually related to failing to properly estimate camera features from the input images.
For your situation, you can either add more input images so there will be enough keypoints detected or you can look into your own implementation.

I copied and run your code. It is working all fine.
Left Image
Right Image
Result
What I think is that the function could not find the enough match points in your pictures. Trying this code for another set of pictures may help.

Algorithm to compare two images with pattern - Python

I would like to ask you for help. I am a student and for academic research I'm designing a system where one of the modules is responsible for comparison of low-resolution simple images (img, jpg, jpeg, png, gif). However, I need guidance if I can write an implementation in Python and how to get started. Maybe someone of you met once with something like this and would be able to share their knowledge.
Issue 1 - simple version
The input data must be compared with the pattern (including images) and the data output will contain information about the degree of similarity (percentage), and the image of the pattern to which the given input is the most similar. In this version, the presumption is that the input image is not modified in any way (ie not rotated, tilted, etc.)
Issue 2 - difficult version
The input data must be compared with the pattern (including images) and the data output will contain information about the degree of similarity (percentage), and the image of the pattern to which the given input is the most similar. In this version, the presumption is that the input image can be rotated
Can some of you guys tell me what I need to do that and how to start. I will appreciate any help.

As a starter, you could read in the images using matplotlib, or the python imaging library (PIL).
Comparing to a pattern could be done by a cross-correlation, which you could do using scipyor numpy. As you only have few pixels, I would go for numpy which does not use fourier transforms.
import pylab as P
import numpy as N
# read the images
im1 = P.imread('4Fsjx.jpg')
im2 = P.imread('xUHhB.jpg')
# do the crosscorrelation
conv = N.convolve(im1, im2)
# a measure for similarity then is:
sim = N.sum(N.flatten(conv))
please note, this is a very quick and dirty approach and you should spend quite some thoughts on how to improve it, not even including the rotation that you mentioned. Anyhow; this code can read in your images, and give you a measure for similarity, although the convolve will not work on color coded data. I hope it will give you something to start at.

Here is a start as some pseudo code. I would strongly recommend getting numpy/scipy to help with this.
#read the input image:
files = glob.glob('*.templates')
listOfImages = []
for elem in files:
imagea = scipy.misc.imread(elem)
listOfImages.append(imagea)
#read input/test imagea
targetImage = scipy.misc.imread(targetImageName)
now loop through each of the listOfImages and compute the "distance"
note that this is probably the hardest part. How will you decide
if two images are similar? Using direct pixel comparisons? Using
image histograms, using some image aligment metric(this would be useful
for your difficult version). Some of the simple gotchas, I noticed that your uploaded images were different sizes. If the images are of different sizes then you will have to
sweep over the images. Also, can the images be scaled? Then you will need to either have a scale invariant metric or try the sweep over different scales
#keep track of the min distance
minDistance = Distance(targetImage,listOfImages[0])
minIndex = 0
for index,elem in enumerate(listOfImages):
currentDistance = Distance(targetImage,elem)
if currentDistance < minDistance:
minDistance = currentDistance
minIndex = index
The distance function is where the challenges are, but I'll leave that
for you.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.