I'm having a bit of trouble with opencv and template matching, so I was hoping someone here could help a lost soul out.
So as part of the code I'm using, I've got the following 2 lines which I don't quite understand as well as I should.
result = cv2.matchTemplate(edged, template, cv2.TM_CCOEFF)
(_, maxVal, _, maxLoc) = cv2.minMaxLoc(result)
From my understanding, the first line stores a correlation coefficient in the variable "result". This in turn is passed into cv2.minMaxLoc(...) which in turn generates a 4 element array composing of (minVal, maxVal, minLoc, maxLoc) of which we are only interested in maxVal and maxLoc.
Upon printing the value of maxVal, I seem to be getting values in the between 2,000,000 to 7,000,000 depending on the template, lighting conditions etc.
My questions are as follows:
What does maxVal mean?
What is the range of maxVal?
What physical characteristics affect the values of maxVal?
Thank you in advance for all your help and guidance!
Ideally, cv2.matchTemplate returns a correlation map, essentially a grayscale image, where each pixel denotes how much does the neighborhood of that pixel match with the template.
You suggest that we are interested only in maxLoc and maxVal and that isn't true, it is subject to the type of correlation you are considering when matching a template.
Now, to your questions, the minMaxLoc function returns the max and min intensity values in a Mat or an array along with the location of these intensities.
MaxLoc means where is the highest intensity in the image returned by matchTemplate which would correspond to the best match in your image w.r.t. to your template ( for specific correlation methods only, for TM_SQDIFF or TM_SQDIFF_NORMED the best match would be the minVal).
Since the image returned by matchTemplate is gray-scale, the range should be dependent on the original image, so 2000000 to 7000000 seem a bit out of order to me.
The only "physical characteristics" that affect the maxVal should be the degree of correlation the template has with the image and nothing else.
Hope it helps!
As the other answers already explain, you are matching based on cross-correlation. So maxVal is the maximum of your cross-correlation. It is difficult to make a generic guess about the range. But you can always limit the range to [0, 1] by
normalize(result, result, 0, 1, NORM_MINMAX, -1, Mat());
If you crop the region of the image that best matches the template, then the value of the peak of the cross-correlation function is
np.sum(cropped * template)
This value will get bigger when the image is brighter, when the template is brighter, and when the template is bigger.
Related
I am trying to perform the Canny edge detection algorithm of OpenCV to an image array, whose values range from 0 to 255.
I am struggling to understand the role of the thresholds in the cv2.canny() function because for example, when I use threshold values of (MinThr=300, MaxThr=400) or (MinThr=350, MaxThr=450) I get different results. I don't understand why this happen since I thought that the Thresholds values that I define, couldn't be higher than the maximum value of the pixels in the array (in my case 255).
The other answers that I saw at Stackoverflow didn't help, so if someone could enlight me I would be very grateful. Thanks.
The thresholds are applied not to the original image intensity, but to its gradient magnitude, which maximal value is about 4 times larger than the maximal image intensity, because it is estimated using the Sobel operator. So you will stop seeing the difference (actually you will get an empty result) when the upper threshold exceeds 1000 and something. For details see Canny tutorial.
I have 10 greyscale brain MRI scans from BrainWeb. They are stored as a 4d numpy array, brains, with shape (10, 181, 217, 181). Each of the 10 brains is made up of 181 slices along the z-plane (going through the top of the head to the neck) where each slice is 181 pixels by 217 pixels in the x (ear to ear) and y (eyes to back of head) planes respectively.
All of the brains are type dtype('float64'). The maximum pixel intensity across all brains is ~1328 and the minimum is ~0. For example, for the first brain, I calculate this by brains[0].max() giving 1328.338086605072 and brains[0].min() giving 0.0003886114541273855. Below is a plot of a slice of a brain[0]:
I want to binarize all these brain images by rescaling the pixel intensities from [0, 1328] to {0, 1}. Is my method correct?
I do this by first normalising the pixel intensities to [0, 1]:
normalized_brains = brains/1328
And then by using the binomial distribution to binarize each pixel:
binarized_brains = np.random.binomial(1, (normalized_brains))
The plotted result looks correct:
A 0 pixel intensity represents black (background) and 1 pixel intensity represents white (brain).
I experimented by implementing another method to normalise an image from this post but it gave me just a black image. This is because np.finfo(np.float64) is 1.7976931348623157e+308, so the normalization step
normalized_brains = brains/1.7976931348623157e+308
just returned an array of zeros which in the binarizition step also led to an array of zeros.
Am I binarising my images using a correct method?
Your method of converting the image to a binary image basically amounts to random dithering, which is a poor method of creating the illusion of grey values on a binary medium. Old-fashioned print is a binary medium, they have fine-tuned the methods to represent grey-value photographs in print over centuries. This process is called halftoning, and is shaped in part by properties of ink on paper, that we do not have to deal with in binary images.
So what methods have people come up with outside of print? Ordered dithering (mostly Bayer matrix), and error diffusion dithering. Read more about dithering on Wikipedia. I wrote a blog post showing how to implement all of these methods in MATLAB some years ago.
I would recommend you use error diffusion dithering for your particular application. Here is some code in MATLAB (taken from my blog post liked above) for the Floyd-Steinberg algorithm, I hope that you can translate this to Python:
img = imread('https://i.stack.imgur.com/d5E9i.png');
img = img(:,:,1);
out = double(img);
sz = size(out);
for ii=1:sz(1)
for jj=1:sz(2)
old = out(ii,jj);
%new = 255*(old >= 128); % Original Floyd-Steinberg
new = 255*(old >= 128+(rand-0.5)*100); % Simple improvement
out(ii,jj) = new;
err = new-old;
if jj<sz(2)
% right
out(ii ,jj+1) = out(ii ,jj+1)-err*(7/16);
end
if ii<sz(1)
if jj<sz(2)
% right-down
out(ii+1,jj+1) = out(ii+1,jj+1)-err*(1/16);
end
% down
out(ii+1,jj ) = out(ii+1,jj )-err*(5/16);
if jj>1
% left-down
out(ii+1,jj-1) = out(ii+1,jj-1)-err*(3/16);
end
end
end
end
imshow(out)
Resampling the image before applying the dithering greatly improves the results:
img = imresize(img,4);
% (repeat code above)
imshow(out)
NOTE that the above process expects the input to be in the range [0,255]. It is easy to adapt to a different range, say [0,1328] or [0,1], but it is also easy to scale your images to the [0,255] range.
Have you tried a threshold on the image?
This is a common way to binarize images, rather than trying to apply a random binomial distribution. You could try something like:
binarized_brains = (brains > threshold_value).astype(int)
which returns an array of 0s and 1s according to whether the image value was less than or greater than your chosen threshold value.
You will have to experiment with the threshold value to find the best one for your images, but it does not need to be normalized first.
If this doesn't work well, you can also experiment with the thresholding options available in the skimage filters package.
IT is easy in OpenCV. as mentioned a very common way is defining a threshold, But your result looks like you are allocating random values to your intensities instead of thresholding it.
import cv2
im = cv2.imread('brain.png', cv2.CV_LOAD_IMAGE_GRAYSCALE)
(th, brain_bw) = cv2.threshold(imy, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
th = (DEFINE HERE)
im_bin = cv2.threshold(im, th, 255, cv
cv2.imwrite('binBrain.png', brain_bw)
brain
binBrain
I am new to image processing and I am processing the following image and applying threshold to identify edges with the following code
import cv2
import numpy as np
img = cv2.imread("box.jpg")
img_gray = cv2.cvtColor(img,cv2.COLOR_RGB2GRAY)
noise_removal = cv2.bilateralFilter(img_gray,9,75,75)
ret,thresh_image = cv2.threshold(noise_removal,0,255,cv2.THRESH_OTSU)
On the left is the original image. In the middle is the gray image calculated by img_gray in the code. On the right is the threshold image calculated by thresh_imgage.
My question is from image 1 and 2 we can see that there is a significant change in the gradient at the corners but in the threshold image it is also including shadow as the part of box object.
I have run the code several times by changing threshold values but did not succeed to get only the box. What am I doing wrong ? Can someone help in this ? Thanks.
You should have considered trying adaptive threshold
adp_th = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY, 5, 1.8)
This is what I got:
Now playing with the morphological operations mentioned on THIS PAGE you can obtain your desired object.
I just came across another solution regarding selection of optimal thresholds for edge detection. My previous answer was about adaptive threshold of which you know very well.
By optimal I mean choosing a two values (lower and upper thresholds) based on the median value of the gray scale image. The following code shows you how its done:
v = np.median(gray_img)
sigma = 0.33
#---- apply optimal Canny edge detection using the computed median----
lower_thresh = int(max(0, (1.0 - sigma) * v))
upper_thresh = int(min(255, (1.0 + sigma) * v))
edge_img = cv2.Canny(gray_img, lower_thresh, upper_thresh)
cv2.imshow('Edge_of_box',edge_img)
The sigma value of 0.33 is the most optimal value in the field of data science.
Illustration: If you observe a Gaussian curve in statistics, values between 0.33 from both sides of the curve are considered in the distribution. Any value outside these points are assumed to be outliers. Since images are considered to be data, this concept is assumed here as well.
Have a look at this:
Now the second box which you so frequently post:
How can you improve this?
I always wanted to try out the following. Give it a try and do let me know:
First try replacing median value with mean and observe the
results.
Change the sigma value and observe how edge detection changes.
Try performing the above mentioned technique for a small patch of the image. Divide the image into small patches and work your way through. (My way of saying 'Localized edge detection')
There might be better ways to detect out there which I have not come across yet. But this is a great and fun way to start.
After reading the docs and searching all over the internet I still do not understand how to interpret the output of the matchTemplate function from openCV.
What I understand:
result = cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED)
I understand that I get kind of a matrix with a matching value for every part in the picture. Each element in this matrix determines how much similarity it shows to the template.
e.g. I can filter all locations that have a matching value below 0.7 with
numpy.where(result >= 0.7)
What I do not understand is how this information is stored in the output I get from the matchTemplate function and how the position of the match can be extracted from the output.
Basically what I wanna do is match several templates to one image and then determine which template matches best to which location (has the max. matchingValue of all applied templates for a location).
My idea is to extract the matching value into a matrix for every template and then compare the matrices (their elements) to one another to find the best match.
Thanks for helping and please correct me where I'm wrong,
Greetings Don
What I do not understand is how this information is stored in the output I get from the matchTemplate function and how the position of the match can be extracted from the output.
This result will return the possibility of each pixel in the image like to top corner pixel of template
When you do this
loc =numpy.where(result >= 0.7)
We will filter out the possibility with the above method. We will get the x coordinate, y coordinate for the pixel of the image which the possibility of like to top corner of template greater than 0.7
#(array([202, 203, 203, 203, 204]), array([259, 258, 259, 260, 259]))
Now we get the image locations which are like to top left corner template with the possibility of higher than or equal to 0.7
In our example output, you can see that we can get plenty of matching points based on our threshold. we need to loop them to find each location.
since we know that the loc variable is a tuple with two NumPy array(Y coordinate NumPy array and X coordinate NumPy array), We need to unpack the tuple and reverse the order of arrays to get the actual places of the templates as follow.
Basically what I wanna do is match several templates to one image and then determine which template matches best to which location (has the max. matchingValue of all applied templates for a location).
The problem is, each result is not in the same shape since it depends on the template height and width.
What we can easily do is make all templates to the same shape( which can approximate for all templates) using cv2.resize method
'''
my original template shapes
img (1256, 1300)
tempate (215, 223)
tempate (217, 204)
tempate (207, 203)
width =220
height = 225
temp = cv2.resize(temp,(width,height),cv2.INTER_CUBIC)
This one makes our result in the same shape as below
res = cv2.matchTemplate(gray, temp,cv2.TM_CCOEFF_NORMED)
print(res.shape) #output (1032, 1081)
**My Approach**
1. I set the threshold to 0.7 (your choice) less than this threshold values i will set them into zero
```
res[res
I found the maximum possibility array with all results list. All res are added to results list
maximum_values_array = np.maximum(*results)
I found which res has the maximum value for that particular location
maximum_value_contains_array =np.array(results).argmax(axis=0)
I iterate each possibility value greater than zero(threshold>0.7). select the colour based on which array has that maximum value.
for i in range(len(maximum_values_array)):
for j in range(len(maximum_values_array[i])):
if maximum_values_array[i][j]>0:
colour = colours[maximum_value_contains_array[i][j]]
top_lect = (j,i)
bottom_right = (j+width, i+height)
cv2.rectangle(img,top_lect,bottom_right, colour, 2)
Full python code
import cv2
import numpy as np
img = cv2.imread('test.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
templates = [cv2.imread('blue_temp.jpg', 0), cv2.imread('yellow_temp.jpg', 0), cv2.imread('red_temp.jpg', 0)]
colours =[(255,0,0), (0,255,0),(0,0,255)]
results =[]
for temp in templates:
print(temp.shape)
width =205
height = 205
# temp = cv2.resize(temp,(width,height),cv2.INTER_CUBIC)
res = cv2.matchTemplate(gray, temp,cv2.TM_CCOEFF_NORMED)
res[res<0.9] =0
results.append(res)
maximum_values_array = np.maximum(*results)
maximum_value_contains_array =np.array(results).argmax(axis=0)
print(maximum_values_array.shape)
print(maximum_value_contains_array.shape)
for i in range(len(maximum_values_array)):
for j in range(len(maximum_values_array[i])):
if maximum_values_array[i][j]>0:
print(maximum_values_array[i][j])
colour = colours[maximum_value_contains_array[i][j]]
top_lect = (j,i)
bottom_right = (j+width, i+height)
cv2.rectangle(img,top_lect,bottom_right, colour, 2)
cv2_imshow(img)
cv2.imwrite('output.png', img)
You can use the following code:
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)
When using cv2.TM_CCOEFF_NORMED max_loc will be the location of the template in your img. And max_val will be the correlation of the match
It simply slides the template image over the input image (as in 2D convolution) and compares the template and patch of input image under the template image.
If input image is of size (WxH) and template image is of size (wxh), output image will have a size of (W-w+1, H-h+1).
I need to search outliers in more or less homogeneous images representing some physical array. The images have a resolution which is much higher than the screen resolution. Thus every pixel on screen originates from a block of image pixels. Is there the possibility to customize the algorithm which calculates the displayed value for such a block? Especially the possibility to either use the lowest or the highest value would be helpful.
Thanks in advance
Scipy provides several such filters. To get a new image (new) whose pixels are the maximum/minimum over a w*w block of an original image (img), you can use:
new = scipy.ndimage.filters.maximum_filter(img, w)
new = scipy.ndimage.filters.minimum_filter(img, w)
scipy.ndimage.filters has several other filters available.
If the standard filters don't fit your requirements, you can roll your own. To get you started here is an example that shows how to get the minimum in each block in the image. This function reduces the size of the full image (img) by a factor of w in each direction. It returns a smaller image (new) in which each pixel is the minimum pixel in a w*w block of pixels from the original image. The function assumes the image is in a numpy array:
import numpy as np
def condense(img, w):
new = np.zeros((img.shape[0]/w, img.shape[1]/w))
for i in range(0, img.shape[1]//w):
col1 = i * w
new[:, i] = img[:, col1:col1+w].reshape(-1, w*w).min(1)
return new
If you wanted the maximum, replace min with max.
For the condense function to work well, the size of the full image must be a multiple of w in each direction. The handling of non-square blocks or images that don't divide exactly is left as an exercise for the reader.