Cross Correlation in numpy, with FFT - strange result? - python

I'm attempting to perform a cross-correlation of two images using numpy's FFT.
As far as I'm aware, we have that the cross-correlation of two images is equal to the inverseFFT of the multiplication of - Fourier transform of image A, and the complex conjugate of the Fourier transform of image B.
Thus, I have the following code:
img1 = cv2.imread("...jpg")
img1 = cv2.cvtColor(img1, cv2.COLOR_RGB2GRAY)
fft1 = numpy.fft.fft2(img1)
# I'm cross correlating the same image with itself
fft2 = fft1.copy()
fft2 = numpy.conj(fft2)
#Element wise multiplication
result = fft1*fft2
result_img = numpy.fft.ifft2(result)
result_img = numpy.abs(result_img) #Remove complex values
#Following images are attached
image_shifted = normalize(numpy.fft.fftshift(result_img))
image_nonshifted = normalize(result_img)
However, my results are rather strange. In order to obtain what I believe to be the actual correlation-result I have to fftshift the result. Here are some example images:
Image, not shifted, you can see bright parts at each corner
Image, shifted, looks much more like what an auto-correlation result should look like (centre point is maximal)
I'm not sure if my code, or expected mathematics is wrong, but I can't quite figure out what's going on!
Any help would be greatly appreciated, thanks.

FFTSHIFT shifts the zero-frequency component to the center of the signal. In this case the signal is an image. A good visual guide is this. If you expand the original output image, you will see something akin to this:
So all the FFTSHIFT is doing is centering around a zero frequency component. Mostly used for visualization purposes.Your original results are mathematically correct, but the axis just aren't centered where you expected them.

Related

What is the difference between opencv ximgproc.slic and skimage segmentation.slic?

I run the SLIC (Simple Linear Iterative Clustering) superpixels algorithm from opencv and skimage on the same picture with, but got different results, the skimage slic result is better, Shown in the picture below.First one is opencv SLIC, the second one is skimage SLIC. I got several questions hope someonc can help.
Why opencv have the parameter 'region_size' while skimage is 'n_segments'?
Is convert to LAB and a guassian blur necessary?
Is there any trick to optimize the opecv SLIC result?
===================================
OpenCV SLIC
Skimage SLIC
# Opencv
src = cv2.imread('pic.jpg') #read image
# gaussian blur
src = cv2.GaussianBlur(src,(5,5),0)
# Convert to LAB
src_lab = cv.cvtColor(src,cv.COLOR_BGR2LAB) # convert to LAB
# SLIC
cv_slic = ximg.createSuperpixelSLIC(src_lab,algorithm = ximg.SLICO,
region_size = 32)
cv_slic.iterate()
# Skimage
src = io.imread('pic.jpg')
sk_slic = skimage.segmentation.slic(src,n_segments = 256, sigma = 5)
Image with superpixels centroid generated with the code below
# Measure properties of labeled image regions
regions = regionprops(labels)
# Scatter centroid of each superpixel
plt.scatter([x.centroid[1] for x in regions], [y.centroid[0] for y in regions],c = 'red')
but there is one superpixel less(top-left corner), and I found that
len(regions) is 64 while len(np.unique(labels)) is 65 , why?
I'm not sure why you think skimage slic is better (and I maintain skimage! 😂), but:
different parameterizations are common in mathematics and computer science. Whether you use region size or number of segments, you should get the same result. I expect the formula to convert between the two will be something like n_segments = image.size / region_size.
The original paper suggests that for natural images (meaning images of the real world like you showed, rather than e.g. images from a microscope or from astronomy), converting to Lab gives better results.
to me, based on your results, it looks like the gaussian blur used for scikit-image was higher than for openCV. So you could make the results more similar by playing with the sigma. I also think the compactness parameter is probably not identical between the two.

Binarize image data

I have 10 greyscale brain MRI scans from BrainWeb. They are stored as a 4d numpy array, brains, with shape (10, 181, 217, 181). Each of the 10 brains is made up of 181 slices along the z-plane (going through the top of the head to the neck) where each slice is 181 pixels by 217 pixels in the x (ear to ear) and y (eyes to back of head) planes respectively.
All of the brains are type dtype('float64'). The maximum pixel intensity across all brains is ~1328 and the minimum is ~0. For example, for the first brain, I calculate this by brains[0].max() giving 1328.338086605072 and brains[0].min() giving 0.0003886114541273855. Below is a plot of a slice of a brain[0]:
I want to binarize all these brain images by rescaling the pixel intensities from [0, 1328] to {0, 1}. Is my method correct?
I do this by first normalising the pixel intensities to [0, 1]:
normalized_brains = brains/1328
And then by using the binomial distribution to binarize each pixel:
binarized_brains = np.random.binomial(1, (normalized_brains))
The plotted result looks correct:
A 0 pixel intensity represents black (background) and 1 pixel intensity represents white (brain).
I experimented by implementing another method to normalise an image from this post but it gave me just a black image. This is because np.finfo(np.float64) is 1.7976931348623157e+308, so the normalization step
normalized_brains = brains/1.7976931348623157e+308
just returned an array of zeros which in the binarizition step also led to an array of zeros.
Am I binarising my images using a correct method?
Your method of converting the image to a binary image basically amounts to random dithering, which is a poor method of creating the illusion of grey values on a binary medium. Old-fashioned print is a binary medium, they have fine-tuned the methods to represent grey-value photographs in print over centuries. This process is called halftoning, and is shaped in part by properties of ink on paper, that we do not have to deal with in binary images.
So what methods have people come up with outside of print? Ordered dithering (mostly Bayer matrix), and error diffusion dithering. Read more about dithering on Wikipedia. I wrote a blog post showing how to implement all of these methods in MATLAB some years ago.
I would recommend you use error diffusion dithering for your particular application. Here is some code in MATLAB (taken from my blog post liked above) for the Floyd-Steinberg algorithm, I hope that you can translate this to Python:
img = imread('https://i.stack.imgur.com/d5E9i.png');
img = img(:,:,1);
out = double(img);
sz = size(out);
for ii=1:sz(1)
for jj=1:sz(2)
old = out(ii,jj);
%new = 255*(old >= 128); % Original Floyd-Steinberg
new = 255*(old >= 128+(rand-0.5)*100); % Simple improvement
out(ii,jj) = new;
err = new-old;
if jj<sz(2)
% right
out(ii ,jj+1) = out(ii ,jj+1)-err*(7/16);
end
if ii<sz(1)
if jj<sz(2)
% right-down
out(ii+1,jj+1) = out(ii+1,jj+1)-err*(1/16);
end
% down
out(ii+1,jj ) = out(ii+1,jj )-err*(5/16);
if jj>1
% left-down
out(ii+1,jj-1) = out(ii+1,jj-1)-err*(3/16);
end
end
end
end
imshow(out)
Resampling the image before applying the dithering greatly improves the results:
img = imresize(img,4);
% (repeat code above)
imshow(out)
NOTE that the above process expects the input to be in the range [0,255]. It is easy to adapt to a different range, say [0,1328] or [0,1], but it is also easy to scale your images to the [0,255] range.
Have you tried a threshold on the image?
This is a common way to binarize images, rather than trying to apply a random binomial distribution. You could try something like:
binarized_brains = (brains > threshold_value).astype(int)
which returns an array of 0s and 1s according to whether the image value was less than or greater than your chosen threshold value.
You will have to experiment with the threshold value to find the best one for your images, but it does not need to be normalized first.
If this doesn't work well, you can also experiment with the thresholding options available in the skimage filters package.
IT is easy in OpenCV. as mentioned a very common way is defining a threshold, But your result looks like you are allocating random values to your intensities instead of thresholding it.
import cv2
im = cv2.imread('brain.png', cv2.CV_LOAD_IMAGE_GRAYSCALE)
(th, brain_bw) = cv2.threshold(imy, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
th = (DEFINE HERE)
im_bin = cv2.threshold(im, th, 255, cv
cv2.imwrite('binBrain.png', brain_bw)
brain
binBrain

OpenCV - Managing thresholds in image processing with python

I am new to image processing and I am processing the following image and applying threshold to identify edges with the following code
import cv2
import numpy as np
img = cv2.imread("box.jpg")
img_gray = cv2.cvtColor(img,cv2.COLOR_RGB2GRAY)
noise_removal = cv2.bilateralFilter(img_gray,9,75,75)
ret,thresh_image = cv2.threshold(noise_removal,0,255,cv2.THRESH_OTSU)
On the left is the original image. In the middle is the gray image calculated by img_gray in the code. On the right is the threshold image calculated by thresh_imgage.
My question is from image 1 and 2 we can see that there is a significant change in the gradient at the corners but in the threshold image it is also including shadow as the part of box object.
I have run the code several times by changing threshold values but did not succeed to get only the box. What am I doing wrong ? Can someone help in this ? Thanks.
You should have considered trying adaptive threshold
adp_th = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY, 5, 1.8)
This is what I got:
Now playing with the morphological operations mentioned on THIS PAGE you can obtain your desired object.
I just came across another solution regarding selection of optimal thresholds for edge detection. My previous answer was about adaptive threshold of which you know very well.
By optimal I mean choosing a two values (lower and upper thresholds) based on the median value of the gray scale image. The following code shows you how its done:
v = np.median(gray_img)
sigma = 0.33
#---- apply optimal Canny edge detection using the computed median----
lower_thresh = int(max(0, (1.0 - sigma) * v))
upper_thresh = int(min(255, (1.0 + sigma) * v))
edge_img = cv2.Canny(gray_img, lower_thresh, upper_thresh)
cv2.imshow('Edge_of_box',edge_img)
The sigma value of 0.33 is the most optimal value in the field of data science.
Illustration: If you observe a Gaussian curve in statistics, values between 0.33 from both sides of the curve are considered in the distribution. Any value outside these points are assumed to be outliers. Since images are considered to be data, this concept is assumed here as well.
Have a look at this:
Now the second box which you so frequently post:
How can you improve this?
I always wanted to try out the following. Give it a try and do let me know:
First try replacing median value with mean and observe the
results.
Change the sigma value and observe how edge detection changes.
Try performing the above mentioned technique for a small patch of the image. Divide the image into small patches and work your way through. (My way of saying 'Localized edge detection')
There might be better ways to detect out there which I have not come across yet. But this is a great and fun way to start.

Python - Get coordinates of important value of 2D array

I would like to determine an angle from an image (2D array).
I can get the coordinates of the point whose intensity is maximum with "unravel_index" and "argmax" but i would like to know how to get an another point whose intensity is high in order to calculate my angle.
I have to automatise that because i have a great number of images for post-treatement
So for the first coordinates, i can do that :
import numpy as np
from numpy import unravel_index
t = unravel_index(eyy.argmax(), eyy.shape)
And i need an another coordinates in order to calculate my angle...
t2 = ....
theta = np.arctan2(t[0]-t2[0],t[1]-t2[1])
What you could try is to look into the Hough Transform (Wikipedia - Hough Transform). The Hough Transform is a tool developed for finding lines and their orientation in images.
There is a Python implementation of the Hough Transform over at Rosetta Code.
I'm not sure if the lines in your data are distinct enough for the Hough Transform to yield good results but I hope it helps.
You can put your array in a masked array, find the pixel with the maximum intensity, then mask it, then find the next pixel with the maximum intensity.

Python - matplotlib - imshow - How to influence displayed value of unzoomed image

I need to search outliers in more or less homogeneous images representing some physical array. The images have a resolution which is much higher than the screen resolution. Thus every pixel on screen originates from a block of image pixels. Is there the possibility to customize the algorithm which calculates the displayed value for such a block? Especially the possibility to either use the lowest or the highest value would be helpful.
Thanks in advance
Scipy provides several such filters. To get a new image (new) whose pixels are the maximum/minimum over a w*w block of an original image (img), you can use:
new = scipy.ndimage.filters.maximum_filter(img, w)
new = scipy.ndimage.filters.minimum_filter(img, w)
scipy.ndimage.filters has several other filters available.
If the standard filters don't fit your requirements, you can roll your own. To get you started here is an example that shows how to get the minimum in each block in the image. This function reduces the size of the full image (img) by a factor of w in each direction. It returns a smaller image (new) in which each pixel is the minimum pixel in a w*w block of pixels from the original image. The function assumes the image is in a numpy array:
import numpy as np
def condense(img, w):
new = np.zeros((img.shape[0]/w, img.shape[1]/w))
for i in range(0, img.shape[1]//w):
col1 = i * w
new[:, i] = img[:, col1:col1+w].reshape(-1, w*w).min(1)
return new
If you wanted the maximum, replace min with max.
For the condense function to work well, the size of the full image must be a multiple of w in each direction. The handling of non-square blocks or images that don't divide exactly is left as an exercise for the reader.

Categories