Disparity Map implementation in Python not matching in-built OpenCV function - python

I'm facing an issue, and would like some inputs from the community on how to improve the disparity map. I'm following this tutorial for calculating the disparity map between 2 images. The code I have is as follows:
import cv2
import numpy as np
import sys
from matplotlib import pyplot as plt
num_disparities = 64 # number of disparities to check
block = 9 # block size to match
def preprocess_frame(path):
image = cv2.imread(path, 0)
image = cv2.equalizeHist(image)
image = cv2.GaussianBlur(image, (5, 5), 0)
return image
def calculate_disparity_matrix(args):
left_image = preprocess_frame(args[1])
right_image = preprocess_frame(args[2])
rows, cols = left_image.shape
kernel = np.ones([block, block]) / block
disparity_maps = np.zeros(
[left_image.shape[0], left_image.shape[1], num_disparities])
for d in range(0, num_disparities):
# shift image
translation_matrix = np.float32([[1, 0, d], [0, 1, 0]])
shifted_image = cv2.warpAffine(
right_image, translation_matrix,
(right_image.shape[1], right_image.shape[0]))
# calculate squared differences
SAD = abs(np.float32(left_image) - np.float32(shifted_image))
# convolve with kernel and find SAD at each point
filtered_image = cv2.filter2D(SAD, -1, kernel)
disparity_maps[:, :, d] = filtered_image
disparity = np.argmin(disparity_maps, axis=2)
disparity = np.uint8(disparity * 255 / num_disparities)
disparity = cv2.equalizeHist(disparity)
plt.imshow(disparity, cmap='gray', vmin=0, vmax=255)
plt.show()
def calculate_disparity_inbuilt(args):
left_image = preprocess_frame(args[1])
right_image = preprocess_frame(args[2])
rows, cols = left_image.shape
stereo = cv2.StereoBM_create(numDisparities=num_disparities,
blockSize=block)
disparity = stereo.compute(left_image, right_image)
plt.imshow(disparity, cmap='gray', vmin=0, vmax=255)
plt.show()
The problem is that the output that I get from the inbuilt function in OpenCV is hardly similar to the one I've implemented. I was expecting at least a slight similarity between the 2. Is this expected? or am I doing something wrong here?
Implemented Algorithm
OpenCV Algorithm

Related

For PIL.ImageFilter.GaussianBlur how what kernel is used and does the radius parameter relate to standard deviation?

After reading an image with PIL I usually perform a Gaussian filter using scipy.ndimage as follow
import PIL
from scipy import ndimage
PIL_image = PIL.Image.open(filename)
data = PIL_image.getdata()
array = np.array(list(data)).reshape(data.size[::-1]+(-1,))
img = array.astype(float)
fimg = ndimage.gaussian_filter(img, sigma=sigma, mode='mirror', order=0)
There is Gaussian blur function within PIL as follows (from this answer), but I don't know how it works exactly or what kernel it uses:
from PIL import ImageFilter
fimgPIL = PIL_image.filter(ImageFilter.GaussianBlur(radius=r)
This documentation does not provide details.
Questions about PIL.ImageFilter.GaussianBlur:
What exactly is the radius parameter; is it equivalent to the standard deviation σ?
For a given radius, how far out does it calculate the kernel? 2σ? 3σ? 6σ?
This comment on an answer to Gaussian Blur - standard deviation, radius and kernel size says the following, but I have not found information for PIL yet.
OpenCV uses kernel radius of (sigma * 3) while scipy.ndimage.gaussian_filter uses kernel radius of int(4 * sigma + 0.5)
From the source code, it looks like PIL.ImageFilter.GaussianBlur uses PIL.ImageFilter.BoxBlur. But I wasn't able to figure out how the radius and sigma are related.
I wrote a script to check the difference between scipy.ndimage.gaussian_filter and PIL.ImageFilter.GaussianBlur.
import numpy as np
from scipy import misc
from scipy.ndimage import gaussian_filter
import PIL
from PIL import ImageFilter
import matplotlib.pyplot as plt
# Load test color image
img = misc.face()
# Scipy gaussian filter
sigma = 5
img_scipy = gaussian_filter(img, sigma=(sigma,sigma,0), mode='nearest')
# PIL gaussian filter
radius = 5
PIL_image = PIL.Image.fromarray(img)
img_PIL = PIL_image.filter(ImageFilter.GaussianBlur(radius=radius))
data = img_PIL.getdata()
img_PIL = np.array(data).reshape(data.size[::-1]+(-1,))
img_PIL = img_PIL.astype(np.uint8)
# Image difference
img_diff = np.abs(np.float_(img_scipy) - np.float_(img_PIL))
img_diff = np.uint8(img_diff)
# Stats
mean_diff = np.mean(img_diff)
median_diff = np.median(img_diff)
max_diff = np.max(img_diff)
# Plot results
plt.subplot(221)
plt.imshow(img_scipy)
plt.title('SciPy (sigma = {})'.format(sigma))
plt.axis('off')
plt.subplot(222)
plt.imshow(img_PIL)
plt.title('PIL (radius = {})'.format(radius))
plt.axis('off')
plt.subplot(223)
plt.imshow(img_diff)
plt.title('Image difference \n (Mean = {:.2f}, Median = {:.2f}, Max = {:.2f})'
.format(mean_diff, median_diff, max_diff))
plt.colorbar()
plt.axis('off')
# Plot histogram
d = img_diff.flatten()
bins = list(range(int(max_diff)))
plt.subplot(224)
plt.title('Histogram of Image difference')
h = plt.hist(d, bins=bins)
for i in range(len(h[0])):
plt.text(h[1][i], h[0][i], str(int(h[0][i])))
Output for sigma=5, radius=5:
Output for sigma=30, radius=30:
The outputs of scipy.ndimage.gaussian_filter and PIL.ImageFilter.GaussianBlur are very similar and the difference is negligible. More than 95% of difference values are <= 2.
PIL version: 7.2.0, SciPy version: 1.5.0
This is a supplementary answer to #Nimal's accepted answer.
Basically the radius parameter is like sigma. I won't dig too deep, but I think the Gaussian kernel is slightly different internally in order to preserve normalization after rounding back to integers, since the PIL method returns 0 to 255 integer levels.
The script below generates an image that is 1 on the left and 0 on the right, then does a sigma = 10 pixel blur with both methods, then plots the center horizontal lines through each, plus their differences. I do difference twice since log can only display positive differences.
The first panel is the difference between PIL and the SciPy float results, the second is the truncated integer SciPy results, and the third is rounded SciPy.
import numpy as np
import matplotlib.pyplot as plt
import PIL
from scipy.ndimage import gaussian_filter
from PIL import ImageFilter
import PIL
sigma = 10.0
filename = 'piximg.png'
# Save a PNG with a central pixel = 1
piximg = np.zeros((101, 101), dtype=float)
piximg[:, :50] = 1.0
plt.imsave(filename, piximg, cmap='gray')
# Read with PIL
PIL_image = PIL.Image.open(filename)
# Blur with PIL
img_PIL = PIL_image.filter(ImageFilter.GaussianBlur(radius=sigma))
data = img_PIL.getdata()
img_PIL = np.array(list(data)).reshape(data.size[::-1]+(-1,))
g1 = img_PIL[..., 1]
# Blur with SciPy
data = PIL_image.getdata()
array = np.array(list(data)).reshape(data.size[::-1]+(-1,))
img = array.astype(float)
fimg = gaussian_filter(img[...,:3], sigma=sigma, mode='mirror', order=0)
g2 = fimg[..., 1]
g2u = np.uint8(g2)
g2ur = np.uint8(g2+0.5)
if True:
plt.figure()
plt.subplot(3, 1, 1)
plt.plot(g1[50])
plt.plot(g2[50])
plt.plot(g2[50] - g1[50])
plt.plot(g1[50] - g2[50])
plt.yscale('log')
plt.ylim(0.1, None)
plt.subplot(3, 1, 2)
plt.plot(g1[50])
plt.plot(g2u[50])
plt.plot(g2u[50] - g1[50])
plt.plot(g1[50] - g2u[50])
plt.yscale('log')
plt.ylim(0.1, None)
plt.subplot(3, 1, 3)
plt.plot(g1[50])
plt.plot(g2ur[50])
plt.plot(g2ur[50] - g1[50])
plt.plot(g1[50] - g2ur[50])
plt.yscale('log')
plt.ylim(0.1, None)
plt.show()

OpenCV: undistort (for images) and undistortPoints are inconsistent

For testing I generate a grid image as matrix and again the grid points as point array:
This represents a "distorted" camera image along with some feature points.
When I now undistort both the image and the grid points, I get the following result:
(Note that the fact that the "distorted" image is straight and the "undistorted" image is morphed is not the point, I'm just testing the undistortion functions with a straight test image.)
The grid image and the red grid points are totally misaligned now. I googled and found that some people forget to specify the "new camera matrix" parameter in undistortPoints but I didn't. The documentation also mentions a normalization but I still have the problem when I use the identity matrix as camera matrix. Also, in the central region it fits perfectly.
Why is this not identical, do I use something in a wrong way?
I use cv2 (4.1.0) in Python. Here is the code for testing:
import numpy as np
import matplotlib.pyplot as plt
import cv2
w = 401
h = 301
# helpers
#--------
def plotImageAndPoints(im, pu, pv):
plt.imshow(im, cmap="gray")
plt.scatter(pu, pv, c="red", s=16)
plt.xlim(0, w)
plt.ylim(0, h)
plt.show()
def cv2_undistortPoints(uSrc, vSrc, cameraMatrix, distCoeffs):
uvSrc = np.array([np.matrix([uSrc, vSrc]).transpose()], dtype="float32")
uvDst = cv2.undistortPoints(uvSrc, cameraMatrix, distCoeffs, None, cameraMatrix)
uDst = [uv[0] for uv in uvDst[0]]
vDst = [uv[1] for uv in uvDst[0]]
return uDst, vDst
# test data
#----------
# generate grid image
img = np.ones((h, w), dtype = "float32")
img[0::20, :] = 0
img[:, 0::20] = 0
# generate grid points
uPoints, vPoints = np.meshgrid(range(0, w, 20), range(0, h, 20), indexing='xy')
uPoints = uPoints.flatten()
vPoints = vPoints.flatten()
# see if points align with the image
plotImageAndPoints(img, uPoints, vPoints) # perfect!
# undistort both image and points individually
#---------------------------------------------
# camera matrix parameters
fx = 1
fy = 1
cx = w/2
cy = h/2
# distortion parameters
k1 = 0.00003
k2 = 0
p1 = 0
p2 = 0
# convert for opencv
mtx = np.matrix([
[fx, 0, cx],
[ 0, fy, cy],
[ 0, 0, 1]
], dtype = "float32")
dist = np.array([k1, k2, p1, p2], dtype = "float32")
# undistort image
imgUndist = cv2.undistort(img, mtx, dist)
# undistort points
uPointsUndist, vPointsUndist = cv2_undistortPoints(uPoints, vPoints, mtx, dist)
# test if they still match
plotImageAndPoints(imgUndist, uPointsUndist, vPointsUndist) # awful!
Any help appreciated!
A bit late to the party, but to help others running into this issue:
The problem is that UndistortPoints is an iterative calculation which in some cases exits before a stable solution has been reached. This can be fixed by modifying the termination criteria for the calculation, which can be done by using UndistortPointsIter. You should replace:
uvDst = cv2.undistortPoints(uvSrc, cameraMatrix, distCoeffs, None, cameraMatrix)
with:
uvDst = cv2.undistortPointsIter(uvSrc, cameraMatrix, distCoeffs, None, cameraMatrix,(cv2.TERM_CRITERIA_COUNT | cv2.TERM_CRITERIA_EPS, 40, 0.03))
Now, it tries 40 iterations to find a solution, rather than the default 5 iterations.

Linear-Blurring an Image

I'm trying to blurr an image by mapping each pixel to the average of the N pixels to the right of it (in the same row). My iterative solution produces good output, but my linear-algebra solution is producing bad output.
From testing, I believe my kernel-matrix is correct; and, I know the last N rows don't get blurred, but that's fine for now. I'd appreciate any hints or solutions.
iterative-solution output (good), linear-algebra output (bad)
original image; and here is the failing linear-algebra code:
def blur(orig_img):
# turn image-mat into a vector
flattened_img = orig_img.flatten()
L = flattened_img.shape[0]
N = 3
# kernel
kernel = np.zeros((L, L))
for r, row in enumerate(kernel[0:-N]):
row[r:r+N] = [round(1/N, 3)]*N
print(kernel)
# blurr the img
print('starting blurring')
blurred_img = np.matmul(kernel, flattened_img)
blurred_img = blurred_img.reshape(orig_img.shape)
return blurred_img
The equation I'm modelling is this:
One option might be to just use a kernel and a convolution?
For example if we load a gray scale image like so:
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
from scipy import ndimage
# load a hackinsh grayscale image
image = np.asarray(Image.open('cup.jpg')).mean(axis=2)
plt.imshow(image)
plt.title('Gray scale image')
plt.show()
Now one can use a kernel and convolution. For example to create a filter that filters just one rows and compute the value of the center pixel as the difference between the pixels to the right and left one can do the following:
# Create a kernel that takes the difference between neighbors horizontal pixes
k = np.array([[-1,0,1]])
plt.subplot(121)
plt.title('Kernel')
plt.imshow(k)
plt.subplot(122)
plt.title('Output')
plt.imshow(ndimage.convolve(image, k, mode='constant', cval=0.0))
plt.show()
Therefore, one can blurr an image by mapping each pixel to the average of the N pixels to the right of it by creating the appropiate kernel.
# Create a kernel that takes the average of N pixels to the right
n=10
k = np.zeros(n*2);k[n:]=1/n
k = k[np.newaxis,...]
plt.subplot(121)
plt.title('Kernel')
plt.imshow(k)
plt.subplot(122)
plt.title('Output')
plt.imshow(ndimage.convolve(image, k, mode='constant', cval=0.0))
plt.show()
The issue was incorrect usage of cv2.imshow() in displaying the output image. It expects floating-point pixel values to be in [0, 1]; which, is done in the below code (near bottom):
def blur(orig_img):
flattened_img = orig_img.flatten()
L = flattened_img.shape[0]
N = int(round(0.1 * orig_img.shape[0], 0))
# mask (A)
mask = np.zeros((L, L))
for r, row in enumerate(mask[0:-N]):
row[r:r+N] = [round(1/N, 2)]*N
# blurred img = A * flattened_img
print('starting blurring')
blurred_img = np.matmul(mask, flattened_img)
blurred_img = blurred_img.reshape(orig_img.shape)
cv2.imwrite('blurred_img.png', blurred_img)
# normalize img to [0,1]
blurred_img = (
blurred_img - blurred_img.min()) / (blurred_img.max()-blurred_img.min())
return blurred_img
Ammended output
Thank you to #CrisLuengo for identifying the issue.

Simple Python Blur Convolution Kernel Function Generates Weird Image

I was writing a custom function to apply a blur to an image using a convolution kernel. When I show the image, however, there is a weird result. In some ways is seems that the image was inverted, but I am not sure why. Here is the original image:
Here is the result:
I have already tried re writing the code, changing the image, changing the blur kernel, printing and personally going through many convolutions by eye, etc.
import cv2
import numpy as np
import matplotlib.pyplot as plt
def showImage(image):
plt.imshow(image, cmap='gray')
plt.show()
def gaussianBlur(image):
tempImage = image.copy()
tempImage = np.pad(tempImage, 1, "constant")
showImage(tempImage)
max = 0
i = 0
for x in range(1, len(image)-1):
for y in range(1, len(image[0])-1):
roi = image[x-1:x+2, y-1:y+2]
kernel = np.array([
[0.0625, 0.125, 0.0625],
[0.125, 0.25, 0.125],
[0.0625, 0.125, 0.0625]
])
if np.matmul(roi, kernel).sum() > max:
max = np.matmul(roi, kernel).sum()
tempImage[x][y] = np.matmul(roi, kernel).sum()
i += 1
print(np.matmul(roi, kernel).sum())
# if(i % 1000 == 0):
# showImage(tempImage)
divAmount = max / 255
for x in range(1, len(image)-1):
for y in range(1, len(image[0])-1):
tempImage[x][y] = tempImage[x][y] / divAmount
return tempImage.tolist()
# Load and view the image
image = cv2.imread("image_1_small.jpg", 0)
showImage(image)
# Apply Blur
image = gaussianBlur(image)
print(image)
# image = cv2.GaussianBlur(image, (5, 5), 0)
showImage(image)
The expected outcome should look like the original image only blurred.
This is caused by overflows. You compute convolution wrong. Use np.multiply in place of np.matmul.

Error with image sharpening in Python

from PIL import Image
fp="C:\\lena.jpg"
img=Image.open(fp)
w,h=img.size
pixels=img.load()
imgsharp=Image.new(img.mode,img.size,color=0)
sharp=[0,-1,0,-1,8,-1,0,-1,0]
for i in range(w):
for j in range(h):
for k in range(3):
for m in range(3):
l=pixels[i-k+1,j-m+1]*sharp[i]
if l>255:
l=255
elif l<0:
l=0
imgsharp.putpixel((i,j),l)
imgsharp.show()
I want to apply a high pass (sharpening) filter with 3x3 mask size to a grayscale image. But I am getting an error:
Traceback (most recent call last):
File "C:\sharp.py", line 16, in <module>
l=pixels[i-k+1,j-m+1]*sharp[i]
IndexError: image index out of range
How can I fix my mistake and how can I get the image sharpening to work in this code?
The specific error you mentioned is because you are not dealing with the borders of the image. A solution is to pad the image or deal with the width and height limits. For example: replace i-k+1 and j-m+1 by max(0, min(w, i-k+1)) and max(0, min(h, j-m+1))) respectively.
There are other issues with your code:
The element of the filter you are accessing is not right... you probably meant sharp[3*m+k] where you wrote sharp[i].
Are you using colored or greyscale image? For colored images, l has 3 dimensions and can't be directly compared to a single number (0 or 255).
Also, the clipping of l value and the putpixel call should be inside the innerest loop.
Your kernel looks a bit odd. Is that 8 supposed to be a 5? Or maybe a 9 and 0 become -1? Take a look at kernels and at this example.
This implementation with several nested loops is not very efficient.
I recommend the following solutions to your problem.
If you want to sharpen the image and that's all, you can use PIL.Image.filter:
from PIL import Image, ImageFilter
img = Image.open('lena.png')
img_sharp = img.filter(ImageFilter.SHARPEN)
img_sharp.show()
If you do want to specify the kernel, try the following with scipy. Be sure to take a look at convolve documentation.
from PIL import Image
from scipy import ndimage, misc
import numpy as np
img = misc.imread('lena.png').astype(np.float) # read as float
kernel = np.array([0, -1, 0, -1, 5, -1, 0, -1, 0]).reshape((3, 3, 1))
# here we do the convolution with the kernel
imgsharp = ndimage.convolve(img, kernel, mode='nearest')
# then we clip (0 to 255) and convert to unsigned int
imgsharp = np.clip(imgsharp, 0, 255).astype(np.uint8)
Image.fromarray(imgsharp).show() # display
Another approach is to use OpenCV. Take a look at this article. It will clearify things about many implementation details.
We can sharpen an RGB image with scipy.convolve2d as well. We have to apply the convolution separately for each image channel. The below code shows the same for the lena image
from scipy import misc, signal
import numpy as np
im = misc.imread('../images/lena.jpg')/255. # scale pixel values in [0,1] for each channel
print(np.max(im))
# 1.0
print(im.shape)
# (220, 220, 3)
sharpen_kernel = np.array([[0, -1, 0], [-1, 5, -1], [0, -1, 0]])
im_sharpened = np.ones(im.shape)
for i in range(3):
im_sharpened[...,i] = np.clip(signal.convolve2d(im[...,i], sharpen_kernel, mode='same', boundary="symm"),0,1)
fig, ax = plt.subplots(nrows=2, figsize=(10, 20))
ax[0].imshow(im)
ax[0].set_title('Original Image', size=20)
ax[1].imshow(im_sharpened)
ax[1].set_title('Sharpened Image', size=20)
plt.show()
We can use the gaussian kernel to first blur the image and subtract from the original image to get a sharpened image as well, as shown in the following code:
from scipy import misc, ndimage
im = misc.imread('../images/lena.jpg') / 255 # scale pixel values in [0,1] for each channel
# First a 1-D Gaussian
t = np.linspace(-10, 10, 30)
bump = np.exp(-0.1*t**2)
bump /= np.trapz(bump) # normalize the integral to 1
# make a 2-D kernel out of it
kernel = bump[:, np.newaxis] * bump[np.newaxis, :]
im_blur = ndimage.convolve(im, kernel.reshape(30,30,1))
im_sharp = np.clip(2*im - im_blur, 0, 1)
fig, ax = plt.subplots(nrows=2, figsize=(10, 20))
ax[0].imshow(im)
ax[0].set_title('Original Image', size=20)
ax[1].imshow(im_sharp)
ax[1].set_title('Sharpened Image', size=20)
plt.show()

Categories