I'm using scipy's convolve2d:
for i in range(0, 12):
R.append(scipy.signal.convolve2d(self.img, h[i], mode = 'same'))
After convolution all values are in magnitudes of 10000s, but considering I'm working with images, I need them to be in the range of 0-255. How do I normalize it?
Assuming that you want to normalize within one single image, you can simply use im_out = im_out / im_out.max() * 255 .
You could also normalize kernel or original image.
Example below.
import scipy.signal
import numpy as np
import matplotlib.pyplot as plt
from skimage import color
from skimage import io
im = plt.imread('dice.jpg')
gray_img = color.rgb2gray(im)
print im.max()
# make some kind of kernel, there are many ways to do this...
t = 1 - np.abs(np.linspace(-1, 1, 16))
kernel = t.reshape(16, 1) * t.reshape(1, 16)
kernel /= kernel.sum() # kernel should sum to 1! :)
im_out =scipy.signal.convolve2d(gray_img, kernel, mode = 'same')
im_out = im_out / im_out.max() * 255
print im_out.max()
plt.subplot(2,1,1)
plt.imshow(im)
plt.subplot(2,1,2)
plt.imshow(im_out)
plt.show()
Related
After reading an image with PIL I usually perform a Gaussian filter using scipy.ndimage as follow
import PIL
from scipy import ndimage
PIL_image = PIL.Image.open(filename)
data = PIL_image.getdata()
array = np.array(list(data)).reshape(data.size[::-1]+(-1,))
img = array.astype(float)
fimg = ndimage.gaussian_filter(img, sigma=sigma, mode='mirror', order=0)
There is Gaussian blur function within PIL as follows (from this answer), but I don't know how it works exactly or what kernel it uses:
from PIL import ImageFilter
fimgPIL = PIL_image.filter(ImageFilter.GaussianBlur(radius=r)
This documentation does not provide details.
Questions about PIL.ImageFilter.GaussianBlur:
What exactly is the radius parameter; is it equivalent to the standard deviation σ?
For a given radius, how far out does it calculate the kernel? 2σ? 3σ? 6σ?
This comment on an answer to Gaussian Blur - standard deviation, radius and kernel size says the following, but I have not found information for PIL yet.
OpenCV uses kernel radius of (sigma * 3) while scipy.ndimage.gaussian_filter uses kernel radius of int(4 * sigma + 0.5)
From the source code, it looks like PIL.ImageFilter.GaussianBlur uses PIL.ImageFilter.BoxBlur. But I wasn't able to figure out how the radius and sigma are related.
I wrote a script to check the difference between scipy.ndimage.gaussian_filter and PIL.ImageFilter.GaussianBlur.
import numpy as np
from scipy import misc
from scipy.ndimage import gaussian_filter
import PIL
from PIL import ImageFilter
import matplotlib.pyplot as plt
# Load test color image
img = misc.face()
# Scipy gaussian filter
sigma = 5
img_scipy = gaussian_filter(img, sigma=(sigma,sigma,0), mode='nearest')
# PIL gaussian filter
radius = 5
PIL_image = PIL.Image.fromarray(img)
img_PIL = PIL_image.filter(ImageFilter.GaussianBlur(radius=radius))
data = img_PIL.getdata()
img_PIL = np.array(data).reshape(data.size[::-1]+(-1,))
img_PIL = img_PIL.astype(np.uint8)
# Image difference
img_diff = np.abs(np.float_(img_scipy) - np.float_(img_PIL))
img_diff = np.uint8(img_diff)
# Stats
mean_diff = np.mean(img_diff)
median_diff = np.median(img_diff)
max_diff = np.max(img_diff)
# Plot results
plt.subplot(221)
plt.imshow(img_scipy)
plt.title('SciPy (sigma = {})'.format(sigma))
plt.axis('off')
plt.subplot(222)
plt.imshow(img_PIL)
plt.title('PIL (radius = {})'.format(radius))
plt.axis('off')
plt.subplot(223)
plt.imshow(img_diff)
plt.title('Image difference \n (Mean = {:.2f}, Median = {:.2f}, Max = {:.2f})'
.format(mean_diff, median_diff, max_diff))
plt.colorbar()
plt.axis('off')
# Plot histogram
d = img_diff.flatten()
bins = list(range(int(max_diff)))
plt.subplot(224)
plt.title('Histogram of Image difference')
h = plt.hist(d, bins=bins)
for i in range(len(h[0])):
plt.text(h[1][i], h[0][i], str(int(h[0][i])))
Output for sigma=5, radius=5:
Output for sigma=30, radius=30:
The outputs of scipy.ndimage.gaussian_filter and PIL.ImageFilter.GaussianBlur are very similar and the difference is negligible. More than 95% of difference values are <= 2.
PIL version: 7.2.0, SciPy version: 1.5.0
This is a supplementary answer to #Nimal's accepted answer.
Basically the radius parameter is like sigma. I won't dig too deep, but I think the Gaussian kernel is slightly different internally in order to preserve normalization after rounding back to integers, since the PIL method returns 0 to 255 integer levels.
The script below generates an image that is 1 on the left and 0 on the right, then does a sigma = 10 pixel blur with both methods, then plots the center horizontal lines through each, plus their differences. I do difference twice since log can only display positive differences.
The first panel is the difference between PIL and the SciPy float results, the second is the truncated integer SciPy results, and the third is rounded SciPy.
import numpy as np
import matplotlib.pyplot as plt
import PIL
from scipy.ndimage import gaussian_filter
from PIL import ImageFilter
import PIL
sigma = 10.0
filename = 'piximg.png'
# Save a PNG with a central pixel = 1
piximg = np.zeros((101, 101), dtype=float)
piximg[:, :50] = 1.0
plt.imsave(filename, piximg, cmap='gray')
# Read with PIL
PIL_image = PIL.Image.open(filename)
# Blur with PIL
img_PIL = PIL_image.filter(ImageFilter.GaussianBlur(radius=sigma))
data = img_PIL.getdata()
img_PIL = np.array(list(data)).reshape(data.size[::-1]+(-1,))
g1 = img_PIL[..., 1]
# Blur with SciPy
data = PIL_image.getdata()
array = np.array(list(data)).reshape(data.size[::-1]+(-1,))
img = array.astype(float)
fimg = gaussian_filter(img[...,:3], sigma=sigma, mode='mirror', order=0)
g2 = fimg[..., 1]
g2u = np.uint8(g2)
g2ur = np.uint8(g2+0.5)
if True:
plt.figure()
plt.subplot(3, 1, 1)
plt.plot(g1[50])
plt.plot(g2[50])
plt.plot(g2[50] - g1[50])
plt.plot(g1[50] - g2[50])
plt.yscale('log')
plt.ylim(0.1, None)
plt.subplot(3, 1, 2)
plt.plot(g1[50])
plt.plot(g2u[50])
plt.plot(g2u[50] - g1[50])
plt.plot(g1[50] - g2u[50])
plt.yscale('log')
plt.ylim(0.1, None)
plt.subplot(3, 1, 3)
plt.plot(g1[50])
plt.plot(g2ur[50])
plt.plot(g2ur[50] - g1[50])
plt.plot(g1[50] - g2ur[50])
plt.yscale('log')
plt.ylim(0.1, None)
plt.show()
I am using Unet for segmentation in python and my unet's output is a mask with this shape [512,512,1].
After predicted a mask I want to do f1 score between the predicted mask and the real mask of the test image. I need to convert the real mask from [512,512,3] to [512,512,1] and I just can convert to [512,512].
Can anyone help me?
Image with my outputs
You can use Pillow
from PIL import Image
img = Image.open('image.png').convert('LA')
img.save('greyscale.png')
OR
Using matplotlib and the formula
Y' = 0.2989 R + 0.5870 G + 0.1140 B
you could do:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
def rgb2gray(rgb):
return np.dot(rgb[...,:3], [0.2989, 0.5870, 0.1140])
img = mpimg.imread('image.png')
gray = rgb2gray(img)
plt.imshow(gray, cmap=plt.get_cmap('gray'), vmin=0, vmax=1)
plt.show()
You can refer this answer for more info
I discover the answer. I had to remove one axis
predicted.shape
[512,512,1]
predicted = predicted[:, :,:, 0]
predicted.shape
[512,512]
I'm trying to blurr an image by mapping each pixel to the average of the N pixels to the right of it (in the same row). My iterative solution produces good output, but my linear-algebra solution is producing bad output.
From testing, I believe my kernel-matrix is correct; and, I know the last N rows don't get blurred, but that's fine for now. I'd appreciate any hints or solutions.
iterative-solution output (good), linear-algebra output (bad)
original image; and here is the failing linear-algebra code:
def blur(orig_img):
# turn image-mat into a vector
flattened_img = orig_img.flatten()
L = flattened_img.shape[0]
N = 3
# kernel
kernel = np.zeros((L, L))
for r, row in enumerate(kernel[0:-N]):
row[r:r+N] = [round(1/N, 3)]*N
print(kernel)
# blurr the img
print('starting blurring')
blurred_img = np.matmul(kernel, flattened_img)
blurred_img = blurred_img.reshape(orig_img.shape)
return blurred_img
The equation I'm modelling is this:
One option might be to just use a kernel and a convolution?
For example if we load a gray scale image like so:
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
from scipy import ndimage
# load a hackinsh grayscale image
image = np.asarray(Image.open('cup.jpg')).mean(axis=2)
plt.imshow(image)
plt.title('Gray scale image')
plt.show()
Now one can use a kernel and convolution. For example to create a filter that filters just one rows and compute the value of the center pixel as the difference between the pixels to the right and left one can do the following:
# Create a kernel that takes the difference between neighbors horizontal pixes
k = np.array([[-1,0,1]])
plt.subplot(121)
plt.title('Kernel')
plt.imshow(k)
plt.subplot(122)
plt.title('Output')
plt.imshow(ndimage.convolve(image, k, mode='constant', cval=0.0))
plt.show()
Therefore, one can blurr an image by mapping each pixel to the average of the N pixels to the right of it by creating the appropiate kernel.
# Create a kernel that takes the average of N pixels to the right
n=10
k = np.zeros(n*2);k[n:]=1/n
k = k[np.newaxis,...]
plt.subplot(121)
plt.title('Kernel')
plt.imshow(k)
plt.subplot(122)
plt.title('Output')
plt.imshow(ndimage.convolve(image, k, mode='constant', cval=0.0))
plt.show()
The issue was incorrect usage of cv2.imshow() in displaying the output image. It expects floating-point pixel values to be in [0, 1]; which, is done in the below code (near bottom):
def blur(orig_img):
flattened_img = orig_img.flatten()
L = flattened_img.shape[0]
N = int(round(0.1 * orig_img.shape[0], 0))
# mask (A)
mask = np.zeros((L, L))
for r, row in enumerate(mask[0:-N]):
row[r:r+N] = [round(1/N, 2)]*N
# blurred img = A * flattened_img
print('starting blurring')
blurred_img = np.matmul(mask, flattened_img)
blurred_img = blurred_img.reshape(orig_img.shape)
cv2.imwrite('blurred_img.png', blurred_img)
# normalize img to [0,1]
blurred_img = (
blurred_img - blurred_img.min()) / (blurred_img.max()-blurred_img.min())
return blurred_img
Ammended output
Thank you to #CrisLuengo for identifying the issue.
I'm facing an issue, and would like some inputs from the community on how to improve the disparity map. I'm following this tutorial for calculating the disparity map between 2 images. The code I have is as follows:
import cv2
import numpy as np
import sys
from matplotlib import pyplot as plt
num_disparities = 64 # number of disparities to check
block = 9 # block size to match
def preprocess_frame(path):
image = cv2.imread(path, 0)
image = cv2.equalizeHist(image)
image = cv2.GaussianBlur(image, (5, 5), 0)
return image
def calculate_disparity_matrix(args):
left_image = preprocess_frame(args[1])
right_image = preprocess_frame(args[2])
rows, cols = left_image.shape
kernel = np.ones([block, block]) / block
disparity_maps = np.zeros(
[left_image.shape[0], left_image.shape[1], num_disparities])
for d in range(0, num_disparities):
# shift image
translation_matrix = np.float32([[1, 0, d], [0, 1, 0]])
shifted_image = cv2.warpAffine(
right_image, translation_matrix,
(right_image.shape[1], right_image.shape[0]))
# calculate squared differences
SAD = abs(np.float32(left_image) - np.float32(shifted_image))
# convolve with kernel and find SAD at each point
filtered_image = cv2.filter2D(SAD, -1, kernel)
disparity_maps[:, :, d] = filtered_image
disparity = np.argmin(disparity_maps, axis=2)
disparity = np.uint8(disparity * 255 / num_disparities)
disparity = cv2.equalizeHist(disparity)
plt.imshow(disparity, cmap='gray', vmin=0, vmax=255)
plt.show()
def calculate_disparity_inbuilt(args):
left_image = preprocess_frame(args[1])
right_image = preprocess_frame(args[2])
rows, cols = left_image.shape
stereo = cv2.StereoBM_create(numDisparities=num_disparities,
blockSize=block)
disparity = stereo.compute(left_image, right_image)
plt.imshow(disparity, cmap='gray', vmin=0, vmax=255)
plt.show()
The problem is that the output that I get from the inbuilt function in OpenCV is hardly similar to the one I've implemented. I was expecting at least a slight similarity between the 2. Is this expected? or am I doing something wrong here?
Implemented Algorithm
OpenCV Algorithm
I'm looking for a robust way to extract the foreground from an image where the background has some noise in it.
So, the image I want to use it on is:
My attempt was to use the Otsu thresholding. I did that in Python as follows:
from skimage.filter import threshold_otsu
import os.path as path
import matplotlib.pyplot as plt
img = io.imread(path.expanduser('~/Desktop/62.jpg'))
r_t = threshold_otsu(img[:, :, 0])
g_t = threshold_otsu(img[:, :, 1])
b_t = threshold_otsu(img[:, :, 2])
m = np.zeros((img.shape[0], img.shape[1]), dtype=np.uint8)
mask = (img[:, :, 0] < r_t) & (img[:, :, 1] < g_t) & (img[:, :, 2] < b_t)
m[~mask] = 255
plt.imshow(m)
plt.show()
This gives the R, G, B threshold as (62 67 64), which is a bit high. The result is:
This image is also one of the images where Otsu thresholding worked best. If I use a manual threshold like a value of 30, it works quite well. The result is:
I was wondering if there are some other approaches that I should try. Segmentation really is not my area of expertise and what I can do out of the box seem limited.
Your image looks not very colorful. So you can perform the segmentation on the gray values and not on each color separately and then combining three masks.
Looking at the package scikit-image.filter there are several other threshold methods. I tried them all and found threshold_isodata to perform extremely well giving almost the same image as your desired image. Therefore I recommend the isodata algorithm.
Example:
import numpy as np
import skimage.io as io
import skimage.filter as filter
import matplotlib.pyplot as plt
img = io.imread('62.jpg')
gray = np.sum(img, axis=2) # summed up over red, green, blue
#threshold = filter.threshold_otsu(gray) # delivers very high threshold
threshold = filter.threshold_isodata(gray) # works extremely well
#threshold = filter.threshold_yen(gray) # delivers even higher threshold
print(threshold)
plt.imshow(gray > threshold)
plt.show()
gives: