I would like to perform pixel classification on RGB images based on input training samples of given number of classes. So I have e.g. 4 classes containing pixels (r,g,b) thus the goal is to segment the image into four phases.
I found that python opencv2 has the Expectation maximization algorithm which could do the job. But unfortunately I did not find any tutorial or material which can explain me (since I am beginner) how to work with the algorithm.
Could you please propose any kind of tutorial which can be used as starting point?
Update...another approach for the code below:
**def getsamples(img):
x, y, z = img.shape
samples = np.empty([x * y, z])
index = 0
for i in range(x):
for j in range(y):
samples[index] = img[i, j]
index += 1
return samples
def EMSegmentation(img, no_of_clusters=2):
output = img.copy()
colors = np.array([[0, 11, 111], [22, 22, 22]])
samples = getsamples(img)
#em = cv2.ml.EM_create()
em = cv2.EM(no_of_clusters)
#em.setClustersNumber(no_of_clusters)
#em.trainEM(samples)
em.train(samples)
x, y, z = img.shape
index = 0
for i in range(x):
for j in range(y):
result = em.predict(samples[index])[0][1]
#print(result)
output[i][j] = colors[result]
index = index + 1
return output
img = cv2.imread('00.jpg')
smallImg = small = cv2.resize(img, (0,0), fx=0.5, fy=0.5)
output = EMSegmentation(img)
smallOutput = cv2.resize(output, (0,0), fx=0.5, fy=0.5)
cv2.imshow('image', smallImg)
cv2.imshow('EM', smallOutput)
cv2.waitKey(0)
cv2.destroyAllWindows()**
convert C++ to python source
import cv2
import numpy as np
def getsamples(img):
x, y, z = img.shape
samples = np.empty([x * y, z])
index = 0
for i in range(x):
for j in range(y):
samples[index] = img[i, j]
index += 1
return samples
def EMSegmentation(img, no_of_clusters=2):
output = img.copy()
colors = np.array([[0, 11, 111], [22, 22, 22]])
samples = getsamples(img)
em = cv2.ml.EM_create()
em.setClustersNumber(no_of_clusters)
em.trainEM(samples)
means = em.getMeans()
covs = em.getCovs() # Known bug: https://github.com/opencv/opencv/pull/4232
x, y, z = img.shape
distance = [0] * no_of_clusters
for i in range(x):
for j in range(y):
for k in range(no_of_clusters):
diff = img[i, j] - means[k]
distance[k] = abs(np.dot(np.dot(diff, covs[k]), diff.T))
output[i][j] = colors[distance.index(max(distance))]
return output
img = cv2.imread('dinosaur.jpg')
output = EMSegmentation(img)
cv2.imshow('image', img)
cv2.imshow('EM', output)
cv2.waitKey(0)
cv2.destroyAllWindows()
Related
I created code to equalize the luminosity values of pixels in an image so that when the image is further edited I do not have dark or light spots in my final image. However, the code seems to stop short and only equalize part of my image. Any ideas as to why the code is stopping early?
Here is my code:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
img = mpimg.imread('EXP_0159-2_8b.tif')
imgOut = img.copy()
for i in range(0, len(img[0, :])):
imgLine1 = (img[:, i] < 165) * img[:, i]
p = imgLine1.nonzero()
if len(p[0]) < 1:
imgOut[:, i] == 0
else:
imgLine2 = imgLine1[p[0]]
def curvefitting(lineFunction):
x = np.arange(0, len(lineFunction))
y = lineFunction
curve = np.polyfit(x, y, deg = 2)
a = curve[0]
b = curve[1]
c = curve[2]
curveEquation = (a*(x**2)) + (b*(x**1)) + (c)
curveCorrected = lineFunction - curveEquation + 200
return curveCorrected
imgLine1[p[0]] = curvefitting(imgLine2)
imgOut[:, i] = imgLine1
plt.imshow(imgOut, cmap = 'gray')
The for loop takes the individual columns of pixels in my image and restricts the endpoints of that column to (0, 165), so that pixels outside of that range are turned into zero and ignored by the nonzero() function. The if condition just finalizes the conversion of values outside (0, 165) to zero. Additionally, I converted the image to gray so I would not have to deal with colors and could focus only on luminosity.
This is my corrected image. The program works to average the luminosity values across the entire surface. However, you can see that it stops before reaching the end. The initial image was darker on the sides and lighter in the middle, but the file is too large to upload.
Any help is greatly appreciated.
If you are not interested in color you can convert input image to grayscale. That would simplified the matrix multiplications. The simplified version would be
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
def rgb2gray(rgb):
return np.dot(rgb[...,:3], [0.2989, 0.5870, 0.1140])
def curvefitting(lineFunction):
x = np.arange(0, len(lineFunction))
y = lineFunction
curve = np.polyfit(x, y, deg = 2)
a = curve[0]
b = curve[1]
c = curve[2]
curveEquation = [(a*(x_**2)) + (b*(x_**1)) + (c) for x_ in x]
curveCorrected = lineFunction - curveEquation + 200
return curveCorrected
img = mpimg.imread('EXP_0159-2_8b.tif')
img = rgb2gray(img)
imgOut = img.copy()
for i in range(0, len(img[0, :])):
imgLine1 = (img[:, i] < 165) * img[:, i]
p = imgLine1.nonzero()
if len(p) < 1:
imgOut[:, i] == 0
else:
imgLine2 = imgLine1[p]
imgLine1[p] = curvefitting(imgLine2)
imgOut[:, i] = imgLine1
plt.imshow(imgOut, cmap = 'gray')
plt.show()
I wrote the code below and what I get is the output below. What I suppose to do is write an histogram equalization function(without built in methods) I get no error, however output is not what it should to be. I could not notice any logic mistakes im my code. Although, while writing the loop for calculating cdf and/or mapping I couldn't follow what happens behind it exactly, maybe the problem is there but I am not sure.
def my_float2int(img):
img = np.round(img * 255, 0)
img = np.minimum(img, 255)
img = np.maximum(img, 0)
img = img.astype('uint8')
return img
def equalizeHistogram(img):
img_height = img.shape[0]
img_width = img.shape[1]
histogram = np.zeros([256], np.int32)
# calculate histogram
for i in range(0, img_height):
for j in range(0, img_width):
histogram[img[i, j]] +=1
# calculate pdf of the image
pdf_img = histogram / histogram.sum()
### calculate cdf
# cdf initialize .
cdf = np.zeros([256], np.int32)
# For loop for cdf
for i in range(0, 256):
for j in range(0, i+1):
cdf[i] += pdf_img[j]
cdf_eq = np.round(cdf * 255, 0) # mapping, transformation function T(x)
imgEqualized = np.zeros((img_height, img_width))
# for mapping input image to s.
for i in range(0, img_height):
for j in range(0, img_width):
r = img[i, j] # feeding intensity levels of pixels into r.
s = cdf_eq[r] # finding value of s by finding r'th position in the cdf_eq list.
imgEqualized[i, j] = s # mapping s thus creating new output image.
# calculate histogram equalized image here
# imgEqualized = s # change this
return imgEqualized
# end of function
# 2.2 obtain the histogram equalized images using the above function
img_eq_low = equalizeHistogram(img_low)
img_eq_high = equalizeHistogram(img_high)
img_eq_low = my_float2int(img_eq_low)
img_eq_high = my_float2int(img_eq_high)
# 2.3 calculate the pdf's of the histogram equalized images
hist_img_eq_low = calcHistogram(img_eq_low)
hist_img_eq_high = calcHistogram(img_eq_high)
pdf_eq_low = hist_img_eq_low / hist_img_eq_low.sum()
pdf_eq_high = hist_img_eq_high / hist_img_eq_high.sum()
# 2.4 display the histogram equalized images and their pdf's
plt.figure(figsize=(14,8))
plt.subplot(121), plt.imshow(img_eq_low, cmap = 'gray', vmin=0, vmax=255)
plt.title('Hist. Equalized Low Exposure Image'), plt.xticks([]), plt.yticks([])
plt.subplot(122), plt.imshow(img_eq_high, cmap = 'gray', vmin=0, vmax=255)
plt.title('Hist. Equalized High Exposure Image'), plt.xticks([]), plt.yticks([])
plt.show()
plt.close()
My output:
Expected output: with the built-in methods.
I found two minor bugs, and one efficiency issue:
Replace cdf = np.zeros([256], np.int32) with cdf = np.zeros([256], float)
In the loop, you are putting float elements in cdf, so the type should be float instead of int32.
Replace img = np.round(img * 255, 0) with img = np.round(img, 0) (in my_float2int).
You are scaling img by 255 twice (the first time is in cdf_eq = np.round(cdf * 255, 0)).
You may compute cdf more efficiently.
Your implementation:
for i in range(0, 256):
for j in range(0, i+1):
cdf[i] += pdf_img[j]
Suggested implementation (more efficient way for computing "accumulated sum"):
cdf[0] = pdf_img[0]
for i in range(1, 256):
cdf[i] = cdf[i-1] + pdf_img[i]
It's not a bug, but a kind of academic issue (regarding complexity).
Here is an example for corrected code (uses only img_low):
import numpy as np
import cv2
def my_float2int(img):
# Don't use *255 twice
# img = np.round(img * 255, 0)
img = np.round(img, 0)
img = np.minimum(img, 255)
img = np.maximum(img, 0)
img = img.astype('uint8')
return img
def equalizeHistogram(img):
img_height = img.shape[0]
img_width = img.shape[1]
histogram = np.zeros([256], np.int32)
# calculate histogram
for i in range(0, img_height):
for j in range(0, img_width):
histogram[img[i, j]] +=1
# calculate pdf of the image
pdf_img = histogram / histogram.sum()
### calculate cdf
# cdf initialize .
# Why does the type np.int32?
#cdf = np.zeros([256], np.int32)
cdf = np.zeros([256], float)
# For loop for cdf
for i in range(0, 256):
for j in range(0, i+1):
cdf[i] += pdf_img[j]
# You may implement the "accumulated sum" in a more efficient way:
cdf = np.zeros(256, float)
cdf[0] = pdf_img[0]
for i in range(1, 256):
cdf[i] = cdf[i-1] + pdf_img[i]
cdf_eq = np.round(cdf * 255, 0) # mapping, transformation function T(x)
imgEqualized = np.zeros((img_height, img_width))
# for mapping input image to s.
for i in range(0, img_height):
for j in range(0, img_width):
r = img[i, j] # feeding intensity levels of pixels into r.
s = cdf_eq[r] # finding value of s by finding r'th position in the cdf_eq list.
imgEqualized[i, j] = s # mapping s thus creating new output image.
# calculate histogram equalized image here
# imgEqualized = s # change this
return imgEqualized
# end of function
# Read input image as Grayscale
img_low = cv2.imread('img_low.png', cv2.IMREAD_GRAYSCALE)
# 2.2 obtain the histogram equalized images using the above function
img_eq_low = equalizeHistogram(img_low)
img_eq_low = my_float2int(img_eq_low)
# Use cv2.imshow (instead of plt.imshow) just for testing.
cv2.imshow('img_eq_low', img_eq_low)
cv2.waitKey()
Result:
I have a (numpy) array of pixels acquired as:
''' import numpy and matplotlib '''
image = Image.open('trollface.png', 'r')
width, height = image.size
pixel_values = list(image.getdata())
pixel_values = np.array(pixel_values).reshape((width, height, 3)) # 3 channels RGB
#height, width = len(pixel_values), len(pixel_values[0])
I need to compute digital negative of this image -
for y in range(0,height):
for x in range(0,width):
R,G,B = pixel_values[x,y]
pixel_values[x,y] =(255 - R, 255 - G, 255 - B)
tried displaying image from above pixels with the help of this thread
plt.imshow(np.array(pixel_values).reshape(width,height,3))
plt.show()
But it just displays a blank (white) window, with this error in CLI:
The aim here is to achieve a negative transformation of an image.
Pixel translations can be directly applied to the R, G, B band using Image.point method.
image = Image.open('trollface.png')
source = image.split()
r, g, b, a = 0, 1, 2, 3
negate = lambda i: 255 - i
transform = [source[band].point(negate) for band in (r, g, b)]
if len(source) == 4: # should have 4 bands for images with alpha channel
transform.append(source[a]) # add alpha channel
out = Image.merge(im.mode, transform)
out.save('negativetrollface.png')
EDIT using OP's procedure, you have:
im = Image.open('trollface.png')
w, h = im.size
arr = np.array(im)
original_shape = arr.shape
arr_to_dim = arr.reshape((w, h, 4))
# Note that this is expensive.
# Always take advantage of array manipulation implemented in the C bindings
for x in range(0, w):
for y in range(0, h):
r, g, b, a = arr_to_dim[x, y]
arr_to_dim[x, y] = np.array([255 - r, 255 - g, 255 - b, a])
dim_to_arr = arr_to_dim.reshape(original_shape)
im = Image.fromarray(dim_to_arr)
out.save('negativetrollface.png')
I built this Harris Corner Detector which works exactly as it's expected in terms of functionality however when it comes to performance it is extremely slow. I am almost sure that it has to do with the fact that I'm accessing every pixel of the image but it might also be that I'm implementing something wrong. I've been thinking of how to optimize np array access for applying the filters but because of the nature of these filters I still can't come up with a good idea.
The method is not slow by itself as with OpenCV is basically instant for the same image.
import numpy as np
import cv2 as cv
import matplotlib.pyplot as pl
import matplotlib.cm as cm
import math
def hor_edge_strength(x, y, img_in = [], filter = []):
strength = 0
for i in range (0,3):
for j in range (0,3):
strength += img_in[x+i-1][y+j-1] * filter[i][j]
return strength
def ver_edge_strength(x, y, img_in = [], filter = []):
strength = 0
for i in range (0,3):
for j in range (0,3):
strength += img_in[x+i-1][y+j-1] * filter[i][j]
return strength
def gauss_kernels(size,sigma=1):
## returns a 2d gaussian kernel
if size<3:
size = 3
m = size/2
x, y = np.mgrid[-m:m+1, -m:m+1]
kernel = np.exp(-(x*x + y*y)/(2*sigma*sigma))
kernel_sum = kernel.sum()
if not sum==0:
kernel = kernel/kernel_sum
return kernel
sobel_h = [[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]]
sobel_v = [[1, 2, 1], [0, 0, 0], [-1, -2, -1]]
img_arr = ['checker.jpg'] #Test image
for img_name in img_arr:
img = cv.imread(img_name,0)
sep = '.'
img_name = img_name.split(sep, 1)[0]
print img_name
gray_img = img.astype(float)
gx = np.zeros_like(gray_img)
gy = np.zeros_like(gray_img)
print 'Getting strengths'
for i in range(1, len(gray_img) - 1):
for j in range(1, len(gray_img[0]) - 1):
gx[i][j] = hor_edge_strength(i, j, gray_img, sobel_h)
gy[i][j] = ver_edge_strength(i, j, gray_img, sobel_v)
I_xx = gx * gx
I_xy = gx * gy
I_yy = gy * gy
gaussKernel = gauss_kernels(3,1)
W_xx = np.zeros_like(gray_img)
W_xy = np.zeros_like(gray_img)
W_yy = np.zeros_like(gray_img)
print 'Convoluting'
for i in range(1, len(gray_img) - 1):
for j in range(1, len(gray_img[0]) - 1):
W_xx[i][j] = hor_edge_strength(i, j, I_xx, gaussKernel)
W_xy[i][j] = hor_edge_strength(i, j, I_xy, gaussKernel)
W_yy[i][j] = hor_edge_strength(i, j, I_yy, gaussKernel)
print 'Calculating Harris Corner'
k = 0.06
HCResponse = np.zeros_like(gray_img)
for i in range(1, len(gray_img) - 1):
for j in range(1, len(gray_img[0]) - 1):
W = np.matrix([[W_xx[i][j],W_xy[i][j]],[W_xy[i][j],W_yy[i][j]]]) #For lap purposes, but not needed
detW = W_xx[i][j]*W_yy[i][j] - (W_xy[i][j] * W_xy[i][j])
traceW = W_xx[i][j] + W_yy[i][j]
HCResponse[i][j] = detW - k*traceW*traceW
threshold = 0.1
imageTreshold = max(HCResponse.ravel()) * threshold
HCResponseTreshold = (HCResponse >= imageTreshold) * 1
candidates = np.transpose(HCResponseTreshold.nonzero())
print 'Drawing'
x, y = gray_img.shape
image = np.empty((x, y, 3), dtype=np.uint8)
image[:, :, 0] = gray_img
image[:, :, 1] = gray_img
image[:, :, 2] = gray_img
for i in candidates:
x,y = i.ravel()
image[x][y] = [255,0,0]
pl.imshow(image)
pl.show()
pl.savefig(img_name + '_edge.jpg')
Is there any possible solution to substantially improve the performance of this edge detector?
import cv2
import numpy as np
filename = 'chess1.jpg'
img = cv2.imread(filename)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray = np.float32(gray)
dst = cv2.cornerHarris(gray,2,3,0.04)
#result is dilated for marking the corners, not important
dst = cv2.dilate(dst,None)
# Threshold for an optimal value, it may vary depending on the image.
img[dst>0.01*dst.max()]=[0,0,255]
cv2.imshow('dst',img)
if cv2.waitKey(0) & 0xff == 27:
cv2.destroyAllWindows()
for more info see this link
I need to flip a picture horizontally, without using the reverse function, I thought I had it right but the returned image is just the bottom right corner of the picture and it is not flipped.
The code I have is
def Flip(image1, image2):
img = graphics.Image(graphics.Point(0, 0), image1)
X = img.getWidth()
Y = img.getHeight()
for y in range(Y):
for x in range(X):
A = img.getPixel(x,y)
r = A[0]
g = A[1]
b = A[2]
color = graphics.color_rgb(r,g,b)
img.setPixel(X-x,y,color)
img = graphics.Image(graphics.Point(0,0), image2)
win = graphics.GraphWin(image2, img.getWidth(), img.getHeight())
img.draw(win)
Where did I go wrong?
Here some things that I think could be improved:
def Flip(image1, image2):
img = graphics.Image(graphics.Point(0, 0), image1)
X = img.getWidth()
Y = img.getHeight()
for y in range(Y):
for x in range(X):
A = img.getPixel(x,y)
r = A[0]
g = A[1]
b = A[2]
color = graphics.color_rgb(r,g,b)
This assignment could be more pythonic:
r, g, b = img.getPixel(x,y)
color = graphics.color_rgb(r,g,b)
img.setPixel(X-x,y,color)
img now has the image half-flipped. This happens because you are writing the content on the same image source, losing the old content anytime until you reach the middle. (Notice that X-x will increase the image size by 1 pixel. If the image width is 100, in the first iteration X-x = 100 - 0 = 100 and because it starts from 0, the image is made wider 1 pixel.) Then, you start copying back. Also, you do not use that content because:
img = graphics.Image(graphics.Point(0,0), image2)
Here is the problem: you just overwrote the content of img without giving it any use. Later:
win = graphics.GraphWin(image2, img.getWidth(), img.getHeight())
img.draw(win)
This seems unrelated with the purpose of the function (flip an image). What I would do is:
import graphics
import sys
def Flip(image_filename):
img_src = graphics.Image(graphics.Point(0, 0), image_filename)
img_dst = img_src.clone()
X, Y = img_src.getWidth(), img_src.getHeight()
for x in range(X):
for y in range(Y):
r, g, b = img_src.getPixel(x, y)
color = graphics.color_rgb(r, g, b)
img_dst.setPixel(X-x-1, y, color)
return img_dst
if __name__ == '__main__':
input = sys.argv[1] or 'my_image.ppm'
output = 'mirror-%s' % input
img = Flip (input)
img.save(output)
Notices that the function Flip only take care of flipping the image, outside the function you can do whatever you need the image, as you can see in 'main' program.
If you want to use only one image, it is possible and more efficient. For that, you can use the same principle for swapping values between variables:
def Flip(image_filename):
img = graphics.Image(graphics.Point(0, 0), image_filename)
X, Y = img.getWidth(), img.getHeight()
for x in range(X/2):
for y in range(Y):
r_1, g_1, b_1 = img.getPixel(x, y)
color_1 = graphics.color_rgb(r_1, g_1, b_1)
r_2, g_2, b_2 = img.getPixel(X-x-1, y)
color_2 = graphics.color_rgb(r_2, g_2, b_2)
img.setPixel(X-x-1, y, color_1)
img.setPixel(x, y, color_2)
return img
I know it has been a long time but you can try this!
from PIL import Image
Image.open('img.png')
img = Image.open('img.png')
Mirror_Image=img.transpose(Image.FLIP_LEFT_RIGHT)
Mirror_Image.save(r'imgoutput.png')
Image.open('imgoutput.png')