python RGB Image to YUYV - python

I want to convert Bitmap to YUV422 (YUYV) format.
I've google about the YUYV format and tried to write this code.
path = "C:/Users/hogan/Desktop/red.bmp"
image = Image.open(path).convert("YCbCr") # YUV
image = np.asarray(image)
width, height, YUYV = image.shape[1], image.shape[0], 4
array = np.empty((height * width * 3), dtype="uint8")
Y, U, V = 0, 1, 2
count = 0
for i in range(len(image)):
for j in range(len(image[i])):
for k in range(len(image[i][j])):
if (count % 4 == 0):
array[count] = image[i][j][Y]
if (count % 4 == 1):
array[count] = image[i][j][U]
if (count % 4 == 2):
array[count] = image[i][j][Y]
if (count % 4 == 3):
array[count] = image[i][j][V]
count = count + 1
array.astype('uint8').tofile("C:/Users/hogan/Desktop/tmpdir/1.raw")
I read this image and know my code is wrong but no idea how to make it right.
For Example :
Red color (255,0,0) in YUV is (76,84,255), if I've a lot of pixels, I don't know which 'U' an 'V' should be dropped.
If use my code to convert a 480*640 (W*H), it will be 960*480.

You can use numpy advanced indexing and broadcasting here.
Suppose, you have an image:
ycbcr = np.array([
[['Y00', 'U00', 'V00'], ['Y01', 'U01', 'V01'], ['Y02', 'U02', 'V02']],
[['Y10', 'U10', 'V10'], ['Y11', 'U11', 'V11'], ['Y12', 'U12', 'V12']],
[['Y20', 'U20', 'V20'], ['Y21', 'U21', 'V21'], ['Y22', 'U22', 'V22']],
], dtype=str)
Working with 3D array is going to be unwieldy, so let's convert it to 1D:
ycbcr = ycbcr.reshape(27)
Then, I'll allocate an array for YUYV stream:
yuyv = np.array([' ' for _ in range(18)]) # ' ' is basically because
# I use strings in this example. You'd probably want to use arrays of uint8
First step is the easiest - we take every third value of ycbcr (Y components) and place them on even positions of yuyv:
yuyv[::2] = ycbcr[::3]
Then we proceed with other bytes. U00 should go from position 1 to position 1, V00 from 2 to 3, U01 and V01 are omitted, U02 goes from 7 to 5, V02 goes from 8 to 7, U03 is omitted and so on:
yuyv[1::4] = ycbcr[1::6] # Moving U, every sixth byte starting from position 1
yuyv[3::4] = ycbcr[2::6][:-1] # Moving V, dropping last element of V
So you get the following yuyv, just like an image suggested:
array(['Y00', 'U00', 'Y01', 'V00', 'Y02', 'U02', 'Y10', 'V02', 'Y11',
'U11', 'Y12', 'V11', 'Y20', 'U20', 'Y21', 'V20', 'Y22', 'U22'],
dtype='<U3')
If you want to follow #alkanen advice, the approach is feasible as well, you'll just need to sample U and V bytes in two arrays and take an average. Perhaps it is going to look like this: (untested)
u_even = ycbcr[1::6]
u_odd = ycbcr[4::6]
u = (u_even + u_odd) / 2
yuyv[1::4] = u
More complex border cases will encounter as well.

Related

What is the best script in python that replicates matlabs imresize3?

I am translating code from MATLAB to python but cannot perfectly replicate the results of MATLAB's imresize3. My input is a 101x101x101 array. First four inputs ([0,0:3,0] or (1,1:4,1)) are: 0.3819 0.4033 0.4336 0.2767. The data input for both languages is identical.
sampleQDNormSmall = imresize3(sampleQDNorm,0.5);
This results in a 51x51x51 array where the first four values (1,1:4,1) for example are: 0.3443 0.2646 0.2700 0.2835
Now I've tried two different pieces of code in python to replicate these results:
from skimage.transform import resize
from skimage.transform import rescale
sampleQDNormSmall = resize(sampleQDNorm,(0.5*sampleQDNorm.shape[0],0.5*sampleQDNorm.shape[1],0.5*sampleQDNorm.shape[2]),order=3,anti_aliasing=True);
sampleQDNormSmall1=rescale(sampleQDNorm,0.5,order=3,anti_aliasing=True)
The first one gives a 51x51x51 array that has the first four values [0,0:3,0] of: 0.3452 0.2669 0.2774 0.3099. Which is very close but not exactly the same numerical outputs. I don't know enough about the optional arguments to know might get me a better result.
The second one gives a 50x50x50 array that has the first four values [0,0:3,0] of: 0.3422 0.2623 0.2810 0.3006. This is a different output array size and also doesn't reproduce the same numerical outputs as the MATLAB code or the other python function
I don't know enough about the optional arguments to know might get me a better result. I know for this type of array, MATLAB's default is cubic interpolation which is why I am using order 3 in python. The default for anti-aliasing in both is true. I have a two bigger arrays that I am having the same issues with: a (873x873x873) array and a bool (873x873x873) array.
The MATLAB code I'm using is considered the "correct answer" for the work I am doing so I am trying to replicate the results as accurately as possible into python. Please let me know what I can try in python to reproduce the correct data.
sampleQDNorm is roughly random decimals between 0 and 1 for [0:100,0:100,0:100] and is padded with zeros on sides [:,:,101],[:,101,:],[101,:,:]
Getting the exact same result as MATLAB imresize3 is challenging.
One reason is that MATLAB enables Antialiasing filter by default, and I can't seem to find the equivalent Python implementation.
The closet existing Python alternatives are described in this post.
scipy.ndimage.zoom supports 3D resizing.
It could be that skimage.transform.resize gives closer result, but none are identical to MATLAB result.
Reimplementing imresize3:
Looking at the MATLAB implementation of imresize3 (MATLAB source code), it is apparent that MATLAB implementation "simply" uses resize along each axis:
Resize (by half) along the vertical axis.
Resize the above result (by half) along the horizontal axis.
Resize the above result (by half) along the depth axis.
Here is a MATLAB codes sample that demonstrates the implementation (using cubic interpolation):
I1 = imread('peppers.png');
I2 = imresize(imread('autumn.tif'), [size(I1, 1), size(I1, 2)]);
I3 = imresize(imread('football.jpg'), [size(I1, 1), size(I1, 2)]);
I4 = imresize(imread('cameraman.tif'), [size(I1, 1), size(I1, 2)]);
I = cat(3, I1, I2, I3, I4);
J = imresize3(I, 0.5, 'cubic', 'Antialiasing', false);
imwrite(I1, '/Tmp/I1.png');
imwrite(I2, '/Tmp/I2.png');
imwrite(I3, '/Tmp/I3.png');
imwrite(I4, '/Tmp/I4.png');
imwrite(J(:,:,1), '/Tmp/J1.png');
imwrite(J(:,:,2), '/Tmp/J2.png');
imwrite(J(:,:,3), '/Tmp/J3.png');
imwrite(J(:,:,4), '/Tmp/J4.png');
imwrite(J(:,:,5), '/Tmp/J5.png');
K = cubicResize3(I, 0.5);
max_abs_diff = max(abs(double(J(:)) - double(K(:))));
disp(['max_abs_diff = ', num2str(max_abs_diff)])
function B = cubicResize3(A, scale)
order = [1 2 3];
B = A;
for k = 1:numel(order)
dim = order(k);
B = cubicResizeAlongDim(B, dim, scale);
end
end
function out = cubicResizeAlongDim(in, dim, scale)
% If dim is 3, permute the input matrix so that the third dimension
% becomes the first dimension. This way, we can resize along the
% third dimensions as though we were resizing along the first dimension.
isThirdDimResize = (dim == 3);
if isThirdDimResize
in = permute(in, [3 2 1]);
dim = 1;
end
if dim == 1
out_rows = round(size(in, 1)*scale);
out_cols = size(in, 2);
else % dim == 2
out_rows = size(in, 1);
out_cols = round(size(in,2)*scale);
end
out = zeros(out_rows, out_cols, size(in, 3), class(in)); % Allocate array for storing the output.
for i = 1:size(in, 3)
% Resize each color plane separately
out(:, :, i) = imresize(in(:, :, i), [out_rows, out_cols], 'bicubic', 'Antialiasing', false);
end
% Permute back so that the original dimensions are restored if we were
% resizing along the third dimension.
if isThirdDimResize
out = permute(out, [3 2 1]);
end
end
The result is max_abs_diff = 0, meaning that cubicResize3 and imresize3 gave the same output.
Note:
The above implementation stores images in Tmp folder to be used a input for testing Python implementation.
Here is a Python implementation using OpenCV:
import numpy as np
import cv2
#from scipy.ndimage import zoom
def cubic_resize_along_dim(inp, dim, scale):
""" Implementation is based on MATLAB source code of resizeAlongDim function """
# If dim is 3, permute the input matrix so that the third dimension
# becomes the first dimension. This way, we can resize along the
# third dimensions as though we were resizing along the first dimension.
is_third_dim_resize = (dim == 2)
if is_third_dim_resize:
inp = np.swapaxes(inp, 2, 0).copy() # in = permute(in, [3 2 1])
dim = 0
if dim == 0:
out_rows = int(np.round(inp.shape[0]*scale)) # out_rows = round(size(in, 1)*scale);
out_cols = inp.shape[1] # out_cols = size(in, 2);
else: # dim == 1
out_rows = inp.shape[0] # out_rows = size(in, 1);
out_cols = int(np.round(inp.shape[1]*scale)) # out_cols = round(size(in,2)*scale);
out = np.zeros((out_rows, out_cols, inp.shape[2]), inp.dtype) # out = zeros(out_rows, out_cols, size(in, 3), class(in)); % Allocate array for storing the output.
for i in range(inp.shape[2]):
# Resize each color plane separately
out[:, :, i] = cv2.resize(inp[:, :, i], (out_cols, out_rows), interpolation=cv2.INTER_CUBIC) # out(:, :, i) = imresize(inp(:, :, i), [out_rows, out_cols], 'bicubic', 'Antialiasing', false);
# Permute back so that the original dimensions are restored if we were
# resizing along the third dimension.
if is_third_dim_resize:
out = np.swapaxes(out, 2, 0) # out = permute(out, [3 2 1]);
return out
def cubic_resize3(a, scale):
b = a.copy()
for k in range(3):
b = cubic_resize_along_dim(b, k, scale)
return b
# Build 3D input image (10 channels with resolution 512x384).
i1 = cv2.cvtColor(cv2.imread('/Tmp/I1.png', cv2.IMREAD_UNCHANGED), cv2.COLOR_BGR2RGB)
i2 = cv2.cvtColor(cv2.imread('/Tmp/I2.png', cv2.IMREAD_UNCHANGED), cv2.COLOR_BGR2RGB)
i3 = cv2.cvtColor(cv2.imread('/Tmp/I3.png', cv2.IMREAD_UNCHANGED), cv2.COLOR_BGR2RGB)
i4 = cv2.imread('/Tmp/I4.png', cv2.IMREAD_UNCHANGED)
im = np.dstack((i1, i2, i3, i4)) # Stack arrays along the third axis
# Read and adjust MATLAB output (out_mat is used as reference for testing).
# out_mat is the result of J = imresize3(I, 0.5, 'cubic', 'Antialiasing', false);
j1 = cv2.imread('/Tmp/J1.png', cv2.IMREAD_UNCHANGED)
j2 = cv2.imread('/Tmp/J2.png', cv2.IMREAD_UNCHANGED)
j3 = cv2.imread('/Tmp/J3.png', cv2.IMREAD_UNCHANGED)
j4 = cv2.imread('/Tmp/J4.png', cv2.IMREAD_UNCHANGED)
j5 = cv2.imread('/Tmp/J5.png', cv2.IMREAD_UNCHANGED)
out_mat = np.dstack((j1, j2, j3, j4, j5)) # Stack arrays along the third axis
#out_py = zoom(im, 0.5, order=3, mode='reflect')
# Execute 3D resize in Python
out_py = cubic_resize3(im, 0.5)
abs_diff = np.absolute(out_mat.astype(np.int16) - out_py.astype(np.int16))
print(f'max_abs_diff = {abs_diff.max()}')
The Python implementation reads the input files stored by MATLAB (and convert from BGR to RGB when required).
The implementation compares the result of cubic_resize3 with the MATLAB output of imresize3.
The maximum difference is 12 (not zero).
Apparently cv2.resize and MATLAB imresize gives slightly different results.
Update:
Replacing:
out[:, :, i] = cv2.resize(inp[:, :, i], (out_cols, out_rows), interpolation=cv2.INTER_CUBIC)
with:
out[:, :, i] = transform.resize(inp[:, :, i], (out_rows, out_cols), order=3, mode='edge', anti_aliasing=False, preserve_range=True)
Reduces the maximum difference to 4.

How to correct color of an image based on a standard image

Please see the figure:
image1 is the image to be corrected, and image2 is the standard image taken in a black box.
There is a triangle in both images with slightly different color, I want to correct image1 through the triangle based on image2 so that the circle and the square in image1 can be also corrected.
How can I do that?
What I have tried:
get the B, G, R mean value of the triangle in image1 and image2, dividing them respectively to get KB, KG, KR, then multiply B, G, R channel of image1 with KB, KG, KR, lastly merge the 3 channel to get the corrected image
Demo code in python with OpenCV:
triangle_image1 = cv2.mean(image1, mask1)[:3]
triangle_image2 = cv2.mean(image2, mask2)[:3]
k_b, k_g, k_r = triangle_image2 / triangle_image1
b, g, r = cv2.split(image1)
corrected = b * k_b, g * k_g, r * k_r
corrected = np.clip(corrected, 0, 255)
corrected = cv2.merge(np.array(corrected, np.uint8))
The result image looks OK but actually not right because the color difference (delta E) of the triangle in the corrected image and image2 is about 6.
I tried executing chromatic adaptation transform, but I have no way telling if the result is correct.
Note that chromatic adaptation corrects the chrominance, but not the luminescence (only the color but not the brightens).
Chromatic adaptation transform is used for color balancing (White Balance), and I don't know if it fits your case.
I reused MATLAB implementation (I did't look for Python examples).
Even if you don't know MATLAB, and the solution is not what you are looking for, you may learn from it (Linearize the RGB value for example).
Here is the code:
T = imread('image.png'); % Load input image (two images side by side).
image1 = T(:, 1:end/2, :); % Left side
image2 = T(:, end/2+1:end, :); % Right side
I = image1; % Source image is named I
% Use color components in range [0, 1] (colors were found by manual picking).
src_sRGB = [205, 232, 32]/255; %Triangle sRGB color from image 1 "source image"
dst_sRGB = [13, 133, 38]/255; %Triangle sRGB color from image 2 "destination image"
%Linearize gamma-corrected RGB values (image values are in sRGB color space, we need to Linearize them).
srcRGB = rgb2lin(src_sRGB)';
dstRGB = rgb2lin(dst_sRGB)';
linI = rgb2lin(double(I)/255); % I in linear RGB color space.
% Color correction by Chromatic Adaptation:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% http://www.brucelindbloom.com/index.html?Eqn_RGB_XYZ_Matrix.html
% Convert from XYZ D65 color space to Linear sRGB color space.
XYZD65_to_sRGB = [ 3.2404542 -1.5371385 -0.4985314
-0.9692660 1.8760108 0.0415560
0.0556434 -0.2040259 1.0572252];
% Convert from Linear sRGB color space to XYZ D65 color space.
sRGBtoXYZD65 = [0.4124564 0.3575761 0.1804375; ...
0.2126729 0.7151522 0.0721750; ...
0.0193339 0.1191920 0.9503041];
% Convert srcRGB and dstRGB to XYZ color space
srcXYZ = sRGBtoXYZD65 * srcRGB;
dstXYZ = sRGBtoXYZD65 * dstRGB;
% Convert srcXYZ and dstXYZ to xyY color space (get only xy out of xyY - xy applies chromaticity).
xySrc = XYZ2xy(srcXYZ);
xyDst = XYZ2xy(dstXYZ);
xyzSrc = xy2XYZ(xySrc, 1); %normalize Y to 1 so D65 luminance comparable
xyzDst = xy2XYZ(xyDst, 1); %normalize Y to 1 so D65 luminance comparable
% Chromatic adaptation transform
catType = 'bradford'; %Bradford transformation is recommended by Bruce Lindbloom http://www.brucelindbloom.com/index.html?Eqn_ChromAdapt.html
estMAT = cbCAT(xyzSrc, xyzDst, catType);
% Ascale estMAT by XYZD65_to_sRGB before apply color correction
M = estMAT * XYZD65_to_sRGB;
linI = cbreshape(linI);
% Destination image - apply color correction be multiplying by correction matrix M
linJ = M*linI;
linJ = cbunshape(linJ, size(I));
% Convet J from Linear to sRGB
J = lin2rgb(linJ);
% Convert from double to uint8 (multiply by 255).
J = im2uint8(J);
% Display result
figure;imshow(J);title('Corrected image1');impixelinfo
figure;imshow(image2);title('image2');impixelinfo
% Save result
imwrite(image2, 'image2.png');
imwrite(J, 'J.png');
function xy = XYZ2xy(xyz)
%xy = XYZ2xy(xyz)
% Converts CIE XYZ to xy chromaticity.
X = xyz(1, :);
Y = xyz(2, :);
s = sum(xyz);
xy = [X./s; Y./s];
end
function XYZ = xy2XYZ(xy,Y)
%XYZ = xy2XYZ(xy,Y)
% Converts xyY chromaticity to CIE XYZ.
x = xy(1); y = xy(2);
XYZ = [Y/y*x; Y; Y/y*(1-x-y)];
end
function outMat = cbCAT(xyz_src, xyz_dst, type)
%https://web.stanford.edu/~sujason/ColorBalancing/adaptation.html
%M = cbCAT(xyz_src, xyz_dst, type)
% Chromatic adaptation transform via von Kries's method.
% type chooses the LMS-like space to apply scaling in, valid options:
% 'vonKries', 'bradford', 'sharp', 'cmccat2000', 'cat02', 'xyz'
% See http://www.brucelindbloom.com/index.html?Eqn_ChromAdapt.html
xyz_src = makecol(xyz_src);
xyz_dst = makecol(xyz_dst);
% the following are mostly taken from S. Bianco. "Two New von Kries Based
% Chromatic Adaptation Transforms Found by Numerical Optimization."
if strcmpi(type,'vonKries') %Hunt-Pointer-Estevez normalized to D65
Ma = [0.40024 0.7076 -0.08081; -0.2263 1.16532 0.0457; 0 0 0.91822];
elseif strcmpi(type,'bradford')
Ma = [0.8951 0.2664 -0.1614; -0.7502 1.7135 0.0367; 0.0389 -0.0685 1.0296];
elseif strcmpi(type,'sharp')
Ma = [1.2694 -0.0988 -0.1706; -0.8364 1.8006 0.0357; 0.0297 -0.0315 1.0018];
elseif strcmpi(type,'cmccat2000')
Ma = [0.7982 0.3389 -0.1371; -0.5918 1.5512 0.0406; 0.0008 0.239 0.9753];
elseif strcmpi(type,'cat02')
Ma = [0.7328 0.4296 -0.1624; -0.7036 1.6975 0.0061; 0.0030 0.0136 0.9834];
else
Ma = eye(3);
end
%Chromatic Adaptation Transforms:
%1. Transform from XYZ into a cone response domain (ro, gamma, beta)
%2. Scale the vector components by factors dependent upon both the source and destination reference whites.
%3. Transform from (ro, gamma, beta) back to XYZ using the inverse transform of step 1.
%D is diagonal matrix marked as inv(Ma)*diag(roD/roS, gammaD/gammaS, betaD/betaS)*Ma.
%Matrix D applies ratios in "cone response domain".
D = diag((Ma*xyz_dst)./(Ma*xyz_src));
%Transform back to XYZ domain:
M = Ma\D*Ma;
sRGBtoXYZ = [0.4124564 0.3575761 0.1804375; ...
0.2126729 0.7151522 0.0721750; ...
0.0193339 0.1191920 0.9503041];
outMat = sRGBtoXYZ\M*sRGBtoXYZ;
end
function x = makecol(x)
%x = makecol(x)
% returns x as a column vector
s = size(x);
if (length(s) == 2) && (s(1) < s(2))
x = x.';
end
end
function out = cbreshape(im)
%out = cbreshape(im)
% Takes a width x height x 3 RGB image and returns a matrix where each column is an RGB
% pixel.
if (size(im, 3) == 3)
out = reshape(permute(im, [3, 1, 2]), [3, numel(im)/3, 1]);
else
out = (im(:))';
end
end
function out = cbunshape(mat,s)
%out = cbunshape(im,[height, width])
% Takes a 3xn matrix of RGB pixels and returns a height x width x 3 RGB
% image
height = s(1); width = s(2);
if (size(mat,1) == 3)
%In case mat is 3 rows, convert to 3D matrix
out = reshape(mat,[3,height,width]);
out = permute(out,[2 3 1]);
else
%In case mat is 1 row, convert to 2D matrix
out = reshape(mat, [height, width]);
end
end
Result:
Update:
Same solution with luminescence adjustment:
In case you need to correct the luminescence, add the following code before "Color correction by Chromatic Adaptation":
% Scale the input so the mean of the triangle in image1 and image2 will be the same.
% The scaling is eqivalent to adjusting the exposure level of the camera.
rgb_scale = mean(dstRGB) / mean(srcRGB);
srcRGB = srcRGB*rgb_scale;
linI = linI*rgb_scale;
Result:

trouble displaying image HSI converted to RGB python

I've been working in a algorithm to convert RGB to HSI and vice-versa in python 3, which it display the resulted images and each channel using matplotlib.
The trouble is displaying HSI to RGB resulted image: Each channel alone is being displayed correctly, but when it shows the tree channels together I get a weird image.
By the way, when I save the resulted image with OpenCV it shows the image correctly.
Resulted display
What I did, but nothing changed:
Round the values and if it pass 1, give 1 to the pixel
In the conversion HSI to RGB, instead define R, G and B arrays with zeros, define arrays with ones
In the conversion RGB to HSI, change the values between [0,360],[0,1],[0,1] to values between [0,360],[0,255],[0,255] rounded or not
Instead use Jupyter notebook, use collab.research by google or Spider
Execute the code on terminal, but it gives me blank windows
Function to display images:
def show_images(T, cols=1):
N = len(T)
fig = plt.figure()
for i in range(N):
a = fig.add_subplot(np.ceil(N/float(cols)), cols, i+1)
try:
img,title = T[i]
except ValueError:
img,title = T[i], "Image %d" % (i+1)
if(img.ndim == 2):
plt.gray()
plt.imshow(img)
a.set_title(title)
plt.xticks([0,img.shape[1]]), plt.yticks([0,img.shape[0]])
fig.set_size_inches(np.array(fig.get_size_inches()) * N)
plt.show()
Then the main function do this:
image = bgr_to_rgb(cv2.imread("rgb.png"))
img1 = rgb_to_hsi(image)
img2 = hsi_to_rgb(img1)
show_images([(image,"RGB"),
(image[:,:,0],"Red"),
(image[:,:,1],"Green"),
(image[:,:,2],"Blue")], 4)
show_images([(img1,"RGB->HSI"),
(img1[:,:,0],"Hue"),
(img1[:,:,1],"Saturation"),
(img1[:,:,2],"Intensity")], 4)
show_images([(img2,"HSI->RGB"),
(img2[:,:,0],"Red"),
(img2[:,:,1],"Green"),
(img2[:,:,2],"Blue")], 4)
Conversion RGB to HSI:
def rgb_to_hsi(img):
zmax = 255 # max value
# values in [0,1]
R = np.divide(img[:,:,0],zmax,dtype=np.float)
G = np.divide(img[:,:,1],zmax,dtype=np.float)
B = np.divide(img[:,:,2],zmax,dtype=np.float)
# Hue, when R=G=B -> H=90
a = (0.5)*np.add(np.subtract(R,G), np.subtract(R,B)) # (1/2)*[(R-G)+(R-B)]
b = np.sqrt(np.add(np.power(np.subtract(R,G), 2) , np.multiply(np.subtract(R,B),np.subtract(G,B))))
tetha = np.arccos( np.divide(a, b, out=np.zeros_like(a), where=b!=0) ) # when b = 0, division returns 0, so then tetha = 90
H = (180/math.pi)*tetha # convert rad to degree
H[B>G]=360-H[B>G]
# saturation = 1 - 3*[min(R,G,B)]/(R+G+B), when R=G=B -> S=0
a = 3*np.minimum(np.minimum(R,G),B) # 3*min(R,G,B)
b = np.add(np.add(R,G),B) # (R+G+B)
S = np.subtract(1, np.divide(a,b,out=np.ones_like(a),where=b!=0))
# intensity = (1/3)*[R+G+B]
I = (1/3)*np.add(np.add(R,G),B)
return np.dstack((H, zmax*S, np.round(zmax*I))) # values between [0,360], [0,255] e [0,255]
Conversion HSI to RGB:
def f1(I,S): # I(1-S)
return np.multiply(I, np.subtract(1,S))
def f2(I,S,H): # I[1+(ScosH/cos(60-H))]
r = math.pi/180
a = np.multiply(S, np.cos(r*H)) # ScosH
b = np.cos(r*np.subtract(60,H)) # cos(60-H)
return np.multiply(I, np.add(1, np.divide(a,b)) )
def f3(I,C1,C2): # 3I-(C1+C2)
return np.subtract(3*I, np.add(C1,C2))
def hsi_to_rgb(img):
zmax = 255 # max value
# values between[0,360], [0,1] and [0,1]
H = img[:,:,0]
S = np.divide(img[:,:,1],zmax,dtype=np.float)
I = np.divide(img[:,:,2],zmax,dtype=np.float)
R,G,B = np.ones(H.shape),np.ones(H.shape),np.ones(H.shape) # values will be between [0,1]
# for 0 <= H < 120
B[(0<=H)&(H<120)] = f1(I[(0<=H)&(H<120)], S[(0<=H)&(H<120)])
R[(0<=H)&(H<120)] = f2(I[(0<=H)&(H<120)], S[(0<=H)&(H<120)], H[(0<=H)&(H<120)])
G[(0<=H)&(H<120)] = f3(I[(0<=H)&(H<120)], R[(0<=H)&(H<120)], B[(0<=H)&(H<120)])
# for 120 <= H < 240
H = np.subtract(H,120)
R[(0<=H)&(H<120)] = f1(I[(0<=H)&(H<120)], S[(0<=H)&(H<120)])
G[(0<=H)&(H<120)] = f2(I[(0<=H)&(H<120)], S[(0<=H)&(H<120)], H[(0<=H)&(H<120)])
B[(0<=H)&(H<120)] = f3(I[(0<=H)&(H<120)], R[(0<=H)&(H<120)], G[(0<=H)&(H<120)])
# for 240 <= H < 360
H = np.subtract(H,120)
G[(0<=H)&(H<120)] = f1(I[(0<=H)&(H<120)], S[(0<=H)&(H<120)])
B[(0<=H)&(H<120)] = f2(I[(0<=H)&(H<120)], S[(0<=H)&(H<120)], H[(0<=H)&(H<120)])
R[(0<=H)&(H<120)] = f3(I[(0<=H)&(H<120)], G[(0<=H)&(H<120)], B[(0<=H)&(H<120)])
return np.dstack( ((zmax*R) , (zmax*G) , (zmax*B)) ) # values between [0,255]
If you take a look at the imshow documentation of matplotlib, you will see the following lines:
X : array-like or PIL image The image data. Supported array shapes
are:
(M, N): an image with scalar data. The data is visualized using a
colormap. (M, N, 3): an image with RGB values (float or uint8). (M, N,
4): an image with RGBA values (float or uint8), i.e. including
transparency. The first two dimensions (M, N) define the rows and
columns of the image.
The RGB(A) values should be in the range [0 .. 1] for floats or [0 ..
255] for integers. Out-of-range values will be clipped to these
bounds.
Which tells you the ranges that it should be in... In your case, the HSI values go from 0-360 in the Hue which will be clipped to 255 any value above it. That is one of the reasons why OpenCV uses the Hue range from 0-180, to be able to fit it inside the range.
Then the HSI->RGB seems to return the image in float, then it will be clipped in 1.0.
This will happen only for the display, but also if you save the image it will be clipped most probably, maybe it gets saved as a 16 bit image.
Possible solutions:
normalize the values from 0-1 or from 0-255 (this may change the min and max value) and then display it (dont forget to cast it to np.uint8).
Create a range that is always inside the possible values.
This is for display or saving purposes... If you use 0-360 save it at least in 16 bits

OpenCV Python cv2.perspectiveTransform

I'm currently trying to video stabilization using OpenCV and Python.
I use the following function to calculate rotation:
def accumulate_rotation(src, theta_x, theta_y, theta_z, timestamps, prev, current, f, gyro_delay=None, gyro_drift=None, shutter_duration=None):
if prev == current:
return src
pts = []
pts_transformed = []
for x in range(10):
current_row = []
current_row_transformed = []
pixel_x = x * (src.shape[1] / 10)
for y in range(10):
pixel_y = y * (src.shape[0] / 10)
current_row.append([pixel_x, pixel_y])
if shutter_duration:
y_timestamp = current + shutter_duration * (pixel_y - src.shape[0] / 2)
else:
y_timestamp = current
transform = getAccumulatedRotation(src.shape[1], src.shape[0], theta_x, theta_y, theta_z, timestamps, prev,
current, f, gyro_delay, gyro_drift)
output = cv2.perspectiveTransform(np.array([[pixel_x, pixel_y]], dtype="float32"), transform)
current_row_transformed.append(output)
pts.append(current_row)
pts_transformed.append(current_row_transformed)
o = utilities.meshwarp(src, pts_transformed)
return o
I get the following error when it gets to output = cv2.perspectiveTransform(np.array([[pixel_x, pixel_y]], dtype="float32"), transform):
cv2.error: /Users/travis/build/skvark/opencv-python/opencv/modules/core/src/matmul.cpp:2271: error: (-215) scn + 1 == m.cols in function perspectiveTransform
Any help or suggestions would really be appreciated.
This implementation really needs to be changed in a future version, or the docs should be more clear.
From the OpenCV docs for perspectiveTransform():
src – input two-channel (...) floating-point array
Slant emphasis added by me.
>>> A = np.array([[0, 0]], dtype=np.float32)
>>> A.shape
(1, 2)
So we see from here that A is just a single-channel matrix, that is, two-dimensional. One row, two cols. You instead need a two-channel image, i.e., a three-dimensional matrix where the length of the third dimension is 2 or 3 depending on if you're sending in 2D or 3D points.
Long story short, you need to add one more set of brackets to make the set of points you're sending in three-dimensional, where the x values are in the first channel, and the y values are in the second channel.
>>> A = np.array([[[0, 0]]], dtype=np.float32)
>>> A.shape
(1, 1, 2)
Also, as suggested in the comments:
If you have an array points of shape (n_points, dimension) (i.e. dimension is 2 or 3), a nice way to re-format it for this use-case is points[np.newaxis]
It's not intuitive, and though it's documented, it's not very explicit on that point. That's all you need. I've answered an identical question before, but for the cv2.transform() function.

How to implement Matlab bwmorph(bw,'remove') in Python

I am trying to implement the Matlab function bwmorph(bw,'remove') in Python. This function removes interior pixels by setting a pixel to 0 if all of its 4-connected neighbor pixels are 1. The resulting image should return the boundary pixels. I've written a code but I'm not sure if this is how to do it.
# neighbors() function returns the values of the 4-connected neighbors
# bwmorph() function returns the input image with only the boundary pixels
def neighbors(input_matrix,input_array):
indexRow = input_array[0]
indexCol = input_array[1]
output_array = []
output_array[0] = input_matrix[indexRow - 1,indexCol]
output_array[1] = input_matrix[indexRow,indexCol + 1]
output_array[2] = input_matrix[indexRow + 1,indexCol]
output_array[3] = input_matrix[indexRow,indexCol - 1]
return output_array
def bwmorph(input_matrix):
output_matrix = input_matrix.copy()
nRows,nCols = input_matrix.shape
for indexRow in range(0,nRows):
for indexCol in range(0,nCols):
center_pixel = [indexRow,indexCol]
neighbor_array = neighbors(output_matrix,center_pixel)
if neighbor_array == [1,1,1,1]:
output_matrix[indexRow,indexCol] = 0
return output_matrix
Since you are using NumPy arrays, one suggestion I have is to change the if statement to use numpy.all to check if all values are nonzero for the neighbours. In addition, you should make sure that your input is a single channel image. Because grayscale images in colour share all of the same values in all channels, just extract the first channel. Your comments indicate a colour image so make sure you do this. You are also using the output matrix which is being modified in the loop when checking. You need to use an unmodified version. This is also why you're getting a blank output.
def bwmorph(input_matrix):
output_matrix = input_matrix.copy()
# Change. Ensure single channel
if len(output_matrix.shape) == 3:
output_matrix = output_matrix[:, :, 0]
nRows,nCols = output_matrix.shape # Change
orig = output_matrix.copy() # Need another one for checking
for indexRow in range(0,nRows):
for indexCol in range(0,nCols):
center_pixel = [indexRow,indexCol]
neighbor_array = neighbors(orig, center_pixel) # Change to use unmodified image
if np.all(neighbor_array): # Change
output_matrix[indexRow,indexCol] = 0
return output_matrix
In addition, a small grievance I have with your code is that you don't check for out-of-boundary conditions when determining the four neighbours. The test image you provided does not throw an error as you don't have any border pixels that are white. If you have a pixel along any of the borders, it isn't possible to check all four neighbours. However, one way to mitigate this would be to perhaps wrap around by using the modulo operator:
def neighbors(input_matrix,input_array):
(rows, cols) = input_matrix.shape[:2] # New
indexRow = input_array[0]
indexCol = input_array[1]
output_array = [0] * 4 # New - I like pre-allocating
# Edit
output_array[0] = input_matrix[(indexRow - 1) % rows,indexCol]
output_array[1] = input_matrix[indexRow,(indexCol + 1) % cols]
output_array[2] = input_matrix[(indexRow + 1) % rows,indexCol]
output_array[3] = input_matrix[indexRow,(indexCol - 1) % cols]
return output_array

Categories