I encountered this puzzling situation when trying to get rid of the third dimension (the RGB dimension) of my images in order to feed them to a Knn classifier for face recognition.
I took one colored face image from the Labeled-face-in-the-wild database as an example. It is saved locally.
I first imported the image, then converted it to grayscale, then checked dimension (time1), then exported with "imwrite", then imported the gray scale image again, then checked its dimension again (time2).
At (time1), the dimension was 2: (250, 250). However, at (time2), the dimension became 3: (250, 250, 3). Why would exporting and importing change the dimension of the gray scale picture? What should I specify when importing the gray scale picture to keep it 2 dimensional?
Here is my python code:
import cv2
import matplotlib.pyplot as plt
imgBGR = cv2.imread("path/filename")
gray = cv2.cvtColor(imgBGR, cv2.COLOR_BGR2GRAY)
gray.shape # this gives me (250, 250)
cv2.imwrite("path/newname", gray)
gray2 = cv2.imread("path/newname")
gray2.shape # this gives me (250, 250, 3)
Try gray2 = cv2.imread("path/newname" , cv2.IMREAD_GRAYSCALE)
As Opencv imread documentaion, the default is cv2.IMREAD_COLOR, so with setting the flag the default setting of cv2.imread is reading image in colour, so it will split a greyscale image into 3 channels.
By specific cv2.imread("path/newname" , cv2.IMREAD_GRAYSCALE), the function will read in image in grayscale.
Related
I've converted some images from RGB to Grayscale for ML purpose.
However the shape of the converted grayscale image is still 3, the same as the color image.
The code for the Conversion:
from PIL import Image
img = Image.open('path/to/color/image')
imgGray = img.convert('L')
imgGray.save('path/to/grayscale/image')
The code to check the shape of the images:
import cv2
im_color = cv2.imread('path/to/color/image')
print(im_color.shape)
im_gray2 = cv2.imread('path/to/grayscale/image')
print(im_gray2.shape)
You did
im_gray2 = cv2.imread('path/to/grayscale/image')
OpenCV does not inspect colorness of image - it does assume image is color and desired output is BGR 8-bit format. You need to inform OpenCV you want output to be grayscale (2D intensity array) as follows
im_gray2 = cv2.imread('path/to/grayscale/image', cv2.IMREAD_GRAYSCALE)
If you want to know more about reading images read OpenCV: Getting Started with Images
cv.imread, without any flags, will always convert any image content to BGR, 8 bits per channel.
If you want any image file, grayscale or color, to be read as grayscale, you can pass the cv.IMREAD_GRAYSCALE flag.
If you want to read the file as it really is, then you need to use cv.IMREAD_UNCHANGED.
im_color = cv2.imread('path/to/color/image', cv2.IMREAD_UNCHANGED)
print(im_color.shape)
im_gray2 = cv2.imread('path/to/grayscale/image', cv2.IMREAD_UNCHANGED)
print(im_gray2.shape)
I have encountered a very strange problem.
Imshow() is showing the same image for the three first imshows - why? (Only the red channel, it seems to have zeroed blue and green)
I'm creating a copy of the original image, but it seems the operations affect all images.
The forth imshow shows the red channel as grey-scale as expected.
What am I doing wrong?
##### Image processing ####
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread('/home/pi/Documents/testcode_python/Tractor_actual/Pictures/result2.jpg') #reads as BGR
print(img.shape)
no_blue=img
no_green=img
only_red=img[:,:,2] #Takes only red channel from BGR image and saves to "only_red"
no_blue[:,:,0]=np.zeros([img.shape[0], img.shape[1]]) #Puts Zeros on Blue channels for "no_blue"
no_green[:,:,1]=np.zeros([img.shape[0], img.shape[1]])
print(no_blue.shape)
cv2.imshow('Original',img)
cv2.imshow('No Blue',no_blue)
cv2.imshow('No Green',no_green)
cv2.imshow('Only Red', only_red)
cv2.waitKey(0)
cv2.destroyAllWindows()
enter image description here
You would need to create a copy of the image to avoid using the same memory location as img. Not sure if this is what you are looking in for only_red, but having all three channels with blue and green set to 0 will avoid it from being considered a single channel grayscale image.
##### Image processing ####
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread('/home/pi/Documents/testcode_python/Tractor_actual/Pictures/result2.jpg') #reads as BGR
print(img.shape)
no_blue=img.copy() # copy of img to avoid using the same memory location as img
no_green=img.copy() # copy of img to avoid using the same memory location as img
only_red=img.copy() # similarly to above.
# You also need all three channels of the RGB image to avoid it being interpreted as single channel image.
only_red[:,:,0] = np.zeros([img.shape[0], img.shape[1]])
only_red[:,:,1] = np.zeros([img.shape[0], img.shape[1]])
# Puts zeros for green and blue channels
no_blue[:,:,0]=np.zeros([img.shape[0], img.shape[1]]) #Puts Zeros on Blue channels for "no_blue"
no_green[:,:,1]=np.zeros([img.shape[0], img.shape[1]])
print(no_blue.shape)
cv2.imshow('Original',img)
cv2.imshow('No Blue',no_blue)
cv2.imshow('No Green',no_green)
cv2.imshow('Only Red', only_red)
cv2.waitKey(0)
cv2.destroyAllWindows()
I have recorded some data as npy file. And I tried to diplay the image (data[0]) to check if it makes sense with the following code
import numpy as np
import cv2
train_data = np.load('c:/data/train_data.npy')
for data in train_data:
output = data[1]
# only take the height, width and channels of the 4 dimensional array
image = data[0][0, :, :, :]
# image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
cv2.imshow('test', image)
print('output {}'.format(output))
if cv2.waitKey(25) & 0xFF == ord('q'):
cv2.destroyAllWindows()
break
But if I display the images without the line image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) the images seem to be BGR based. If I comment this line into the code the images are displayed correctly.
My question: Does this observation imply that the image array is already in BGR format? Or does this imply that cv2.imshow() does by
default interprete the array as BGR array?
Matplotlib and Numpy read images into RGB and processes them as RGB. OpenCV reads images into BGR and processes them as BGR. Either system recognizes a range of input types, has ways to convert between color spaces of almost any type, and offers support of a variety of image processing tasks.
This gives three different ways to load an image (plt.imread(), ndimage.imread() and cv2.imread()), two systems for processing the data (Numpy and CV2), and two ways to display the image (plt.imshow() and cv2.imshow()), and really, there is a third way to display the image using pyplot, if you want to treat the image as numerical data in 2-d plus another dimension for each color.
Here is some simple code to demonstrate some of this.
#!/usr/bin/python
import matplotlib.pyplot as plt
from scipy.ndimage import imread
import numpy as np
import cv2
img = imread('index.jpg')
print( "img data type: %s shape %s"%( type(img), str( img.shape) ) )
plt.imshow( img )
plt.title( 'pyplot as read' )
plt.savefig( 'index.plt.raw.jpg' )
cv2.imshow('cv2, read by numpy', img)
cv2.imwrite('index.cv2.raw.jpg',img)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
cv2.imshow('after conversion', img)
cv2.imwrite('index.cv2.bgr2rgb.jpg',img)
This generates the following line of text, and the following three example image files.
img data type: <type 'numpy.ndarray'> shape (225, 225, 3)
The correct image has red as the upper circle. We read the image into a numpy array, using ndimage.imread(), and show it with Pyplot's imshow() and get the correct image. We then show it with cv2.imshow() and we see that the red channel is interpreted as the blue channel and vice versa. Then we convert the colorspace and we see that cv2.imshow() now interprets the result correctly.
plt.imshow(), as read by ndimage():
cv2.imshow(), the image as read by ndimage:
cv2.imshow(), after converting from RGB to BGR:
I'm trying resize images retrieved from cifar10 in the original 32x32 to 96x96 for use with MobileNetV2, howevery I'm running into this error. Tried a variety of solutions but nothing seems to work.
My code:
for a in range(len(train_images)):
train_images[a] = cv2.resize(train_images[a], dsize=(minSize, minSize), interpolation=cv2.INTER_CUBIC)
Error I'm getting:
----> 8 train_images[a] = cv2.resize(train_images[a], dsize=(minSize, minSize), interpolation=cv2.INTER_CUBIC)
ValueError: could not broadcast input array from shape (96,96,3) into shape (32,32,3)
Sometimes you have to convert the image from RGB to grayscale. If that is the problem, the only thing you should do is gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY), resize the image and then again resized_image = cv2.cvtColor(gray_image, cv2.COLOR_GRAY2RGB)
I have never run into this error but if the first option doesn't work, you can try and resize image with pillow like this:
from PIL import Image
im = Image.fromarray(cv2_image)
nx, ny = im.size
im2 = im.resize((nx*2, ny*2), Image.LANCZOS)
cv2_image = cv2.cvtColor(numpy.array(im2), cv2.COLOR_RGB2BGR)
You can make this into a function and call it in the list comprehension. I hope this solves your problem :)
This is simply because you are reading the 32x32 image from train_images and trying to save the reshaped image (96x96) in the same array which is impossible!
Try something like:
train_images_reshaped = np.array((num_images, 96, 96, 3))
for a in range(len(train_images)):
train_images_reshaped[a] = cv2.resize(train_images[a], dsize=(minSize, minSize), interpolation=cv2.INTER_CUBIC)
There are some interpolation algorithms in OpenCV. Such as-
INTER_NEAREST – a nearest-neighbor interpolation
INTER_LINEAR – a bilinear interpolation (used by default)
INTER_AREA – resampling using pixel area relation. It may be a
preferred method for image decimation, as it gives moire’-free
results. But when the image is zoomed, it is similar to the
INTER_NEAREST method.
INTER_CUBIC – a bicubic interpolation over 4×4 pixel neighborhood
INTER_LANCZOS4 – a Lanczos interpolation over 8×8 pixel neighborhood
Code:
image_scaled=cv2.resize(image,None,fx=.75,fy=.75,interpolation = cv2.INTER_LINEAR)
img_double=cv2.resize(image,None,fx=2,fy=2,interpolation=cv2.INTER_CUBIC)
image_resize=cv2.resize(image,(200,300),interpolation=cv2.INTER_AREA)
image_resize=cv2.resize(image,(500,400),interpolation=cv2.INTER_LANCZOS4)
You can find the details about python implementation here as well: How to resize images in OpenCV python
I am trying to resize a .jpg image with skimage.transform.resize function. Function returns me weird result (see image below). I am not sure if it is a bug or just wrong use of the function.
import numpy as np
from skimage import io, color
from skimage.transform import resize
rgb = io.imread("../../small_dataset/" + file)
# show original image
img = Image.fromarray(rgb, 'RGB')
img.show()
rgb = resize(rgb, (256, 256))
# show resized image
img = Image.fromarray(rgb, 'RGB')
img.show()
Original image:
Resized image:
I allready checked skimage resize giving weird output, but I think that my bug has different propeties.
Update: Also rgb2lab function has similar bug.
The problem is that skimage is converting the pixel data type of your array after resizing the image. The original image has a 8 bits per pixel, of type numpy.uint8, and the resized pixels are numpy.float64 variables.
The resize operation is correct, but the result is not being correctly displayed. For solving this issue, I propose 2 different approaches:
To change the data structure of the resulting image. Prior to changing to uint8 values, the pixels have to be converted to a 0-255 scale, as they are on a 0-1 normalized scale:
# ...
# Do the OP operations ...
resized_image = resize(rgb, (256, 256))
# Convert the image to a 0-255 scale.
rescaled_image = 255 * resized_image
# Convert to integer data type pixels.
final_image = rescaled_image.astype(np.uint8)
# show resized image
img = Image.fromarray(final_image, 'RGB')
img.show()
Update: This method is deprecated, as per scipy.misc.imshow
To use another library for displaying the image. Taking a look at the Image library documentation, there isn't any mode supporting 3xfloat64 pixel images. However, the scipy.misc library has the appropriate tools for converting the array format in order to display it correctly:
from scipy import misc
# ...
# Do OP operations
misc.imshow(resized_image)