cv2 image size is transposed from PIL Image - python

I have an image of size 72x96. Windows says its size is 72x96. PIL Image also says it is 72x96:
from PIL import Image, ImageOps
with Image.open(<path>) as img:
print(img.size) # (72, 96)
print(ImageOps.exif_transpose(img).size) # (72, 96)
But when I read the image with cv2.imread or skimage.io.imread it says, that the shape of the image is (96, 72, 3):
from skimage.io import imread
im0 = imread(<path>)
print(im0.shape) # (96, 72, 3)
What is wrong here? Even if I do something like that:
import matplotlib.pyplot as plt
plt.imshow(im0)
It shows the image with the correct size, but the written size looks to be transposed.

This is expected behavior.
PIL returns the size of an image as (width, height) (PIL documentation), whereas numpy returns the shape of an array as the lengths of the first and then second dimension (in the case of a 2d array), so (height, width) (Numpy documentation).

Related

Numpy array not displaying color dimension of greyscale image after converting from PIL image

I'm trying to convert an RGB image to a greyscale image, then to a numpy array using the following code snippet:
img = Image.open("image1.png")
img = img.convert('L')
img = np.array(img, dtype='f')
print(img.shape)
The result is a numpy array of shape (128, 128). Is there anyway that I could convert a greyscale image to a numpy array so that it would have the color channel as well, i.e. the shape would be (128, 128, 1)?
Like #Mark mentioned in comments, add a dimension to the end if your array using newaxis:
img=img[...,None]
None will do similar as np.newaxis. It does not create a color, but adds a dimension similar to a single channel image.

Lanczos Interpolation in Python with 2D images

I try to rescale 2D images (greyscale).
The image size is 256x256 and the desired output is 224x224.
The pixel values range from 0 to 1300.
I tried 2 approaches to rescale them with Lanczos Interpolation:
First using PIL Image:
import numpy as np
from PIL import Image
import cv2
array = np.random.randint(0, 1300, size=(10, 256, 256))
array[0] = Image.fromarray(array[0]).resize(size=(224, 224), resample=Image.LANCZOS)
resulting in the error message: ValueError: image has wrong mode
And then CV2:
array[0] = cv2.resize(array[0], dsize=(224, 224), interpolation=cv2.INTER_LANCZOS4)
resulting in the error message: ValueError: could not broadcast input array from shape (224,224) into shape (256,256)
How to do it properly?
In the second case, you are resizing a 256x256 image to 224x224, then assigning it back into a slice of the original array. This slice still has size 256x256, so NumPy doesn't know how to do the data copy.
Instead, create a new output array of the right sizes:
array = np.random.randint(0, 1300, size=(10, 256, 256))
newarray = np.zeros((10, 224, 224))
newarray[0] = cv2.resize(array[0], dsize=(224, 224), interpolation=cv2.INTER_LANCZOS4)
In the PIL part, you have a few issues.
Firstly, you need to check the dtype of things you create! You create an array of np.int64 when you use np.random() like that. As you know your data only maxes out at 1300, an unsigned 16-bit is preferable:
array = np.random.randint(0, 1300, size=(10, 256, 256), dtype=np.uint16)
Secondly, when you create a PIL Image from the Numpy array, you need to tell PIL the mode - greyscale or Lightness here:
array[0] = Image.fromarray(array[0], 'L')
Thirdly, you are trying to stuff the newly created PIL Image back into a Numpy array - don't do that:
newVariable = Image.fromarray(...).resize()

Dimension decreases after performing "pyramid_reduce" function. How to fix?

I am trying to downscale the image using "scikit-image". However I cannot show the downscaled picture through matplotlib.imshow function because of the dimension. Is there a way to prevent such dimension reduction? I put the script as well.
import os, cv2, glob
import numpy as np
import matplotlib.pyplot as plt
from skimage import io
from skimage.transform import pyramid_reduce,
plt.style.use('dark_background')
img_path = os.path.join(img_base_path, value[0])
img = io.imread(img_path)
resized = pyramid_reduce(img, downscale=4)
print(resized.shape)
img.shape is (240, 240, 3). So what I expect for an output is (60, 60, 3). However what I get is (60, 60, 1).
When I read the documentation of the pyramid_reduce function, I notice the parameter multichannel:
multichannel: bool,optional
Whether the last axis of the image is to be
interpreted as multiple channels or another spatial dimension.
So I would suggest you to set that to True, otherwise he is treating your 2D color images as a 3D BW image:
resized = pyramid_reduce(img, downscale=4, multichannel=True)

Skimage rgb2gray reduces one dimension

I am trying to convert multiple RGB images to grayscale. However, I am loosing one dimension
# img is an array of 10 images of 32x32 dimensions in RGB
from skimage.color import rgb2gray
print(img.shape) # (10, 32, 32, 3)
img1 = rgb2gray(img)
print(img.shape) # (10, 32, 3)
As you can see, though the shape of img is expected to be (10, 32, 32, 1), it is coming out as (10, 32, 3).
What point am I missing?
This function assumes the input to be one single image of dims 3 or 4 (with alpha).
(As your input has 4 dimensions, it's interpreted as single image of RGB + Alpha; not as N images of 3 dimensions)
If you got multiple images you will need to loop somehow, like this (untested):
import numpy as np
from skimage.color import rgb2gray
print(img.shape) # (10, 32, 32, 3)
img_resized = np.stack([rgb2gray(img[i]) for i in range(img.shape[0])])

Discarding alpha channel from images stored as Numpy arrays

I load images with numpy/scikit. I know that all images are 200x200 pixels.
When the images are loaded, I notice some have an alpha channel, and therefore have shape (200, 200, 4) instead of (200, 200, 3) which I expect.
Is there a way to delete that last value, discarding the alpha channel and get all images to a nice (200, 200, 3) shape?
Just slice the array to get the first three entries of the last dimension:
image_without_alpha = image[:,:,:3]
scikit-image builtin:
from skimage.color import rgba2rgb
from skimage import data
img_rgba = data.logo()
img_rgb = rgba2rgb(img_rgba)
https://scikit-image.org/docs/dev/user_guide/transforming_image_data.html#conversion-from-rgba-to-rgb-removing-alpha-channel-through-alpha-blending
https://scikit-image.org/docs/dev/api/skimage.color.html#rgba2rgb
Use PIL.Image to remove the alpha channel
from PIL import Image
import numpy as np
img = Image.open("c:\>path_to_image")
img = img.convert("RGB") # remove alpha
image_array = np.asarray(img) # converting image to numpy array
print(image_array.shape)
img.show()
If images are in numpy array to convert the array to Image use Image.fromarray to convert array to Image
pilImage = Image.fromarray(numpy_array)

Categories