I'm trying resize images retrieved from cifar10 in the original 32x32 to 96x96 for use with MobileNetV2, howevery I'm running into this error. Tried a variety of solutions but nothing seems to work.
My code:
for a in range(len(train_images)):
train_images[a] = cv2.resize(train_images[a], dsize=(minSize, minSize), interpolation=cv2.INTER_CUBIC)
Error I'm getting:
----> 8 train_images[a] = cv2.resize(train_images[a], dsize=(minSize, minSize), interpolation=cv2.INTER_CUBIC)
ValueError: could not broadcast input array from shape (96,96,3) into shape (32,32,3)
Sometimes you have to convert the image from RGB to grayscale. If that is the problem, the only thing you should do is gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY), resize the image and then again resized_image = cv2.cvtColor(gray_image, cv2.COLOR_GRAY2RGB)
I have never run into this error but if the first option doesn't work, you can try and resize image with pillow like this:
from PIL import Image
im = Image.fromarray(cv2_image)
nx, ny = im.size
im2 = im.resize((nx*2, ny*2), Image.LANCZOS)
cv2_image = cv2.cvtColor(numpy.array(im2), cv2.COLOR_RGB2BGR)
You can make this into a function and call it in the list comprehension. I hope this solves your problem :)
This is simply because you are reading the 32x32 image from train_images and trying to save the reshaped image (96x96) in the same array which is impossible!
Try something like:
train_images_reshaped = np.array((num_images, 96, 96, 3))
for a in range(len(train_images)):
train_images_reshaped[a] = cv2.resize(train_images[a], dsize=(minSize, minSize), interpolation=cv2.INTER_CUBIC)
There are some interpolation algorithms in OpenCV. Such as-
INTER_NEAREST – a nearest-neighbor interpolation
INTER_LINEAR – a bilinear interpolation (used by default)
INTER_AREA – resampling using pixel area relation. It may be a
preferred method for image decimation, as it gives moire’-free
results. But when the image is zoomed, it is similar to the
INTER_NEAREST method.
INTER_CUBIC – a bicubic interpolation over 4×4 pixel neighborhood
INTER_LANCZOS4 – a Lanczos interpolation over 8×8 pixel neighborhood
Code:
image_scaled=cv2.resize(image,None,fx=.75,fy=.75,interpolation = cv2.INTER_LINEAR)
img_double=cv2.resize(image,None,fx=2,fy=2,interpolation=cv2.INTER_CUBIC)
image_resize=cv2.resize(image,(200,300),interpolation=cv2.INTER_AREA)
image_resize=cv2.resize(image,(500,400),interpolation=cv2.INTER_LANCZOS4)
You can find the details about python implementation here as well: How to resize images in OpenCV python
Related
I have the following image:
Original Image
I am using the following code to resize this image to 1600x1200.
img = cv2.imread('R.png')
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray_image.resize(1600,1200)
I am then returned the following image:
Final Image
I have tried to fix this by using different image formats (jpg, tif), but this does not seem to help. I also tried using different interpolation algorithms like INTER_NEAREST and INTER_LINEAR, and these produce the same results.
Does anyone have an idea?
You are calling the resize() function on the numpy array that represents the grayscale image, which only changes the shape of the array. You should use the resize() function from OpenCV:
img = cv2.imread('R.png')
resized_image = cv2.resize(img, (1600, 1200), interpolation = cv2.INTER_LINEAR)
Besides of that, I think you have mistakenly swapped the width and height of the image, it should be 1200 x 1200 to keep the scale.
In computer vision course the teacher says that first of all image should be normalized to remove brightness variations.
The link for the video https://youtu.be/0WNiYrRjJbM
The formula looks like below:
I = I/||I||, where I is an image, ||I|| is the magnitude of this image.
Could somebody explain how to implement this normalization using python and any library, opencv for instance. May be there is already exists such function in some library and ready to use?
What I think is the magnitude of an image calculates like m=sqrt(sum(v*v)), where v - is the array of values for each point after converting image to hsv. And then I=v/m, each point value divided by magnitude. But this doesn't work. It looks strange.
Thanks.
Below is the small code i wrote which does image normalization.
import numpy as np
import cv2
img = cv2.imread("../images/segmentation/peppers_BlueHills.png")
print("img shape = ", img.shape)
print("img type = ", img.dtype)
print("img[0][0]", img[0][0])
#2-norm
norm = np.linalg.norm(img)
print("img norm = ", norm)
img2 = img / norm
#here img2 becomes float64, reducing it to float32
img2 = np.float32(img2)
print("img2 type = ", img2.dtype)
print("img2[0][0]", img2[0][0])
cv2.imwrite('../images/segmentation/NormalizedPeppers_BlueHills.tif', img2)
cv2.imshow('normalizedImg', img2.astype(np.uint8))
cv2.waitKey(0)
cv2.destroyAllWindows()
exit(0)
The output looks like below:
img shape = (384, 512, 3)
img type = uint8
img[0][0] [64 29 62]
img norm = 78180.45637497904
img2 type = float32
img2[0][0] [0.00081862 0.00037094 0.00079304]
The output image looks like black square.
However it's possible to equalize brightness in Photoshop for instance, to see something.
Each channel (R,G,B) becomes float and only tiff format supports it.
To me it's still not clear what it gives us to divide each pixel brightness by some value, in this case it's 2-norm value of an image. It just makes an image too dark and unreadable. But it doesn't equalize brightness to make it even across entire image.
What do you think about?
I have a greyscale image that, as a numpy array, has a maximal value of 91, but if it is first converted from grayscale to RGB, its maximal value (across all channels) is 255. What formula is being used here? When viewing the images using im.show() they look identical. I checked the PIL source code for 'convert' (link) but it doesn't explicitly state how a greyscale image is converted to RGB.
I run the following:
im = PIL.Image.open(path_to_greyscale_image)
im_max_grey = max(np.asarray(im).flatten())
im = im.convert('RGB')
im_max_rgb = max(np.asarray(im).flatten())
I encountered this puzzling situation when trying to get rid of the third dimension (the RGB dimension) of my images in order to feed them to a Knn classifier for face recognition.
I took one colored face image from the Labeled-face-in-the-wild database as an example. It is saved locally.
I first imported the image, then converted it to grayscale, then checked dimension (time1), then exported with "imwrite", then imported the gray scale image again, then checked its dimension again (time2).
At (time1), the dimension was 2: (250, 250). However, at (time2), the dimension became 3: (250, 250, 3). Why would exporting and importing change the dimension of the gray scale picture? What should I specify when importing the gray scale picture to keep it 2 dimensional?
Here is my python code:
import cv2
import matplotlib.pyplot as plt
imgBGR = cv2.imread("path/filename")
gray = cv2.cvtColor(imgBGR, cv2.COLOR_BGR2GRAY)
gray.shape # this gives me (250, 250)
cv2.imwrite("path/newname", gray)
gray2 = cv2.imread("path/newname")
gray2.shape # this gives me (250, 250, 3)
Try gray2 = cv2.imread("path/newname" , cv2.IMREAD_GRAYSCALE)
As Opencv imread documentaion, the default is cv2.IMREAD_COLOR, so with setting the flag the default setting of cv2.imread is reading image in colour, so it will split a greyscale image into 3 channels.
By specific cv2.imread("path/newname" , cv2.IMREAD_GRAYSCALE), the function will read in image in grayscale.
I am trying to resize a .jpg image with skimage.transform.resize function. Function returns me weird result (see image below). I am not sure if it is a bug or just wrong use of the function.
import numpy as np
from skimage import io, color
from skimage.transform import resize
rgb = io.imread("../../small_dataset/" + file)
# show original image
img = Image.fromarray(rgb, 'RGB')
img.show()
rgb = resize(rgb, (256, 256))
# show resized image
img = Image.fromarray(rgb, 'RGB')
img.show()
Original image:
Resized image:
I allready checked skimage resize giving weird output, but I think that my bug has different propeties.
Update: Also rgb2lab function has similar bug.
The problem is that skimage is converting the pixel data type of your array after resizing the image. The original image has a 8 bits per pixel, of type numpy.uint8, and the resized pixels are numpy.float64 variables.
The resize operation is correct, but the result is not being correctly displayed. For solving this issue, I propose 2 different approaches:
To change the data structure of the resulting image. Prior to changing to uint8 values, the pixels have to be converted to a 0-255 scale, as they are on a 0-1 normalized scale:
# ...
# Do the OP operations ...
resized_image = resize(rgb, (256, 256))
# Convert the image to a 0-255 scale.
rescaled_image = 255 * resized_image
# Convert to integer data type pixels.
final_image = rescaled_image.astype(np.uint8)
# show resized image
img = Image.fromarray(final_image, 'RGB')
img.show()
Update: This method is deprecated, as per scipy.misc.imshow
To use another library for displaying the image. Taking a look at the Image library documentation, there isn't any mode supporting 3xfloat64 pixel images. However, the scipy.misc library has the appropriate tools for converting the array format in order to display it correctly:
from scipy import misc
# ...
# Do OP operations
misc.imshow(resized_image)