I am trying to resize a .jpg image with skimage.transform.resize function. Function returns me weird result (see image below). I am not sure if it is a bug or just wrong use of the function.
import numpy as np
from skimage import io, color
from skimage.transform import resize
rgb = io.imread("../../small_dataset/" + file)
# show original image
img = Image.fromarray(rgb, 'RGB')
img.show()
rgb = resize(rgb, (256, 256))
# show resized image
img = Image.fromarray(rgb, 'RGB')
img.show()
Original image:
Resized image:
I allready checked skimage resize giving weird output, but I think that my bug has different propeties.
Update: Also rgb2lab function has similar bug.
The problem is that skimage is converting the pixel data type of your array after resizing the image. The original image has a 8 bits per pixel, of type numpy.uint8, and the resized pixels are numpy.float64 variables.
The resize operation is correct, but the result is not being correctly displayed. For solving this issue, I propose 2 different approaches:
To change the data structure of the resulting image. Prior to changing to uint8 values, the pixels have to be converted to a 0-255 scale, as they are on a 0-1 normalized scale:
# ...
# Do the OP operations ...
resized_image = resize(rgb, (256, 256))
# Convert the image to a 0-255 scale.
rescaled_image = 255 * resized_image
# Convert to integer data type pixels.
final_image = rescaled_image.astype(np.uint8)
# show resized image
img = Image.fromarray(final_image, 'RGB')
img.show()
Update: This method is deprecated, as per scipy.misc.imshow
To use another library for displaying the image. Taking a look at the Image library documentation, there isn't any mode supporting 3xfloat64 pixel images. However, the scipy.misc library has the appropriate tools for converting the array format in order to display it correctly:
from scipy import misc
# ...
# Do OP operations
misc.imshow(resized_image)
Related
I have the following image:
Original Image
I am using the following code to resize this image to 1600x1200.
img = cv2.imread('R.png')
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray_image.resize(1600,1200)
I am then returned the following image:
Final Image
I have tried to fix this by using different image formats (jpg, tif), but this does not seem to help. I also tried using different interpolation algorithms like INTER_NEAREST and INTER_LINEAR, and these produce the same results.
Does anyone have an idea?
You are calling the resize() function on the numpy array that represents the grayscale image, which only changes the shape of the array. You should use the resize() function from OpenCV:
img = cv2.imread('R.png')
resized_image = cv2.resize(img, (1600, 1200), interpolation = cv2.INTER_LINEAR)
Besides of that, I think you have mistakenly swapped the width and height of the image, it should be 1200 x 1200 to keep the scale.
In computer vision course the teacher says that first of all image should be normalized to remove brightness variations.
The link for the video https://youtu.be/0WNiYrRjJbM
The formula looks like below:
I = I/||I||, where I is an image, ||I|| is the magnitude of this image.
Could somebody explain how to implement this normalization using python and any library, opencv for instance. May be there is already exists such function in some library and ready to use?
What I think is the magnitude of an image calculates like m=sqrt(sum(v*v)), where v - is the array of values for each point after converting image to hsv. And then I=v/m, each point value divided by magnitude. But this doesn't work. It looks strange.
Thanks.
Below is the small code i wrote which does image normalization.
import numpy as np
import cv2
img = cv2.imread("../images/segmentation/peppers_BlueHills.png")
print("img shape = ", img.shape)
print("img type = ", img.dtype)
print("img[0][0]", img[0][0])
#2-norm
norm = np.linalg.norm(img)
print("img norm = ", norm)
img2 = img / norm
#here img2 becomes float64, reducing it to float32
img2 = np.float32(img2)
print("img2 type = ", img2.dtype)
print("img2[0][0]", img2[0][0])
cv2.imwrite('../images/segmentation/NormalizedPeppers_BlueHills.tif', img2)
cv2.imshow('normalizedImg', img2.astype(np.uint8))
cv2.waitKey(0)
cv2.destroyAllWindows()
exit(0)
The output looks like below:
img shape = (384, 512, 3)
img type = uint8
img[0][0] [64 29 62]
img norm = 78180.45637497904
img2 type = float32
img2[0][0] [0.00081862 0.00037094 0.00079304]
The output image looks like black square.
However it's possible to equalize brightness in Photoshop for instance, to see something.
Each channel (R,G,B) becomes float and only tiff format supports it.
To me it's still not clear what it gives us to divide each pixel brightness by some value, in this case it's 2-norm value of an image. It just makes an image too dark and unreadable. But it doesn't equalize brightness to make it even across entire image.
What do you think about?
I found this method really helpful and it's actually working quite accurately. BUT this uses OpenCV.. and I want to use the same method using PIL.
code using PIL instead of OpenCV:
from PIL import Image
import numpy as np
###test image
img=Image.open('')
img=img.load()
### splitting b,g,r channels
r,g,b=img.split()
### getting differences between (b,g), (r,g), (b,r) channel pixels
r_g=np.count_nonzero(abs(r-g))
r_b=np.count_nonzero(abs(r-b))
g_b=np.count_nonzero(abs(g-b))
### sum of differences
diff_sum=float(r_g+r_b+g_b)
### finding ratio of diff_sum with respect to size of image
ratio=diff_sum/img.size
if ratio>0.005:
print("image is color")
else:
print("image is greyscale")
I changed cv2.imread('') to Image.open('') and added img=img.load().
and I changed b,g,r=cv2.split(img) to r,g,b=img.split()
I know that split() method exists in PIL. but I'm having this error.
AttributeError: 'PixelAccess' object has no attribute 'split'
How can I solve this?
Thank you in advance!!
You are mixing data types like you are mixing Red Bull and Vodka.
The load method is producing the error because it converts the PIL image into a PixelAccess object, an you need a PIL image for split(). Also, count_nonzero() does not work because it operates on NumPy arrays, and you are attempting to call that method on a PIL image. Lastly, size returns a tuple (width and height) of the image, so you need to modify your code accordingly:
from PIL import Image
import numpy as np
###test image
img=Image.open("D://opencvImages//lena512.png")
### splitting b,g,r channels
r,g,b=img.split()
### PIL to numpy conversion:
r = np.array(r)
g = np.array(g)
b = np.array(b)
### getting differences between (b,g), (r,g), (b,r) channel pixels
r_g=np.count_nonzero(abs(r-g))
r_b=np.count_nonzero(abs(r-b))
g_b=np.count_nonzero(abs(g-b))
### sum of differences
diff_sum=float(r_g+r_b+g_b)
### get image size:
width, height = img.size
### get total pixels on image:
totalPixels = width * height
### finding ratio of diff_sum with respect to size of image
ratio = diff_sum/totalPixels
print("Ratio is: "+ratio)
if ratio>0.005:
print("image is color")
else:
print("image is greyscale")
Let's check out the Lena image in color and grayscale:
Color Lena returns this:
Ratio is: 2.981109619140625
image is color
And Grayscale Lena returns this:
Ratio is: 0.0
image is greyscale
I'm trying resize images retrieved from cifar10 in the original 32x32 to 96x96 for use with MobileNetV2, howevery I'm running into this error. Tried a variety of solutions but nothing seems to work.
My code:
for a in range(len(train_images)):
train_images[a] = cv2.resize(train_images[a], dsize=(minSize, minSize), interpolation=cv2.INTER_CUBIC)
Error I'm getting:
----> 8 train_images[a] = cv2.resize(train_images[a], dsize=(minSize, minSize), interpolation=cv2.INTER_CUBIC)
ValueError: could not broadcast input array from shape (96,96,3) into shape (32,32,3)
Sometimes you have to convert the image from RGB to grayscale. If that is the problem, the only thing you should do is gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY), resize the image and then again resized_image = cv2.cvtColor(gray_image, cv2.COLOR_GRAY2RGB)
I have never run into this error but if the first option doesn't work, you can try and resize image with pillow like this:
from PIL import Image
im = Image.fromarray(cv2_image)
nx, ny = im.size
im2 = im.resize((nx*2, ny*2), Image.LANCZOS)
cv2_image = cv2.cvtColor(numpy.array(im2), cv2.COLOR_RGB2BGR)
You can make this into a function and call it in the list comprehension. I hope this solves your problem :)
This is simply because you are reading the 32x32 image from train_images and trying to save the reshaped image (96x96) in the same array which is impossible!
Try something like:
train_images_reshaped = np.array((num_images, 96, 96, 3))
for a in range(len(train_images)):
train_images_reshaped[a] = cv2.resize(train_images[a], dsize=(minSize, minSize), interpolation=cv2.INTER_CUBIC)
There are some interpolation algorithms in OpenCV. Such as-
INTER_NEAREST – a nearest-neighbor interpolation
INTER_LINEAR – a bilinear interpolation (used by default)
INTER_AREA – resampling using pixel area relation. It may be a
preferred method for image decimation, as it gives moire’-free
results. But when the image is zoomed, it is similar to the
INTER_NEAREST method.
INTER_CUBIC – a bicubic interpolation over 4×4 pixel neighborhood
INTER_LANCZOS4 – a Lanczos interpolation over 8×8 pixel neighborhood
Code:
image_scaled=cv2.resize(image,None,fx=.75,fy=.75,interpolation = cv2.INTER_LINEAR)
img_double=cv2.resize(image,None,fx=2,fy=2,interpolation=cv2.INTER_CUBIC)
image_resize=cv2.resize(image,(200,300),interpolation=cv2.INTER_AREA)
image_resize=cv2.resize(image,(500,400),interpolation=cv2.INTER_LANCZOS4)
You can find the details about python implementation here as well: How to resize images in OpenCV python
I'm trying to open an RGB picture, convert it to grayscale, then represent it as a list of floats scaled from 0 to 1. At last, I want to convert it back again to an Image. However, in the code below, something in my conversion procedure fails, as img.show() (the original image) displays correctly while img2.show() display an all black picture. What am I missing?
import numpy as np
from PIL import Image
ocr_img_path = "./ocr-test.jpg"
# Open image, convert to grayscale
img = Image.open(ocr_img_path).convert("L")
# Convert to list
img_data = img.getdata()
img_as_list = np.asarray(img_data, dtype=float) / 255
img_as_list = img_as_list.reshape(img.size)
# Convert back to image
img_mul = img_as_list * 255
img_ints = np.rint(img_mul)
img2 = Image.new("L", img_as_list.shape)
img2.putdata(img_ints.astype(int))
img.show()
img2.show()
The image used
The solution is to flatten the array before putting it into the image. I think PIL interprets multidimensional arrays as different color bands.
img2.putdata(img_ints.astype(int).flatten())
For a more efficient way of loading images, check out
https://blog.eduardovalle.com/2015/08/25/input-images-theano/
but use image.tobytes() (Pillow) instead of image.tostring() (PIL).
.