I'm trying to mimic what this line of code does, using imageio :
img_array = scipy.misc.imread('/Users/user/Desktop/IMG_5.png', flatten=True)
img_data = 255.0 - img_array.reshape(784)`
However when using imageio I get:
img = imageio.imread('/Users/user/Desktop/IMG_5.png')
img.flatten()
Output: Image([212, 211, 209, ..., 192, 190, 191], dtype=uint8)
img.reshape(1, 784)
ValueError: cannot reshape array of size 2352 into shape (1,784)
Can someone explain what is going on here, why is my image size 2352? I resized the image to 28x28 pixels before importing it.
I know this question already has an accepted answer, however, it implies to use skimage library instead of imageio as the question (and scipy) suggest. So here it goes.
According to imageio's doc on translating from scipy, you should change flatten argument by as_gray argument.
So this line:
img_array = scipy.misc.imread('/Users/user/Desktop/IMG_5.png', flatten=True)
should gives you same result as this:
img_array = imageio.imread('/Users/user/Desktop/IMG_5.png', as_gray=True)
It worked for me. If it didn't work for you, perhaps there is another problem. Providing an image as an example might help.
An RGB image has three channels, so 784 pixels three times is 2352. Shouldn't you save the results of img.flatten() in a variable? img_flat = img.flatten(). If you do this you should get the three color layers flatten to one gray-scale layer, then you can reshape it.
Edit: It's probably going to be easier to just use skimage in the same fashion you used the deprecated scipy:
from skimage import transform,io
# read in grey-scale
grey = io.imread('your_image.png', as_grey=True)
# resize to 28x28
small_grey = transform.resize(grey, (28,28), mode='symmetric', preserve_range=True)
# reshape to (1,784)
reshape_img = small_grey.reshape(1, 784)
Related
I have some code that I am using with tensorflow datasets.
It's worked fine previously and it may still work. But I don't think so
img = parse_image(img_paths[0])
img = tf.image.resize(img, [224, 224])
plt.imshow(img)
Just outputs a blank 224x224 canvas.
img = parse_image(img_paths[0])
plt.imshow(img)
outputs the image correctly.
img_paths is a list of strings with pathnames
I have tried:
img = parse_image(img_paths[0])
img = tf.image.resize([img], [224, 224])
plt.imshow(img[0])
and
img = parse_image(img_paths[0])
img = tf.image.resize(img, [224, 224])
plt.imshow(img.numpy())
and
img = parse_image(img_paths[0])
img = tf.image.resize([img], [224, 224])
plt.imshow(img.numpy()[0])
The shape is correct and this code has worked before.
And may still work, I'm thinking I may not use it correctly anymore (been a while since I wrote it).
thanks for any hints or thoughts you can provide? And of course solutions ;-)
The "problem" is with Matplotlib. When you resize with Tensorflow, it turns your input to float. Matplotlib accepts two image formats, integers between 0-255 and floats between 0 and 1. If you call plt.imshow() on floats of more than 1, it will clip all values and you'll see a white image. It's like you would give Matplotlib only pixels at 1.0 (or 255).
tf.image.convert_image_dtype has a saturate argument, and its default value makes it that the 0-255 integer range becomes 0-1 float. This is why it "works", because Matplotlib understands that format. After this, the Tensorflow resizing operation keeps it between 0-1 too, so it works.
Huh,
I saw something elsewhere and added this line:
img = tf.image.convert_image_dtype(img, tf.float32)
before resizing and it worked.
This is extremely weird because I didn't need this line before. Maybe due to a version update?
Either way this works:
img = parse_image(train_img_paths[0])
img = tf.image.convert_image_dtype(img, tf.float32)
img = tf.image.resize(img, [224, 224])
plt.imshow(img)
I have an array of image pixel values that I would like to upscale for input into my neural network. It is an array of shape (28000, 48, 48, 1). These are normalized image pixel values and would like to upscale these to a higher resolution for input into my CNN. The arrays look like this...
array([[[[-0.6098866 ],
[-0.4592209 ],
[-0.40325198],
...,
[-0.7694696 ],
[-0.90518403],
[-0.95160526]],
[[-0.66049284],
[-0.68162924],
[-0.694159 ],
Both my X_train and y_train image arrays have shape of (28000,48,48,1). I would like to upscale or resize these 28000 image arrays to size 75x75. Please help. Should I convert arrays back to non-normalized arrays or images and then maybe use cv2 to upscale? How would I do this?
One easy way to resize images is using the Python module PIL (Python Image Library), which you can install with pip install pillow. Example below to demonstrate resizing a single image:
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
# Open image
panda_pil = Image.open("panda.jpg")
print(np.array(panda_pil).shape)
# (613, 696, 3)
panda_pil_resized = panda_pil.resize((75, 75))
print(np.array(panda_pil_resized).shape)
# (75, 75, 3)
plt.imshow(np.array(panda_pil_resized))
plt.show()
You can download the panda image as follows:
import urllib.request
panda_fname = "panda.jpg"
panda_url = "https://upload.wikimedia.org/wikipedia/commons/f/fe/Giant_Panda_in_Beijing_Zoo_1.JPG"
urllib.request.urlretrieve(panda_url, panda_fname)
To resize all 28000 images, one approach would be to do this as a preprocessing step in a for-loop, and save the images to a numpy array.
Edit: You can loop through your original 28000x2304 image array and upscale each image individually in a for-loop. To get the PIL.Image object from a np.ndarray object, you can use Pil.Image.from_array, as shown below (I have just generated a random array of Gaussian noise but it should work the same with your images):
import numpy as np
from PIL import Image
from time import perf_counter
old_width, old_height = 48, 48
new_width, new_height = 75, 75
num_images = 28000
old_image_array = np.random.normal(size=[num_images, old_width*old_height])
new_image_array = np.empty(shape=[num_images, new_width*new_height])
print("Starting conversion...")
t0 = perf_counter()
# Loop over each image individually
for i in range(num_images):
# Get the ith image and reshape
old_image = old_image_array[i].reshape(old_width, old_height)
# Convert to PIL.Image
old_image_pil = Image.fromarray(old_image)
# Upscale resolution
new_image_pil = old_image_pil.resize((new_width, new_height))
# Convert to numpy array
new_image = np.array(new_image_pil)
# Reshape and store in new image array
new_image_array[i] = new_image.reshape(new_width*new_height)
t1 = perf_counter()
print("Time taken = {:.3f} s".format(t1 - t0))
print(old_image_array.shape, new_image_array.shape)
Console output:
Starting conversion...
Time taken = 2.771 s
(28000, 2304) (28000, 5625)
There may well be a more efficient way of doing this, but this method is simple, and uses tools which are useful to know about if you don't know about them already (PIL is a good module for manipulating images, see this blog post if you want to learn more about PIL).
I have an image stored in a numpy array that I want to convert to PIL.Image in order to perform an interpolation only available with PIL.
When trying to convert it through Image.fromarray() it raises the following error:
TypeError: Cannot handle this data type
I have read the answers here and here but they do not seem to help in my situation.
What I'm trying to run:
from PIL import Image
x # a numpy array representing an image, shape: (256, 256, 3)
Image.fromarray(x)
tl;dr
Does x contain uint values in [0, 255]? If not and especially if x ranges from 0 to 1, that is the reason for the error.
Explanation
Most image libraries (e.g. matplotlib, opencv, scikit-image) have two ways of representing images:
as uint with values ranging from 0 to 255.
as float with values ranging from 0 to 1.
The latter is more convenient when performing operations between images and thus is more popular in the field of Computer Vision.
However PIL seems to not support it for RGB images.
If you take a look here
it seems that when you try to read an image from an array, if the array has a shape of (height, width, 3) it automatically assumes it's an RGB image and expects it to have a dtype of uint8!
In your case, however, you have an RBG image with float values from 0 to 1.
Solution
You can fix it by converting your image to the format expected by PIL:
im = Image.fromarray((x * 255).astype(np.uint8))
I solved it different way.
Problem Situation:
When working with gray image or binary image, if the numpy array shape is (height, width, 1), this error will be raised also.
For example, a 32 by 32 pixel gray image (value 0 to 255)
np_img = np.random.randint(low=0, high=255, size=(32, 32, 1), dtype=np.uint8)
# np_img.shape == (32, 32, 1)
pil_img = Image.fromarray(np_img)
will raise TypeError: Cannot handle this data type: (1, 1, 1), |u1
Solution:
If the image shape is like (32, 32, 1), reduce dimension into (32, 32)
np_img = np.squeeze(np_img, axis=2) # axis=2 is channel dimension
pil_img = Image.fromarray(np_img)
This time it works!!
Additionally, please make sure the dtype is uint8(for gray) or bool(for binary).
In my case it was only because I forgotted to add the "RGB" arg in the "fromarray" func.
pil_img = Image.fromarray(np_img, 'RGB')
I found a different issue for the same error in my case. The image I used was in RGBA format, so before using fromarray() function just convert it to RGB using the convert() function and it will work perfectly fine.
image_file = Image.open(image_file)
image_file = image_file.convert('RGB')
P.S.: Posting this solution as an initial step, before converting the image to np.
In my case file format of the images was changed to png to jpg. It worked well when I corrected the image format of the error images.
According to the scikit-image's documentation, you can convert an image to unsigned byte format, with values in [0, 255] using img_as_ubyte:
from skimage import img_as_ubyte
from PIL import Image
new_image = Image.fromarray(img_as_ubyte(image))
I'm trying resize images retrieved from cifar10 in the original 32x32 to 96x96 for use with MobileNetV2, howevery I'm running into this error. Tried a variety of solutions but nothing seems to work.
My code:
for a in range(len(train_images)):
train_images[a] = cv2.resize(train_images[a], dsize=(minSize, minSize), interpolation=cv2.INTER_CUBIC)
Error I'm getting:
----> 8 train_images[a] = cv2.resize(train_images[a], dsize=(minSize, minSize), interpolation=cv2.INTER_CUBIC)
ValueError: could not broadcast input array from shape (96,96,3) into shape (32,32,3)
Sometimes you have to convert the image from RGB to grayscale. If that is the problem, the only thing you should do is gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY), resize the image and then again resized_image = cv2.cvtColor(gray_image, cv2.COLOR_GRAY2RGB)
I have never run into this error but if the first option doesn't work, you can try and resize image with pillow like this:
from PIL import Image
im = Image.fromarray(cv2_image)
nx, ny = im.size
im2 = im.resize((nx*2, ny*2), Image.LANCZOS)
cv2_image = cv2.cvtColor(numpy.array(im2), cv2.COLOR_RGB2BGR)
You can make this into a function and call it in the list comprehension. I hope this solves your problem :)
This is simply because you are reading the 32x32 image from train_images and trying to save the reshaped image (96x96) in the same array which is impossible!
Try something like:
train_images_reshaped = np.array((num_images, 96, 96, 3))
for a in range(len(train_images)):
train_images_reshaped[a] = cv2.resize(train_images[a], dsize=(minSize, minSize), interpolation=cv2.INTER_CUBIC)
There are some interpolation algorithms in OpenCV. Such as-
INTER_NEAREST – a nearest-neighbor interpolation
INTER_LINEAR – a bilinear interpolation (used by default)
INTER_AREA – resampling using pixel area relation. It may be a
preferred method for image decimation, as it gives moire’-free
results. But when the image is zoomed, it is similar to the
INTER_NEAREST method.
INTER_CUBIC – a bicubic interpolation over 4×4 pixel neighborhood
INTER_LANCZOS4 – a Lanczos interpolation over 8×8 pixel neighborhood
Code:
image_scaled=cv2.resize(image,None,fx=.75,fy=.75,interpolation = cv2.INTER_LINEAR)
img_double=cv2.resize(image,None,fx=2,fy=2,interpolation=cv2.INTER_CUBIC)
image_resize=cv2.resize(image,(200,300),interpolation=cv2.INTER_AREA)
image_resize=cv2.resize(image,(500,400),interpolation=cv2.INTER_LANCZOS4)
You can find the details about python implementation here as well: How to resize images in OpenCV python
I've trained a Handwritten image classifier using Keras library in Python. Initially I've used standard MNIST dataset for training and testing purpose. But now I want to use my own data set for testing, in which all the images are size 900*1200*3 instead of 28*28*1
So I need to reshape all the images before testing. I'm using following code to reshape but it give errors.
Code:
bb = lol.reshape(lol.shape[0], 28, 28, 1).astype('float32')
where lol is my numpy array containing 55 images of shape (900,1200,3)
and the Error log is as following:
ValueError Traceback (most recent call last)
<ipython-input-46-87da95da73e9> in <module>()
24 # # you can show every image
25 # img.show()
---> 26 bb = lol.reshape(lol.shape[0], 28, 28, 1).astype('float32')
27 # model = loaded_model
28 # classes = model.predict(bb)
ValueError: cannot reshape array of size 178200000 into shape (55,28,28,1)
So what am I doing wrong? Can I get accurate predictions even after resizing the large images to very small images of 28*28? Thanks for help.
What you are doing is wrong. You can't reshape an array of (55, 900, 1200, 3) into an array of (55, 28, 28, 1), because you are trying to store 55*900*1200*3=178200000 elements in an array that can store only 55*28*28=43120 elements.
You want to do two things:
1) Convert your rgb image (indicated by the last dimension which is the 3 channels) into grayscale (1 channel). The simplest way to do this is (R+B+G)/3. All python libraries that have to do with images (PIL, OpenCV, skimage, tensorflow, keras, etc) have this already implemented. Example:
from skimage.color import rgb2gray
gray = rgb2gray(original)
2) Resize the image from 900x1200 to 28x28. Again you can do this in all major image-related python libraries. Example:
from skimage.transform import resize
resized = resize(gray, (28,28))
Now if you want to do this in all 55 images you can either write a function that transforms one image and map it across your array, or use a simple for loop and populate your new array one image at a time.
In your case the code should look something like this:
num_images = lol.shape[0] # 55 in your case
resized_images = np.zeros(shape=(num_images, 28, 28, 1)) # your final array
for i in range(num_images):
gray = rgb2gray(lol[i,:,:,:]) # gray.shape should be (900,1200,1)
resized = resize(gray, (28,28)) # resized.shape should be (28,28,1)
resized_images[i,:,:,:] = resized # resized_images.shape should be (55,28,28,1)
It would be more intuitive to process each image individually, which would also give you the best chance of preserving some information.
Try using the PIL library:
import numpy
from PIL import Image
lol = numpy.zeros((55,900,1200,3),dtype=numpy.uint8)
new_array = numpy.zeros((lol.shape[0],28,28),dtype=numpy.float32)
for i in range(lol.shape[0]):
img = Image.fromarray(lol[i])
img_resize = img.resize((28,28))
img_mono = img_resize.convert('L')
arr = numpy.array(img_mono,dtype=numpy.uint8)
new_array[i] = arr