I have an array of image pixel values that I would like to upscale for input into my neural network. It is an array of shape (28000, 48, 48, 1). These are normalized image pixel values and would like to upscale these to a higher resolution for input into my CNN. The arrays look like this...
array([[[[-0.6098866 ],
[-0.4592209 ],
[-0.40325198],
...,
[-0.7694696 ],
[-0.90518403],
[-0.95160526]],
[[-0.66049284],
[-0.68162924],
[-0.694159 ],
Both my X_train and y_train image arrays have shape of (28000,48,48,1). I would like to upscale or resize these 28000 image arrays to size 75x75. Please help. Should I convert arrays back to non-normalized arrays or images and then maybe use cv2 to upscale? How would I do this?
One easy way to resize images is using the Python module PIL (Python Image Library), which you can install with pip install pillow. Example below to demonstrate resizing a single image:
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
# Open image
panda_pil = Image.open("panda.jpg")
print(np.array(panda_pil).shape)
# (613, 696, 3)
panda_pil_resized = panda_pil.resize((75, 75))
print(np.array(panda_pil_resized).shape)
# (75, 75, 3)
plt.imshow(np.array(panda_pil_resized))
plt.show()
You can download the panda image as follows:
import urllib.request
panda_fname = "panda.jpg"
panda_url = "https://upload.wikimedia.org/wikipedia/commons/f/fe/Giant_Panda_in_Beijing_Zoo_1.JPG"
urllib.request.urlretrieve(panda_url, panda_fname)
To resize all 28000 images, one approach would be to do this as a preprocessing step in a for-loop, and save the images to a numpy array.
Edit: You can loop through your original 28000x2304 image array and upscale each image individually in a for-loop. To get the PIL.Image object from a np.ndarray object, you can use Pil.Image.from_array, as shown below (I have just generated a random array of Gaussian noise but it should work the same with your images):
import numpy as np
from PIL import Image
from time import perf_counter
old_width, old_height = 48, 48
new_width, new_height = 75, 75
num_images = 28000
old_image_array = np.random.normal(size=[num_images, old_width*old_height])
new_image_array = np.empty(shape=[num_images, new_width*new_height])
print("Starting conversion...")
t0 = perf_counter()
# Loop over each image individually
for i in range(num_images):
# Get the ith image and reshape
old_image = old_image_array[i].reshape(old_width, old_height)
# Convert to PIL.Image
old_image_pil = Image.fromarray(old_image)
# Upscale resolution
new_image_pil = old_image_pil.resize((new_width, new_height))
# Convert to numpy array
new_image = np.array(new_image_pil)
# Reshape and store in new image array
new_image_array[i] = new_image.reshape(new_width*new_height)
t1 = perf_counter()
print("Time taken = {:.3f} s".format(t1 - t0))
print(old_image_array.shape, new_image_array.shape)
Console output:
Starting conversion...
Time taken = 2.771 s
(28000, 2304) (28000, 5625)
There may well be a more efficient way of doing this, but this method is simple, and uses tools which are useful to know about if you don't know about them already (PIL is a good module for manipulating images, see this blog post if you want to learn more about PIL).
Related
I am trying to create a random image using NUMPY. First I am creating a random 3D array as it should be in the case of an image e.g. (177,284,3).
random_im = np.random.rand(177,284,3)
data = np.array(random_im)
print(data.shape)
Image.fromarray(data)
But when I am using Image.fromarray(random_array), this is throwing the following error.
Just to check if there is any issue with the shape of the array, I converted an image back to the array and converted it back after copying it to the other variable. And I got the output I was looking for.
img = np.array(Image.open('Sample_imgs/dog4.jpg'))
git = img.copy()
git.shape
Image.fromarray(git)
They both have the same shape, I don't understand where am I making the mistake.
When I am creating a 2D array and then converting it back it is giving me a black canvas of that size (even though the pixels should not be black).
random_im = np.random.randint(0,256,size=(231,177))
print(random_im)
# data = np.array(random_im)
print(data.shape)
Image.fromarray(random_im)
I was able to get this working with the solution detailed here:
import numpy as np
from PIL import Image
random_array = np.random.rand(177,284,3)
random_array = np.random.random_sample(random_array.shape) * 255
random_array = random_array.astype(np.uint8)
random_im = Image.fromarray(random_array)
random_im.show()
----EDIT
A more elegant way to get a random array of the correct type without conversions is like so:
import numpy as np
from PIL import Image
random_array = np.random.randint(low=0, high=255,size=(250,250),dtype=np.uint8)
random_im = Image.fromarray(random_array)
random_im.show()
Which is almost what you were doing in your solution, but you have to specify the dtype to be np.uint8:
random_im = np.random.randint(0,256,size=(231,177),dtype=np.uint8)
I am trying to downscale the image using "scikit-image". However I cannot show the downscaled picture through matplotlib.imshow function because of the dimension. Is there a way to prevent such dimension reduction? I put the script as well.
import os, cv2, glob
import numpy as np
import matplotlib.pyplot as plt
from skimage import io
from skimage.transform import pyramid_reduce,
plt.style.use('dark_background')
img_path = os.path.join(img_base_path, value[0])
img = io.imread(img_path)
resized = pyramid_reduce(img, downscale=4)
print(resized.shape)
img.shape is (240, 240, 3). So what I expect for an output is (60, 60, 3). However what I get is (60, 60, 1).
When I read the documentation of the pyramid_reduce function, I notice the parameter multichannel:
multichannel: bool,optional
Whether the last axis of the image is to be
interpreted as multiple channels or another spatial dimension.
So I would suggest you to set that to True, otherwise he is treating your 2D color images as a 3D BW image:
resized = pyramid_reduce(img, downscale=4, multichannel=True)
I saved an numpy array to an image as follows:
plt.imshow(xNext[0,:,:,0]) #xNext has shape (1,64,25,1)
print(xNext[0,:,:,0].shape) #outputs (64,25)
plt.savefig(os.path.join(root,filename)+'.png')
np.save(os.path.join(root,filename)+'.npy',xNext[0,:,:,0])
How can I obtain the same numpy array back from the .png saved image? Can you also please show me if I had saved as .jpg image?
I've tried the following and works with 3D array (v1) where resulting image close to the original numpy array produced image (original).
image = Image.open(imageFilename) #brings in as 3D array
box = (315,60,500,540)
image = image.crop(box)
image = image.resize((25,64)) #to correct to desired shape
arr = np.asarray(image)
plt.imshow(arr)
plt.savefig('v1.png')
plt.close()
However, when I convert the 3D array to 2D array, the resulting image is different (v1b and v1c).
arr2 = arr[:,:,0]
plt.imshow(arr2)
plt.savefig('v1b.png')
plt.close()
arr3 = np.dot(arr[...,:3],[0.299,0.587,0.11])
plt.imshow(arr3)
plt.savefig('v1c.png')
plt.close()
How can I convert the 3D to 2D correctly? Thanks for your help.
original, v1 (saved from 3D array)
v1b, v1c (saved from 2D arrays)
original (with original size)
If your objective is to save a numpy array as an image, your approach have a problem. The function plt.savefig saves an image of the plot, not the array. Also transforming an array into an image may carry some precision loss (when converting from float64 or float32 to uint16). That been said, I suggest you use skimage and imageio:
import imageio
import numpy as np
from skimage import img_as_uint
data = np.load('0058_00086_brown_2_recording1.wav.npy')
print("original", data.shape)
img = img_as_uint(data)
imageio.imwrite('image.png', img)
load = imageio.imread('image.png')
print("image", load.shape)
This script loads the data you provided and prints the shape for verification
data = np.load('0058_00086_brown_2_recording1.wav.npy')
print("original", data.shape)
then it transform the data to uint, saves the image as png and loads it:
img = img_as_uint(data)
imageio.imwrite('image.png', img)
load = imageio.imread('image.png')
the output of the script is:
original (64, 25)
image (64, 25)
i.e. the image is loaded with the same shape that data. Some notes:
image.png is saved as a grayscale image
To save to .jpg just change to imageio.imwrite('image.jpg', img)
In the case of .png the absolute average distance from the original image was 3.890e-06 (this can be verified using np.abs(img_as_float(load) - data).sum() / data.size)
Information about skimage and imageio can be found in the respectives websites. More on saving numpy arrays as images can be found in the following answers: [1], [2], [3] and [4].
link
from scipy.misc import imread
image_data = imread('test.jpg').astype(np.float32)
This should give you the numpy array (I would suggest using imread from scipy)
I've trained a Handwritten image classifier using Keras library in Python. Initially I've used standard MNIST dataset for training and testing purpose. But now I want to use my own data set for testing, in which all the images are size 900*1200*3 instead of 28*28*1
So I need to reshape all the images before testing. I'm using following code to reshape but it give errors.
Code:
bb = lol.reshape(lol.shape[0], 28, 28, 1).astype('float32')
where lol is my numpy array containing 55 images of shape (900,1200,3)
and the Error log is as following:
ValueError Traceback (most recent call last)
<ipython-input-46-87da95da73e9> in <module>()
24 # # you can show every image
25 # img.show()
---> 26 bb = lol.reshape(lol.shape[0], 28, 28, 1).astype('float32')
27 # model = loaded_model
28 # classes = model.predict(bb)
ValueError: cannot reshape array of size 178200000 into shape (55,28,28,1)
So what am I doing wrong? Can I get accurate predictions even after resizing the large images to very small images of 28*28? Thanks for help.
What you are doing is wrong. You can't reshape an array of (55, 900, 1200, 3) into an array of (55, 28, 28, 1), because you are trying to store 55*900*1200*3=178200000 elements in an array that can store only 55*28*28=43120 elements.
You want to do two things:
1) Convert your rgb image (indicated by the last dimension which is the 3 channels) into grayscale (1 channel). The simplest way to do this is (R+B+G)/3. All python libraries that have to do with images (PIL, OpenCV, skimage, tensorflow, keras, etc) have this already implemented. Example:
from skimage.color import rgb2gray
gray = rgb2gray(original)
2) Resize the image from 900x1200 to 28x28. Again you can do this in all major image-related python libraries. Example:
from skimage.transform import resize
resized = resize(gray, (28,28))
Now if you want to do this in all 55 images you can either write a function that transforms one image and map it across your array, or use a simple for loop and populate your new array one image at a time.
In your case the code should look something like this:
num_images = lol.shape[0] # 55 in your case
resized_images = np.zeros(shape=(num_images, 28, 28, 1)) # your final array
for i in range(num_images):
gray = rgb2gray(lol[i,:,:,:]) # gray.shape should be (900,1200,1)
resized = resize(gray, (28,28)) # resized.shape should be (28,28,1)
resized_images[i,:,:,:] = resized # resized_images.shape should be (55,28,28,1)
It would be more intuitive to process each image individually, which would also give you the best chance of preserving some information.
Try using the PIL library:
import numpy
from PIL import Image
lol = numpy.zeros((55,900,1200,3),dtype=numpy.uint8)
new_array = numpy.zeros((lol.shape[0],28,28),dtype=numpy.float32)
for i in range(lol.shape[0]):
img = Image.fromarray(lol[i])
img_resize = img.resize((28,28))
img_mono = img_resize.convert('L')
arr = numpy.array(img_mono,dtype=numpy.uint8)
new_array[i] = arr
I have 5 pictures and i want to convert each image to 1d array and put it in a matrix as vector. I want to be able to convert each vector to image again.
img = Image.open('orig.png').convert('RGBA')
a = np.array(img)
I'm not familiar with all the features of numpy and wondered if there other tools I can use.
Thanks.
import numpy as np
from PIL import Image
img = Image.open('orig.png').convert('RGBA')
arr = np.array(img)
# record the original shape
shape = arr.shape
# make a 1-dimensional view of arr
flat_arr = arr.ravel()
# convert it to a matrix
vector = np.matrix(flat_arr)
# do something to the vector
vector[:,::10] = 128
# reform a numpy array of the original shape
arr2 = np.asarray(vector).reshape(shape)
# make a PIL image
img2 = Image.fromarray(arr2, 'RGBA')
img2.show()
import matplotlib.pyplot as plt
img = plt.imread('orig.png')
rows,cols,colors = img.shape # gives dimensions for RGB array
img_size = rows*cols*colors
img_1D_vector = img.reshape(img_size)
# you can recover the orginal image with:
img2 = img_1D_vector.reshape(rows,cols,colors)
Note that img.shape returns a tuple, and multiple assignment to rows,cols,colors as above lets us compute the number of elements needed to convert to and from a 1D vector.
You can show img and img2 to see they are the same with:
plt.imshow(img) # followed by
plt.show() # to show the first image, then
plt.imshow(img2) # followed by
plt.show() # to show you the second image.
Keep in mind in the python terminal you have to close the plt.show() window to come back to the terminal to show the next image.
For me it makes sense and only relies on matplotlib.pyplot. It also works for jpg and tif images, etc. The png I tried it on has float32 dtype and the jpg and tif I tried it on have uint8 dtype (dtype = data type); each seems to work.
I hope this is helpful.
I used to convert 2D to 1D image-array using this code:
import numpy as np
from scipy import misc
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt
face = misc.imread('face1.jpg');
f=misc.face(gray=True)
[width1,height1]=[f.shape[0],f.shape[1]]
f2=f.reshape(width1*height1);
but I don't know yet how to change it back to 2D later in code, Also note that not all the imported libraries are necessary, I hope it helps