I am evaluating a Tensorflow model on open cv video frames. I need to reshape the incoming PIL image into reshaped numpy array so that i can run inference on it.
But i see that the conversion of the PIL image to numpy array is taking around 900+ milliseconds on my laptop with 16 GiB memory and 2.6 GHz Intel Core i7 processor. I need to get this down to a few milliseconds so that i can process multiple frames per second on my camera.
Can anyone suggest how to make the below method run faster?
def load_image_into_numpy_array(pil_image):
(im_width, im_height) = pil_image.size
data = pil_image.getdata()
data_array = np.array(data)
return data_array.reshape((im_height, im_width, 3)).astype(np.uint8)
On further instrumentation i realized that np.array(data) is taking the bulk of the time... close to 900+ milliseconds. So conversion of the image data to numpy array is the real culprit.
You can just let numpy handle the conversion instead of reshaping yourself.
def pil_image_to_numpy_array(pil_image):
return np.asarray(pil_image)
You are converting image into (height, width, channel) format. That is default conversion numpy.asarray function performs on PIL image so explicit reshaping should not be neccesary.
Thank you very much!! It works very fast!
def load_image_into_numpy_array(path):
"""Load an image from file into a numpy array.
Puts image into numpy array to feed into tensorflow graph.
Note that by convention we put it into a numpy array with shape
(height, width, channels), where channels=3 for RGB.
Args:
path: a file path (this can be local or on colossus)
Returns:
uint8 numpy array with shape (img_height, img_width, 3)
"""
img_data = tf.io.gfile.GFile(path, 'rb').read()
image = Image.open(BytesIO(img_data))
return np.array(image)
Image with (3684, 4912, 3) take 0.3~0.4 sec.
Related
Python wand supports converting images directly to a Numpy arrays, such as can be seen in related questions.
However, when doing this for .hdr (high dynamic range) images, this appears to compress the image to 0/255. As a result, converting from a Python Wand image to a np array and back drastically reduces file size/quality.
# Without converting to a numpy array
img = Image('image.hdr') # Open with Python Wand Image
img.save(filename='test.hdr') # Save with Python wand
Running this opens the image and saves it again, which creates a file with a size of 41.512kb. However, if we convert it to numpy before saving it again..
# With converting to a numpy array
img = Image(filename=os.path.join(path, 'N_SYNS_89.hdr')) # Open with Python Wand Image
arr = np.asarray(img, dtype='float32') # convert to np array
img = Image.from_array(arr) # convert back to Python Wand Image
img.save(filename='test.hdr') # Save with Python wand
This results in a file with a size of 5.186kb.
Indeed, if I look at arr.min() and arr.max() I see that the min and max values for the numpy array are 0 and 255. If I open the .hdr image with cv2 however as an numpy array, the range is much higher.
img = cv2.imread('image.hdr'), -1)
img.min() # returns 0
img.max() # returns 868352.0
Is there a way to convert back and forth between numpy arrays and Wand images without this loss?
As per the comment of #LudvigH, the following worked as in this answer.
img = Image(filename='image.hdr'))
img.format = 'rgb'
img.alpha_channel = False # was not required for me, including it for completion
img_array = np.asarray(bytearray(img.make_blob()), dtype='float32')
Now we much reshape the returned img_array. In my case I could not run the following
img_array.reshape(img.shape)
Instead, for my img.size was a (x,y) tuple that should have been an (x,y,z) tuple.
n_channels = img_array.size / img.size[0] / img.size[1]
img_array = img_array.reshape(img.size[0],img.size[1],int(n_channels))
After manually calculating z as above, it worked fine. Perhaps this is also what caused the original fault in converting using arr = np.asarray(img, dtype='float32')
I have an array of image pixel values that I would like to upscale for input into my neural network. It is an array of shape (28000, 48, 48, 1). These are normalized image pixel values and would like to upscale these to a higher resolution for input into my CNN. The arrays look like this...
array([[[[-0.6098866 ],
[-0.4592209 ],
[-0.40325198],
...,
[-0.7694696 ],
[-0.90518403],
[-0.95160526]],
[[-0.66049284],
[-0.68162924],
[-0.694159 ],
Both my X_train and y_train image arrays have shape of (28000,48,48,1). I would like to upscale or resize these 28000 image arrays to size 75x75. Please help. Should I convert arrays back to non-normalized arrays or images and then maybe use cv2 to upscale? How would I do this?
One easy way to resize images is using the Python module PIL (Python Image Library), which you can install with pip install pillow. Example below to demonstrate resizing a single image:
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
# Open image
panda_pil = Image.open("panda.jpg")
print(np.array(panda_pil).shape)
# (613, 696, 3)
panda_pil_resized = panda_pil.resize((75, 75))
print(np.array(panda_pil_resized).shape)
# (75, 75, 3)
plt.imshow(np.array(panda_pil_resized))
plt.show()
You can download the panda image as follows:
import urllib.request
panda_fname = "panda.jpg"
panda_url = "https://upload.wikimedia.org/wikipedia/commons/f/fe/Giant_Panda_in_Beijing_Zoo_1.JPG"
urllib.request.urlretrieve(panda_url, panda_fname)
To resize all 28000 images, one approach would be to do this as a preprocessing step in a for-loop, and save the images to a numpy array.
Edit: You can loop through your original 28000x2304 image array and upscale each image individually in a for-loop. To get the PIL.Image object from a np.ndarray object, you can use Pil.Image.from_array, as shown below (I have just generated a random array of Gaussian noise but it should work the same with your images):
import numpy as np
from PIL import Image
from time import perf_counter
old_width, old_height = 48, 48
new_width, new_height = 75, 75
num_images = 28000
old_image_array = np.random.normal(size=[num_images, old_width*old_height])
new_image_array = np.empty(shape=[num_images, new_width*new_height])
print("Starting conversion...")
t0 = perf_counter()
# Loop over each image individually
for i in range(num_images):
# Get the ith image and reshape
old_image = old_image_array[i].reshape(old_width, old_height)
# Convert to PIL.Image
old_image_pil = Image.fromarray(old_image)
# Upscale resolution
new_image_pil = old_image_pil.resize((new_width, new_height))
# Convert to numpy array
new_image = np.array(new_image_pil)
# Reshape and store in new image array
new_image_array[i] = new_image.reshape(new_width*new_height)
t1 = perf_counter()
print("Time taken = {:.3f} s".format(t1 - t0))
print(old_image_array.shape, new_image_array.shape)
Console output:
Starting conversion...
Time taken = 2.771 s
(28000, 2304) (28000, 5625)
There may well be a more efficient way of doing this, but this method is simple, and uses tools which are useful to know about if you don't know about them already (PIL is a good module for manipulating images, see this blog post if you want to learn more about PIL).
I was watching a tutorial on a facial recognition project using OpenCV,numpy, PIL.
During training, the image was converted into a numpy array, what is the need of converting it into a numpy array?
THE CODE:
PIL_IMAGE = Image.open(path).convert("L")
image_array = np.array(PIL_IMAGE, "uint8")
TLDR; OpenCV images are stored as three-dimensional Numpy arrays.
When you read in digital images using the library, they are represented as Numpy arrays. The rectangular shape of the array corresponds to the shape of the image. Consider this image of a chair
Here's a visualization of how this image is stored as a Numpy array in OpenCV
If we read in the image of the chair we can see how it is structured with image.shape which returns a tuple (height, width, channels). Image properties will be a tuple of the number of rows, columns, and channels if it is a colored image. If it is a grayscale image, image.shape only returns the number of rows and columns.
import cv2
image = cv2.imread("chair.jpg")
print(image.shape)
(222, 300, 3)
When working with OpenCV images, we specify the y coordinate first, then the x coordinate. Colors are stored as BGR values with blue in layer 0, green in layer 1, and red in layer 2. So for this chair image, it has a height of 222, a width of 300, and has 3 channels (meaning it is a color image). Essentially, when the library reads in any image, it stores it as a Numpy array in this format.
The answer is rather simple:
With Numpy you can make blazing fast operations on numerical arrays, no matter which dimension, shape, etc. they are.
Image processing libraries (OpenCV, PIL, scikit-image) sometimes wrap images in some special format that already uses Numpy behind the scenes. If they are not already using Numpy in the background, the images can be converted to Numpy arrays explicitly. Then you can do speedy numerical calculations on them (convolution, FFT, blurry, filters, ...).
I saved an numpy array to an image as follows:
plt.imshow(xNext[0,:,:,0]) #xNext has shape (1,64,25,1)
print(xNext[0,:,:,0].shape) #outputs (64,25)
plt.savefig(os.path.join(root,filename)+'.png')
np.save(os.path.join(root,filename)+'.npy',xNext[0,:,:,0])
How can I obtain the same numpy array back from the .png saved image? Can you also please show me if I had saved as .jpg image?
I've tried the following and works with 3D array (v1) where resulting image close to the original numpy array produced image (original).
image = Image.open(imageFilename) #brings in as 3D array
box = (315,60,500,540)
image = image.crop(box)
image = image.resize((25,64)) #to correct to desired shape
arr = np.asarray(image)
plt.imshow(arr)
plt.savefig('v1.png')
plt.close()
However, when I convert the 3D array to 2D array, the resulting image is different (v1b and v1c).
arr2 = arr[:,:,0]
plt.imshow(arr2)
plt.savefig('v1b.png')
plt.close()
arr3 = np.dot(arr[...,:3],[0.299,0.587,0.11])
plt.imshow(arr3)
plt.savefig('v1c.png')
plt.close()
How can I convert the 3D to 2D correctly? Thanks for your help.
original, v1 (saved from 3D array)
v1b, v1c (saved from 2D arrays)
original (with original size)
If your objective is to save a numpy array as an image, your approach have a problem. The function plt.savefig saves an image of the plot, not the array. Also transforming an array into an image may carry some precision loss (when converting from float64 or float32 to uint16). That been said, I suggest you use skimage and imageio:
import imageio
import numpy as np
from skimage import img_as_uint
data = np.load('0058_00086_brown_2_recording1.wav.npy')
print("original", data.shape)
img = img_as_uint(data)
imageio.imwrite('image.png', img)
load = imageio.imread('image.png')
print("image", load.shape)
This script loads the data you provided and prints the shape for verification
data = np.load('0058_00086_brown_2_recording1.wav.npy')
print("original", data.shape)
then it transform the data to uint, saves the image as png and loads it:
img = img_as_uint(data)
imageio.imwrite('image.png', img)
load = imageio.imread('image.png')
the output of the script is:
original (64, 25)
image (64, 25)
i.e. the image is loaded with the same shape that data. Some notes:
image.png is saved as a grayscale image
To save to .jpg just change to imageio.imwrite('image.jpg', img)
In the case of .png the absolute average distance from the original image was 3.890e-06 (this can be verified using np.abs(img_as_float(load) - data).sum() / data.size)
Information about skimage and imageio can be found in the respectives websites. More on saving numpy arrays as images can be found in the following answers: [1], [2], [3] and [4].
link
from scipy.misc import imread
image_data = imread('test.jpg').astype(np.float32)
This should give you the numpy array (I would suggest using imread from scipy)
I am learning Tensorflow and Python. I tried reading an image from a file and then displaying that image using matplotlib. Here is my code.
import matplotlib.pyplot as plt
import tensorflow as tf
# read and decode the image
image_contents = tf.read_file('elephant.jpeg')
image = tf.image.decode_jpeg(image_contents, channels=3)
with tf.Session() as sess:
img = sess.run(image)
print(img)
plt.axis('off')
plt.imshow(img)
plt.show()
This also prints a huge array which I understand are the RGB values for each pixel. Now I am trying to modify pixel values individually. I can modify all the pixel values at once using tf operations but I am not able to operate on individual pixel values.
For example, I have been trying to make the image grayscale. So, I want to replace the R, G and B values with the average of R,G and B values of the pixel. How do I do that?
I also want to know if I should be focussing on Python or Tensorflow?
You can directly convert the image to grayscale with Pillow
from PIL import Image
img = Image.open('/some path/image.png').convert('L')
I prefer preprocessing images with numpy before feeding them into tensorflow.
I am not sure which shape your array has, i would suggest to convert the image to a 2 dim np array. In the case below i am converting a list of pixels (shape=[784]) to an array with shape=28x28. Afterwards you can directly perform operations on each pixel.
image = np.reshape(img, (28,28)).astype(np.uint8)