fastest way to load images in python for processing - python

I want to load more than 10000 images in my 8gb ram in the form of numpy arrays.So far I have tried cv2.imread,keras.preprocessing.image.load_image,pil,imageio,scipy.I want to do it the fastest way possible but I can't figure out which on is it.

One of the fastest ways is to get your multiprocessors to do your job in Parallel. It brings multiple processors to work on your tasks at the same time when concurrent running isn't an issue. Now the example below is just a simple sketch of how it might look, you can practice with small functions and then integrate them with your own code :
from multiprocessing import Process
#this is the function to be parallelized
def image_load_here(image_path):
pass
if __name__ == '__main__':
#Start the multiprocesses and provide your dataset.
p = Process(target=image_load_here,['img1', 'img2', 'img3', 'img4'])
p.start()
p.join()
Feel free to write, ill try to help.

If you're using keras library in order to create a deep learning model, I suggest you to use image class from keras.preprocessing package.
image class provides a method img_to_array which returns already a numpy array.
Also, it uses NumPy - Numpy internally for all its array manipulations/computations.
train_image = image.load_img(path, target_size = (height, width))
train_image = image.img_to_array(train_image)

import numpy as np
import os
from keras.preprocessing import image
def batch_data_generator(data, indexes):
#indexes is a sub array of index from the data
X = np.zeros((len(indexes), config.IMG_INPUT_SHAPE[0], config.IMG_INPUT_SHAPE[1], config.IMG_INPUT_SHAPE[2]))
Y = np.zeros((len(indexes), len(label_mapping)))
i = 0
for idx in indexes:
image_id = data['X'][idx]
filename = os.path.join('images', str(image_id) + '.jpg')
img = image.load_img(filename, target_size=(300, 300))
X[i] = np.array(img, dtype='float32')
label_id = label_mapping[data['Y'][idx]]
Y[i][label_id] = 1
i += 1
# subtract mean and normalize
for depth in range(3):
X[:, :, :, depth] = (X[:, :, :, depth] - np.mean(X[:, :, :, depth])) / 255
return X, Y

Related

Wrong output when following the filter formula?

I am trying to make my image sepia, but I get the wrong filter and I cant see why? Is this the incorrect formula of the sepia filter?
im = Image.open("some.jpg")
image = np.asarray(im)
sepia_image = np.empty_like(image)
for i in range(image.shape[0]):
for j in range(image.shape[1]):
sepia_image[i][j][0] = 0.393*image[i][j][0] + 0.769*image[i][j][1] + 0.189*image[i][j][2]
sepia_image[i][j][1] = 0.349*image[i][j][0] + 0.686*image[i][j][1] + 0.168*image[i][j][2]
sepia_image[i][j][2] = 0.272*image[i][j][0] + 0.534*image[i][j][1] + 0.131*image[i][j][2]
for k in range(image.shape[2]):
if sepia_image[i][j][k] > 255:
sepia_image[i][j][k] = 255
sepia_image = sepia_image.astype("uint8")
Image.fromarray(sepia_image).show()
The image i get is this
The problem is that your values are going out of bounds.
For example, using your formula on my example image below, the red channel in the first pixel ends up being 205*0.393 + 206*0.769 + 211*0.189, which is 278. If you are using unsigned 8-bit integers, this will overflow to 22.
To fix it, you need to use floats and clip the range back to 0 to 255, for example by using this instead of your np.empty_like() instantiation:
sepia_image = np.zeros_like(image, dtype=float)
Then, after running your loops:
sepia_image.astype(np.uint8)
Then your code works on my image at least.
Unsolicited advice: don't use loops
Another issue is the difficulty of debugging code like this. In general, you want to avoid loops over arrays in Python. It's slow, and it tends to require more code. Instead, take advantage of NumPy's elementwise maths. For example, you can use np.matmul (or the # operator, which does the same thing) like so:
from io import BytesIO
import requests
import numpy as np
from PIL import Image
# Image CC BY-SA Leiju / Wikimedia Commons
uri = 'https://upload.wikimedia.org/wikipedia/commons/thumb/3/35/Neckertal_20150527-6384.jpg/640px-Neckertal_20150527-6384.jpg'
r = requests.get(uri)
img = Image.open(BytesIO(r.content))
# Turn this PIL Image into a NumPy array.
imarray = np.asarray(img)[..., :3] / 255
# Make a `sepia` multiplier.
sepia = np.array([[0.393, 0.349, 0.272],
[0.769, 0.686, 0.534],
[0.189, 0.168, 0.131]])
# Compute the result and clip back to 0 to 1.
imarray_sepia = np.clip(imarray # sepia, 0, 1)
This produces:

How to use random zoom in keras tensorflow 2.3

I am trying to add random zoom to my images that are tiff files with 128x160 resolution 1 channel but new version of random zoom for keras tensorflow has gotten me confused, i don't understand how should be the tuple format it expects as the zoom range arguement.
From the documentation.
tf.keras.preprocessing.image.random_zoom(
x, zoom_range, row_axis=1, col_axis=2, channel_axis=0, fill_mode='nearest',
cval=0.0, interpolation_order=1
)
I need to add some random zoom to my image, and i am trying like this:
zoom_range = ((0.4, 0.4))
img = tf.keras.preprocessing.image.random_zoom(
img, zoom_range, row_axis=1, col_axis=2, channel_axis=0, fill_mode='nearest',
cval=0.0, interpolation_order=1
)
The output is:
TypeError: float() argument must be a string or a number, not
'NoneType'
How exactly should i pass as parameter any random zoom amount to my images?
Public kaggle notebook here:
https://www.kaggle.com/puelon/notebook75c416766a
TypeError: in user code:
<ipython-input-4-9ba0455797a4>:17 load *
img = tf.keras.preprocessing.image.random_zoom(img, zoom_range, row_axis=0, col_axis=1, channel_axis=2, fill_mode='nearest')
/opt/conda/lib/python3.7/site-packages/keras_preprocessing/image/affine_transformations.py:153 random_zoom *
x = apply_affine_transform(x, zx=zx, zy=zy, channel_axis=channel_axis,
/opt/conda/lib/python3.7/site-packages/keras_preprocessing/image/affine_transformations.py:321 apply_affine_transform *
transform_matrix = transform_matrix_offset_center(
/opt/conda/lib/python3.7/site-packages/keras_preprocessing/image/affine_transformations.py:246 transform_matrix_offset_center *
o_x = float(x) / 2 + 0.5
/opt/conda/lib/python3.7/site-packages/tensorflow/python/autograph/operators/py_builtins.py:195 float_ **
return _py_float(x)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/autograph/operators/py_builtins.py:206 _py_float
return float(x)
TypeError: float() argument must be a string or a number, not 'NoneTyp
e'
TypeError Traceback (most recent call last)
<ipython-input-4-9ba0455797a4> in <module>
27 train1, train2, test1 = d
28 train_ds = tf.data.Dataset.from_tensor_slices(train1 + train2).\
---> 29 shuffle(len(train1) + len(train2)).map(load).batch(4)
30 test_ds = tf.data.Dataset.from_tensor_slices(test1).\
31 shuffle(len(test1)).map(load).batch(4)
for i in range(len(groups)):
d = deque(groups)
d.rotate(i)
train1, train2, test1 = d
train_ds = tf.data.Dataset.from_tensor_slices(train1 + train2).\
shuffle(len(train1) + len(train2)).map(load).batch(4)
test_ds = tf.data.Dataset.from_tensor_slices(test1).\
shuffle(len(test1)).map(load).batch(4)
Probably your img is of wrong object type. For random_zoom(...) function you need to provide input as tensor or 3D numpy array of shape (height, width, channels) i.e. for RGB image of size 300x200 array should be of shape (200, 300, 3). This kind of numpy array can be obtained for example by PIL library like in code below.
Also if you're having TF code then you're dealing with tensors, but random_zoom needs to know all dimensions, their integer sizes. Tensors may have None size for some dimensions if it is unknown at graph construction time, and probably this is causing the error about NoneType in your case. To overcome this you need to wrap random_zoom usage into numpy function interface, this will force function input to be numpy array instead of tensor and numpy arrays always have all dimensions with known sizes. I've implemented this wrapping in my code down below too.
Also you probably need to change row_axis=1, col_axis=2, channel_axis=0 to row_axis=0, col_axis=1, channel_axis=2 because channels (colors) go usually in least significant dimension (last).
Documentation for tf.keras.preprocessing.image.random_zoom.
I've implemented simple code next that works.
Input in code looks like this:
output looks like this:
Next code can be also run here online.
# Needs: python -m pip install tensorflow numpy pillow requests
import tensorflow as tf, numpy as np, PIL.Image, requests, io
tf.compat.v1.enable_eager_execution()
zoom_range = (0.4, 0.5)
img = PIL.Image.open(io.BytesIO(requests.get('https://i.stack.imgur.com/Fc3Jb.png').content))
#img = PIL.Image.open('Ruler-Big-Icon-PNG.png')
img = np.array(img)
img = tf.convert_to_tensor(img) # This line is not needed if you already have a tensor.
# You need only this single line of code to fix your issue!
img = tf.numpy_function(lambda img: tf.keras.preprocessing.image.random_zoom(
img, zoom_range, row_axis=0, col_axis=1, channel_axis=2, fill_mode='nearest',
), [img], tf.float32)
img = np.array(img) # This line is not needed if you plan img to be a tensor futher
# Example output is https://i.stack.imgur.com/MWk9T.png
PIL.Image.fromarray(img).save('result.png')
I'm generally not convinced that using Keras preprocessing functions outside of where they belong is the right approach. A simple way would be to use tf.image.random_crop. Assuming your image is larger than (200, 200, 3), you can just use this line:
img = tf.image.random_crop(img, (200, 200, 3))
Let's try an example. Original image:
import tensorflow as tf
import skimage
import matplotlib.pyplot as plt
import numpy as np
X = np.stack([skimage.data.chelsea() for _ in range(10)])
ds = tf.data.Dataset.from_tensor_slices(X).\
map(lambda x: tf.image.random_crop(x, (200, 200, 3)))
plt.imshow(next(iter(ds)))
plt.show()
Randomly cropped image of size (200, 200, 3):

How can I stop my Colab notebook from crashing while normalising my images?

I am trying to make a model which recognises the emotions of a human. My code and RAM is just fine in the start:
But when I try to normalise my images, the RAM drastically jumps up
and then Colab just crashes:
This is the code block which is causing colab to crash:
import os
import matplotlib.pyplot as plt
import cv2
data = []
for emot in os.listdir('./data/'):
for file_ in os.listdir(f'./data/{emot}'):
img = cv2.imread(f'./data/{emot}/{file_}', 0)
img = cv2.bitwise_not(img)
img /= 255.0 # <--- This is the line that causes colab to crash
data.append([img, emotions.index(emot)])
If I remove the img /= 255.0, it doesn't crash, but then I have images which are not normalised!:
I even tried normalising it in another block:
for i in range(len(data)):
data[i][0] = np.array(data[i][0]) / 255.0
but it doesn't work and still crashes
I would like to go through an example. Firstly let's have a look at the following code.
import numpy as np
x = np.random.randint(0, 255, size=(100, 32, 32), dtype=np.int16)
print('Present data type', x.dtype)
# What you did
y = x/255
print('Present data type', y.dtype)
# What you should do
z = (x/255).astype(np.float16)
print('Present data type', z.dtype)
Output:
Present data type int16
Present data type float64
Present data type float16
If you look closely, while I am dividing the x variable and declaring y=x/255, the data type changes to float64. If you divide an int data type of a NumPy array, by default, it is typecasted to float64. Generally, 'float64' contains a larger memory. Therefore while dividing a int type NumPy matrix, one should always typecase to shorter datatypes for larger datasets.
If the code you executed fluently runs without the img /= 255.0 block, then this is the case. After dividing, you should typecast the img variable to the lowest possible float types, such as, np.float16 or np.float32. However, np.float16 has some limitations and it is not fully supported by TensorFlow (TF converts it to 32-bit float), you may use np.float32 datatype.
Therefore, try adding img.astype(np.float16) or img.astype(np.float32) after the line img /= 255.0.
The modified version of the code is given,
import os
import matplotlib.pyplot as plt
import cv2
data = []
for emot in os.listdir('./data/'):
for file_ in os.listdir(f'./data/{emot}'):
img = cv2.imread(f'./data/{emot}/{file_}', 0)
img = cv2.bitwise_not(img)
img = (img/255.0).astype(np.float16) # <--- This is the suggestion
data.append([img, emotions.index(emot)])
Assuming the next step in your pipeline is to create a tf.data.Dataset object out of your image corpus, you can use Dataset.map() to move your preprocessing into the data loading pipeline to save on memory space. Tensorflow has a very well-documented guide on how to do this here -> https://www.tensorflow.org/guide/data#preprocessing_data

IncrementalPCA & partial_fit - number of components

I work with python and about 4000 images of watches (examples: watch_1, watch_2). The images are rgb and their resolution is 450x450. My aim is to find the most similar watches among them. For this reason I am using IncrementalPCA and partial_fit of scikit_learn to handle these big data with my 26GB RAM (see also: SO_Link_1, SO_Link_2). My source code is the following:
import cv2
import numpy as np
import os
from glob import glob
from sklearn.decomposition import IncrementalPCA
from sklearn import neighbors
from sklearn import preprocessing
data = []
# Read images from file #
for filename in glob('Watches/*.jpg'):
img = cv2.imread(filename)
height, width = img.shape[:2]
img = np.array(img)
# Check that all my images are of the same resolution
if height == 450 and width == 450:
# Reshape each image so that it is stored in one line
img = np.concatenate(img, axis=0)
img = np.concatenate(img, axis=0)
data.append(img)
# Normalise data #
data = np.array(data)
Norm = preprocessing.Normalizer()
Norm.fit(data)
data = Norm.transform(data)
# IncrementalPCA model #
ipca = IncrementalPCA(n_components=6)
length = len(data)
chunk_size = 4
pca_data = np.zeros(shape=(length, ipca.n_components))
for i in range(0, length // chunk_size):
ipca.partial_fit(data[i*chunk_size : (i+1)*chunk_size])
pca_data[i * chunk_size: (i + 1) * chunk_size] = ipca.transform(data[i*chunk_size : (i+1)*chunk_size])
# K-Nearest neighbours #
knn = neighbors.NearestNeighbors(n_neighbors=4, algorithm='ball_tree', metric='minkowski').fit(data)
distances, indices = knn.kneighbors(data)
print(indices)
However when I run this program for start with 40 images of watches I get the following error when i = 1:
ValueError: Number of input features has changed from 4 to 6 between calls to partial_fit! Try setting n_components to a fixed value.
However, it is obvious that I set n_components to 6 when coding ipca = IncrementalPCA(n_components=6) but for some reason ipca considers chunk_size = 4 as the number of components when i = 0 and then when i = 1 changes to 6.
Why is this happening?
How can I fix it?
This seems to follow the math behind PCA as it will be ill-conditioned for n_components > n_samples.
You might be interested in reading this (introduction of error-message) and some discussion behind it.
Try to increase the batch-size / chunk-size (or lowering n_components).
(In general i'm also somewhat sceptic about this approach. I hope you tested it on some small example-dataset using batch-PCA. It does not seem your watches are preprocessed in regards to geometry: cropping; maybe hist-/color-normalization.)

Best way in Pytorch to upsample a Tensor and transform it to rgb?

For a nice output in Tensorboard I want to show a batch of input images, corresponding target masks and output masks in a grid.
Input images have different size then the masks. Furthermore the images are obviously RGB.
From a batch of e.g. 32 or 64 I only want to show the first 4 images.
After some fiddling around I came up with the following example code. Good thing: It works.
But I am really not sure if I missed something in Pytorch. It just looks much longer then I expected. Especially the upsampling and transformation to RGB seems wild. But the other transformations I found would not work for a whole batch.
import torch
from torch.autograd import Variable
import torch.nn.functional as FN
import torchvision.utils as vutils
from tensorboardX import SummaryWriter
import time
batch = 32
i_size = 192
o_size = 112
nr_imgs = 4
# Tensorboard init
writer = SummaryWriter('runs/' + time.strftime('%Y%m%d_%H%M%S'))
input_image=Variable(torch.rand(batch,3,i_size,i_size))
target_mask=Variable(torch.rand(batch,o_size,o_size))
output_mask=Variable(torch.rand(batch,o_size,o_size))
# upsample target_mask, add dim to have gray2rgb
tm = FN.upsample(target_mask[:nr_imgs,None], size=[i_size, i_size], mode='bilinear')
tm = torch.cat( (tm,tm,tm), dim=1) # grayscale plane to rgb
# upsample target_mask, add dim to have gray2rgb
om = FN.upsample(output_mask[:nr_imgs,None], size=[i_size, i_size], mode='bilinear')
om = torch.cat( (om,om,om), dim=1) # grayscale plane to rgb
# add up all images and make grid
imgs = torch.cat( ( input_image[:nr_imgs].data, tm.data, om.data ) )
x = vutils.make_grid(imgs, nrow=nr_imgs, normalize=True, scale_each=True)
# Tensorboard img output
writer.add_image('Image', x, 0)
EDIT: Found this on Pytorchs Issues list. Its about Batch support for Transform. Seems there are no plans to add batch transforms in the future. So my current code might be the best solution for the time being, anyway?
Maybe you can just convert your Tensors to the numpy array (.data.cpu().numpy() ) and use opencv to do upsampling? OpenCV implementation should be quite fast.

Categories