I am trying to add random zoom to my images that are tiff files with 128x160 resolution 1 channel but new version of random zoom for keras tensorflow has gotten me confused, i don't understand how should be the tuple format it expects as the zoom range arguement.
From the documentation.
tf.keras.preprocessing.image.random_zoom(
x, zoom_range, row_axis=1, col_axis=2, channel_axis=0, fill_mode='nearest',
cval=0.0, interpolation_order=1
)
I need to add some random zoom to my image, and i am trying like this:
zoom_range = ((0.4, 0.4))
img = tf.keras.preprocessing.image.random_zoom(
img, zoom_range, row_axis=1, col_axis=2, channel_axis=0, fill_mode='nearest',
cval=0.0, interpolation_order=1
)
The output is:
TypeError: float() argument must be a string or a number, not
'NoneType'
How exactly should i pass as parameter any random zoom amount to my images?
Public kaggle notebook here:
https://www.kaggle.com/puelon/notebook75c416766a
TypeError: in user code:
<ipython-input-4-9ba0455797a4>:17 load *
img = tf.keras.preprocessing.image.random_zoom(img, zoom_range, row_axis=0, col_axis=1, channel_axis=2, fill_mode='nearest')
/opt/conda/lib/python3.7/site-packages/keras_preprocessing/image/affine_transformations.py:153 random_zoom *
x = apply_affine_transform(x, zx=zx, zy=zy, channel_axis=channel_axis,
/opt/conda/lib/python3.7/site-packages/keras_preprocessing/image/affine_transformations.py:321 apply_affine_transform *
transform_matrix = transform_matrix_offset_center(
/opt/conda/lib/python3.7/site-packages/keras_preprocessing/image/affine_transformations.py:246 transform_matrix_offset_center *
o_x = float(x) / 2 + 0.5
/opt/conda/lib/python3.7/site-packages/tensorflow/python/autograph/operators/py_builtins.py:195 float_ **
return _py_float(x)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/autograph/operators/py_builtins.py:206 _py_float
return float(x)
TypeError: float() argument must be a string or a number, not 'NoneTyp
e'
TypeError Traceback (most recent call last)
<ipython-input-4-9ba0455797a4> in <module>
27 train1, train2, test1 = d
28 train_ds = tf.data.Dataset.from_tensor_slices(train1 + train2).\
---> 29 shuffle(len(train1) + len(train2)).map(load).batch(4)
30 test_ds = tf.data.Dataset.from_tensor_slices(test1).\
31 shuffle(len(test1)).map(load).batch(4)
for i in range(len(groups)):
d = deque(groups)
d.rotate(i)
train1, train2, test1 = d
train_ds = tf.data.Dataset.from_tensor_slices(train1 + train2).\
shuffle(len(train1) + len(train2)).map(load).batch(4)
test_ds = tf.data.Dataset.from_tensor_slices(test1).\
shuffle(len(test1)).map(load).batch(4)
Probably your img is of wrong object type. For random_zoom(...) function you need to provide input as tensor or 3D numpy array of shape (height, width, channels) i.e. for RGB image of size 300x200 array should be of shape (200, 300, 3). This kind of numpy array can be obtained for example by PIL library like in code below.
Also if you're having TF code then you're dealing with tensors, but random_zoom needs to know all dimensions, their integer sizes. Tensors may have None size for some dimensions if it is unknown at graph construction time, and probably this is causing the error about NoneType in your case. To overcome this you need to wrap random_zoom usage into numpy function interface, this will force function input to be numpy array instead of tensor and numpy arrays always have all dimensions with known sizes. I've implemented this wrapping in my code down below too.
Also you probably need to change row_axis=1, col_axis=2, channel_axis=0 to row_axis=0, col_axis=1, channel_axis=2 because channels (colors) go usually in least significant dimension (last).
Documentation for tf.keras.preprocessing.image.random_zoom.
I've implemented simple code next that works.
Input in code looks like this:
output looks like this:
Next code can be also run here online.
# Needs: python -m pip install tensorflow numpy pillow requests
import tensorflow as tf, numpy as np, PIL.Image, requests, io
tf.compat.v1.enable_eager_execution()
zoom_range = (0.4, 0.5)
img = PIL.Image.open(io.BytesIO(requests.get('https://i.stack.imgur.com/Fc3Jb.png').content))
#img = PIL.Image.open('Ruler-Big-Icon-PNG.png')
img = np.array(img)
img = tf.convert_to_tensor(img) # This line is not needed if you already have a tensor.
# You need only this single line of code to fix your issue!
img = tf.numpy_function(lambda img: tf.keras.preprocessing.image.random_zoom(
img, zoom_range, row_axis=0, col_axis=1, channel_axis=2, fill_mode='nearest',
), [img], tf.float32)
img = np.array(img) # This line is not needed if you plan img to be a tensor futher
# Example output is https://i.stack.imgur.com/MWk9T.png
PIL.Image.fromarray(img).save('result.png')
I'm generally not convinced that using Keras preprocessing functions outside of where they belong is the right approach. A simple way would be to use tf.image.random_crop. Assuming your image is larger than (200, 200, 3), you can just use this line:
img = tf.image.random_crop(img, (200, 200, 3))
Let's try an example. Original image:
import tensorflow as tf
import skimage
import matplotlib.pyplot as plt
import numpy as np
X = np.stack([skimage.data.chelsea() for _ in range(10)])
ds = tf.data.Dataset.from_tensor_slices(X).\
map(lambda x: tf.image.random_crop(x, (200, 200, 3)))
plt.imshow(next(iter(ds)))
plt.show()
Randomly cropped image of size (200, 200, 3):
Related
I am trying to make a model which recognises the emotions of a human. My code and RAM is just fine in the start:
But when I try to normalise my images, the RAM drastically jumps up
and then Colab just crashes:
This is the code block which is causing colab to crash:
import os
import matplotlib.pyplot as plt
import cv2
data = []
for emot in os.listdir('./data/'):
for file_ in os.listdir(f'./data/{emot}'):
img = cv2.imread(f'./data/{emot}/{file_}', 0)
img = cv2.bitwise_not(img)
img /= 255.0 # <--- This is the line that causes colab to crash
data.append([img, emotions.index(emot)])
If I remove the img /= 255.0, it doesn't crash, but then I have images which are not normalised!:
I even tried normalising it in another block:
for i in range(len(data)):
data[i][0] = np.array(data[i][0]) / 255.0
but it doesn't work and still crashes
I would like to go through an example. Firstly let's have a look at the following code.
import numpy as np
x = np.random.randint(0, 255, size=(100, 32, 32), dtype=np.int16)
print('Present data type', x.dtype)
# What you did
y = x/255
print('Present data type', y.dtype)
# What you should do
z = (x/255).astype(np.float16)
print('Present data type', z.dtype)
Output:
Present data type int16
Present data type float64
Present data type float16
If you look closely, while I am dividing the x variable and declaring y=x/255, the data type changes to float64. If you divide an int data type of a NumPy array, by default, it is typecasted to float64. Generally, 'float64' contains a larger memory. Therefore while dividing a int type NumPy matrix, one should always typecase to shorter datatypes for larger datasets.
If the code you executed fluently runs without the img /= 255.0 block, then this is the case. After dividing, you should typecast the img variable to the lowest possible float types, such as, np.float16 or np.float32. However, np.float16 has some limitations and it is not fully supported by TensorFlow (TF converts it to 32-bit float), you may use np.float32 datatype.
Therefore, try adding img.astype(np.float16) or img.astype(np.float32) after the line img /= 255.0.
The modified version of the code is given,
import os
import matplotlib.pyplot as plt
import cv2
data = []
for emot in os.listdir('./data/'):
for file_ in os.listdir(f'./data/{emot}'):
img = cv2.imread(f'./data/{emot}/{file_}', 0)
img = cv2.bitwise_not(img)
img = (img/255.0).astype(np.float16) # <--- This is the suggestion
data.append([img, emotions.index(emot)])
Assuming the next step in your pipeline is to create a tf.data.Dataset object out of your image corpus, you can use Dataset.map() to move your preprocessing into the data loading pipeline to save on memory space. Tensorflow has a very well-documented guide on how to do this here -> https://www.tensorflow.org/guide/data#preprocessing_data
dataset = tf.data.Dataset.from_tensor_slices((images,boxes))
function_to_map = lambda x,y: func3(x,y)
fast_benchmark(dataset.map(function_to_map).batch(1).prefetch(tf.data.experimental.AUTOTUNE))
Now I here is the func3
def fast_benchmark(dataset, num_epochs=2):
start_time = time.perf_counter()
print('dataset->',dataset)
for _ in tf.data.Dataset.range(num_epochs):
for _,__ in dataset:
print(_,__)
break
pass
the ooutput of print is
tf.Tensor([b'/media/jake/mark-4tb3/input/datasets/pascal/VOCtrainval_11-May-2012/VOCdevkit/VOC2012/JPEGImages/2008_000008.jpg'], shape=(1,), dtype=string) <tf.RaggedTensor [[[52, 86, 470, 419], [157, 43, 288, 166]]]>
what I want to do in func3()
want to change image directory to the real image and run the batch
You need to extract string form the tensor and use the appropriate image reading function. Below are the steps to be implemented in the code to achieve this.
You have to decorate the map function with tf.py_function(get_path, [x], [tf.float32]). You can find more about tf.py_function here. In tf.py_function, first argument is the name of map function, second argument is the element to be passed to map function and final argument is the return type.
You can get your string part by using bytes.decode(file_path.numpy()) in map function.
Use appropriate function to load your image. We are using load_img.
In the below simple program, we are using tf.data.Dataset.list_files to read path of the image. Next in the map function we are reading the image using load_img and later doing the tf.image.central_crop function to crop central part of the image.
Code -
%tensorflow_version 2.x
import tensorflow as tf
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array, array_to_img
from matplotlib import pyplot as plt
import numpy as np
def load_file_and_process(path):
image = load_img(bytes.decode(path.numpy()), target_size=(224, 224))
image = img_to_array(image)
image = tf.image.central_crop(image, np.random.uniform(0.50, 1.00))
return image
train_dataset = tf.data.Dataset.list_files('/content/bird.jpg')
train_dataset = train_dataset.map(lambda x: tf.py_function(load_file_and_process, [x], [tf.float32]))
for f in train_dataset:
for l in f:
image = np.array(array_to_img(l))
plt.imshow(image)
Output -
Hope this answers your question. Happy Learning.
When I tried to use tf.reduce_mean to compute the mean value of image for all axes, it showed that "An error ocurred while starting the kernel".
Here is my code:
import tensorflow as tf
import numpy as np
imgdata = tf.read_file("./test_img/00000001.jpg")
my_img = tf.image.decode_jpeg(imgdata)
image = tf.reduce_mean(my_img)
tf.Session().run(image)
I tried to run session to get "my_img" before computing mean, it didn't work as well.
If I try to compute mean value of an array created by myself with the same shape, it goes well.
my_array = = np.random.randn(720, 1280, 3)
tf.Session().run(tf.reduce_mean(my_array))
It has been solved by converting the data type from uint8 to float32:
my_img = 255.0 * tf.image.convert_image_dtype(my_img, tf.float32)
For a nice output in Tensorboard I want to show a batch of input images, corresponding target masks and output masks in a grid.
Input images have different size then the masks. Furthermore the images are obviously RGB.
From a batch of e.g. 32 or 64 I only want to show the first 4 images.
After some fiddling around I came up with the following example code. Good thing: It works.
But I am really not sure if I missed something in Pytorch. It just looks much longer then I expected. Especially the upsampling and transformation to RGB seems wild. But the other transformations I found would not work for a whole batch.
import torch
from torch.autograd import Variable
import torch.nn.functional as FN
import torchvision.utils as vutils
from tensorboardX import SummaryWriter
import time
batch = 32
i_size = 192
o_size = 112
nr_imgs = 4
# Tensorboard init
writer = SummaryWriter('runs/' + time.strftime('%Y%m%d_%H%M%S'))
input_image=Variable(torch.rand(batch,3,i_size,i_size))
target_mask=Variable(torch.rand(batch,o_size,o_size))
output_mask=Variable(torch.rand(batch,o_size,o_size))
# upsample target_mask, add dim to have gray2rgb
tm = FN.upsample(target_mask[:nr_imgs,None], size=[i_size, i_size], mode='bilinear')
tm = torch.cat( (tm,tm,tm), dim=1) # grayscale plane to rgb
# upsample target_mask, add dim to have gray2rgb
om = FN.upsample(output_mask[:nr_imgs,None], size=[i_size, i_size], mode='bilinear')
om = torch.cat( (om,om,om), dim=1) # grayscale plane to rgb
# add up all images and make grid
imgs = torch.cat( ( input_image[:nr_imgs].data, tm.data, om.data ) )
x = vutils.make_grid(imgs, nrow=nr_imgs, normalize=True, scale_each=True)
# Tensorboard img output
writer.add_image('Image', x, 0)
EDIT: Found this on Pytorchs Issues list. Its about Batch support for Transform. Seems there are no plans to add batch transforms in the future. So my current code might be the best solution for the time being, anyway?
Maybe you can just convert your Tensors to the numpy array (.data.cpu().numpy() ) and use opencv to do upsampling? OpenCV implementation should be quite fast.
I am using the MICCAI BRATS 2015 database containing 3D MRI images of the dimensions 155x240x240.
I wanted to perform intensity standardization on these images, and am trying to use the IntensityRangeStandardization class from medpy.filter.
The code is simple:
Load 20 flair images from the database into an array:
from glob import glob
import SimpleITK as sitk
pth = 'C:/BRats2015/HGG' #path to the directory
flair = glob(self.path + '*/*Flair*/*.mha') #contain paths to all images
flair = flair[:20] #choose 20 images
#load the 20 images in sitk format
im = []
for i in flair:
im.append(sitk.ReadImage(i))
#convert them into numpy array
for i in xrange(len(im)):
im[i] = sitk.GetArrayFromImage(im[i])
#initialize the filter
normalizer = IntensityRangeStandardization()
#train and transform the images
im_n = normalizer.train_transform(im)[1] # the second returned variable contains the new images, # hence [1]
I get the following error message:
File "intensity_range_standardization.py", line 268, in train
self.__stdrange = self.__compute_stdrange(images)
File "intensity_range_standardization.py", line 451, in __compute_stdrange
raise SingleIntensityAccumulationError('Image no.{} shows an unusual single-intensity accumulation that leads to a situation where two percentile values are equal. This situation is usually caused, when the background has not been removed from the image. Another possibility would be to reduce the number of landmark percentiles landmarkp or to change their distribution.'.format(idx))
SingleIntensityAccumulationError: Image no.0 shows an unusual single-intensity accumulation that leads to a situation where two percentile values are equal. This situation is usually caused, when the background has not been removed from the image. Another possibility would be to reduce the number of landmark percentiles landmarkp or to change their distribution.
Okay, I figured how to call the function train_transform if we are given images and their masks respectively. Here's the code from the medpy github repo.
Reshaping the images should be easy, but I'll still just post the link to the code in case of any confusion : Reshape the new images
The full code that worked for me:
images = [img1, img2, img3]
# each image is numpy array of shape (150,150)
masks = [i > 0 for i in images]
norm0 = IntensityRangeStandardization()
trained_model, transformed_images = norm0.train_transform([i[m] for i, m in zip(images, masks)])
for ti, i, m, in zip(transformed_images, images, masks):
i[m] = ti
norm_images.append(i)
To train and transform one after the other:
norm_images = []
trained_model = norm0.train([i[m] for i, m in zip(images, masks)])
transformed_images = [trained_model.transform(i[m], surpress_mapping_check = False) for i, m in zip(images, masks)]
for ti, i, m, in zip(transformed_images, images, masks):
i[m] = ti
norm_images.append(i)