dataset = tf.data.Dataset.from_tensor_slices((images,boxes))
function_to_map = lambda x,y: func3(x,y)
fast_benchmark(dataset.map(function_to_map).batch(1).prefetch(tf.data.experimental.AUTOTUNE))
Now I here is the func3
def fast_benchmark(dataset, num_epochs=2):
start_time = time.perf_counter()
print('dataset->',dataset)
for _ in tf.data.Dataset.range(num_epochs):
for _,__ in dataset:
print(_,__)
break
pass
the ooutput of print is
tf.Tensor([b'/media/jake/mark-4tb3/input/datasets/pascal/VOCtrainval_11-May-2012/VOCdevkit/VOC2012/JPEGImages/2008_000008.jpg'], shape=(1,), dtype=string) <tf.RaggedTensor [[[52, 86, 470, 419], [157, 43, 288, 166]]]>
what I want to do in func3()
want to change image directory to the real image and run the batch
You need to extract string form the tensor and use the appropriate image reading function. Below are the steps to be implemented in the code to achieve this.
You have to decorate the map function with tf.py_function(get_path, [x], [tf.float32]). You can find more about tf.py_function here. In tf.py_function, first argument is the name of map function, second argument is the element to be passed to map function and final argument is the return type.
You can get your string part by using bytes.decode(file_path.numpy()) in map function.
Use appropriate function to load your image. We are using load_img.
In the below simple program, we are using tf.data.Dataset.list_files to read path of the image. Next in the map function we are reading the image using load_img and later doing the tf.image.central_crop function to crop central part of the image.
Code -
%tensorflow_version 2.x
import tensorflow as tf
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array, array_to_img
from matplotlib import pyplot as plt
import numpy as np
def load_file_and_process(path):
image = load_img(bytes.decode(path.numpy()), target_size=(224, 224))
image = img_to_array(image)
image = tf.image.central_crop(image, np.random.uniform(0.50, 1.00))
return image
train_dataset = tf.data.Dataset.list_files('/content/bird.jpg')
train_dataset = train_dataset.map(lambda x: tf.py_function(load_file_and_process, [x], [tf.float32]))
for f in train_dataset:
for l in f:
image = np.array(array_to_img(l))
plt.imshow(image)
Output -
Hope this answers your question. Happy Learning.
Related
Below is the data augmentation function that I created.
import tensorFlow as tf
import tensorflow_addons as tfa
def augment_data(ds):
seed = tf.random.Generator.from_seed(1).normal([])
seed_2d = (1, 2)
# flipped images
ds_flipped = ds.map(lambda img, lbl: (tf.image.flip_left_right(img), lbl))
# induce random brightness
ds_rnb = ds.map(lambda img, lbl :
(tf.image.stateless_random_brightness(img,
max_delta=0.65,
seed=seed_2d),
lbl))
print('ds_flipped, ds_rnb ran successfully')
# centre crop
ds_cc = ds.map(lambda img, lbl:
(tf.image.central_crop(img,
central_fraction=0.8),
lbl))
ds_ran_zoom = ds.map(lambda img, lbl:
(tf.keras.preprocessing.image.random_zoom(img,
zoom_range=(.30, .70)),
lbl))
return ds_flipped, ds_rnb, ds_cc, ds_ran_zoom
The functions for flipped images and random brightness are working fine but tf.image.central_crop and tf.keras.preprocessing.image.random_zoom are not working.
Calling augment_data(ds) gives the following error
Running tf.image.central_crop giving me an error:
ValueError: image should either be a Tensor with rank = 3 or
rank = 4. Had rank = None.
Running tf.keras.preprocessing.image.random_zoom giving me an error
in transform_matrix_offset_center *
o_x = float(x) / 2 + 0.5
TypeError: float() argument must be a string or a number, not 'NoneType'
But if I run the central_crop function without using the map then the below code works fine
for image, label in train_data:
_ = tf.image.central_crop(image, central_fraction=0.8)
print('tf.image.central_crop ran successfully')
outputs
tf.keras.preprocessing.image.random_zoom ran successfully
If we run tf.keras.preprocessing.image.random.zoom in the same way then we get the error
for image, label in train_data:
_ = tf.keras.preprocessing.image.random_zoom(image, zoom_range=(.30, .70))
RuntimeError: affine matrix has wrong number of rows
Where in order to run tf.keras.preprocessing.image.random.zoom requires un-batching of the dataset. So the below code works fine
for image, label in train_data.unbatch().take(1):
_ = tf.keras.preprocessing.image.random_zoom(image, zoom_range=(.30, .70))
print('tf.keras.preprocessing.image.random_zoom ran successfully')
I have created a google colab notebook to replicate the issue.
What is the best way to run the TensorFlow function using the map function on the tf dataset?
What is the way to know whether any function is able to run on tf dataset using map function?
How to create a function that runs on batched and un-batched dataset both?
As you see above most of the functions are able to run on a single image but when it comes to running them using a map, different functions are throwing different errors.
The problem is that the shape of the images and labels is unknown. You should use set_shape at the end of the read_tfrecord function : decoded_image.set_shape(img_x, img_y, channels) and also for the label.
If you set the image and label shape in the dataset, most Tensorflow functions will work by applying map, both batched and unbatched.
tf.keras.preprocessing.image.random_zoom has a problem because it only takes a 3D Tensorflow tensor as an input and outputs a numpy array. This particular transformation is problematic.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
Trying to follow this tutorial here. Although when I select my content image and style image when I try to use the imshow() function I am getting this error:
ValueError: pic should be 2/3 dimensional. Got 4 dimensions.
Using google I have not been able to really find any remedy to this problem.
Here is my code:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from PIL import Image
import matplotlib.pyplot as plt
import torchvision.transforms as transforms
import torchvision.models as models
import copy
import numpy as np
# This detects if cuda is available for GPU training otherwise will use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)
# Desired size of the output image
imsize = 512 if torch.cuda.is_available() else 256
print(imsize)
# Helper function
def image_loader(image_name, imsize):
# Scale the imported image and transform it into a torch tensor
loader = transforms.Compose([transforms.Resize(imsize), transforms.ToTensor()])
image = Image.open(image_name)
# Fake batch dimension required to fit network's input dimension
image = loader(image).unsqueeze(0)
return image.to(device, torch.float)
# Helper function to show the tensor as a PIL image
def imshow(tensor, title=None):
unloader = transforms.ToPILImage()
image = tensor.cpu().clone()
image = unloader(image)
plt.imshow(image)
if title is not None:
plt.title(title)
plt.pause(0.001) # Pause so that the plots are updated
# Loading of images
image_directory = './images/'
style_img = image_loader(image_directory + "pb.jpg", imsize)
content_img = image_loader(image_directory + "content.jpg", imsize)
assert style_img.size() == content_img.size(), "we need to import style and content images of the same size"
plt.figure()
imshow(style_img, title='style image')
Any suggestions would be really helpful.
Here is the style and content image for reference:
matplotlib.pyplot expects either 2D (grayscale, dimensions=(W,H)) or 3D (colored, dimensions = (W,H,color channel)) in the imshow-function.
You probably still have the batchsize as a first dimension in your tensor, because in your code you do:
# Fake batch dimension required to fit network's input dimension
image = loader(image).unsqueeze(0)
which adds this first dimensions. If so, try either to use:
plt.imshow(np.squeeze(image))
or
plt.imshow(image[0])
I am trying to add random zoom to my images that are tiff files with 128x160 resolution 1 channel but new version of random zoom for keras tensorflow has gotten me confused, i don't understand how should be the tuple format it expects as the zoom range arguement.
From the documentation.
tf.keras.preprocessing.image.random_zoom(
x, zoom_range, row_axis=1, col_axis=2, channel_axis=0, fill_mode='nearest',
cval=0.0, interpolation_order=1
)
I need to add some random zoom to my image, and i am trying like this:
zoom_range = ((0.4, 0.4))
img = tf.keras.preprocessing.image.random_zoom(
img, zoom_range, row_axis=1, col_axis=2, channel_axis=0, fill_mode='nearest',
cval=0.0, interpolation_order=1
)
The output is:
TypeError: float() argument must be a string or a number, not
'NoneType'
How exactly should i pass as parameter any random zoom amount to my images?
Public kaggle notebook here:
https://www.kaggle.com/puelon/notebook75c416766a
TypeError: in user code:
<ipython-input-4-9ba0455797a4>:17 load *
img = tf.keras.preprocessing.image.random_zoom(img, zoom_range, row_axis=0, col_axis=1, channel_axis=2, fill_mode='nearest')
/opt/conda/lib/python3.7/site-packages/keras_preprocessing/image/affine_transformations.py:153 random_zoom *
x = apply_affine_transform(x, zx=zx, zy=zy, channel_axis=channel_axis,
/opt/conda/lib/python3.7/site-packages/keras_preprocessing/image/affine_transformations.py:321 apply_affine_transform *
transform_matrix = transform_matrix_offset_center(
/opt/conda/lib/python3.7/site-packages/keras_preprocessing/image/affine_transformations.py:246 transform_matrix_offset_center *
o_x = float(x) / 2 + 0.5
/opt/conda/lib/python3.7/site-packages/tensorflow/python/autograph/operators/py_builtins.py:195 float_ **
return _py_float(x)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/autograph/operators/py_builtins.py:206 _py_float
return float(x)
TypeError: float() argument must be a string or a number, not 'NoneTyp
e'
TypeError Traceback (most recent call last)
<ipython-input-4-9ba0455797a4> in <module>
27 train1, train2, test1 = d
28 train_ds = tf.data.Dataset.from_tensor_slices(train1 + train2).\
---> 29 shuffle(len(train1) + len(train2)).map(load).batch(4)
30 test_ds = tf.data.Dataset.from_tensor_slices(test1).\
31 shuffle(len(test1)).map(load).batch(4)
for i in range(len(groups)):
d = deque(groups)
d.rotate(i)
train1, train2, test1 = d
train_ds = tf.data.Dataset.from_tensor_slices(train1 + train2).\
shuffle(len(train1) + len(train2)).map(load).batch(4)
test_ds = tf.data.Dataset.from_tensor_slices(test1).\
shuffle(len(test1)).map(load).batch(4)
Probably your img is of wrong object type. For random_zoom(...) function you need to provide input as tensor or 3D numpy array of shape (height, width, channels) i.e. for RGB image of size 300x200 array should be of shape (200, 300, 3). This kind of numpy array can be obtained for example by PIL library like in code below.
Also if you're having TF code then you're dealing with tensors, but random_zoom needs to know all dimensions, their integer sizes. Tensors may have None size for some dimensions if it is unknown at graph construction time, and probably this is causing the error about NoneType in your case. To overcome this you need to wrap random_zoom usage into numpy function interface, this will force function input to be numpy array instead of tensor and numpy arrays always have all dimensions with known sizes. I've implemented this wrapping in my code down below too.
Also you probably need to change row_axis=1, col_axis=2, channel_axis=0 to row_axis=0, col_axis=1, channel_axis=2 because channels (colors) go usually in least significant dimension (last).
Documentation for tf.keras.preprocessing.image.random_zoom.
I've implemented simple code next that works.
Input in code looks like this:
output looks like this:
Next code can be also run here online.
# Needs: python -m pip install tensorflow numpy pillow requests
import tensorflow as tf, numpy as np, PIL.Image, requests, io
tf.compat.v1.enable_eager_execution()
zoom_range = (0.4, 0.5)
img = PIL.Image.open(io.BytesIO(requests.get('https://i.stack.imgur.com/Fc3Jb.png').content))
#img = PIL.Image.open('Ruler-Big-Icon-PNG.png')
img = np.array(img)
img = tf.convert_to_tensor(img) # This line is not needed if you already have a tensor.
# You need only this single line of code to fix your issue!
img = tf.numpy_function(lambda img: tf.keras.preprocessing.image.random_zoom(
img, zoom_range, row_axis=0, col_axis=1, channel_axis=2, fill_mode='nearest',
), [img], tf.float32)
img = np.array(img) # This line is not needed if you plan img to be a tensor futher
# Example output is https://i.stack.imgur.com/MWk9T.png
PIL.Image.fromarray(img).save('result.png')
I'm generally not convinced that using Keras preprocessing functions outside of where they belong is the right approach. A simple way would be to use tf.image.random_crop. Assuming your image is larger than (200, 200, 3), you can just use this line:
img = tf.image.random_crop(img, (200, 200, 3))
Let's try an example. Original image:
import tensorflow as tf
import skimage
import matplotlib.pyplot as plt
import numpy as np
X = np.stack([skimage.data.chelsea() for _ in range(10)])
ds = tf.data.Dataset.from_tensor_slices(X).\
map(lambda x: tf.image.random_crop(x, (200, 200, 3)))
plt.imshow(next(iter(ds)))
plt.show()
Randomly cropped image of size (200, 200, 3):
I create a tensorflow dataset of filenames of many images in a folder. The images are named [index].jpg, where index is some integer used to identify the images. I have a dictionary of string 'index' to labels as tuples. How, using tf.data.Dataset.map, can I map the index to a label tuple?
Here's the map_func I am trying to pass to the map function:
def grabImages(filepath):
index = getIndexFromFilePath(filepath)
img = tf.io.read_file(filepath)
img = translateImage(img)
dictionary = getLabelDictionary()
return index, img
Where dictionary is the index to labels dict, index is the index of the filepath as tf.Tensor and img is a preprocessed image that was at the filepath.
This returns a dataset with the index, as a tensor, mapped to the corresponding image. Is there a way to get the labels of the index using dictionary using something like dictionary[index]? Basically, I want to find the string content of index.
I have tried using .numpy() and .eval() with the current session within the grabImages function, but neither work.
Here is an example of how to get string part of a tensor in the tf.data.Dataset.map function.
Below are the steps I have implemented in the code to achieve this.
You have to decorate the map function with tf.py_function(get_path, [x], [tf.string]). You can find more about tf.py_function here.
You can get your string part by using bytes.decode(file_path.numpy()) in map function.
Code -
%tensorflow_version 2.x
import tensorflow as tf
import numpy as np
def get_path(file_path):
print("file_path: ",bytes.decode(file_path.numpy()),type(bytes.decode(file_path.numpy())))
return file_path
train_dataset = tf.data.Dataset.list_files('/content/bird.jpg')
train_dataset = train_dataset.map(lambda x: tf.py_function(get_path, [x], [tf.string]))
for one_element in train_dataset:
print(one_element)
Output -
file_path: /content/bird.jpg <class 'str'>
(<tf.Tensor: shape=(), dtype=string, numpy=b'/content/bird.jpg'>,)
Hope this answers your question.
For a nice output in Tensorboard I want to show a batch of input images, corresponding target masks and output masks in a grid.
Input images have different size then the masks. Furthermore the images are obviously RGB.
From a batch of e.g. 32 or 64 I only want to show the first 4 images.
After some fiddling around I came up with the following example code. Good thing: It works.
But I am really not sure if I missed something in Pytorch. It just looks much longer then I expected. Especially the upsampling and transformation to RGB seems wild. But the other transformations I found would not work for a whole batch.
import torch
from torch.autograd import Variable
import torch.nn.functional as FN
import torchvision.utils as vutils
from tensorboardX import SummaryWriter
import time
batch = 32
i_size = 192
o_size = 112
nr_imgs = 4
# Tensorboard init
writer = SummaryWriter('runs/' + time.strftime('%Y%m%d_%H%M%S'))
input_image=Variable(torch.rand(batch,3,i_size,i_size))
target_mask=Variable(torch.rand(batch,o_size,o_size))
output_mask=Variable(torch.rand(batch,o_size,o_size))
# upsample target_mask, add dim to have gray2rgb
tm = FN.upsample(target_mask[:nr_imgs,None], size=[i_size, i_size], mode='bilinear')
tm = torch.cat( (tm,tm,tm), dim=1) # grayscale plane to rgb
# upsample target_mask, add dim to have gray2rgb
om = FN.upsample(output_mask[:nr_imgs,None], size=[i_size, i_size], mode='bilinear')
om = torch.cat( (om,om,om), dim=1) # grayscale plane to rgb
# add up all images and make grid
imgs = torch.cat( ( input_image[:nr_imgs].data, tm.data, om.data ) )
x = vutils.make_grid(imgs, nrow=nr_imgs, normalize=True, scale_each=True)
# Tensorboard img output
writer.add_image('Image', x, 0)
EDIT: Found this on Pytorchs Issues list. Its about Batch support for Transform. Seems there are no plans to add batch transforms in the future. So my current code might be the best solution for the time being, anyway?
Maybe you can just convert your Tensors to the numpy array (.data.cpu().numpy() ) and use opencv to do upsampling? OpenCV implementation should be quite fast.