Shearing image in tensorflow - python

I am using tf.keras to build my network. And I am doing all the augmentation in tensor_wise level since my data in tfrecords file. Then I needed to do shearing and zca for augmentation but couldn't find a proper implementation in tensor flow. And I can't use the DataImageGenerator that did both operation I needed because as I said my data doesn't fit in memory and it is in tfrecord format. So all my augmentations process should be tesnorwise.
#fchollet here suggested a way to use ImgaeDataGenerator with large dataset.
My first questino is
if I use #fchollet way, which is basically using X-sample of the large data to run the ImageDataGenerator then using train_on_batch to train the network , how I can feed my validation data to the network.
My Second question is there any tensor-wise implementation for shear and zca operations. Some people like here suggested using tf.contrib.image.transform but couldn't understand how. If some one have the idea on how to do it, I will appreciate that.
Update:
This is my trial to construct the transformation matrix through ski_image
from skimage import io
from skimage import transform as trans
import tensor flow as tf
def augment()
afine_tf = trans.AffineTransform(shear=0.2)
transform = tf.contrib.image.matrices_to_flat_transforms(tf.linalg.inv(afine_tf.params))
transform= tf.cast(transform, tf.float32)
image = tf.contrib.image.transform(image, transform) # Image here is a tensor
return image
dataset_train = tf.data.TFRecordDataset(training_files, num_parallel_reads=calls)
dataset_train = dataset_train.apply(tf.contrib.data.shuffle_and_repeat(buffer_size=1000+ 4 * batch_size))
dataset_train = dataset_train.map(decode_train, num_parallel_calls= calls)
dataset_train = dataset_train.map(augment,num_parallel_calls=calls )
dataset_train = dataset_train.batch(batch_size)
dataset_train = dataset_train.prefetch(tf.contrib.data.AUTOTUNE)

I will answer the second question.
Today one of my old questions was commented by a user, but the comments have been deleted when I was adding more details on how to use tf.contrib.image.transform. I guess it's you, right?
So, I have edited my question and added an example, check it here.
TL;DR:
def transformImg(imgIn,forward_transform):
t = tf.contrib.image.matrices_to_flat_transforms(tf.linalg.inv(forward_transform))
# please notice that forward_transform must be a float matrix,
# e.g. [[2.0,0,0],[0,1.0,0],[0,0,1]] will work
# but [[2,0,0],[0,1,0],[0,0,1]] will not
imgOut = tf.contrib.image.transform(imgIn, t, interpolation="BILINEAR",name=None)
return imgOut
def shear_transform_example(filename,shear_lambda):
image_string = tf.read_file(filename)
image_decoded = tf.image.decode_jpeg(image_string, channels=3)
img = transformImg(image_decoded, [[1.0,shear_lambda,0],[0,1.0,0],[0,0,1.0]])
# Notice that this is a shear transformation parallel to the x axis
# If you want a y axis version, use this:
# img = transformImg(image_decoded, [[1.0,0,0],[shear_lambda,1.0,0],[0,0,1.0]])
return img
img = shear_transform_example("white_square.jpg",0.1)

Related

How to load images with different image shape to tf.data pipe?

My goal is to have a preprocessing layers so it can handle any image size. This is because the data set that I use have 2 different image shape. The solution is simple, just resize it when I load the image. However, I believe this wont work when the model is deployed, I can't do manual resize like that. So I must use preprocessing layers.
The docs I used
What I've tried:
Put the preprocessing layers part of the model, it does not work.
I am thinking to use TensorSliceDataset.map(resize_and_rescale).
The problem is I need to convert the [tensor image 1, tensor image 2] to TensorSliceDataset. However, I can't convert it.
What I've tried:
tf.data.Dataset.from_tensor_slices((X_train, y_train))
It throws error
InvalidArgumentError: {{function_node __wrapped__Pack_N_9773_device_/job:localhost/replica:0/task:0/device:GPU:0}} Shapes of all inputs must match: values[0].shape = [258,320,3] != values[23].shape = [322,480,3]
[[{{node Pack}}]] [Op:Pack] name: component_0
The load images function:
def load_images(df):
paths = df['path'].values
X = []
for path in paths:
raw = tf.io.read_file(path)
img = tf.image.decode_png(raw, channels=3)
X.append(img)
y = df['kind'].cat.codes
return X, y
As far as I understand you wish to train on both image sizes simultaneously. The simplest way is probably to create two different datasets for each image size and concatenate them after the batching as follows:
dataset_1 = tf.data.Dataset.from_tensor_slices((X_train_1, y_train_1))
dataset_1 = dataset_1.batch(batch_size_1)
dataset_2 = tf.data.Dataset.from_tensor_slices((X_train_2, y_train_2))
dataset_2 = dataset_2.batch(batch_size_2)
dataset = dataset_1.concatenate(dataset_2)
dataset = dataset.shuffle(shuffle_buffer_size)
This case each batch consists of images of the same size. If you use .repeat() do not forget to put if after the concatination.
You need to use ragged tensors to handle different image sizes:
dataset = tf.data.Dataset.from_tensor_slices((tf.ragged.constant(img_list), label_list))
dataset = dataset.apply(tf.data.experimental.dense_to_ragged_batch(batch_size=3))
Example

How can I prepare my image dataset for a federated model?

How could I transform my dataset (composed of images) in a federated dataset?
I am trying to create something similar to emnist but for my own dataset.
tff.simulation.datasets.emnist.load_data(
only_digits=True, cache_dir=None )
You will need to create the clientData object first
for example:
client_data = tff.simulation.datasets.ClientData.from_clients_and_tf_fn(client_ids,
create_dataset)
where create_dataset is a serializable function but first you have to prepare your images read this tutorial about preprocessing data
labels_tf = tf.convert_to_tensor(labels)
def parse_image(filename):
parts = tf.strings.split(filename, os.sep)
label_str = parts[-2]
label_int = tf.where(labels_tf == label_str)[0][0]
image = tf.io.read_file(filename)
image = tf.io.decode_jpeg(image,channels=3)
image = tf.image.convert_image_dtype(image, tf.float32)
image = tf.image.resize(image, [32, 32])
return image, label_int
When you prepared your data pass it to the create_dataset function
def create_dataset(client_id):
....
list_ds = tf.data.Dataset.list_files(<path of your dataset>)
images_ds = list_ds.map(parse_image)
return images_ds
after this step, you can make some preprocessing function
NUM_CLIENTS = 10
NUM_EPOCHS = 5
BATCH_SIZE = 20
SHUFFLE_BUFFER = 100
PREFETCH_BUFFER = 10
def preprocess(dataset):
return dataset.repeat(NUM_EPOCHS).shuffle(SHUFFLE_BUFFER, seed=1).batch(
BATCH_SIZE).prefetch(PREFETCH_BUFFER)
After this you could make a tf.data.Dataset which will be suitable for federated training.
def make_federated_data(client_data, client_ids):
return [
preprocess(client_data.create_tf_dataset_for_client(x))
for x in client_ids
]
After this your dataset is ready for federated learning!
Federated datasets in TFF are represented as ClientData objects. There are multiple subclasses that can be used depending on your dataset.
Two potentially relevant ways to create such objects:
Use ClientData.from_clients_and_tf_fn. This is useful for smaller datasets.
As a SqlClientData, which uses a SQL-file backing to improve performance. This can be done through tff.simulation.datasets.save_to_sql_client_data. Effectively, this allows you to do one-time work to create the client datasets and save the result, rather than having to reconstruct the datasets each time.
Note that both of these require TF-serializable functions for creating datasets from ids. If you just tensors you can use TestClientData, but this is intended only for small-scale datasets.

Use preprocessing function that changes size of input on ImageDataGenerator

I wish to take the FFT of the input dataset loaded using ImageDataGenerator. Taking the FFT will double the number of channels as I stack the real and complex parts of the complex output of the FFT together along the channels dimension. The preprocessing_function attribute of the ImageDataGenerator class should output a Numpy tensor with the same shape as the input, so I could not use that.
I tried applying tf.math.fft2d directly on the ImageDataGenerator.flow_from_directory() output, but it is consuming too much RAM - causing the program to crash on Google colab. Another way I tried was to add a custom layer computing the FFT as the first layer of my neural network, but this adds to the training time. So I wish to do it as a pre-processing step.
Could anyone kindly suggest an efficient way to apply a function on ImageDataGenerator.
You can do a custom ImageDataGenerator, but I have no reason to think this is any faster than using it in the first layer. It seems like a costly operation, since tf.signal.fft2d takes complex64 or complex128 dtypes. So it needs casting, and then casting back because neural network weights are tf.float32 and other image processing functions don't take complex dtype.
import tensorflow as tf
labels = ['Cats', 'Dogs', 'Others']
def read_image(file_name):
image = tf.io.read_file(file_name)
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.convert_image_dtype(image, tf.float32)
image = tf.image.resize_with_pad(image, target_height=224, target_width=224)
image = tf.cast(image, tf.complex64)
image = tf.signal.fft2d(image)
label = tf.strings.split(file_name, '\\')[-2]
label = tf.where(tf.equal(label, labels))
return image, label
ds = tf.data.Dataset.list_files(r'path\to\my\pictures\*\*.jpg')
ds = ds.map(read_image)
next(iter(ds))

Tensorflow: Modern way to load large data

I want to train a convolutional neural network (using tf.keras from Tensorflow version 1.13) using numpy arrays as input data. The training data (which I currently store in a single >30GB '.npz' file) does not fit in RAM all at once. What is the best way to save and load large data-sets into a neural network for training? Since I didn't manage to find a good answer to this (surely ubiquitous?) problem, I'm hoping to hear one here. Thank you very much in advance for any help!
Sources
Similar questions seem to have been asked many times (e.g. training-classifier-from-tfrecords-in-tensorflow, tensorflow-synchronize-readings-from-tfrecord, how-to-load-data-parallelly-in-tensorflow) but are several years old and usually contain no conclusive answer.
My current understanding is that using TFRecord files is a good way to approach this problem. The most promising tutorial I found so far explaining how to use TFRecord files with keras is medium.com. Other helpful sources were machinelearninguru.com and medium.com_source2 and sources therin.
The official tensorflow documentation and tutorials (on tf.data.Dataset, Importing Data, tf_records etc.) did not help me. In particular, several of the examples given there didn't work for me even without modifications.
My Attempt at using TFRecord files
I'm assuming TFRecords are a good way to solve my problem but I'm having a hard time using them. Here is an example I made based on the tutorial medium.com. I stripped down the code as much as I could.
# python 3.6, tensorflow 1.13.
# Adapted from https://medium.com/#moritzkrger/speeding-up-keras-with-tfrecord-datasets-5464f9836c36
import tensorflow as tf
import numpy as np
from tensorflow.python import keras as keras
# Helper functions (see also https://www.tensorflow.org/tutorials/load_data/tf_records)
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def writeTFRecords():
number_of_samples = 100 # create some random data to play with
images, labels = (np.random.sample((number_of_samples, 256, 256, 1)), np.random.randint(0, 30, number_of_samples))
writer = tf.python_io.TFRecordWriter("bla.tfrecord")
for index in range(images.shape[0]):
image = images[index]
label = labels[index]
feature = {'image': _bytes_feature(tf.compat.as_bytes(image.tostring())),
'label': _int64_feature(int(label))}
example = tf.train.Example(features=tf.train.Features(feature=feature))
writer.write(example.SerializeToString())
writer.close()
def loadTFRecord(data_path):
with tf.Session() as sess:
feature = {'train/image': tf.FixedLenFeature([], tf.string),
'train/label': tf.FixedLenFeature([], tf.int64)}
# Create a list of filenames and pass it to a queue
filename_queue = tf.train.string_input_producer([data_path], num_epochs=1)
# Define a reader and read the next record
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
# Decode the record read by the reader
features = tf.parse_single_example(serialized_example, features=feature)
# Convert the image data from string back to the numbers
image = tf.decode_raw(features['train/image'], tf.float32)
# Cast label data into int32
label = tf.cast(features['train/label'], tf.int32)
# Reshape image data into the original shape
image = tf.reshape(image, [256, 256, 1])
return image, label # I'm not 100% sure that's how this works...
# ######### generate a TFRecords file in the working directory containing random data. #################################
writeTFRecords()
# ######## Load the TFRecords file and use it to train a simple example neural network. ################################
image, label = loadTFRecord("bla.tfrecord")
model_input = keras.layers.Input(tensor=image)
model_output = keras.layers.Flatten(input_shape=(-1, 256, 256, 1))(model_input)
model_output = keras.layers.Dense(16, activation='relu')(model_output)
train_model = keras.models.Model(inputs=model_input, outputs=model_output)
train_model.compile(optimizer=keras.optimizers.RMSprop(lr=0.0001),
loss='mean_squared_error',
target_tensors=[label])
print("\n \n start training \n \n") # Execution gets stuck on fitting
train_model.fit(epochs=1, steps_per_epoch=10) # no output or error messages.
The code creates a TFRecord file and starts fitting, then just gets stuck with no output or error messages. I don't know what the problem is or how I could try to fix it.
While this is no real answer to the original question (i.e. "what is the optimal way to train on large datasets"), I managed to get tfrecords and datasets to work. Of particular help was this tutorial on YouTube. I include a minimal example with working code for anyone struggling with the same problem.
# Developed using python 3.6, tensorflow 1.14.0.
# This code writes data (pairs (label, image) where label is int64 and image is np.ndarray) into .tfrecord files and
# uses them for training a simple neural network. It is meant as a minimal working example of how to use tfrecords. This
# solution is likely not optimal. If you know how to improve it, please comment on
# https://stackoverflow.com/q/57717004/9988487. Refer to links therein for further information.
import tensorflow as tf
import numpy as np
from tensorflow.python import keras as keras
# Helper functions (see also https://www.tensorflow.org/tutorials/load_data/tf_records)
def _int64_feature(value):
return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def _bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
def write_tfrecords_file(out_path: str, images: np.ndarray, labels: np.ndarray) -> None:
"""Write all image-label pairs into a single .tfrecord file.
:param out_path: File path of the .tfrecord file to generate or overwrite.
:param images: array with first dimension being the image index. Every images[i].tostring() is
serialized and written into the file as 'image': wrap_bytes(img_bytes)
:param labels: 1d array of integers. labels[i] is the label of images[i]. Written as 'label': wrap_int64(label)"""
assert len(images) == len(labels)
with tf.io.TFRecordWriter(out_path) as writer: # could use writer_options parameter to enable compression
for i in range(len(labels)):
img_bytes = images[i].tostring() # Convert the image to raw bytes.
label = labels[i]
data = {'image': _bytes_feature(img_bytes), 'label': _int64_feature(label)}
feature = tf.train.Features(feature=data) # Wrap the data as TensorFlow Features.
example = tf.train.Example(features=feature) # Wrap again as a TensorFlow Example.
serialized = example.SerializeToString() # Serialize the data.
writer.write(serialized) # Write the serialized data to the TFRecords file.
def parse_example(serialized, shape=(256, 256, 1)):
features = {'image': tf.io.FixedLenFeature([], tf.string), 'label': tf.io.FixedLenFeature([], tf.int64)}
# Parse the serialized data so we get a dict with our data.
parsed_example = tf.io.parse_single_example(serialized=serialized, features=features)
label = parsed_example['label']
image_raw = parsed_example['image'] # Get the image as raw bytes.
image = tf.decode_raw(image_raw, tf.float32) # Decode the raw bytes so it becomes a tensor with type.
image = tf.reshape(image, shape=shape)
return image, label # this function will be called once (to add it to tf graph; then parse images individually)
# create some arbitrary data to play with: 1000 images sized 256x256 with one colour channel. Use your custom np-arrays
IMAGE_WIDTH, NUM_OF_IMAGES, NUM_OF_CLASSES, COLOUR_CHANNELS = 256, 10_000, 10, 1
# using float32 to save memory. Must match type in parse_example(), tf.decode_raw(image_raw, tf.float32)
features_train = np.random.sample((NUM_OF_IMAGES, IMAGE_WIDTH, IMAGE_WIDTH, COLOUR_CHANNELS)).astype(np.float32)
labels_train = np.random.randint(low=0, high=NUM_OF_CLASSES, size=NUM_OF_IMAGES) # one random label for each image
features_eval = features_train[:200] # use the first 200 images as evaluation data for simplicity.
labels_eval = labels_train[:200]
write_tfrecords_file("train.tfrecord", features_train, labels_train) # normal: split the data files of several GB each
write_tfrecords_file("eval.tfrecord", features_eval, labels_eval) # this may take a while. Consider a progressbar
# The files are complete. Now define a model and use datasets to feed the data from the .tfrecord files into the model.
model = keras.Sequential([keras.layers.Flatten(input_shape=(256, 256, 1)),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(10, activation='softmax')])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# Check docs for parameters (compression, buffer size, thread count. Also www.tensorflow.org/guide/performance/datasets
train_dataset = tf.data.TFRecordDataset("train.tfrecord") # specify a list (or dataset) of file names for large data
train_dataset = train_dataset.map(parse_example) # parse tfrecords. Parameter num_parallel_calls may help performance.
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)
validation_dataset = tf.data.TFRecordDataset("eval.tfrecord")
validation_dataset = validation_dataset.map(parse_example).batch(64)
model.fit(train_dataset, epochs=3)
# evaluate the results
results = model.evaluate(validation_dataset)
print('\n\nvalidation loss, validation acc:', results)
Note that it's tricky to use some_keras_model.fit(..., validation_data=some_dataset) with dataset objects. It may result in
TypeError: 'DatasetV1Adapter' object does not support indexing.
This seems to be a bug (see github.com/tensorflow/tensorflow/issues/28995) and is supposedly fixed as of tf-nightly version '1.15.0-dev20190808'; The official tutorial uses this too, although it doesn't work in most versions. An easy but dirty-ish fix is to use verbose=0 (which only suppresses program output) and plot the validation results using tensorboard. Also see Keras model.fit() with tf.dataset API + validation_data.

How to obtain gcloud predictions by passing a base64 image to a retrained inception model?

I am trying to obtain a prediction with gcloud by passing a base64 encoded image to a retrained inception model, by using a similar approach as the one adopted by Davide Biraghi in this post.
When using 'DecodeJpeg/contents:0' as input I also get the same error when trying to get predictions, therefore I adopted a slightly different approach.
Following rhaertel80's suggestions in his answer to this post, I have created a graph that takes a jpeg image as input in 'B64Connector/input', preprocess it and feeds it to the inception model in 'ResizeBilinear:0'.
The prediction returns the values, although the wrong ones (I am trying to find a solution in another post) but at least it doesn't fail. The placeholder I use as input is
images_placeholder = tf.placeholder(dtype=tf.string, shape=(None,), name='B64Connector/input')
And I add it to the model inputs with
inputs = {"b64_bytes": 'B64Connector/input:0'}
tf.add_to_collection("inputs", json.dumps(inputs))
As Davide I am following the suggestions found in these posts: here, here and here and I am trying to get predictions with
gcloud beta ml predict --json-instances=request.json --model=MODEL
where the file request.json has been obtained with this code
jpgtxt = base64.b64encode(open(imagefile ,"rb").read())
with open( outputfile, 'w' ) as f :
f.write( json.dumps( {"b64_bytes": {"b64": jpgtxt}} ) )
I would like to know why the prediction fails when I use as input 'DecodeJpeg/contents:0' and it doesn't when I use this different approach, since they look almost identical to me: I use the same script to generate the instances (changing the input_key) and the same command line to request predictions
Is there a way to pass the instance fed to 'B64Connector/input:0' to 'DecodeJpeg/contents:0' in order to get the right predictions?
Here I describe more in detail my approach and how I use the images_placeholder.
I define a function that resizes the image:
def decode_and_resize(image_str_tensor):
"""Decodes jpeg string, resizes it and returns a uint8 tensor."""
image = tf.image.decode_jpeg(image_str_tensor, channels=MODEL_INPUT_DEPTH)
# Note resize expects a batch_size, but tf_map supresses that index,
# thus we have to expand then squeeze. Resize returns float32 in the
# range [0, uint8_max]
image = tf.expand_dims(image, 0)
image = tf.image.resize_bilinear(
image, [MODEL_INPUT_HEIGHT, MODEL_INPUT_WIDTH], align_corners=False)
image = tf.squeeze(image, squeeze_dims=[0])
image = tf.cast(image, dtype=tf.uint8)
return image
and one that generates the definition of the graph in which the resize takes place and where images_placeholder is defined and used
def create_b64_graph() :
with tf.Graph().as_default() as b64_graph:
images_placeholder = tf.placeholder(dtype=tf.string, shape=(None,),
name='B64Connector/input')
decoded_images = tf.map_fn(
decode_and_resize, images_placeholder, back_prop=False, dtype=tf.uint8)
# convert_image_dtype, also scales [0, uint8_max] -> [0, 1).
images = tf.image.convert_image_dtype(decoded_images, dtype=tf.float32)
# Finally, rescale to [-1,1] instead of [0, 1)
images = tf.sub(images, 0.5)
images = tf.mul(images, 2.0)
# NOTE: using identity to get a known name for the output tensor.
output = tf.identity(images, name='B64Connector/output')
b64_graph_def = b64_graph.as_graph_def()
return b64_graph_def
Moreover I am using the following code to merge the resizing graph with the inception graph. Can I use a similar approach to link images_placeholder directly to 'DecodeJpeg/contents:0'?
def concatenate_to_inception_graph( b64_graph_def ):
model_dir = INPUT_MODEL_PATH
model_filename = os.path.join(
model_dir, 'classify_image_graph_def.pb')
with tf.Session() as sess:
# Import the b64_graph and get its output tensor
resized_b64_tensor, = (tf.import_graph_def(b64_graph_def, name='',
return_elements=['B64Connector/output:0']))
with gfile.FastGFile(model_filename, 'rb') as f:
inception_graph_def = tf.GraphDef()
inception_graph_def.ParseFromString(f.read())
# Concatenate b64_graph and inception_graph
g_1 = tf.import_graph_def(inception_graph_def, name='inception',
input_map={'ResizeBilinear:0' : resized_b64_tensor} )
return sess.graph

Categories