How to get an image to array, Tensorflow 1.9 - python

So I have to use Tensorflow 1.9 for system specific reasons.
I want to train a cnn with a custom dataset consisting of images.
The folder structure looks very much like this:
./
+ circles
- circle-0.jpg
- circle-1.jpg
- ...
+ hexagons
- hexagon-0.jpg
- hexagon-1.jpg
- ...
+ ...
So the example I have to work with uses MNIST and has these two particular lines of code:
mnist_dataset = tf.keras.datasets.mnist.load_data('mnist_data')
(x_train, y_train), (x_test, y_test) = mnist_dataset
In my work, I also have to use this data format (x_train, y_train), (x_test, y_test), which seems to be quite common. As far as I was able to find out up to now, the format of those datasets are: (image_data, label), and is something like ((60000, 28, 28), (60000,)), at least with the MNIST dataset. The image_data here is supposedly of dtype uint8 (according to this post). I was able to find out, that a tf.data.Dataset() object looks like the tuples I need here (image_data, label).
So far so good. But a few questions arise from this information which I wasn't able to figure out yet, and where I would kindly request your help:
(60000, 28, 28) means 60k a 28 x 28 image value array, right?
If 1. is right, how do I get my images (like in the directory structure I described above) into this format? Is there a function which yields an array that I can use like that?
I know I need some kind of generator function which should get all the images with label, because in Tensorflow 1.9 the tf.keras.utils.image_dataset_from_directory() does not seem to exist yet.
How do the labels actually look like? For example, with my directory structure, would I have something like this:
(A)
File
Label
circle-0.jpg
circle
circle-233.jpg
circle
hexagon-1.jpg
hexagon
triangle-12.jpg
triangle
or (B)
File
Label
circle-0.jpg
circle-0
circle-233.jpg
circle-233
hexagon-1.jpg
hexagon-1
triangle-12.jpg
triangle-12
, where the respective image is already converted to a "(60000, 28, 28)" format? It seems as if I need to create all my functions by myself, since there does not seem to be a good function which takes a directory structure like mine to a dataset which can be utilized by Tensorflow 1.9, or is there?. I know of the tf.keras.preprocessing.image.ImageDataGenerator and image_dataset_from_directory as well as flow_from_directory(), however, all of them don't seem to bring me my desired dataset value tuple format.
I would really appreciate any help!

You have to build a custom data generator for that. If you have two arrays, train_paths containing the paths to images and train_labels containing the labels for the images, then this function (datagen) would yield the images as array and with their respective label as a tuple (image_array, label).
And I have also added a way to integer-encode your labels, with a dictionary encode_label
For example, train_paths and train_labels should look like this:
train_paths = np.array(['path/to/image1.jpg','path/to/image2.jpg','path/to/image3.jpg'])
train_labels = np.array(['circle','square','hexagon'])
where the image of path 'path/to/image1.jpg' has a label of 'circle', the image of path 'path/to/image2.jpg' has a label of 'square'.
This generator function will return data as a batch and you can write your custom augmentation techniques as well (inside the augment function)
import tensorflow as tf
# Hyperparameters
HEIGHT = 224 # Image height
WIDTH = 224 # Image width
CHANNELs = 3 # Image channels
# This function will encode your labels
encode_label = {'hexagon':0, 'circle':1, 'square':2}
def augment(image):
# All your augmentation techniques are done here
return image
def encode_labels(labels):
encoded = []
for label in labels:
encoded.append(encode_label[label])
return encoded
def open_images(paths):
'''
Given a list of paths to images, this function loads
the images from the paths, then augments them, then returns it as a batch
'''
images = []
for path in paths:
image = tf.keras.preprocessing.image.load_img(path, target_size=(HEIGHT, WIDTH, CHANNELS))
image = np.array(image)
image = augment(image)
images.append(image)
return np.array(images)
# This is the data generator
def datagen(paths, labels, batch_size=32):
for x in range(0,len(paths), batch_size):
# Load batch of images
batch_paths = paths[x:x+batch_size]
batch_images = open_images(batch_paths)
# Load batch of labels
batch_labels = labels[x:x+batch_size]
batch_labels = encode_labels(batch_labels)
batch_labels = np.array(batch_labels, dtype='float').reshape(-1)
yield batch_images, batch_labels
If you cannot get tf.keras.preprocessing.image.load_img working in your tensorflow version, try using an alternative method to load image and resize it. One alternative way would be to load the image with matplotlib and then resizing it with skimage. So the open_images function would be this:
import matplotlib
from skimage.transform import resize
def open_images(paths):
'''
Given a list of paths to images, this function loads
the images from the paths, then augments them, then returns it as a batch
'''
images = []
for path in paths:
image = matplotlib.image.imread(path)
image = np.array(image)
image = resize(image, (HEIGHT, WIDTH, CHANNELS))
image = augment(image)
images.append(image)
return np.array(images)

Related

How to load images with different image shape to tf.data pipe?

My goal is to have a preprocessing layers so it can handle any image size. This is because the data set that I use have 2 different image shape. The solution is simple, just resize it when I load the image. However, I believe this wont work when the model is deployed, I can't do manual resize like that. So I must use preprocessing layers.
The docs I used
What I've tried:
Put the preprocessing layers part of the model, it does not work.
I am thinking to use TensorSliceDataset.map(resize_and_rescale).
The problem is I need to convert the [tensor image 1, tensor image 2] to TensorSliceDataset. However, I can't convert it.
What I've tried:
tf.data.Dataset.from_tensor_slices((X_train, y_train))
It throws error
InvalidArgumentError: {{function_node __wrapped__Pack_N_9773_device_/job:localhost/replica:0/task:0/device:GPU:0}} Shapes of all inputs must match: values[0].shape = [258,320,3] != values[23].shape = [322,480,3]
[[{{node Pack}}]] [Op:Pack] name: component_0
The load images function:
def load_images(df):
paths = df['path'].values
X = []
for path in paths:
raw = tf.io.read_file(path)
img = tf.image.decode_png(raw, channels=3)
X.append(img)
y = df['kind'].cat.codes
return X, y
As far as I understand you wish to train on both image sizes simultaneously. The simplest way is probably to create two different datasets for each image size and concatenate them after the batching as follows:
dataset_1 = tf.data.Dataset.from_tensor_slices((X_train_1, y_train_1))
dataset_1 = dataset_1.batch(batch_size_1)
dataset_2 = tf.data.Dataset.from_tensor_slices((X_train_2, y_train_2))
dataset_2 = dataset_2.batch(batch_size_2)
dataset = dataset_1.concatenate(dataset_2)
dataset = dataset.shuffle(shuffle_buffer_size)
This case each batch consists of images of the same size. If you use .repeat() do not forget to put if after the concatination.
You need to use ragged tensors to handle different image sizes:
dataset = tf.data.Dataset.from_tensor_slices((tf.ragged.constant(img_list), label_list))
dataset = dataset.apply(tf.data.experimental.dense_to_ragged_batch(batch_size=3))
Example

Data augmentation with Keras produces almost white images

I'm trying do do data augmentation with keras ImageDataGenerator.
I'm pulling the images from a Dataframe containing the paths to the image in one column and the label in another one. For now, i'm only trying to flip the image horizontaly. But when I plot the images, the images look like the brightness was pushed to the max. I wonder what's going on here... Any thoughts?
datagen = ImageDataGenerator(horizontal_flip=True)
# Configure batch size and retrieve one batch of images
for X_batch, y_batch in datagen.flow_from_dataframe(dataframe=data,
x_col="File name",
y_col="Driving direction",
directory = "Self-Driving-Car/Training Data/",
target_size = (480, 640),
class_mode = "other"):
# Show 9 images
for i in range(0, 9):
plt.subplot(330 + 1 + i)
plt.imshow(X_batch[i])
plt.show()
break
Ok it was just a plotting problem, this solved it :
plt.imshow(X_batch[i]/255)
More generally, be aware that if you do not have 0-255 ranged int images, ImageDataGenerator can rescale the data to 0-255 according to certain default values on certain of the augmentation options but not others. brightness_range being one example that rescales to 0-255.

Training with tf.data API and sample weights

All my training images are in tfrecords files. Now they are used in a standard way like this:
dataset = dataset.apply(tf.data.experimental.map_and_batch(
map_func=lambda x: preprocess(x, data_augmentation_options=data_augmentation),
batch_size=images_per_batch)
where preprocess returns the decoded image and the label which both come from the tfrecord file.
Now the new situation. I want also a sample weight for each example. So instead of
return image,label
in preprocess, it should be
return image, label, sample_weight
However, this sample_weight is not in the tfrecord file. It is computed when training start based on number of examples for each class. Basically it is a Python dictionary weights[label] = sample_weights.
The question is how to use these sample weights in the tf.data pipeline. Because label is a Tensor it cannot be used to index the Python dictionary.
There are some things that are no clear on your question, as what is x? It would be better if you can post a whole code example with your question.
I'm assuming that x is as tensor with an image and label. If so you can use the map function to add a tensor of sample weights to your dataset. Something as (note that this code was not tested):
def im_add_weight(image, label, sample_weight):
#convert to tensor if they are not and make sure to us
image= tf.convert_to_tensor(image, dtype= tf.float32)
label = tf.convert_to_tensor(label, dtype= tf.float32)
sample_weight = tf.convert_to_tensor(sample_weight, dtype= tf.float32)
return image, label, sample_weight
dataset = dataset .map(
lambda image, label, sample_weight: tuple(tf.py_func(
im_add_weight, [image, label,sample_weight], [tf.float32, tf.float32,tf.float32])))

How to get two tf.dataset from tf.data.Dataset.zip((images, labels))

I am working on the Python/tensorflow/mnist tutorial.
Since a few weeks using the orignal code from tensorflow web site i get the warning that the image dataset would soon be deprecated abd that i should use the following one :
https://github.com/tensorflow/models/blob/master/official/mnist/dataset.py
I load it it my code using :
from tensorflow.models.official.mnist import dataset
trainfile = dataset.train(data_dir)
Which returns :
tf.data.Dataset.zip((images, labels))
The issue is that I cannot find a,way to separate them in the following way for example :
trainfile = dataset.train(data_dir)
train_data= trainfile.images
train_label= trainfile.label
But this clearly doesnot work because the attributrs images and label do not exist. trainfile is a tf.dataset.
Knowing that tf.dataset is made of int32 and float32 i tried :
train_data = trainfile.map(lambda x,y : x.dtype == tf.float32)
But it returns and empty dataset.
I insist (but will be open mimded) in doing it this way (two complete batches of image and label) because this is how the tutorial works :
https://www.tensorflow.org/tutorials/estimators/cnn
I saw a lot of solution to get elements from datasets but nothing to go back from the zip operations that is done in the following code
tf.data.Dataset.zip((images, labels))
Thanks you in advance for your help.
I hope this helps:
inputs = tf.placeholder(tf.float32, shape=(None, 784), name='inputs')
outputs = tf.placeholder(tf.float32, shape=(None,), name='outputs')
#Prepare a tensorflow dataset
ds = tf.data.Dataset.from_tensor_slices((x_train, y_train))
ds = ds.shuffle(buffer_size=10, reshuffle_each_iteration=True).batch(batch_size=batch_size, drop_remainder=True).repeat()
iter = ds.make_one_shot_iterator()
next = iter.get_next()
inputs = next[0]
outputs = next[1]
TensorFlow's get_single_element() is finally around which can be used to unzip features and labels from the dataset.
This avoids the need of generating and using an iterator using .map(), iter() or one_shot_iterator() (which could be costly for big datasets).
get_single_element() returns a tensor (or a tuple or dict of tensors) encapsulating all the members of the dataset. We need to pass all the members of the dataset batched into a single element.
This can be used to get features as a tensor-array, or features and labels as a tuple or dictionary (of tensor-arrays) depending upon how the original dataset was created.
Check this answer on SO for an example that unpacks features and labels into a tuple of tensor-arrays.
Instead of separating into two datasets, one for images and another for labels, it's best to make a single iterator which returns both the image and the label.
The reason why this is preferred is that it's a lot easier to ensure that you match each example with its label even after a complicated series of shuffles, reorderings, filterings, etc, as you might have in a nontrivial input pipeline.
You can visualize images and find its associated labels
ds = tf.data.Dataset.from_tensor_slices((x_train, y_train))
ds = ds.shuffle(buffer_size=10).batch(batch_size=batch_size)
iter = ds.make_one_shot_iterator()
next = iter.get_next()
def display(image, label):
# display image
...
plt.imshow(image)
...
with tf.Session() as sess:
try:
while True:
image, label = sess.run(next)
# image = numpy array (batch, image_size)
# label = numpy array (batch, label)
display(image[0], label[0]) #display first image in batch
except:
pass

In Tensorflow, is there an op / are there ops to accept a tensor (of filenames) and output images?

I'd like to be able to read in batches of images. However, these batches must be constructed by a python script (they cannot be placed into a file ahead of time for various reasons). What's the most efficient way, in tensorflow to do the following:
(1) Provided: A python list of B variable-length strings that point to images, all have the same size. B is the batch size.
(2) For each string, load the image it corresponds to, and apply a random crop of 5% (the crop is random but the size of the crop is fixed)
(3) Concatenate the images together into a tensor of size B x H x W x 3
If this is not possible, does anyone have any benchmarks / data on the efficiency loss of loading and preprocessing the images in python then placing them into a queue? I assume the net will run considerably faster if image loading / preprocessing is done internally on tensorflow.
This is how I understand your problem:
you have some images
you have a function sample_batch() which returns a batch of filenames of size B
you want to read the images corresponding to these filenames and preprocess them
finally you output a batch of these examples
input = tf.placeholder(tf.string, name='Input')
queue = tf.FIFOQueue(capacity, tf.string, [()], name='Queue')
enqueue_op = queue.enqueue_many(input)
reader = tf.WholeFileReader()
filename, content = reader.read(queue)
image = tf.image.decode_jpeg(content, channels=3)
# Preprocessing
image = tf.random_crop(image, [H, W, 3])
image = tf.to_float(image)
batch_image = tf.train.batch([image], batch_size=B, name='Batch')
output = inference(batch_image)
Then in the session, you have to run the enqueue operation with the filenames from your sample_batch()function:
with tf.Session() as sess:
tf.train.start_queue_runners()
for i in range(NUM_STEPS):
batch_filenames = sample_batch()
sess.run(enqueue_op, feed_dict={input: batch_filenames})
sess.run(output)
If you have the images as a byte array you can use something similar to this in your graph:
jpegs = tf.placeholder(tf.string, shape=(None))
images = tf.map_fn(lambda jpeg : your_processing_fn(jpeg), jpegs,
dtype=tf.float32)
logits = your_inference_model(images,labels)
where your_processing_fn is a function which receives a jpeg tensor of bytes, decodes, resizes and crops it and returns a image of H x W x 3
You need the latest version of tensorflow as map_fn is not in 0.8 and below.

Categories