Wrong shape Dataset Tensorflow - python

Im new to tensorflow and Im trying to feed some data with tensorflow.Dataset. Im using Cityscape dataset with 8 different classes. Here is my code:
import os
import cv2
import numpy as np
import tensorflow as tf
H = 256
W = 256
id2cat = np.array([0,0,0,0,0,0,0, 1,1,1,1, 2,2,2,2,2,2, 3,3,3,3, 4,4, 5, 6,6, 7,7,7,7,7,7,7,7,7])
def readImage(x):
x = cv2.imread(x, cv2.IMREAD_COLOR)
x = cv2.resize(x, (W, H))
x = x / 255.0
x = x.astype(np.float32)
return x
def readMask(path):
mask = cv2.imread(path, 0)
mask = cv2.resize(mask, (W, H))
mask = id2cat[mask]
return mask.astype(np.int32)
def preprocess(x, y):
def f(x, y):
image = readImage(x)
mask = readMask(y)
return image, mask
image, mask = tf.numpy_function(f, [x, y], [tf.float32, tf.int32])
mask = tf.one_hot(mask, 3, dtype=tf.int32)
image.set_shape([H, W, 3])
mask.set_shape([H, W, 3])
return image, mask
def tf_dataset(x, y, batch=8):
dataset = tf.data.Dataset.from_tensor_slices((x, y))
dataset = dataset.shuffle(buffer_size=5000)
dataset = dataset.map(preprocess)
dataset = dataset.batch(batch)
dataset = dataset.repeat()
dataset = dataset.prefetch(2)
return dataset
def loadCityscape():
trainPath = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'datasets\\Cityscape\\train')
imagesPath = os.path.join(trainPath, 'images')
maskPath = os.path.join(trainPath, 'masks')
images = []
masks = []
print('Loading images and masks for Cityscape dataset...')
for image in os.listdir(imagesPath):
images.append(readImage(os.path.join(imagesPath, image)))
for mask in os.listdir(maskPath):
if 'label' in mask:
masks.append(readMask(os.path.join(maskPath, mask)))
print('Loaded {} images\n'.format(len(images)))
return images, masks
images, masks = loadCityscape()
dataset = tf_dataset(images, masks, batch=8)
print(dataset)
That last print(dataset) shows:
<PrefetchDataset shapes: ((None, 256, 256, 3), (None, 256, 256, 3)), types: (tf.float32, tf.int32)>
Why am I obtaining (None, 256, 256, 3) instead of (8, 256, 256, 3)? I also have some doubts about how to iterate over this dataset.
Thanks a lot.

Tensorflow is a graph based mathematical framework that abstracts for you all of those complex vectorial or matricial operations you face, particularly in machine learning.
What the developers though is that it would be unconfortable to specify every single time how many input vectors you need to pass in your model for the training, so they decided to abstract it for you.
You will not interested if your model is fed with one single or thousands samples as long as the output matches with the input dimension (but also any internal operation should match in dimensions!).
So the None size is a placeholder for a possible changing shape, that is usually the batch size of the input.
We need a placeholder because (None, 2) is a different shape with respect of just (2,), because in the first case we know we will face 2 dimensions.
Even if the None dimension is unknown when you "compile" your model, it will be evaluated only when it is strictly needed, in other words when you run it. In this way your model will be happy to run on a batch size of 64 as like as 128 samples.
For the rest a (non-scalar) Tensor behaves like a normal numpy array:
tensor1 = tf.constant([ 0, 1, 2, 3]) # shape (4, )
tensor2 = tf.constant([ [0], [1], [2], [3]]) # shape (4, 1)
for x in tensor1:
print(x) # 0, 1, 2, 3
for x in tensor2:
print(x) # Tensor([0]), Tensor([1]), Tensor([2]), Tensor([3])
The only difference is that it can be allocated into any supported device memory (CPU / Cuda GPU).
Iterating through the dataset is just like slicing it at (usually) constant sizes, where that constant is your batch size, which will fill that empty None dimension.
This line of code will be responsible of slicing your dataset into "sub-tensors" ("sub-arrays") composed by its samples:
dataset = dataset.batch(N)
# iterating over it:
for batch in dataset: # I'm taking N samples here
...
Your "runtime" shape will be (N, 256, 256, 3), but if you will try to take an element from the dataset it could still have None in the shape... That's because we can't guarantee, for example, that the dimension of the dataset is exactly divisible by the batch size, so some trailing samples of a variable shape could still be possible. You will hardly get rid off that None dimension, but in some custom methods of your model you could achieve that.
If you are still unconfortable with tensors there is the tensor.numpy() method that gives you back a numpy array, but at the cost of copying it (usually to your CPU). This is not available in every step of the process.
There are many way to define a dataset in tensorflow, I suggest to read how they think you should build an input pipeline, because it will make your life easier if you understand how much tensorflow takes your code at higher levels of abstraction.

Related

How to perform element convolution between two tensors?

I have two tensors, both with batch size N of images and same resolution. I would like to convolve the first image in tensor 1 with the first image of tensor 2, second image of tensor 1 with tensor 2, and so on. I want the output to be a tensor with N images of the same size.
I looked into using tf.nn.conv2d, but it seems like this command will take in a batch of N images and convolve them with a single filter.
I looked into examples like What does tf.nn.conv2d do in tensorflow?
but they do not talk about multiple images and multiple filters.
You can manage to do something like that using tf.nn.separable_conv2d, using the batch dimension as the separable channels and the actual input channels as batch dimension. I am not sure if it is going to be perform very well, though, as it involves several transpositions (which are not free in TensorFlow) and a convolution through a large number of channels, which is not really the optimized use case. Here is how it could work:
import tensorflow as tf
import numpy as np
import scipy.signal
# Expects imgs with shape (B, H, W, C) and filters with shape (B, H, W, 1)
def batch_conv(imgs, filters, strides, padding, rate=None):
imgs = tf.convert_to_tensor(imgs)
filters = tf.convert_to_tensor(filters)
b = tf.shape(imgs)[0]
imgs_t = tf.transpose(imgs, [3, 1, 2, 0])
filters_t = tf.transpose(filters, [1, 2, 0, 3])
strides = [strides[3], strides[1], strides[2], strides[0]]
# "do-nothing" pointwise filter
pointwise = tf.eye(b, batch_shape=[1, 1])
conv = tf.nn.separable_conv2d(imgs_t, filters_t, pointwise, strides, padding, rate)
return tf.transpose(conv, [3, 1, 2, 0])
# Slow, loop-based version using SciPy's correlate to check result
def batch_conv_np(imgs, filters, padding):
return np.stack(
[np.stack([scipy.signal.correlate2d(img[..., i], filter[..., 0], padding.lower())
for i in range(img.shape[-1])], axis=-1)
for img, filter in zip(imgs, filters)], axis=0)
# Make random input
np.random.seed(0)
imgs = np.random.rand(5, 20, 30, 3).astype(np.float32)
filters = np.random.rand(5, 20, 30, 1).astype(np.float32)
padding = 'SAME'
# Test
res_np = batch_conv_np(imgs, filters, padding)
with tf.Graph().as_default(), tf.Session() as sess:
res_tf = batch_conv(imgs, filters, [1, 1, 1, 1], padding)
res_tf_val = sess.run(res_tf)
print(np.allclose(res_np, res_tf_val))
# True

How to convolve signal with 1D kernel in TensorFlow?

I am trying to filter a TensorFlow tensor of shape (N_batch, N_data), where N_batch is the batch size (e.g. 32), and N_data is the size of the (noisy) timeseries array. I have a Gaussian kernel (taken from here), which is one-dimensional. I then want to use tensorflow.nn.conv1d to convolve this kernel with my signal.
I have been trying for most of the morning to get the dimensions of the input signal and the kernel right, but obviously with no success. From what I gathered from the interwebs, the dimensions of both the input signal and the kernel need to be aligned in some finicky way, and I just can't figure out which way that is. The TensorFlow error messages aren't particularly meaningful either (Shape must be rank 4 but is rank 3 for 'conv1d/Conv2D' (op: 'Conv2D') with input shapes: [?,1,1000], [1,81]). Below I've included a little piece of code to reproduce the situation:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
# Based on: https://stackoverflow.com/a/52012658/1510542
# Credits to #zephyrus
def gaussian_kernel(size, mean, std):
d = tf.distributions.Normal(tf.cast(mean, tf.float32), tf.cast(std, tf.float32))
vals = d.prob(tf.range(start=-size, limit=size+1, dtype=tf.float32))
kernel = vals # Some reshaping is required here
return kernel / tf.reduce_sum(kernel)
def gaussian_filter(input, sigma):
size = int(4*sigma + 0.5)
x = input # Some reshaping is required here
kernel = gaussian_kernel(size=size, mean=0.0, std=sigma)
conv = tf.nn.conv1d(x, kernel, stride=1, padding="SAME")
return conv
def run_filter():
tf.reset_default_graph()
# Define size of data, batch sizes
N_batch = 32
N_data = 1000
noise = 0.2 * (np.random.rand(N_batch, N_data) - 0.5)
x = np.linspace(0, 2*np.pi, N_data)
y = np.tile(np.sin(x), N_batch).reshape(N_batch, N_data)
y_noisy = y + noise
input = tf.placeholder(tf.float32, shape=[None, N_data])
smooth_input = gaussian_filter(input, sigma=10)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
y_smooth = smooth_input.eval(feed_dict={input: y_noisy})
plt.plot(y_noisy[0])
plt.plot(y_smooth[0])
plt.show()
if __name__ == "__main__":
run_filter()
Any ideas?
You need to add channel dimensions to your input/kernel, since TF convolutions are generally used for multi-channel inputs/outputs. As you are working with simple 1-channel input/output this amounts to just adding some size-1 "dummy" axes.
Since by default convolution expects channels to come last, your placeholder should have shape [None, N_data, 1] and your input be modified like
y_noisy = y + noise
y_noisy = y_noisy[:, :, np.newaxis]
Similarly, you need to add input and output channel dimensions to your filter:
kernel = gaussian_kernel(size=size, mean=0.0, std=sigma)
kernel = kernel[:, tf.newaxis, tf.newaxis]
That is, the filter is expected to have shape [width, in_channels, out_cannels].

How do I display the feature maps (filtered layers) in a tensorflow CNN?

I need some help viewing the feature maps in a plant leaf classification program using TensorFlow.
I have a function that takes in any number of images (size 128x128x3) and convolves the images using some filter (size 3x3x32).
layer_conv1 = create_convolutional_layer(input=x,
num_input_channels=num_channels,
conv_filter_size=filter_size_conv1,
num_filters=num_filters_conv1)
print(layer_conv1)
The code outputs a tensor as printed: Tensor("Relu_182:0", shape=(?, 64, 64, 32), dtype=float32)
I am trying to display an image on the console from the tensor, and I've tried the following code (using matplotlib.pyplot):
session.run(tf.global_variables_initializer())
img = session.run(layer_conv1)
plt.imshow(img)
plt.show()
and
""
img = layer_conv1[0,:,:,:].eval(session=session)
""
""
which both don't work.
You must feed a value for placeholder tensor 'x_54' with dtype float and shape [?,128,128,3] is one of the errors that occurs.
You define your layer with
layer_conv1 = create_convolutional_layer(input=x,...)
Here, x is a placeholder defined with something like
x = tf.Placeholder(tf.float32, [None, 128, 128, 3])
When you call img = session.run(layer_conv1) you need to feed a value for x like with
img = session.run(layer_conv1, feed_dict={x: myImage})
where myImage is a numpy array of shape [1, 128, 128, 3] representing your image.

How do I reshape the dimensions of an image to contain the number of images (i.e., 1) as well?

I am running a neural network model on some images. Initially, for training, I converted all the images into a pandas dataframe of dimension (# of images in the dataset) x r x g x b, where r, g, b are the colour values of each image. Now when I am trying to test the model on a single externally downloaded image, it is giving a dimension error as, obviously, the image's dimension is only r x g x b. How do I add the number of images as a dimension into this image?
EDIT: Here's the code:
#load the data as a pandas data frame
import pandas as pd
dataset = pd.read_csv(os.path.join(data_path, 'data.csv'))
# split into input (X) and output (Y) variables
X = dataset.values[:,0]
Y = dataset.values[:,1]
# Load all the images and resize them into a single numpy array of consistent dimension
from scipy.misc import imresize
from scipy.misc import imread
import numpy as np
temp = []
for img_name in X:
img_path = os.path.join(data_dir, 'Train', img_name)
img = imread(img_path)
img = imresize(img, (32, 32))
img = img.astype('float32')
temp.append(img)
X = np.stack(temp)
# Convert the data classes from words into a number format readable by the program
from sklearn.preprocessing import LabelEncoder
lb = LabelEncoder()
Y = lb.fit_transform(Y)
Y = keras.utils.np_utils.to_categorical(Y)
# Split the data into 67% for training and 33% for testing
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.33)
### Define the neural network model
### Compile and train the model on the data
### Evaluate it
# Test it on an externally downloaded image
img = imread(os.path.join(image_folder, downloaded_image)).astype('float32')
plt.imshow(imresize(img, (128, 128)))
print('X_train shape: ', X_train.shape)
print('Downloaded image shape: ', img.shape)
This returns:
X_train shape: (13338, 32, 32, 3)
Downloaded image shape: (448, 720, 3)
I want to make the downloaded image's shape to be (1, 448, 720, 3) so that it matches the dimensions of X_train's shape, because when I try to predict the class of the downloaded image, it returns a dimension error:
pred = cnn_model.predict_classes(img)
print('Predicted:', lb.inverse_transform(pred))
This returns:
ValueError: Error when checking : expected conv2d_71_input to have 4 dimensions, but got array with shape (960, 640, 3)
From your description, it seems like you don't really mean to use the number of images as a feature, but rather as a sample weight. Conceptually, you probably want to transform
k x r x g x b
to
r x g x b
... # repeat k times
r x g x b
which would naturally make the input and output dimensions identical, BTW. If this increases learning time too much, and your library has a sample weight parameter, you should consider using it.
If you'd like to just technically add a dimension, you can use np.expand_dims:
>>> np.expand_dims(np.array([[1, 2, 3], [3, 4, 5]]), axis=0).shape
(1, 2, 3)
However, I cannot say I'm sure that this is fundamentally what you what.

tensorflow: how to set the shape of tensor with different conditional statements?

I would like to train a network with two different shapes of input tensor. Each epoch chooses one type.
Here I write a small code:
import tensorflow as tf
import numpy as np
with tf.Session() as sess:
imgs1 = tf.placeholder(tf.float32, [4, 224, 224, 3], name = 'input_imgs1')
imgs2 = tf.placeholder(tf.float32, [4, 180, 180, 3], name = 'input_imgs2')
epoch_num_tf = tf.placeholder(tf.int32, [], name = 'input_epoch_num')
imgs = tf.cond(tf.equal(tf.mod(epoch_num_tf, 2), 0),
lambda: tf.Print(imgs2, [imgs2.get_shape()], message='(even number) input epoch number is '),
lambda: tf.Print(imgs1, [imgs1.get_shape()], message='(odd number) input epoch number is'))
print(imgs.get_shape())
for epoch in range(10):
epoch_num = np.array(epoch).astype(np.int32)
imgs1_input = np.ones([4, 224, 224, 3], dtype = np.float32)
imgs2_input = np.ones([4, 180, 180, 3], dtype = np.float32)
output = sess.run(imgs, feed_dict = {epoch_num_tf: epoch_num,
imgs1: imgs1_input,
imgs2: imgs2_input})
When I execute it, the output of imgs.get_shape() is (4, ?, ?, 3)
i.e. imgs.get_shape()[1]=None, imgs.get_shape()[2]=None.
But I will use the value of the output of imgs.get_shape() to define the kernel (ksize) and strides size (strides) of the tf.nn.max_pool() e.g. ksize=[1,imgs.get_shape()[1]/6, imgs.get_shape()[2]/6, 1] in the future code.
I think ksize and strides cannot support tf.Tensor value.
How to solve this problem? Or how to set the shape of imgs conditionally?
When you do print(a.get_shape()), you are getting the static shape of the tensor a. Assuming you mean imgs.get_shape() and not a.get_shape() in the code above, dimensions 1 and 2 of imgs vary dynamically with the value of epoch_num_tf. Therefore the static shape in those dimensions is unknown, which TensorFlow represents as None.
If you want to use the dynamic shape of imgs in subsequent code, you should use the tf.shape() operator to get the shape as a tf.Tensor. For example, instead of imgs.get_shape()[2], you can use tf.shape(imgs)[2].
Unfortunately, the ksize and strides arguments of tf.nn.max_pool() do not accept tf.Tensor values. (I think this is a historical limitation, because these were configured as "attrs" rather than "inputs" of the corresponding kernel. Please open a GitHub issue if you'd like to request this feature!) One possible workaround would be to use another tf.cond():
imgs = ...
# Could also use `tf.equal(tf.mod(epoch_num_tf, 2), 0)` as the predicate.
pool_output = tf.cond(tf.equal(tf.shape(imgs)[2], 180),
lambda: tf.nn.max_pool(imgs, ksize=[1, 180/6, 180/6, 1], ...),
lambda: tf.nn.max_pool(imgs, ksize=[1, 224/6, 224/6, 1], ...))

Categories