tensorflow: Strange result from convolution compared to theano (not flipping, though) - python

I am trying to implement some deep neural network with tensorflow. But I have already a problem at the first steps.
When I type the following using theano.tensor.nnet.conv2d, I get the expected result:
import theano.tensor as T
import theano
import numpy as np
# Theano expects input of shape (batch_size, channels, height, width)
# and filters of shape (out_channel, in_channel, height, width)
x = T.tensor4()
w = T.tensor4()
c = T.nnet.conv2d(x, w, filter_flip=False)
f = theano.function([x, w], [c], allow_input_downcast=True)
base = np.array([[1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1]]).T
i = base[np.newaxis, np.newaxis, :, :]
print f(i, i) # -> results in 3 as expected because np.sum(i*i) = 3
However, when I do the presumingly same thing in tf.nn.conv2d, my result is different:
import tensorflow as tf
import numpy as np
# TF expects input of (batch_size, height, width, channels)
# and filters of shape (height, width, in_channel, out_channel)
x = tf.placeholder(tf.float32, shape=(1, 4, 3, 1), name="input")
w = tf.placeholder(tf.float32, shape=(4, 3, 1, 1), name="weights")
c = tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding='VALID')
with tf.Session() as sess:
base = np.array([[1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1]]).T
i = base[np.newaxis, :, :, np.newaxis]
weights = base[:, :, np.newaxis, np.newaxis]
res = sess.run(c, feed_dict={x: i, w: weights})
print res # -> results in -5.31794233e+37
The layout of the convolution operation in tensorflow is a little different from theano, which is why the input looks slightly different.
However, since strides in Theano default to (1,1,1,1) and a valid convolution is the default, too, this should be the exact same input.
Furthermore, tensorflow does not flip the kernel (implements cross-correlation).
Do you have any idea why this is not giving the same result?
Thanks in advance,
Roman

Okay, I found a solution, even though it is not really one because I do not understand it myself.
First, it seems that for the task that I was trying to solve, Theano and Tensorflow use different convolutions.
The task at hand is a "1.5 D convolution" which means sliding a kernel in only one direction over the input (here DNA sequences).
In Theano, I solved this with the conv2d operation that had the same amount of rows as the kernels and it was working fine.
However, Tensorflow (probably correctly) wants me to use conv1d for that, interpreting the rows as channels.
So, the following should work but didn't in the beginning:
import tensorflow as tf
import numpy as np
# TF expects input of (batch_size, height, width, channels)
# and filters of shape (height, width, in_channel, out_channel)
x = tf.placeholder(tf.float32, shape=(1, 4, 3, 1), name="input")
w = tf.placeholder(tf.float32, shape=(4, 3, 1, 1), name="weights")
x_star = tf.reshape(x, [1, 4, 3])
w_star = tf.reshape(w, [4, 3, 1])
c = tf.nn.conv1d(x_star, w_star, stride=1, padding='VALID')
with tf.Session() as sess:
base = np.array([[1, 0, 0, 0], [1, 0, 0, 0], [0, 0, 0, 1]]).T
i = base[np.newaxis, :, :, np.newaxis]
weights = base[:, :, np.newaxis, np.newaxis]
res = sess.run(c, feed_dict={x: i, w: weights})
print res # -> produces 3 after updating tensorflow
This code produced NaN until I updated Tensorflow to version 1.0.1 and since then, it produces the expected output.
To summarize, my problem was partly solved by using 1D convolution instead of 2D convolution but still required the update of the framework. For the second part, I have no idea at all what might have caused wrong behavior in the first place.
EDIT: The code I posted in my original question is now working fine, too. So the different behavior came only from an old (maybe corrupt) version of TF.

Related

Issue with using pooling layers in Spektral package for graph neural networks

I am currently working on implementing graph neural networks using the Spektral package in Python. I am trying to include pooling layers (such as TopKPool) in my model, but I am running into dimension issues.
I have created a sample code to reproduce the error. Here is the code:
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
from spektral.layers import GCNConv, TopKPool
# Example adjacency matrix and feature matrix
A = np.array([[0, 1, 1, 0],
[1, 0, 1, 0],
[1, 1, 0, 1],
[0, 0, 1, 0]], dtype=np.float32)
X = np.array([[0, 1],
[2, 3],
[4, 5],
[6, 7]], dtype=np.float32)
# Add a batch dimension to X
X = np.expand_dims(X, axis=0)
A = np.expand_dims(A, axis=0)
# Build the graph convolutional network model
X_in = Input((X.shape[1], X.shape[2]))
A_in = Input((A.shape[1],A.shape[2]))
gc1 = GCNConv(16, activation='relu')([X_in, A_in])
gc2 = GCNConv(32, activation='relu')([gc1, A_in])
pool1, A_out = TopKPool(ratio=0.5)([gc2, A_in])
flatten = Dense(128, activation='relu')(pool1)
model = Model(inputs=[X_in, A_in], outputs=flatten)
When I run this code, I get the following error message:
ValueError: Dimensions [1,1) of input[shape=[?]] = [] must match
dimensions [1,2) of updates[shape=[?,1]] = 1: Shapes must be equal
rank, but are 0 and 1 for '{{node top_k_pool/TensorScatterUpdate}} =
TensorScatterUpdate[T=DT_FLOAT, Tindices=DT_INT32](top_k_pool/sub_1,
top_k_pool/strided_slice_5, top_k_pool/strided_slice_1)' with input
shapes: [?], [?,1], [?,1].
I have tried using different pooling layers (SAGPool etc.) and checked the dimensions according to the package documentation, but I keep encountering the same error.
Here are Input dimensions:
Node features of shape (batch, n_nodes_in,
n_node_features); Adjacency matrix of shape (batch, n_nodes_in,
n_nodes_in);
I would greatly appreciate any assistance in resolving this issue.

Max pool a single image in tensorflow using "tf.nn.avg_pool"

I want to apply "tf.nn.max_pool()" on a single image but I get a result with dimension that is totally different than the input:
import tensorflow as tf
import numpy as np
ifmaps_1 = tf.Variable(tf.random_uniform( shape=[ 7, 7, 3], minval=0, maxval=3, dtype=tf.int32))
ifmaps=tf.dtypes.cast(ifmaps_1, dtype=tf.float64)
ofmaps_tf = tf.nn.max_pool([ifmaps], ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding="SAME")[0] # no padding
init = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
print("ifmaps_tf = ")
print(ifmaps.eval())
print("ofmaps_tf = ")
result = sess.run(ofmaps_tf)
print(result)
I think this is related to trying to apply pooling to single example not on a batch. I need to do the pooling on a single example.
Any help is appreciated.
Your input is (7,7,3), kernel size is (3,3) and stride is (2,2). So if you do not want any paddings, (state in your comment), you should use padding="VALID", that will return a (3,3) tensor as output. If you use padding="SAME", it will return (4,4) tensor.
Usually, the formula of calculating output size for SAME pad is:
out_size = ceil(in_sizei/stride)
For VALID pad is:
out_size = ceil(in_size-filter_size+1/stride)

How to perform element convolution between two tensors?

I have two tensors, both with batch size N of images and same resolution. I would like to convolve the first image in tensor 1 with the first image of tensor 2, second image of tensor 1 with tensor 2, and so on. I want the output to be a tensor with N images of the same size.
I looked into using tf.nn.conv2d, but it seems like this command will take in a batch of N images and convolve them with a single filter.
I looked into examples like What does tf.nn.conv2d do in tensorflow?
but they do not talk about multiple images and multiple filters.
You can manage to do something like that using tf.nn.separable_conv2d, using the batch dimension as the separable channels and the actual input channels as batch dimension. I am not sure if it is going to be perform very well, though, as it involves several transpositions (which are not free in TensorFlow) and a convolution through a large number of channels, which is not really the optimized use case. Here is how it could work:
import tensorflow as tf
import numpy as np
import scipy.signal
# Expects imgs with shape (B, H, W, C) and filters with shape (B, H, W, 1)
def batch_conv(imgs, filters, strides, padding, rate=None):
imgs = tf.convert_to_tensor(imgs)
filters = tf.convert_to_tensor(filters)
b = tf.shape(imgs)[0]
imgs_t = tf.transpose(imgs, [3, 1, 2, 0])
filters_t = tf.transpose(filters, [1, 2, 0, 3])
strides = [strides[3], strides[1], strides[2], strides[0]]
# "do-nothing" pointwise filter
pointwise = tf.eye(b, batch_shape=[1, 1])
conv = tf.nn.separable_conv2d(imgs_t, filters_t, pointwise, strides, padding, rate)
return tf.transpose(conv, [3, 1, 2, 0])
# Slow, loop-based version using SciPy's correlate to check result
def batch_conv_np(imgs, filters, padding):
return np.stack(
[np.stack([scipy.signal.correlate2d(img[..., i], filter[..., 0], padding.lower())
for i in range(img.shape[-1])], axis=-1)
for img, filter in zip(imgs, filters)], axis=0)
# Make random input
np.random.seed(0)
imgs = np.random.rand(5, 20, 30, 3).astype(np.float32)
filters = np.random.rand(5, 20, 30, 1).astype(np.float32)
padding = 'SAME'
# Test
res_np = batch_conv_np(imgs, filters, padding)
with tf.Graph().as_default(), tf.Session() as sess:
res_tf = batch_conv(imgs, filters, [1, 1, 1, 1], padding)
res_tf_val = sess.run(res_tf)
print(np.allclose(res_np, res_tf_val))
# True

Tensorflow InvalidArgumentError Matrix size incompatible

When I start the simple neural net I got an error. By the way, the code should output the first number of the test array.
There have been other errors(there was one having to do with the data's dtype).
import tensorflow as tf
import numpy as np
from tensorflow import keras
data = np.array([[0, 1, 1], [0, 0, 1], [1, 1, 1]])
labels = np.array([0, 0, 1])
data.dtype = float
print(data.dtype)
model = keras.Sequential([
keras.layers.Dense(3, activation=tf.nn.relu),
keras.layers.Dense(2, activation=tf.nn.softmax)])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(data, labels)
prediction = model.predict([0, 1, 0])
print(prediction)
I get this error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [3,1], In[1]: [3,3]
[[{{node sequential/dense/Relu}}]]
You are getting above error because of below line:
prediction = model.predict([0, 1, 0])
You are passing a list which should be a numpy array and of shape Nx3, where N is basically batch size and can be 1, 2, etc. In this case, it will be 1.
In order to make it correct, change it to
prediction = model.predict(np.expand_dims(np.array([0, 1, 0], dtype=np.float32), 0))
or
prediction = model.predict(np.array([[0, 1, 0]], dtype=np.float32))
And, change data.dtype = float to data.dtype = np.float32.

Tensorflow Convolution Neural Network with different sized images

I am attempting to create a deep CNN that can classify each individual pixel in an image. I am replicating architecture from the image below taken from this paper. In the paper it is mentioned that deconvolutions are used so that any size of input is possible. This can be seen in the image below.
Github Repository
Currently, I have hard coded my model to accept images of size 32x32x7, but I would like to accept any size of input. What changes would I need to make to my code to accept variable sized input?
x = tf.placeholder(tf.float32, shape=[None, 32*32*7])
y_ = tf.placeholder(tf.float32, shape=[None, 32*32*7, 3])
...
DeConnv1 = tf.nn.conv3d_transpose(layer1, filter = w, output_shape = [1,32,32,7,1], strides = [1,2,2,2,1], padding = 'SAME')
...
final = tf.reshape(final, [1, 32*32*7])
W_final = weight_variable([32*32*7,32*32*7,3])
b_final = bias_variable([32*32*7,3])
final_conv = tf.tensordot(final, W_final, axes=[[1], [1]]) + b_final
Dynamic placeholders
Tensorflow allows to have multiple dynamic (a.k.a. None) dimensions in placeholders. The engine won't be able to ensure correctness while the graph is built, hence the client is responsible for feeding the correct input, but it provides a lot of flexibility.
So I'm going from...
x = tf.placeholder(tf.float32, shape=[None, N*M*P])
y_ = tf.placeholder(tf.float32, shape=[None, N*M*P, 3])
...
x_image = tf.reshape(x, [-1, N, M, P, 1])
to...
# Nearly all dimensions are dynamic
x_image = tf.placeholder(tf.float32, shape=[None, None, None, None, 1])
label = tf.placeholder(tf.float32, shape=[None, None, 3])
Since you intend to reshape the input to 5D anyway, so why don't use 5D in x_image right from the start. At this point, the second dimension of label is arbitrary, but we promise tensorflow that it will match with x_image.
Dynamic shapes in deconvolution
Next, the nice thing about tf.nn.conv3d_transpose is that its output shape can be dynamic. So instead of this:
# Hard-coded output shape
DeConnv1 = tf.nn.conv3d_transpose(layer1, w, output_shape=[1,32,32,7,1], ...)
... you can do this:
# Dynamic output shape
DeConnv1 = tf.nn.conv3d_transpose(layer1, w, output_shape=tf.shape(x_image), ...)
This way the transpose convolution can be applied to any image and the result will take the shape of x_image that was actually passed in at runtime.
Note that static shape of x_image is (?, ?, ?, ?, 1).
All-Convolutional network
Final and most important piece of the puzzle is to make the whole network convolutional, and that includes your final dense layer too. Dense layer must define its dimensions statically, which forces the whole neural network fix input image dimensions.
Luckily for us, Springenberg at al describe a way to replace an FC layer with a CONV layer in "Striving for Simplicity: The All Convolutional Net" paper. I'm going to use a convolution with 3 1x1x1 filters (see also this question):
final_conv = conv3d_s1(final, weight_variable([1, 1, 1, 1, 3]))
y = tf.reshape(final_conv, [-1, 3])
If we ensure that final has the same dimensions as DeConnv1 (and others), it'll make y right the shape we want: [-1, N * M * P, 3].
Combining it all together
Your network is pretty large, but all deconvolutions basically follow the same pattern, so I've simplified my proof-of-concept code to just one deconvolution. The goal is just to show what kind of network is able to handle images of arbitrary size. Final remark: image dimensions can vary between batches, but within one batch they have to be the same.
The full code:
sess = tf.InteractiveSession()
def conv3d_dilation(tempX, tempFilter):
return tf.layers.conv3d(tempX, filters=tempFilter, kernel_size=[3, 3, 1], strides=1, padding='SAME', dilation_rate=2)
def conv3d(tempX, tempW):
return tf.nn.conv3d(tempX, tempW, strides=[1, 2, 2, 2, 1], padding='SAME')
def conv3d_s1(tempX, tempW):
return tf.nn.conv3d(tempX, tempW, strides=[1, 1, 1, 1, 1], padding='SAME')
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def max_pool_3x3(x):
return tf.nn.max_pool3d(x, ksize=[1, 3, 3, 3, 1], strides=[1, 2, 2, 2, 1], padding='SAME')
x_image = tf.placeholder(tf.float32, shape=[None, None, None, None, 1])
label = tf.placeholder(tf.float32, shape=[None, None, 3])
W_conv1 = weight_variable([3, 3, 1, 1, 32])
h_conv1 = conv3d(x_image, W_conv1)
# second convolution
W_conv2 = weight_variable([3, 3, 4, 32, 64])
h_conv2 = conv3d_s1(h_conv1, W_conv2)
# third convolution path 1
W_conv3_A = weight_variable([1, 1, 1, 64, 64])
h_conv3_A = conv3d_s1(h_conv2, W_conv3_A)
# third convolution path 2
W_conv3_B = weight_variable([1, 1, 1, 64, 64])
h_conv3_B = conv3d_s1(h_conv2, W_conv3_B)
# fourth convolution path 1
W_conv4_A = weight_variable([3, 3, 1, 64, 96])
h_conv4_A = conv3d_s1(h_conv3_A, W_conv4_A)
# fourth convolution path 2
W_conv4_B = weight_variable([1, 7, 1, 64, 64])
h_conv4_B = conv3d_s1(h_conv3_B, W_conv4_B)
# fifth convolution path 2
W_conv5_B = weight_variable([1, 7, 1, 64, 64])
h_conv5_B = conv3d_s1(h_conv4_B, W_conv5_B)
# sixth convolution path 2
W_conv6_B = weight_variable([3, 3, 1, 64, 96])
h_conv6_B = conv3d_s1(h_conv5_B, W_conv6_B)
# concatenation
layer1 = tf.concat([h_conv4_A, h_conv6_B], 4)
w = tf.Variable(tf.constant(1., shape=[2, 2, 4, 1, 192]))
DeConnv1 = tf.nn.conv3d_transpose(layer1, filter=w, output_shape=tf.shape(x_image), strides=[1, 2, 2, 2, 1], padding='SAME')
final = DeConnv1
final_conv = conv3d_s1(final, weight_variable([1, 1, 1, 1, 3]))
y = tf.reshape(final_conv, [-1, 3])
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=label, logits=y))
print('x_image:', x_image)
print('DeConnv1:', DeConnv1)
print('final_conv:', final_conv)
def try_image(N, M, P, B=1):
batch_x = np.random.normal(size=[B, N, M, P, 1])
batch_y = np.ones([B, N * M * P, 3]) / 3.0
deconv_val, final_conv_val, loss = sess.run([DeConnv1, final_conv, cross_entropy],
feed_dict={x_image: batch_x, label: batch_y})
print(deconv_val.shape)
print(final_conv.shape)
print(loss)
print()
tf.global_variables_initializer().run()
try_image(32, 32, 7)
try_image(16, 16, 3)
try_image(16, 16, 3, 2)
Theoretically, it's possible. you need to set the image size of the input and label image place holder to none, and let the graph dynamically infer the image size from input data.
However, have to be careful when you define the graph. Need to use tf.shape instead of tf.get_shape(). the former dynamically infer the shape only when you session.run, the latter can get the shape when you define the graph. But when input size is set to none, the latter does not get true reshape (maybe just return None).
And to make things complicated, if you use tf.layers.conv2d or upconv2d, sometimes these high level functions do not like tf.shape, because it seems they assume the shape information are available during graph construction.
I hope I have better working example to show the points above. I'll put this answer as a placeholder and will come back and add more stuff if I get a chance.

Categories