I am currently working on implementing graph neural networks using the Spektral package in Python. I am trying to include pooling layers (such as TopKPool) in my model, but I am running into dimension issues.
I have created a sample code to reproduce the error. Here is the code:
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
from spektral.layers import GCNConv, TopKPool
# Example adjacency matrix and feature matrix
A = np.array([[0, 1, 1, 0],
[1, 0, 1, 0],
[1, 1, 0, 1],
[0, 0, 1, 0]], dtype=np.float32)
X = np.array([[0, 1],
[2, 3],
[4, 5],
[6, 7]], dtype=np.float32)
# Add a batch dimension to X
X = np.expand_dims(X, axis=0)
A = np.expand_dims(A, axis=0)
# Build the graph convolutional network model
X_in = Input((X.shape[1], X.shape[2]))
A_in = Input((A.shape[1],A.shape[2]))
gc1 = GCNConv(16, activation='relu')([X_in, A_in])
gc2 = GCNConv(32, activation='relu')([gc1, A_in])
pool1, A_out = TopKPool(ratio=0.5)([gc2, A_in])
flatten = Dense(128, activation='relu')(pool1)
model = Model(inputs=[X_in, A_in], outputs=flatten)
When I run this code, I get the following error message:
ValueError: Dimensions [1,1) of input[shape=[?]] = [] must match
dimensions [1,2) of updates[shape=[?,1]] = 1: Shapes must be equal
rank, but are 0 and 1 for '{{node top_k_pool/TensorScatterUpdate}} =
TensorScatterUpdate[T=DT_FLOAT, Tindices=DT_INT32](top_k_pool/sub_1,
top_k_pool/strided_slice_5, top_k_pool/strided_slice_1)' with input
shapes: [?], [?,1], [?,1].
I have tried using different pooling layers (SAGPool etc.) and checked the dimensions according to the package documentation, but I keep encountering the same error.
Here are Input dimensions:
Node features of shape (batch, n_nodes_in,
n_node_features); Adjacency matrix of shape (batch, n_nodes_in,
n_nodes_in);
I would greatly appreciate any assistance in resolving this issue.
I want to apply "tf.nn.max_pool()" on a single image but I get a result with dimension that is totally different than the input:
import tensorflow as tf
import numpy as np
ifmaps_1 = tf.Variable(tf.random_uniform( shape=[ 7, 7, 3], minval=0, maxval=3, dtype=tf.int32))
ifmaps=tf.dtypes.cast(ifmaps_1, dtype=tf.float64)
ofmaps_tf = tf.nn.max_pool([ifmaps], ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding="SAME")[0] # no padding
init = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
print("ifmaps_tf = ")
print(ifmaps.eval())
print("ofmaps_tf = ")
result = sess.run(ofmaps_tf)
print(result)
I think this is related to trying to apply pooling to single example not on a batch. I need to do the pooling on a single example.
Any help is appreciated.
Your input is (7,7,3), kernel size is (3,3) and stride is (2,2). So if you do not want any paddings, (state in your comment), you should use padding="VALID", that will return a (3,3) tensor as output. If you use padding="SAME", it will return (4,4) tensor.
Usually, the formula of calculating output size for SAME pad is:
out_size = ceil(in_sizei/stride)
For VALID pad is:
out_size = ceil(in_size-filter_size+1/stride)
I have two tensors, both with batch size N of images and same resolution. I would like to convolve the first image in tensor 1 with the first image of tensor 2, second image of tensor 1 with tensor 2, and so on. I want the output to be a tensor with N images of the same size.
I looked into using tf.nn.conv2d, but it seems like this command will take in a batch of N images and convolve them with a single filter.
I looked into examples like What does tf.nn.conv2d do in tensorflow?
but they do not talk about multiple images and multiple filters.
You can manage to do something like that using tf.nn.separable_conv2d, using the batch dimension as the separable channels and the actual input channels as batch dimension. I am not sure if it is going to be perform very well, though, as it involves several transpositions (which are not free in TensorFlow) and a convolution through a large number of channels, which is not really the optimized use case. Here is how it could work:
import tensorflow as tf
import numpy as np
import scipy.signal
# Expects imgs with shape (B, H, W, C) and filters with shape (B, H, W, 1)
def batch_conv(imgs, filters, strides, padding, rate=None):
imgs = tf.convert_to_tensor(imgs)
filters = tf.convert_to_tensor(filters)
b = tf.shape(imgs)[0]
imgs_t = tf.transpose(imgs, [3, 1, 2, 0])
filters_t = tf.transpose(filters, [1, 2, 0, 3])
strides = [strides[3], strides[1], strides[2], strides[0]]
# "do-nothing" pointwise filter
pointwise = tf.eye(b, batch_shape=[1, 1])
conv = tf.nn.separable_conv2d(imgs_t, filters_t, pointwise, strides, padding, rate)
return tf.transpose(conv, [3, 1, 2, 0])
# Slow, loop-based version using SciPy's correlate to check result
def batch_conv_np(imgs, filters, padding):
return np.stack(
[np.stack([scipy.signal.correlate2d(img[..., i], filter[..., 0], padding.lower())
for i in range(img.shape[-1])], axis=-1)
for img, filter in zip(imgs, filters)], axis=0)
# Make random input
np.random.seed(0)
imgs = np.random.rand(5, 20, 30, 3).astype(np.float32)
filters = np.random.rand(5, 20, 30, 1).astype(np.float32)
padding = 'SAME'
# Test
res_np = batch_conv_np(imgs, filters, padding)
with tf.Graph().as_default(), tf.Session() as sess:
res_tf = batch_conv(imgs, filters, [1, 1, 1, 1], padding)
res_tf_val = sess.run(res_tf)
print(np.allclose(res_np, res_tf_val))
# True
When I start the simple neural net I got an error. By the way, the code should output the first number of the test array.
There have been other errors(there was one having to do with the data's dtype).
import tensorflow as tf
import numpy as np
from tensorflow import keras
data = np.array([[0, 1, 1], [0, 0, 1], [1, 1, 1]])
labels = np.array([0, 0, 1])
data.dtype = float
print(data.dtype)
model = keras.Sequential([
keras.layers.Dense(3, activation=tf.nn.relu),
keras.layers.Dense(2, activation=tf.nn.softmax)])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(data, labels)
prediction = model.predict([0, 1, 0])
print(prediction)
I get this error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Matrix size-incompatible: In[0]: [3,1], In[1]: [3,3]
[[{{node sequential/dense/Relu}}]]
You are getting above error because of below line:
prediction = model.predict([0, 1, 0])
You are passing a list which should be a numpy array and of shape Nx3, where N is basically batch size and can be 1, 2, etc. In this case, it will be 1.
In order to make it correct, change it to
prediction = model.predict(np.expand_dims(np.array([0, 1, 0], dtype=np.float32), 0))
or
prediction = model.predict(np.array([[0, 1, 0]], dtype=np.float32))
And, change data.dtype = float to data.dtype = np.float32.
I am attempting to create a deep CNN that can classify each individual pixel in an image. I am replicating architecture from the image below taken from this paper. In the paper it is mentioned that deconvolutions are used so that any size of input is possible. This can be seen in the image below.
Github Repository
Currently, I have hard coded my model to accept images of size 32x32x7, but I would like to accept any size of input. What changes would I need to make to my code to accept variable sized input?
x = tf.placeholder(tf.float32, shape=[None, 32*32*7])
y_ = tf.placeholder(tf.float32, shape=[None, 32*32*7, 3])
...
DeConnv1 = tf.nn.conv3d_transpose(layer1, filter = w, output_shape = [1,32,32,7,1], strides = [1,2,2,2,1], padding = 'SAME')
...
final = tf.reshape(final, [1, 32*32*7])
W_final = weight_variable([32*32*7,32*32*7,3])
b_final = bias_variable([32*32*7,3])
final_conv = tf.tensordot(final, W_final, axes=[[1], [1]]) + b_final
Dynamic placeholders
Tensorflow allows to have multiple dynamic (a.k.a. None) dimensions in placeholders. The engine won't be able to ensure correctness while the graph is built, hence the client is responsible for feeding the correct input, but it provides a lot of flexibility.
So I'm going from...
x = tf.placeholder(tf.float32, shape=[None, N*M*P])
y_ = tf.placeholder(tf.float32, shape=[None, N*M*P, 3])
...
x_image = tf.reshape(x, [-1, N, M, P, 1])
to...
# Nearly all dimensions are dynamic
x_image = tf.placeholder(tf.float32, shape=[None, None, None, None, 1])
label = tf.placeholder(tf.float32, shape=[None, None, 3])
Since you intend to reshape the input to 5D anyway, so why don't use 5D in x_image right from the start. At this point, the second dimension of label is arbitrary, but we promise tensorflow that it will match with x_image.
Dynamic shapes in deconvolution
Next, the nice thing about tf.nn.conv3d_transpose is that its output shape can be dynamic. So instead of this:
# Hard-coded output shape
DeConnv1 = tf.nn.conv3d_transpose(layer1, w, output_shape=[1,32,32,7,1], ...)
... you can do this:
# Dynamic output shape
DeConnv1 = tf.nn.conv3d_transpose(layer1, w, output_shape=tf.shape(x_image), ...)
This way the transpose convolution can be applied to any image and the result will take the shape of x_image that was actually passed in at runtime.
Note that static shape of x_image is (?, ?, ?, ?, 1).
All-Convolutional network
Final and most important piece of the puzzle is to make the whole network convolutional, and that includes your final dense layer too. Dense layer must define its dimensions statically, which forces the whole neural network fix input image dimensions.
Luckily for us, Springenberg at al describe a way to replace an FC layer with a CONV layer in "Striving for Simplicity: The All Convolutional Net" paper. I'm going to use a convolution with 3 1x1x1 filters (see also this question):
final_conv = conv3d_s1(final, weight_variable([1, 1, 1, 1, 3]))
y = tf.reshape(final_conv, [-1, 3])
If we ensure that final has the same dimensions as DeConnv1 (and others), it'll make y right the shape we want: [-1, N * M * P, 3].
Combining it all together
Your network is pretty large, but all deconvolutions basically follow the same pattern, so I've simplified my proof-of-concept code to just one deconvolution. The goal is just to show what kind of network is able to handle images of arbitrary size. Final remark: image dimensions can vary between batches, but within one batch they have to be the same.
The full code:
sess = tf.InteractiveSession()
def conv3d_dilation(tempX, tempFilter):
return tf.layers.conv3d(tempX, filters=tempFilter, kernel_size=[3, 3, 1], strides=1, padding='SAME', dilation_rate=2)
def conv3d(tempX, tempW):
return tf.nn.conv3d(tempX, tempW, strides=[1, 2, 2, 2, 1], padding='SAME')
def conv3d_s1(tempX, tempW):
return tf.nn.conv3d(tempX, tempW, strides=[1, 1, 1, 1, 1], padding='SAME')
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
def max_pool_3x3(x):
return tf.nn.max_pool3d(x, ksize=[1, 3, 3, 3, 1], strides=[1, 2, 2, 2, 1], padding='SAME')
x_image = tf.placeholder(tf.float32, shape=[None, None, None, None, 1])
label = tf.placeholder(tf.float32, shape=[None, None, 3])
W_conv1 = weight_variable([3, 3, 1, 1, 32])
h_conv1 = conv3d(x_image, W_conv1)
# second convolution
W_conv2 = weight_variable([3, 3, 4, 32, 64])
h_conv2 = conv3d_s1(h_conv1, W_conv2)
# third convolution path 1
W_conv3_A = weight_variable([1, 1, 1, 64, 64])
h_conv3_A = conv3d_s1(h_conv2, W_conv3_A)
# third convolution path 2
W_conv3_B = weight_variable([1, 1, 1, 64, 64])
h_conv3_B = conv3d_s1(h_conv2, W_conv3_B)
# fourth convolution path 1
W_conv4_A = weight_variable([3, 3, 1, 64, 96])
h_conv4_A = conv3d_s1(h_conv3_A, W_conv4_A)
# fourth convolution path 2
W_conv4_B = weight_variable([1, 7, 1, 64, 64])
h_conv4_B = conv3d_s1(h_conv3_B, W_conv4_B)
# fifth convolution path 2
W_conv5_B = weight_variable([1, 7, 1, 64, 64])
h_conv5_B = conv3d_s1(h_conv4_B, W_conv5_B)
# sixth convolution path 2
W_conv6_B = weight_variable([3, 3, 1, 64, 96])
h_conv6_B = conv3d_s1(h_conv5_B, W_conv6_B)
# concatenation
layer1 = tf.concat([h_conv4_A, h_conv6_B], 4)
w = tf.Variable(tf.constant(1., shape=[2, 2, 4, 1, 192]))
DeConnv1 = tf.nn.conv3d_transpose(layer1, filter=w, output_shape=tf.shape(x_image), strides=[1, 2, 2, 2, 1], padding='SAME')
final = DeConnv1
final_conv = conv3d_s1(final, weight_variable([1, 1, 1, 1, 3]))
y = tf.reshape(final_conv, [-1, 3])
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=label, logits=y))
print('x_image:', x_image)
print('DeConnv1:', DeConnv1)
print('final_conv:', final_conv)
def try_image(N, M, P, B=1):
batch_x = np.random.normal(size=[B, N, M, P, 1])
batch_y = np.ones([B, N * M * P, 3]) / 3.0
deconv_val, final_conv_val, loss = sess.run([DeConnv1, final_conv, cross_entropy],
feed_dict={x_image: batch_x, label: batch_y})
print(deconv_val.shape)
print(final_conv.shape)
print(loss)
print()
tf.global_variables_initializer().run()
try_image(32, 32, 7)
try_image(16, 16, 3)
try_image(16, 16, 3, 2)
Theoretically, it's possible. you need to set the image size of the input and label image place holder to none, and let the graph dynamically infer the image size from input data.
However, have to be careful when you define the graph. Need to use tf.shape instead of tf.get_shape(). the former dynamically infer the shape only when you session.run, the latter can get the shape when you define the graph. But when input size is set to none, the latter does not get true reshape (maybe just return None).
And to make things complicated, if you use tf.layers.conv2d or upconv2d, sometimes these high level functions do not like tf.shape, because it seems they assume the shape information are available during graph construction.
I hope I have better working example to show the points above. I'll put this answer as a placeholder and will come back and add more stuff if I get a chance.