I would like to create a simple Keras neural network that accepts an input matrix of dimension (rows, columns) = (n, m), flattens the matrix to a dimension (n*m, 1), sends the flattened matrix through a number of arbitrary layers, and in the final layer, once more unflattens the matrix to a dimension of (n, m) before releasing this final matrix as an output.
The issue I'm having is that I haven't found any documentation for an Unflatten layer at the keras.io page, and I'm wondering whether there is a reason that such a seemingly standard common use layer doesn't exist. Is there a much more natural and easy way to do what I'm proposing?
You can use the Reshape layer for this purpose. It accepts the desired output shape as its argument and would reshape the input tensor to that shape. For example:
from keras.layers import Reshape
rsh_inp = Reshape((n*m, 1))(inp) # if you don't want the last axis with dimension 1, you can also use Flatten layer
# rsh_inp goes through a number of arbitrary layers ...
# reshape back the output
out = Reshape((n,m))(out_rsh_inp)
Related
I have a recurrent neural network model that maps a (N,) sequence to a (N,3) length sequence. My target outputs are actually (N,N) matrices. However, I have a deterministic function implemented in numpy that converts (N,3) into these (N,N) matrices in a particular way that I want. How can I use this operation in training? I.e. currently my neural network is giving out (N,3) sequences, how do I perform my function to convert it to (N,N) on these before calling keras.fit?
Edit: I should also note that it is much harder to do the reverse function from (N,N) to (N,3) so it's not a viable option to just convert my target outputs to the (N,3) output representations.
You can use a Lambda layer as the last layer of your model:
def convert_to_n_times_n(x):
# transform x from shape (N, 3) to (N, N)
transformation_layer = tf.keras.layers.Lambda(convert_to_n_times_n)
You probably want to use "tf-native methods" within your function as much as possible to avoid unnecessary conversions of tensors to numpy arrays and back.
If you only want to use the layer during training, but not during inference, you can achieve that using the functional API:
# create your original model (N,) -> (N, 3)
input_ = Input(shape=(N,))
x = SomeFancyLayer(...)(input_)
x = ...
...
inference_output = OtherFancyLayer(...)(x)
inference_model = Model(inputs=input_, outputs=inference_output)
# create & fit the training model
training_output = transformation_layer(inference_output)
training_model = Model(inputs=input_, outputs=training_output)
training_model.compile(...)
training_model.fit(X, Y)
# run inference using your original model
inference_model.predict(...)
I am having trouble figuring out what the dimensions of each CNN layer is.
Let's say my input is a vector which I then projected onto a 4x4x256 matrix using a fully-connected layer as so...
zP = slim.fully_connected(
z,
4*4*256,
normalizer_fn=slim.batch_norm,
activation_fn=tf.nn.relu,
scope='g_project',
weights_initializer=initializer
)
# Layer is reshaped to a 4x4x256 mapping.
zCon = tf.reshape(zP,[-1,4,4,256])
Where z was my original vector. I then take this 4x4x256 matrix and feed it into a CNN...
gen1 = slim.convolution2d_transpose(
zCon,
num_outputs=64,
kernel_size=[5,5],
stride=[2,2],
padding="SAME",
normalizer_fn=slim.batch_norm,
activation_fn=tf.nn.relu,
scope='g_conv1',
weights_initializer=initializer
)
As you can see I used a convolutional 2d transpose and I specified the output as 64, with a stride of 2 and a filter size of 5. This means that I know one of my dimension will be 64, however I do not know what the other 2 dimensions will be and I do not know how to calculate it.
I tried using the following formula but it is not working out for me...
How can I calculate the remaining dimensions?
The formula you have written is for the Convolution operation, since you need to calculate for the transposed convolution where the shapes are inverse of convolution, the formula can be derived from the above equation by re-arranging the terms as:
W = (Out-1)*S + F - 2P
W is your actual output and Out is your actual input to the transpose convolution.
Given 2 3D tensors t1 = [?, 1, 1, 1, 2048] and t2 = [?, 3, 1, 1, 256] seen in the image, how would these be concatenated? Currently, I am using:
tf.concat([t1, t2], 4)
However, given that my architecture has a large amount of layers with many concatenations, I eventually have a tensor that is too large (in terms of channels/features) to initialize. Is this the correct way to implement a concatenation layer?
First of all, the shapes of tensors in the inception layer are not like you define. 1x1, 1x3 and 3x1 are the shapes of the filters applied to the image. There are two more parameters in convolution: padding and striding, and depending on their exact values, the result shape can be very different.
In this particular case, the spatial shape doesn't change, only the channels dimension will be 2048 and 256, that's why they can be concatenated. The concatenation of your original t1 and t2 will result in error.
Is this the correct way to implement a concatenation layer?
Yes, feature map concatenation is one of key ideas of inception network and its implementation indeed uses tf.concat (e.g. see inception v1 source code).
Note that this tensor will grow in one direction (channels / features), but contract in spatial dimensions because of downsampling, so it won't get too large. Also note that this tensor is the transformed input data (image), hence unlike the weights, it's not initialized, but rather flows through the network. The weights will be the tensors 1x1x2048=2048, 1x3x224=672, 3x1x256=768, etc - as you can see they are not very big at all, and that's another idea of the inception network.
According to the keras documentation (https://keras.io/layers/convolutional/) the shape of a Conv1D output tensor is (batch_size, new_steps, filters) while the input tensor shape is (batch_size, steps, input_dim). I don't understand how this could be since that implies that if you pass a 1d input of length 8000 where batch_size = 1 and steps = 1 (I've heard steps means the # of channels in your input) then this layer would have an output of shape (1,1,X) where X is the number of filters in the Conv layer. But what happens to the input dimension? Since the X filters in the layer are applied to the entire input dimension shouldn't one of the output dimensions be 8000 (or less depending on padding), something like (1,1,8000,X)? I checked and Conv2D layers behave in a way that makes more sense their output_shape is (samples, filters, new_rows, new_cols) where new_rows and new_cols would be the dimensions of an input image again adjusted based on padding. If Conv2D layers preserve their input dimensions why don't Conv1D layers? Is there something I'm missing here?
Background Info:
I'm trying to visualize 1d convolutional layer activations of my CNN but most tools online I've found seem to just work for 2d convolutional layers so I've decided to write my own code for it. I've got a pretty good understanding of how it works here is the code I've got so far:
# all the model's activation layer output tensors
activation_output_tensors = [layer.output for layer in model.layers if type(layer) is keras.layers.Activation]
# make a function that computes activation layer outputs
activation_comp_function = K.function([model.input, K.learning_phase()], activation_output_tensors)
# 0 means learning phase = False (i.e. the model isn't learning right now)
activation_arrays = activation_comp_function([training_data[0,:-1], 0])
This code is based off of julienr's first comment in this thread, with some modifications for the current version of keras. Sure enough when I use it though all the activation arrays are of shape (1,1,X)... I spent all day yesterday trying to figure out why this is but no luck any help is greatly appreciated.
UPDATE: Turns out I mistook the meaning of the input_dimension with the steps dimension. This is mostly because the architecture I used came from another group that build their model in mathematica and in mathematica an input shape of (X,Y) to a Conv1D layer means X "channels" (or input_dimension of X) and Y steps. A thank you to gionni for helping me realize this and explaining so well how the "input_dimension" becomes the "filter" dimension.
I used to have the same problem with 2D convolutions. The thing is that when you apply a convolutional layer the kernel you are applying is not of size (kernel_size, 1) but actually (kernel_size, input_dim).
If you think of it if it wasn't this way a 1D convolutional layer with kernel_size = 1 would be doing nothing to the inputs it received.
Instead it is computing a weighted average of the input features at each time step, using the same weights for each time step (although every filter uses a different set of weights). I think it helps to visualize input_dim as the number of channels in a 2D convolution of an image, where the same reaoning applies (in that case is the channels that "get lost" and trasformed into the number of filters).
To convince yourself of this, you can reproduce the 1D convolution with a 2D convolution layer using kernel_size=(1D_kernel_size, input_dim) and the same number of filters. Here an example:
from keras.layers import Conv1D, Conv2D
import keras.backend as K
import numpy as np
# create an input with 4 steps and 5 channels/input_dim
channels = 5
steps = 4
filters = 3
val = np.array([list(range(i * channels, (i + 1) * channels)) for i in range(1, steps + 1)])
val = np.expand_dims(val, axis=0)
x = K.variable(value=val)
# 1D convolution. Initialize the kernels to ones so that it's easier to compute the result by hand
conv1d = Conv1D(filters=filters, kernel_size=1, kernel_initializer='ones')(x)
# 2D convolution that replicates the 1D one
# need to add a dimension to your input since conv2d expects 4D inputs. I add it at axis 4 since my keras is setup with `channel_last`
val1 = np.expand_dims(val, axis=3)
x1 = K.variable(value=val1)
conv2d = Conv2D(filters=filters, kernel_size=(1, 5), kernel_initializer='ones')(x1)
# evaluate and print the outputs
print(K.eval(conv1d))
print(K.eval(conv2d))
As I said, it took me a while too to understand this, I think mostly because no tutorial explains it clearly
Thanks, It's very useful.
here the same code adapted using recent version of tensorflow + keras
and stacking on axis 0 to build the 4D
# %%
from tensorflow.keras.layers import Conv1D, Conv2D
from tensorflow.keras.backend import eval
import tensorflow as tf
import numpy as np
# %%
# create an 3D input with format BLC (Batch, Layer, Channel)
batch = 10
layers = 3
channels = 5
kernel = 2
val3D = np.random.randint(0, 100, size=(batch, layers, channels))
x = tf.Variable(val3D.astype('float32'))
# %%
# 1D convolution. Initialize the kernels to ones so that it's easier to compute the result by hand / compare
conv1d = Conv1D(filters=layers, kernel_size=kernel, kernel_initializer='ones')(x)
# %%
# 2D convolution that replicates the 1D one
# need to add a dimension to your input since conv2d expects 4D inputs. I add it at axis 0 since my keras is setup with `channel_last`
# stack 3 time the same
val4D = np.stack([val3D,val3D,val3D], axis=0)
x1 = tf.Variable(val4D.astype('float32'))
# %%
# 2D convolution. Initialize the kernel_size to one for the 1st kernel size so that replicate the conv1D
conv2d = Conv2D(filters=layers, kernel_size=(1, kernel), kernel_initializer='ones')(x1)
# %%
# evaluate and print the outputs
print(eval(conv1d))
print('---------------------------------------------')
# display only one of the stacked
print(eval(conv2d)[0])
I have two massive numpy arrays of weights and biases for a CNN. I can set weights for each layer (using set_weights) but I don't see a way to set the bias for each layer. How do I do this?
You do this by using layer.set_weights(weights). From the documentation:
weights: a list of Numpy arrays. The number
of arrays and their shape must match
number of the dimensions of the weights
of the layer (i.e. it should match the
output of `get_weights`).
You don't just put the weights for the filter in there but for each parameter the layer has. The order in which you have to put in the weights depends on layer.weights. You may look at the code or print the names of the weights of the layer by doing something like
print([p.name for p in layer.weights])