I have two separately designed CNNs for two different features(image and text) of the same data, and the output has two classes
In the very last layer:
for image (resnet), I would like to use "he_normal" as the initializer
flatten1 = Flatten()(image_maxpool)
dense = Dense(output_dim=2, kernel_initializer="he_normal")(flatten1)
but for the text CNNs, i would like to use the default "glorot_normal"
flatten2 = Flatten()(text_maxpool)
output = Dense(output_dim=2, kernel_initializer="glorot_normal")(flatten2)
the flatten1 and flatten2 have sizes:
flatten_1 (Flatten) (None, 512)
flatten_2 (Flatten) (None, 192)
is there anyway i can concate these two flatten layers and have a long dense layer with a size 192+512 = 704, where the first 192 and second 512 has two seperate kernel_initializer, and produce a 2-class outputs?
something like this:
merged_tensor = merge([flatten1, flatten2], mode='concat', concat_axis=1)
output = Dense(output_dim=2,
kernel_initializer for [:512]='he_normal',
kernel_initializer for [512:]='glorot_normal')(merged_tensor)
Edit: I think I have gotten this work by having the following codes(thanks to #Aechlys):
def my_init(shape, shape1, shape2):
x = initializers.he_normal()(shape1)
y = initializers.glorot_normal()(shape2)
return tf.concat([x,y], 0)
class_num = 2
flatten1 = Flatten()(image_maxpool)
flatten2 = Flatten()(text_maxpool)
merged_tensor = concatenate([flatten1, flatten2],axis=-1)
output = Dense(output_dim=class_num, kernel_initializer=lambda shape: my_init(shape,\
shape1=(512,class_num),\
shape2=(192,class_num)),\
activation='softmax')(merged_tensor)
I have to manually add the shape size 512 and 192, because I failed to get the size of flatten1 and flatten1 via the code
flatten1.get_shape().as_list()
,which gave me [none, none], althought it should be [None, 512], other than that it should be fine
Oh my, have I had fun with this one. You have to create your own kernel intializer:
def my_init(shape, dtype=None, *, shape1, shape2):
x = keras.initializers.he_normal()(shape1, dtype=dtype)
y = keras.initializers.glorot_normal()(shape2, dtype=dtype)
return tf.concat([x,y], 0)
Then you will call it via lambda function within the Dense function:
Unfortunately, as you can see, I have not been able to deduce the shape programatically, yet. I may update this answer when I do. But, if you know the shape beforehand you can pass them as constants:
DENSE_UNITS = 64
input_t = Input((1,25))
input_i = Input((1,35))
input_a = Concatenate(axis=-1)([input_t, input_i])
dense = Dense(DENSE_UNITS, kernel_initializer=lambda shape: my_init(shape,
shape1=(int(input_t.shape[-1]), DENSE_UNITS),
shape2=(int(input_i.shape[-1]), DENSE_UNITS)))(input_a)
tf.keras.Model(inputs=[input_t, input_i], outputs=dense)
Out: <tensorflow.python.keras._impl.keras.engine.training.Model at 0x19ff7baac88>
Related
I am trying to build a binary temporal image classifier by combining ResNet18 and an LSTM. However, I have never really used RNNs before and have been struggling on getting the correct output shape.
I am using a batch size of 128 and a sequence size of 32. The images are 80x80 grayscale images.
The current model is:
class CNNLSTM(nn.Module):
def __init__(self):
super(CNNLSTM, self).__init__()
self.resnet = models.resnet18(pretrained=False)
self.resnet.conv1 = nn.Conv2d(1, 64, kernel_size=7, stride=2, padding=3)
self.resnet.fc = nn.Sequential(nn.Linear(in_features=512, out_features=256, bias=True))
self.lstm = nn.LSTM(input_size=256, hidden_size=256, num_layers=3)
self.fc1 = nn.Linear(256, 128)
self.fc2 = nn.Linear(128, 1)
def forward(self, x_3d):
#x3d: torch.Size([128, 32, 1, 80, 80])
hidden = None
toret = []
for t in range(x_3d.size(1)):
x = self.resnet(x_3d[:, t, :, :, :])
out, hidden = self.lstm(x.unsqueeze(0), hidden)
x = self.fc1(out[-1, :, :])
x = F.relu(x)
x = self.fc2(x)
print("x shape: ", x.shape)
toret.append(x)
return torch.stack(toret)
Which returns a tensor of shape torch.Size([32, 128, 1]) which, according to what I understand, means that every nth row represents the nth time step of each element in the sequence.
How can I get output of shape 128x1x32 instead?
And is there a better way to do this?
You could permute the dimensions:
a = torch.rand(32, 128, 1)
a = a.permute(1, 2, 0) # these are the indices of the original dimensions
print(a.shape)
>> torch.Size([128, 1, 32])
But you could also set batch_first=True in the LSTM module:
self.lstm = nn.LSTM(input_size=256, hidden_size=256, num_layers=3, batch_first=True)
This will expect that the input to the LSTM has the shape batch-size x seq-len x features and will output a tensor in the same way.
The keras model is like this:
input_x = Input(shape=input_shape)
x=Conv2D(...)(input_x)
...
y_pred1 = Conv2D(...)(x) # shape of (None, 80, 80, 2)
y_pred2 = Dense(...)(x) # shape of (None, 4)
y_merged = Concatenate(...)([y_pred1, y_pred2])
model = Model(input_x, y_merged)
y_pred1 and y_pred2 are the results I want the model to learn to predict.
But the loss function fcn1 for the y_pred1 branch need y_pred2 prediction results, so I have to concatenate the results of the two branches to get y_merged, so that fcn1 will have access to y_pred2.
The problem is, I want to use the Concatenate layer to concatenate the y_pred1 (None, 4) output with the y_pred2 (None, 80, 80, 2) output, but I don't know how to do that.
How can I reshape the (None, 4) to (None, 80, 80, 1)? For example, by filling the (None, 80, 80, 1) with the 4 elements in y_pred2 and zeros.
Is there any better solutions than using the Concatenate layer?
Maybe this extracted piece of code could help you:
tf.print(condi_input.shape)
# shape is TensorShape([None, 1])
condi_i_casted = tf.expand_dims(condi_input, 2)
tf.print(condi_i_casted.shape)
# shape is TensorShape([None, 1, 1])
broadcasted_val = tf.broadcast_to(condi_i_casted, shape=tf.shape(decoder_outputs))
tf.print(broadcasted_val.shape)
# shape is TensorShape([None, 23, 256])
When you want to broadcast a value, first think about what exactly you want to broadcast. In this example, condi_input has shape(None,1) and helped me as a condition for my encoder-decoder lstm network. To match all dimensionalities, of the encoder states of the lstm, first I had to use tf.expand_dims() to expand the condition value from a shape like [[1]] to [[[1]]].
This is what you need to do first. If you have a prediction as a softmax from the dense layers, you might want to use tf.argmax() first, so you only have one value, which is way easier to broadcast. However, its also possible with 4 but keep in mind, that the dimensions need to match. You cannot broadcast shape(None,4) to shape(None,6), but to shape(None,8) since 8 is devidable through 4.
Then you you can use tf.broadcast() to broadcast your value into the desired shape. Then you have two shapes, you can concatenate together.
hope this helps you out.
Figured it out, the code is like this:
input_x = Input(shape=input_shape)
x=Conv2D(...)(input_x)
...
y_pred1 = Conv2D(...)(x) # shape of (None, 80, 80, 2)
y_pred2 = Dense(4)(x) # (None, 4)
# =========transform to concatenate:===========
y_pred2_matrix = Lambda(lambda x: K.expand_dims(K.expand_dims(x, -1)))(y_pred2) # (None,4, 1,1)
y_pred2_matrix = ZeroPadding2D(padding=((0,76),(0,79)))(y_pred2_matrix) # (None, 80, 80,1)
y_merged = Concatenate(axis=-1)([y_pred1, y_pred2_matrix]) # (None, 80, 80, 3)
The 4 elements of y_pred2 can be indexed as y_merged[None, :4, 0, 2]
I want to obtain the output of intermediate sub-model layers with tf2.keras.Here is a model composed of two sub-modules:
input_shape = (100, 100, 3)
def model1():
input = tf.keras.layers.Input(input_shape)
cov = tf.keras.layers.Conv2D(filters=32, kernel_size=3, strides=1,name='cov1')(input)
embedding_model = tf.keras.Model(input,cov,name='model1')
return embedding_model
def model2(embedding_model):
input_sequence = tf.keras.layers.Input((None,) + input_shape)
sequence_embedding = tf.keras.layers.TimeDistributed(embedding_model,name='time_dis1')
emb = sequence_embedding(input_sequence)
att = tf.keras.layers.Attention()([emb,emb])
dense1 = tf.keras.layers.Dense(64,name='dense1')(att)
outputs = tf.keras.layers.Softmax()(dense1)
final_model = tf.keras.Model(inputs=input_sequence, outputs=outputs,name='model2')
return final_model
embedding_model = model1()
model2 = model2(embedding_model)
print(model2.summary())
output:
Model: "model2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_2 (InputLayer) [(None, None, 100, 1 0
__________________________________________________________________________________________________
time_dis1 (TimeDistributed) (None, None, 98, 98, 896 input_2[0][0]
__________________________________________________________________________________________________
attention (Attention) (None, None, 98, 98, 0 time_dis1[0][0]
time_dis1[0][0]
__________________________________________________________________________________________________
dense1 (Dense) (None, None, 98, 98, 2112 attention[0][0]
__________________________________________________________________________________________________
softmax (Softmax) (None, None, 98, 98, 0 dense1[0][0]
==================================================================================================
Total params: 3,008
Trainable params: 3,008
Non-trainable params: 0
and then,I want to get output intermediate layer of model1 and model2:
model1_output_layer = model2.get_layer('time_dis1').layer.get_layer('cov1')
output1 = model1_output_layer.get_output_at(0)
output2 = model2.get_layer('dense1').get_output_at(0)
output_tensors = [output1,output2]
model2_input = model2.input
submodel = tf.keras.Model([model2_input],output_tensors)
input_data2 = np.zeros((1,10,100,100,3))
result = submodel.predict([input_data2])
print(result)
Running in tf2.3 ,the error I am getting is:
File "/Users/bouluoyu/anaconda/envs/tf2/lib/python3.6/site-packages/tensorflow/python/keras/engine/functional.py", line 115, in __init__
self._init_graph_network(inputs, outputs)
File "/Users/bouluoyu/anaconda/envs/tf2/lib/python3.6/site-packages/tensorflow/python/training/tracking/base.py", line 457, in _method_wrapper
result = method(self, *args, **kwargs)
File "/Users/bouluoyu/anaconda/envs/tf2/lib/python3.6/site-packages/tensorflow/python/keras/engine/functional.py", line 191, in _init_graph_network
self.inputs, self.outputs)
File "/Users/bouluoyu/anaconda/envs/tf2/lib/python3.6/site-packages/tensorflow/python/keras/engine/functional.py", line 931, in _map_graph_network
str(layers_with_complete_input))
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(None, 100, 100, 3), dtype=float32) at layer "cov1". The following previous layers were accessed without issue: ['time_dis1', 'attention', 'dense1']
But the following code works:
model1_input = embedding_model.input
model2_input = model2.input
submodel = tf.keras.Model([model1_input,model2_input],output_tensors)
input_data1 = np.zeros((1,100,100,3))
input_data2 = np.zeros((1,10,100,100,3))
result = submodel.predict([input_data1,input_data2])
print(result)
But not what I want.This is strange, model1 is part of model2, so why do we need to input an extra tensor.Sometimes,it is hard to get an extra tensor,especially for complex models.
so why do we need to input an extra tensor
Short answer is, TensorFlow doesn't know to make the connection between the inputs you expect it to make. The problem arises because you're passing a Model (instead of a Layer) to your TimeDistributed layer. This leave the Input layer of your model1 hanging, unless you explicitly pass it an input. The TimeDistributed layer is not smart enough to handle models in this way.
My solution would depend on the answer to the following question,
Why do you need model1? All it has is a Conv2D layer. You can easily do
sequence_embedding = tf.keras.layers.TimeDistributed(
tf.keras.layers.Conv2D(filters=32, kernel_size=3, strides=1,name='cov1'),
name='time_dis1'
)
If you do this, now you gotta change the following lines,
model1_output_layer = model2.get_layer('time_dis1').layer.get_layer('cov1')
output1 = model1_output_layer.get_output_at(0)
to something like (the exact output you want will depend on what you're actually after)
model1_output_layer = model2.get_layer('time_dis1')
output1 = model1_output_layer.output
# This output1 may need further processing depending on what you need
# e.g. if you need mean embeddings over time axis
output_mean = tf.keras.layers.Average(output1, axis=1)
This is because you can't access the output of the layer nested by a TimeDistributed layer. Because the layer passed to the TimeDistributed layer doesn't actually do anything. And it doesn't have a defined output. It's just sitting there as a template for the TimeDistributed layer to compute the output using it. So, to get the output from a TimeDistributed layer, you need to access it via that layer.
You try to do it as you have it (instead of my way), you'll get,
AttributeError: Layer cov1 has no inbound nodes.
You may ask, "why did it work before"?
It's because, before you had a Model there instead of a Layer. Because the Conv2D layer was wrapped by the model, the layer output was defined (because it had an Input layer). And this feeds back to the reason, why it complained about the missing Input from model1 when trying to define the submodel.
I know this explanation may make your head spin as the reasons behind this error are quite convoluted. But going through it a few times will hopefully help.
I'm not sure if this is possible or not in Keras, but I would like to know is there any way to concatenate specific values from LSTM layers.
I would like to encode whole sequence using LSTM but for prediction use only specific columns.
For example:
I'm using two input sequences with size of 5 (so my input shape is for one input is (None, 5) and (None, 5).
Then I embed sequences and my input shape is (None, 5, 300) and (None, 5, 300)
Then I encode sequence with LSTM layers with 200 LSTM cells and my final shape is (None, 5, 200) and (None, 5, 200).
Now I would like not to concatenate the whole sequence but the last 4 words encoded in lstm_1 and the first word in lstm_2.
> input_1 = Input(shape=(5, ))
> emb_1 = Embedding(..., 300, ...)(input_1)
> lstm_1 = CuDNNLSTM(200, ...)(emb_1)
>
> input_2 = Input(shape=(5, ))
> emb_2 = Embedding(..., 300, ...)(input_2)
> lstm_2 = CuDNNLSTM(200, ...)(emb_2)
>
> # here is the problem
> emb = concatenate([lstm_1[??], lstm_2[??])
>
> d1 = Dense(...)(emb)
> out = Dense(..., activation="softmax")(d1)
Not sure if I'm making any sense but I would like to know if it's possible using Keras functional API.
Best regards,
Daniel
So I didn't realize that lstm_1 and lstm_2 are actually Numpy arrays that I can concatenate. So the solution is simple, you just need to wrap it in Lambda layer.
lstm_1_lambda = Lambda(lambda x: x[:, -4:, :])(lstm_1)
lstm_2_lambda = Lambda(lambda x: x[:, :1, :])(lstm_2)
emb = concatenate([lstm_1_lambda, lstm_1_lambda)
Best regards,
Daniel
I want to try to implement the neural network architecture of the attached image: 1DCNN_model
Consider that I've got a dataset X which is (N_signals, 1500, 40) where 40 is the number of features where I want to do the 1d convolution on.
My Y is (N_signals, 1500, 2) and I'm working with keras.
Every 1d convolution needs to take one feature vector like in this picture:1DCNN_convolution
So it has to take one chunk of the 1500 timesamples, pass it through the 1d convolutional layer (sliding along time-axis) then feed all the output features to the LSTM layer.
I tried to implement the first convolutional part with this code but I'm not sure what it's doing, I can't understand how it can take in one chunk at a time (maybe I need to preprocess my input data before?):
input_shape = (None, 40)
model_input = Input(input_shape, name = 'input')
layer = model_input
convs = []
for i in range(n_chunks):
conv = Conv1D(filters = 40,
kernel_size = 10,
padding = 'valid',
activation = 'relu')(layer)
conv = BatchNormalization(axis = 2)(conv)
pool = MaxPooling1D(40)(conv)
pool = Dropout(0.3)(pool)
convs.append(pool)
out = Merge(mode = 'concat')(convs)
conv_model = Model(input = layer, output = out)
Any advice? Thank you very much
Thank you very much, I modified my code in this way:
input_shape = (1500,40)
model_input = Input(shape=input_shape, name='input')
layer = model_input
layer = Conv1D(filters=40,
kernel_size=10,
padding='valid',
activation='relu')(layer)
layer = BatchNormalization(axis=2)(layer)
layer = MaxPooling1D(pool_size=40,
padding='same')(layer)
layer = Dropout(self.params.drop_rate)(layer)
layer = LSTM(40, return_sequences=True,
activation=self.params.lstm_activation)(layer)
layer = Dropout(self.params.lstm_dropout)(layer)
layer = Dense(40, activation = 'relu')(layer)
layer = BatchNormalization(axis = 2)(layer)
model_output = TimeDistributed(Dense(2,
activation='sigmoid'))(layer)
I was actually thinking that maybe I have to permute my axes in order to make maxpooling layer work on my 40 mel feature axis...
If you want to perform an individual 1D convolution over the 40 feature channels you should add a dimension to your input:
(1500,40,1)
if you perform 1D convolution on a input with shape
(1500,40)
the filters are applied on the time dimension and the pictures you posted indicate that this is not what you want to do.