I am trying to implement a siamese network from Sergey Zagoruyko using Tensorflow
http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Zagoruyko_Learning_to_Compare_2015_CVPR_paper.pdf
I don't know to concatenate the 2 input layers to a top network (fully connected layer + relu + fully connected layer)
This may not be what you are looking for, but I recommend trying Keras. It is a flexible, high-level framework built on tensorflow that makes it extremely easy to accomplish what you are attempting. This would be how you could do it in Keras (with 32 inputs and 32 neurons in your FC layers).
from keras.models import Sequential
from keras.layers import Dense, Activation, Input
model = Sequential()
model.add(Input(shape=(32,)))
model.add(Dense(32))
model.add(Activation("relu"))
model.add(Dense(32))
Alternatively, using just tensorflow you could use this strategy.
import tensorflow as tf
x = tf.placeholder(shape, dtype=dtype)
y = tf.layers.dense(x, 32)
y = tf.nn.relu(y)
y = tf.layers.dense(y, 32)
But I personally think keras is more elegant, plus it adds a whole lot of more useful features, such as model.output, model.input, and much more. In fact, keras has recently been built into tensorflow's contrib module as tf.contrib.keras. Hope that helps!
Related
I have found a weird behavior in tensorflow.keras, it doesn't occur in the classic keras.
I have these shapes in my dataset.
x_train = np.random.rand(60,3,1)
y_train = np.random.rand(60,1)
And this LSTM network
from tensorflow.keras.layers import Dense, LSTM
from tensorflow.keras import Sequential
model = Sequential()
model.add(LSTM(120,input_shape=(3,1)))
model.add(Dense(2,activation="relu"))
model.compile(loss="MSE",optimizer="adam")
model.fit(x_train,y_train,epochs=1)
model.summary()
It's supposed this shouldn't work because the output of the network is (,2) and y_train is (,1). But it start training.
But using the classic keras it fails, as I expected.
from keras.layers import Dense, LSTM
from keras import Sequential
model = Sequential()
model.add(LSTM(120,input_shape=(3,1)))
model.add(Dense(2,activation="relu"))
model.compile(loss="MSE",optimizer="adam")
model.fit(x_train,y_train,epochs=1)
model.summary()
The version are, and I'm using Google Colab:
Tensorflow: 2.2.0
Keras: 2.3.1
What could be causing this? Is this a bug or a new feature?
I want to make a video event trigger which outputs 1 on detecting a change in event and 0 for other frames. I am using a CNN followed by an LSTM to do that. Both the CNN and LSTM are in the same model and have to be trained together. Each feature frame is encoded by CNN and this output is provided to the LSTM.
The problem is that , in order to process online videos, I need an online LSTM which is stateful and operates on batch size 1. Even after doing so, Keras gives me the error:
ValueError: If a RNN is stateful, it needs to know its batch size. Specify the batch size of your input tensors:
- If using a Sequential model, specify the batch size by passing a batch_input_shape argument to your first layer.
- If using the functional API, specify the batch size by passing a batch_shape argument to your Input layer.
I need to know where I am going wrong and what can be the solution.
I have tried including batch size in all the layers. starting from Input layer in the CNN andeven in The LSTM.
The model compiles if i am not using stateful='True' in LSTM and don't mention the batch size
I am attaching the complete code.
import time
import matplotlib.pyplot as plt
import numpy as np
from keras.constraints import maxnorm
from keras.models import Sequential,Model
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers import LSTM,Input,TimeDistributed
from keras.layers import Activation
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.normalization import BatchNormalization
from keras.utils import np_utils
'''
DAP will get triggered whenever it receives a frame with optical flow
very different from the previous opticalflow.It uses the optical flow
to determine whether a new event has started to occur in the videoframe
or not.
The CNN is responsible for extracting the optical flow. The RNN/LSTM is
used to trigger the event change by tracking the sequence of online
optical flow frames, being feeded from the CNN at every timestep.
NOTE: Keras does not support summary of a TimeDistributed Model without
specifying inputoutput dimension. Hence this coding approach had to be
incorporated , specifying layer names and seggregating the input output
layers. First we define the model. The separately define the CNN.
Then add the CNN to the model. And then add the LSTM. This is the only
approach to incorporate online video frame (which doesn't require
specifying the frame no or the third dimension of the LSTM).
An alternate, and perhaps easier to read, approach is to wrap each layer
in the CNN model in a TimeDistributed layer when adding it to the main
model.
INPUT: an 80x80 RGB image
OUTPUT: a sequence of 0 and 1s .
1 implies an event trigger. 0 implies continuation of the current event
'''
model=Sequential()
#--------------CNN----------------------------------------
x_input = Input(shape=(80,80,3))
c1=Conv2D(32, (3, 3), padding='same' )(x_input)
a1=Activation('relu')(c1)
c2=Conv2D(32,(3, 3))(a1)
a2=Activation('relu')(c2)
p1=MaxPooling2D(pool_size=(2, 2))(a2)
d1=Dropout(0.25)(p1)
c2=Conv2D(64, (3, 3), padding='same')(d1)
a3=Activation('relu')(c2)
c3=Conv2D(64, (3,3))(a3)
a4=Activation('relu')(c3)
p2=MaxPooling2D(pool_size=(2, 2))(a4)
d2=Dropout(0.25)(p2)
#FCN-----------------------------------------------------
f=Flatten()(d2)
d=Dense(512)(f)
a5=Activation('relu')(d)
x_output=Dropout(0.5)(a5)
#model.add(Dense(2))
#model.add(Activation('softmax'))
#---------------------------------------------------------
cnn = Model(x_input,x_output)
print("CNN SUMMARY")
cnn.summary()
#-----------------LSTM-------------------------------------
model.add(TimeDistributed(cnn,input_shape=(1,80,80,3)))
#model.add(cnn)
#batch size=1,
model.add(LSTM(128,batch_input_shape=
(1,512),stateful='True',activation='relu'))
model.add(Dense(1,activation='sigmoid'))
#----------------------------------------------------------
model.compile(loss='categorical_crossentropy', optimizer='adam')
print("DAP summary")
model.summary()
return model
Also i need to know if this is the correct approach towards coding an online LSTM.
The overall model should input 1 frame of the video and output a 0 or 1 for that frame, for every frame of the video. (one to one)
I am learning how to create sequential models. I have a model:
*model = Sequential()*
I then went on to add pooling layers and convolution layers (which were fine). But when creating the dense layer:
*model.add(Dense(num_classes, activation = 'softmax'))*
the line returned:
*tf.nn.softmax(x, axis=axis)*
which caused an error as axis was not defined. Both Keras and TensorFlow documentation show that the default axis for softmax is None or -1.
Is this a bug with keras? And is there an easy workaround (If I were to set the axis I am not sure what the input tensor would be)?
-I can add the rest of the code if necessary, but it simply consists of the other layers and I don't think it will help much.
I believe your Keras and/or TensorFlow is not up to date, you should update it/them.
This was a known problem in Keras in the Summer of 2017 and was fixed in this commit. See more on this comment on the bug report.
Also axis was introduced as an argument on November 22, 2017 in TensorFlow's softmax() so if the TensorFlow version is 1.4.0 or less, that would also cause this error.
Which one exactly causes the error depends on the rank of the tensor processed if you review the source of Keras at the linked commit.
This code is fine with the current versions (tested on https://colab.research.google.com):
import keras
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.optimizers import SGD
print( keras.__version__ )
model = Sequential()
model.add( Dense(6, input_shape=(6,), activation = 'softmax' ) )
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
outputs
2.1.6
but more importantly, compiles the model with no error.
I was wondering how to implement biLSTM with Batch Normalization (BN) in Keras. I know that BN layer should be between linearity and nonlinearity, i.e., activation. This is easy to implement with CNN or Dense layers. But, how to do this with biLSTM?
Thanks in advance.
If you want to apply BatchNormalization over the linear outputs of an LSTM you can do it as
from keras.models import Sequential
from keras.layers.recurrent import LSTM
from keras.layers.wrappers import Bidirectional
from keras.layers.normalization import BatchNormalization
model = Sequential()
model.add(Bidirectional(LSTM(128, activation=None), input_shape=(256,10)))
model.add(BatchNormalization())
Essentially, you are removing the non-linear activations of the LSTM (but not the gate activations), and then applying BatchNormalization to the outpus.
If what you want is to apply BatchNormalization into one of the inside flows of the LSTM, such as recurrent flows, I'm afraid that feature has not been implemented in Keras.
I'm use keras to build a model, and write optimizing codes and all the others in tensorflow. When I was using quite simple layers like Dense or Conv2D, everything was straightforward. But adding BatchNormalization layer into my keras model makes problem complicated.
Since BatchNormalization layer behaves differently in training phase and testing phase, I figured out that I need K.learning_phase():True in my feed_dict. But following code is not working well. It runs with no error, but the model's performance isn't getting any better.
import keras.backend as K
...
x_train, y_train = get_data()
sess.run(train_op, feed_dict={x:x_train, y:y_train, K.learning_phase():True})
When I tried training keras model with keras fit function, it worked well.
What should I do to train a keras model with BatchNormalization layer in tensorflow?
Actually I duplicated this question that I hadn't seen.
I found the answer here, it just consists in passing an special argument to the BatchNormalization layer call