Let's say I have an LSTM layer in Keras like this:
x = Input(shape=(input_shape), dtype='int32')
x = LSTM(128,return_sequences=True)(x)
Now I am trying to add Dropout to this layer using:
X = Dropout(0.5)
but this gives error, which I am assuming the above line is redefining X instead of adding Dropout to it.
How to fix this?
Just add x = Dropout(0.5)(x) like this:
x = Input(shape=(input_shape), dtype='int32')
x = LSTM(128,return_sequences=True)(x)
x = Dropout(0.5)(x)
Related
the model is for binary classification.
this is my model:
im_input= layers.Input(shape=[160,160,3])
x = layers.Conv2D(30,(3,3),strides=(2,2),padding='same')(im_input)
z = layers.DepthwiseConv2D((3,3),strides=2,padding='same',depth_multiplier=10)(im_input)
x = layers.ReLU()(x)
z = layers.ReLU()(z)
x = layers.Conv2D(60,(3,3),strides=(2,2),padding='same')(x)
z = layers.Conv2D(60,(3,3),strides=2,padding='same')(z)
x = layers.ReLU()(x)
z = layers.ReLU()(z)
x = layers.Concatenate()([x,z])
x = layers.Conv2D(120,(3,3),strides=2,padding='same')(x)
x = layers.ReLU()(x)
x = layers.Conv2D(200,(3,3),strides=2,padding='same')(x)
x = layers.ReLU()(x)
x = layers.Conv2D(400,(3,3),strides=1,padding='same')(x)
x = layers.ReLU()(x)
x = layers.Conv2D(900,(3,3),strides=1,padding='same')(x)
x = layers.Flatten()(x)
#x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(100,activation='relu')(x)
x = layers.Dropout(0.2)(x)
x = layers.Dense(20, activation='relu')(x)
out = layers.Dense(1,activation='sigmoid')(x)
smodel = tf.keras.Model(inputs=im_input, outputs=out, name="myModel2")
smodel.summary()
and this is the loss function:
cross_entropy = tf.keras.losses.BinaryCrossentropy()
the optimizer:
optimizer = tf.keras.optimizers.SGD(0.001)
any suggestions for the optimizer?
why does this model loss is not decreasing? is there something wrong in model? someone, please help...
Instead of SGD, you should try Adam optimizer.
Also, in your network, increase the units in the Dense layer as this is the final representation of the data.
Finally, the number of filters should be less, keep it maximum to 512.
If your input size is small, then reduce the number of layers also.
Try changing the optimizer to Adam
I don't think there is anything wrong with the code.
Also try changing the dense layers-
after flatten use dense layer with 512 units and then directly your final output layer.
You don't need so many dense layers.
Also can you post your loss value, if its two large then maybe there is something wrong with your Train labels.
I have a question.
Working with CNN, there is a way to get output of an hidden layer during classification?
Example:
common_input = layers.Input(shape=(224, 224, 3))
x = model0(common_input) #model0 is a pretrain model on imagenet
x = layers.Flatten()(x)
p = layers.Dense(768, activation="relu")(x)
p = layers.Dropout(0.3)(p)
p = layers.Dense(8, activation="softmax", name="fc_out")(p)
model = Model(inputs=common_input, outputs=p)
My task is classification; so if I have 2000 images I will get a matrix 2000x8. If I need the output of layer dense, there is a way to get a matrix 2000x768 (both in the same computation)?
Thanks
I want to accomplish the equivalent of tf.depth_to_space in a Keras model. Specifically, the data in the Keras model is shaped H x W x 4 (i.e., depth of 4) and I want to permute the data so that the output is sized H x W x 1, with the mapping done as viewing the 4 input channels as 2x2 blocks; i.e.,
input location is y, x, k
output location is 2*y+(k//2), 2*x+(k%2), 1
I know that I can get the correct shape with:
outputs = keras.layers.Reshape((H*2,W*2,1), input_shape=(H,W,4))(inputs)
But I think that the mapping will be
input location is y, x, k
Linear_addess is y*W*4+x*4+k
output location is Linear_addess//(H*2), Linear_addess % (H*2), 1
which is not what I want
I tried directly using the
outputs = tf.depth_to_space(inputs, 2)
but that lead to an error:
TypeError: Output tensors to a Model must be Keras tensors. Found Tensor("DepthToSpace:0", shape=(?, 1024, 1024, 1), dtype=float32)
the problem can be seen with this simple function
def simple_net(H=512, W=512):
inputs = keras.layers.Input((H, W, 4))
# gets the correct shape but not the correct order
outputs = keras.layers.Reshape((H*2,W*2,1), input_shape=(H,W,4))(inputs)
# Run time error message
#outputs = tf.depth_to_space(output_planes, 2)
model = keras.models.Model(inputs, outputs)
return model
you should use Keras Lamda layer
from keras.layers import Lambda
import tensorflow as tf
Subpixel_layer = Lambda(lambda x:tf.nn.depth_to_space(x,scale))
x = Subpixel_layer(inputs=x)
MINIMAL MODEL
import tensorflow as tf
from keras.layers import Input,Lambda
in=Input(shape=(32,32,3))
x = Conv2D(32, (3,3), activation='relu')(in)
x = Conv2D(32, (3,3), activation='relu')(x)
sub_layer = Lambda(lambda x:tf.nn.depth_to_space(x,2))
x = sub_layer(inputs=x)
model = Model(inputs=in, outputs=x)
# model.compile(optimizer = Adam(), loss = mean_squared_error)
model.summary()
Summary
Essentially, I am training an LSTM model using Keras, but when I save it, its size takes up to 100MB. However, my purpose of the model is to deploy to a web server in order to serve as an API, my web server is not able to run it since the model size is too big. After analyzing all parameters in my model, I figured out that my model has 20,000,000 parameters but 15,000,000 parameters are untrained since they are word embeddings. Is there any way that I can minimize the size of the model by removing that 15,000,000 parameters but still preserving the performance of the model?
Here is my code for the model:
def LSTModel(input_shape, word_to_vec_map, word_to_index):
sentence_indices = Input(input_shape, dtype="int32")
embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index)
embeddings = embedding_layer(sentence_indices)
X = LSTM(256, return_sequences=True)(embeddings)
X = Dropout(0.5)(X)
X = LSTM(256, return_sequences=False)(X)
X = Dropout(0.5)(X)
X = Dense(NUM_OF_LABELS)(X)
X = Activation("softmax")(X)
model = Model(inputs=sentence_indices, outputs=X)
return model
Define the layers you want to save outside the function and name them. Then create two functions foo() and bar(). foo() will have the original pipeline including the embedding layer. bar() will have only the part of pipeline AFTER embedding layer. Instead, you will define new Input() layer in bar() with dimensions of your embeddings:
lstm1 = LSTM(256, return_sequences=True, name='lstm1')
lstm2 = LSTM(256, return_sequences=False, name='lstm2')
dense = Dense(NUM_OF_LABELS, name='Susie Dense')
def foo(...):
sentence_indices = Input(input_shape, dtype="int32")
embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index)
embeddings = embedding_layer(sentence_indices)
X = lstm1(embeddings)
X = Dropout(0.5)(X)
X = lstm2(X)
X = Dropout(0.5)(X)
X = dense(X)
X = Activation("softmax")(X)
return Model(inputs=sentence_indices, outputs=X)
def bar(...):
embeddings = Input(embedding_shape, dtype="float32")
X = lstm1(embeddings)
X = Dropout(0.5)(X)
X = lstm2(X)
X = Dropout(0.5)(X)
X = dense(X)
X = Activation("softmax")(X)
return Model(inputs=sentence_indices, outputs=X)
foo_model = foo(...)
bar_model = bar(...)
foo_model.fit(...)
bar_model.save_weights(...)
Now, you will train the original foo() model. Then you can save the weights of the reduced bar() model. When loading the model, don't forget to specify by_name=True parameter:
foo_model.load_weights('bar_model.h5', by_name=True)
Aux_input = Input(shape=(wrd_temp.shape[1],1), dtype='float32')#shape (,200)
Main_input = Input(shape=(wrdvec.shape[1],),dtype='float32')#shape(,367)
X = Bidirectional(LSTM(20,return_sequences=True))(Aux_input)
X = Dropout(0.2)(X)
X = Bidirectional(LSTM(28,return_sequences=True))(X)
X = Dropout(0.2)(X)
X = Bidirectional(LSTM(28,return_sequences=False))(X)
Aux_Output = Dense(Opt_train.shape[1], activation= 'softmax' )(X)#total 22 classes
x = keras.layers.concatenate([Main_input,Aux_Output],axis=1)
x = tf.reshape(x,[1,389,1])#here 389 is the shape of the new input i.e.(
Main_input+Aux_Output)
x = Bidirectional(LSTM(20,return_sequences=True))(x)
x = Dropout(0.2)(x)
x = Bidirectional(LSTM(28,return_sequences=True))(x)
x = Dropout(0.2)(x)
x = Bidirectional(LSTM(28,return_sequences=False))(x)
Main_Output = Dense(Opt_train.shape[1], activation= 'softmax' )(x)
model = Model(inputs=[Aux_input,Main_input], outputs= [Aux_Output,Main_Output])
Error occurs in line declaring the model i.e. model = Model(), here the attribute error has occurred, Also if there is any other mistake in my implementation please do take a not and notify me in the comment section.
The problem lied in the fact that using every tf operation should be encapsulated by either:
Using keras.backend functions,
Lambda layers,
Designated keras functions with the same behavior.
When you are using tf operation - you are getting tf tensor object which doesn't have history field. When you use keras functions you will get keras.tensors.