I have tried to build my model in Keras. My model has 2 inputs and 2 outputs. The model includes 5 convolution layers and the weights of the layers must be shared. But, the new layers after the convolutional layers should not share the weights. I used concatenate() but it affected the other layers. The figure of my model is below the page. How can I do that?
The Network-Model:
I think you should upload the image of model architecture again.
And the part of shared weights confused me. I think you mean just one layer with one input.
input = Input(shape=(64,))
layer_1 = Dense(32, activation="relu")(input)
layer_2 = Dense(16, activation="relu")(layer_1)
layer_3 = Dense(16, activation="relu")(layer_1)
combined = concatenate([layer_2 , layer_3])
output = Dense(8, activation="relu")(combined)
output = Dense(1, activation="linear")(z)
model = Model(inputs=[input], outputs=output)
Related
Could you help me with the code such that along with the dense layers also the last convolutional layer of Efficientnet is trained as well ?
features_url ="https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet21k_b3/feature_vector/2"
img_shape = (299,299,3)
features_layer = hub.KerasLayer(features_url,
input_shape=img_shape)
# below commented code keeps all the cnn layers frozen, thus it does not work for me at the moment
#features_layer.trainable = False
model = tf.keras.Sequential([
features_layer,
tf.keras.layers.Dense(256, activation = 'relu'),
tf.keras.layers.Dense(64, activation = 'relu'),
tf.keras.layers.Dense(4, activation = 'softmax')
])
In addition how can I save in a variable the name of the last convolutional layer ?
With all I know. pretrained CNN can do way better than CNN. I have a dataset of 855 images. I have applied CNN and got 94% accuracy.Then I applied Pretrained model (VGG16, ResNet50, Inception_V3, MobileNet)also with fine tuning but still i got highest 60% and two of them are doing very bad on classification. Can CNN really do better than pretrained model or my implementation is wrong. I've converted my image into 100 by 100 dimensions and followed the way of keras application. Then What is the issue ??
Naive CNN approach :
def cnn_model():
size = (100,100,1)
num_cnn_layers =2
NUM_FILTERS = 32
KERNEL = (3, 3)
MAX_NEURONS = 120
model = Sequential()
for i in range(1, num_cnn_layers+1):
if i == 1:
model.add(Conv2D(NUM_FILTERS*i, KERNEL, input_shape=size,
activation='relu', padding='same'))
else:
model.add(Conv2D(NUM_FILTERS*i, KERNEL, activation='relu',
padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(int(MAX_NEURONS), activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(int(MAX_NEURONS/2), activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
return model
VGG16 approach:
def vgg():
` `vgg_model = keras.applications.vgg16.VGG16(weights='imagenet',include_top=False,input_shape = (100,100,3))
model = Sequential()
for layer in vgg_model.layers:
model.add(layer)
# Freeze the layers
for layer in model.layers:
layer.trainable = False
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(3, activation='softmax'))
model.compile(optimizer=keras.optimizers.Adam(lr=1e-5),
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
What you're referring to as CNN in both cases talk about the same thing, which is a type of a neural network model. It's just that the pre-trained model has been trained on some other data instead of the dataset you're working on and trying to classify.
What is usually used here is called Transfer Learning. Instead of freezing all the layers, trying leaving the last few layers open so they can be retrained with your own data, so that the pretrained model can edit its weights and biases to match your needs as well. It could be the case that the dataset you're trying to classify is foreign to the pretrained models.
Here's an example from my own work, there are additional pieces of code but you can make it work with your own code, the logic remains the same
#You extract the layer which you want to manipulate, usually the last few.
last_layer = pre_trained_model.get_layer(name_of_layer)
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024,activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1,activation='sigmoid')(x)
#Here we combine your newly added layers and the pre-trained model.
model = Model( pre_trained_model.input, x)
model.compile(optimizer = RMSprop(lr=0.0001),
loss = 'binary_crossentropy',
metrics = ['accuracy'])
Adding to what #Ilknur Mustafa mentioned, as your dataset may be foreign to the images used for pre-training, you can try to re-train few last layers of the pre-trained model instead of adding a whole new layers. The below example code doesn't add any additional trainable layer other than the output layer. In this way, you can benefit by retraining the last few layers on the existing weights, rather than training from scratch. This may be beneficial if you don't have a large dataset to train on.
# load model without classifier layers
vgg = VGG16(include_top=False, input_shape=(100, 100, 3), weights='imagenet', pooling='avg')
# make only last 2 conv layers trainable
for layer in vgg.layers[:-4]:
layer.trainable = False
# add output layer
out_layer = Dense(3, activation='softmax')(vgg.layers[-1].output)
model_pre_vgg = Model(vgg.input, out_layer)
# compile model
opt = SGD(lr=1e-5)
model_pre_vgg.compile(optimizer=opt, loss=keras.losses.categorical_crossentropy, metrics=['accuracy'])
#You extract the layer which you want to manipulate, usually the last few.
last_layer = pre_trained_model.get_layer(name_of_layer)
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024,activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1,activation='sigmoid')(x)
#Here we combine your newly added layers and the pre-trained model.
model = Model( pre_trained_model.input, x)
model.compile(optimizer = RMSprop(lr=0.0001),
loss = 'binary_crossentropy',
metrics = ['accuracy'])
I'm using Inception model in Keras with the pre-trained weights of image net.
The problem is that default input size for this model is 299x299 as per Keras documentation. While my images are 230 * 350 and I don't want to resize them as it will distort the image. So I am trying to find a method to change input layer size.
Below is code is what I tried so far, however I am doubting that the image net weights are being preserved as I thing the architecture will change when I change input size.
Any ideas ..
input = Input(shape=(230, 350, 3), name='image_input')
base_model = InceptionV3(weights='imagenet', include_top=False, input_tensor=input)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(64, activation='relu')(x)
predictions = Dense(1, activation='sigmoid')(x)
model = Model(inputs=input, outputs=predictions)
for layer in base_model.layers:
layer.trainable = True
model.compile(loss='binary_crossentropy',
optimizer=Adam(lr=0.0001),
metrics=['accuracy'])
Inception V3 is a fully convolutional model. You use the global pooling on the top of convolutional encoder, so slight deviation from the 299x299 should not be a big deal. If you do not have error messages with your code, it must be absolutely fine to use it like this.
I'm using pre-trained ResNet-50 model and want to feed the outputs of the penultimate layer to a LSTM Network. Here is my sample code containing only CNN (ResNet-50):
N = NUMBER_OF_CLASSES
#img_size = (224,224,3)....same as that of ImageNet
base_model = ResNet50(include_top=False, weights='imagenet',pooling=None)
x = base_model.output
x = GlobalAveragePooling2D()(x)
predictions = Dense(1024, activation='relu')(x)
model = Model(inputs=base_model.input, outputs=predictions)
Next, I want to feed it to a LSTM network, as follows...
final_model = Sequential()
final_model.add((model))
final_model.add(LSTM(64, return_sequences=True, stateful=True))
final_model.add(Dense(N, activation='softmax'))
But I'm confused how to reshape the output to the LSTM input. My original input is (224*224*3) to CNN.
Also, should I use TimeDistributed?
Any kind of help is appreciated.
Adding an LSTM after a CNN does not make a lot of sense, as LSTM is mostly used for temporal/sequence information, whereas your data seems to be only spatial, however if you still like to use it just use
x = Reshape((1024,1))(x)
This would convert it to a sequence of 1024 samples, with 1 feature
If you are talking of spatio-temporal data, Use Timedistributed on the Resnet Layer and then you can use convlstm2d
Example of using pretrained network with LSTM:
inputs = Input(shape=(config.N_FRAMES_IN_SEQUENCE, config.IMAGE_H, config.IMAGE_W, config.N_CHANNELS))
cnn = VGG16(include_top=False, weights='imagenet', input_shape=(config.IMAGE_H, config.IMAGE_W, config.N_CHANNELS))
x = TimeDistributed(cnn)(inputs)
x = TimeDistributed(Flatten())(x)
x = LSTM(256)(x)
Suppose I have trained the model below for an epoch:
model = Sequential([
Dense(32, input_dim=784), # first number is output_dim
Activation('relu'),
Dense(10), # output_dim, input_dim is taken for granted from above
Activation('softmax'),
])
And I got the weights dense1_w, biases dense1_b of first hidden layer (named it dense1) and a single data sample sample.
How do I use these to get the output of dense1 on the sample in keras?
Thanks!
The easiest way is to use the keras backend. With the keras backend you can define a function that gives you the intermediate output of a keras model as defined here (https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer).
So in essence:
get_1st_layer_output = K.function([model.layers[0].input],
[model.layers[1].output])
layer_output = get_1st_layer_output([X])
Just recreate the first part of the model up until the layer for which you would like the output (in your case only the first dense layer). Afterwards you can load the trained weights of the first part in your newly created model and compile it.
The output of the prediction with this new model will be the output of the layer (in your case the first dense layer).
from keras.models import Sequential
from keras.layers import Dense, Activation
import numpy as np
model = Sequential([
Dense(32, input_dim=784), # first number is output_dim
Activation('relu'),
Dense(10), # output_dim, input_dim is taken for granted from above
Activation('softmax'),
])
model.compile(optimizer='adam', loss='categorical_crossentropy')
#create some random data
n_features = 5
samples = np.random.randint(0, 10, 784*n_features).reshape(-1,784)
labels = np.arange(10*n_features).reshape(-1, 10)
#train your sample model
model.fit(samples, labels)
#create new model
new_model= Sequential([
Dense(32, input_dim=784), # first number is output_dim
Activation('relu')])
#set weights of the first layer
new_model.set_weights(model.layers[0].get_weights())
#compile it after setting the weights
new_model.compile(optimizer='adam', loss='categorical_crossentropy')
#get output of the first dens layer
output = new_model.predict(samples)
As for weights, I had a none-Sequential model. What I did was using model.summary() to get the desired layers name and then model.get_layer("layer_name").get_weights() to get the weights.