We built a small cnn using keras, tensorflow.
We used keras's functional API for that matter.
We're interested in passing last convolutional layer's weights (the one before the fully connected layers) as an input to other cnn.
for simplicity I suggest the next simplified code to discuss upon:
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
visible = Input(shape=(64,64,1))
conv1 = Conv2D(32, kernel_size=4, activation='relu')(visible)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(16, kernel_size=4, activation='relu')(pool1)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
hidden1 = Dense(10, activation='relu')(pool2)
output = Dense(1, activation='sigmoid')(hidden1)
model = Model(inputs=visible, outputs=output)
model.compile(optimizer='Adam',
loss=['sparse_categorical_crossentropy', None],
metrics=['accuracy'])
model.fit(train_dataset,
train_labels,
epochs=400,
batch_size=512,
validation_data=(valid_dataset, valid_labels),
verbose=1,
callbacks=[early_stop])
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='convolutional_neural_network.png')
question is: how can I pass pool2 layer as an input to some other simple model using keras, so it will train simultaniously with the first model described above?
One possible way would be to add to your model so that everything is contained in a single model that ends with 2 branches. The functional API in keras allows you to define connections between layers however you want, and also provides the infrastructure for having multiple outputs and loss functions.
For example:
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
visible = Input(shape=(64,64,1))
conv1 = Conv2D(32, kernel_size=4, activation='relu')(visible)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(16, kernel_size=4, activation='relu')(pool1)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
#add your second model here
X = FirstLayer()(pool2) #replace with your actual network layer
# ...
output2 = YourSecondOutput()(X)
hidden1 = Dense(10, activation='relu')(pool2)
output = Dense(1, activation='sigmoid')(hidden1)
model = Model(inputs=visible, outputs=[output, output2]) #list of outputs
model.compile(optimizer='Adam',
loss=['sparse_categorical_crossentropy', None],
metrics=['accuracy'])
model.fit(train_dataset,
train_labels,
epochs=400,
batch_size=512,
validation_data=(valid_dataset, valid_labels),
verbose=1,
callbacks=[early_stop])
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='convolutional_neural_network.png')
Then you’ll just need to update your inputs to fit so that you have labels for each output. You can find more info in the keras documentation on multi input and output models
Related
I've seen that in keras I can use tf.slpit to split layers. My problem is that I don't understand how to do the connections between the "forked ways" that each layer must take.
Here is an image of an example I'm trying to do. It is basically an Input layer that splits into 2 sub-NN and then reunite in a layer before the output.
Diagram (thin black lines white poligons represent the wheight connections matrixes):
After googling with better words (as merge neural networks) I found that Keras Functional API are the answer.
Helpful links:
https://machinelearningmastery.com/keras-functional-api-deep-learning/
https://www.educative.io/answers/how-to-merge-two-different-models-in-keras
Example code from first link:
# Shared Input Layer
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras.layers.merge import concatenate
# input layer
visible = Input(shape=(64,64,1))
# first feature extractor
conv1 = Conv2D(32, kernel_size=4, activation='relu')(visible)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
flat1 = Flatten()(pool1)
# second feature extractor
conv2 = Conv2D(16, kernel_size=8, activation='relu')(visible)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
flat2 = Flatten()(pool2)
# merge feature extractors
merge = concatenate([flat1, flat2])
# interpretation layer
hidden1 = Dense(10, activation='relu')(merge)
# prediction output
output = Dense(1, activation='sigmoid')(hidden1)
model = Model(inputs=visible, outputs=output)
# summarize layers
print(model.summary())
# plot graph
plot_model(model, to_file='shared_input_layer.png')
I have read this, this
I have numerical data in arrays of shape,
input_array = 14674 x 4
output_array = 13734 x 4
reshaping for LSTM (batch, timesteps, features) gives
input_array= (14574, 100, 4)
output_array = (13634, 100, 4)
Now I would like to build a Many to Many LSTM architecture for this given data,
should I use encoder-decorder or synced sequence input and output architecture
using following model but it works on when input and outputs are same
import tenowingsorflow
from tensorflow.keras.metrics import Recall, Precision
from tensorflow.keras.layers import Conv1D, Dense, MaxPooling1D, Flatten
opt = tensorflow.keras.optimizers.Adam(learning_rate=0.001)
model_enc_dec_cnn = Sequential()
model_enc_dec_cnn.add(Conv1D(filters=64, kernel_size=9, activation='relu', input_shape=(100, 4)))
model_enc_dec_cnn.add(Conv1D(filters=64, kernel_size=11, activation='relu'))
model_enc_dec_cnn.add(MaxPooling1D(pool_size=2))
model_enc_dec_cnn.add(Flatten())
model_enc_dec_cnn.add(RepeatVector(100))
model_enc_dec_cnn.add(LSTM(100, activation='relu', return_sequences=True))
model_enc_dec_cnn.add(TimeDistributed(Dense(4)))
model_enc_dec_cnn.compile( optimizer=opt, loss='mse', metrics=['accuracy'])
history = model_enc_dec_cnn.fit(X,y, epochs=3, batch_size=64, )
I'm trying to train a simple model over some picture data that belongs to 10 classes.
The images are in B/W format (not gray scale), I'm using the image_dataset_from_directory to import the data into python as well as split it into validation/training sets.
My code is as below:
My Imports
import matplotlib.pyplot as plt
import numpy as np
import os
import PIL
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense
Read Image Data
trainDT = tf.keras.preprocessing.image_dataset_from_directory(
data_path,
labels="inferred",
label_mode="categorical",
class_names=['0','1','2','3','4','5','6','7','8','9'],
color_mode="grayscale",
batch_size=4,
image_size=(256, 256),
shuffle=True,
seed=44,
validation_split=0.1,
subset='validation',
interpolation="bilinear",
follow_links=False,
)
Model Creation/Compile/Fit
model = Sequential([
Dense(units=128, activation='relu', input_shape=(256,256,1), name='h1'),
Dense(units=64, activation='relu',name='h2'),
Dense(units=16, activation='relu',name='h3'),
layers.Flatten(name='flat'),
Dense(units=10, activation='softmax',name='out')
],name='1st')
model.summary()
model.compile(optimizer='adam' , loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x=trainDT, validation_data=train_data, epochs=10, verbose=2)
The model training returns an error:
InvalidArgumentError Traceback (most recent call last)
....
/// anaconda paths and anaconda python code snippets in the error reporting \\\
....
InvalidArgumentError: Matrix size-incompatible: In[0]: [1310720,3], In[1]: [1,128]
[[node 1st/h1/Tensordot/MatMul (defined at <ipython-input-38-58d6507e2d35>:1) ]] [Op:__inference_test_function_11541]
Function call stack:
test_function
I don't understand where the size mismatch comes from, I've spent a few hours looking around for a solution and trying different things but nothing seems to work for me.
Appreciate any help, thank you in advance!
Dense layers expect flat input (not 3d tensor), but you are sending (256,256,1) shaped tensor into the first dense layer. If you want to use dense layers from the beginning then you will need to move the flatten to be the first layer or you will need to properly reshape your data.
model = tf.keras.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax")
])
Also, the flatten between 2 dense layers makes no sense because the output of a dense layer is flat anyway.
From the structure of your model (especially the flatten placement), I assume that
those dense layers were supposed to be convolutional layers instead.
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation="relu"),
tf.keras.layers.Dense(10, activation="softmax")
])
Convolutional layers can process 2D input and they will also produce more dimensional output which you need to flatten before passing it to the dense top (note that you can add more convolutional layers).
Hy mhk777 Hope you are doing well. Brother, I think that you are confusing dense layers with convolution layers. You have to apply some convolution layers to the image before giving it to dense layers. If you don't want to apply convolution than you have to give 2d array to the dense layer i.e (number of samples, data)
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
model = models.Sequential()
# Here are convolutional Layer
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(256,256,1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
# Here are your dense layers
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))
model.summary()
model.compile(optimizer='adam' , loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x=trainDT, validation_data=train_data, epochs=10, verbose=2)
I created this model
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D, Input, Dense
from tensorflow.keras.layers import Reshape, Flatten
from tensorflow.keras import Model
def create_DeepCAPCHA(input_shape=(28,28,1),n_prediction=1,n_class=10,optimizer='adam',
show_summary=True):
inputs = Input(input_shape)
x = Conv2D(filters=32, kernel_size=3, activation='relu', padding='same')(inputs)
x = MaxPooling2D(pool_size=2)(x)
x = Conv2D(filters=48, kernel_size=3, activation='relu', padding='same')(x)
x = MaxPooling2D(pool_size=2)(x)
x = Conv2D(filters=64, kernel_size=3, activation='relu', padding='same')(x)
x = MaxPooling2D(pool_size=2)(x)
x = Flatten()(x)
x = Dense(512, activation='relu')(x)
x = Dense(units=n_prediction*n_class, activation='softmax')(x)
outputs = Reshape((n_prediction,n_class))(x)
model = Model(inputs, outputs)
model.compile(optimizer=optimizer,
loss='categorical_crossentropy',
metrics= ['accuracy'])
if show_summary:
model.summary()
return model
I tried the model on MNIST dataset
import tensorflow as tf
import numpy as np
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
inputs = x_train
outputs = tf.keras.utils.to_categorical(y_train, num_classes=10)
outputs = np.expand_dims(outputs,1)
model = create_DeepCAPCHA(input_shape=(28,28,1),n_prediction=1,n_class=10)
model.fit(inputs, outputs, epochs=10, validation_split=0.1)
but it failed to converge (stuck at 10% accuracy => same as random guessing). Yet when I remove the "padding='same'" argument from Conv2D layers, it works flawlessly:
def working_DeepCAPCHA(input_shape=(28,28,1),n_prediction=1,n_class=10,optimizer='adam',
show_summary=True):
inputs = Input(input_shape)
x = Conv2D(filters=32, kernel_size=3, activation='relu')(inputs)
x = MaxPooling2D(pool_size=2)(x)
x = Conv2D(filters=48, kernel_size=3, activation='relu')(x)
x = MaxPooling2D(pool_size=2)(x)
x = Conv2D(filters=64, kernel_size=3, activation='relu')(x)
x = MaxPooling2D(pool_size=2)(x)
x = Flatten()(x)
x = Dense(512, activation='relu')(x)
x = Dense(units=n_prediction*n_class, activation='softmax')(x)
outputs = Reshape((n_prediction,n_class))(x)
model = Model(inputs, outputs)
model.compile(optimizer=optimizer,
loss='categorical_crossentropy',
metrics= ['accuracy'])
if show_summary:
model.summary()
return model
Anyone has any idea what problem this is?
Thank you for sharing, it was really interesting to me. So I wrote the code and tested several scenarios. Note that what I'm going to say is just my guest and I'm not sure about it.
My conclusion from those tests is that no padding or valid padding works because it produces (1, 1, 64) output shape for the last conv layer. But if you set the padding to same it will produce (3, 3, 64) and because the next layer is a big Dense layer, it will multiply the number of networks parameters by 9 (I expected to somehow result in overfitting) and it seems to make it much harder for the network to find the good values of the parameters. So I tried some different ways to reduce the output of last conv layer to (1, 1, 64) as below:
using one more conv layer + maxpooling
change the last maxpooling to pool_size of 4
using stride of 2 for one of conv layers
change the filters of last conv layer to 20
and they all worked well. even changing the dense units from 512 to 64 will help as well (note that even now you may get poor results with a little chance, because of bad initialization I guess).
Then I changed the shape of the last conv layer to (2, 2, 64) and the chance to get a good result (more than 90% accuracy) reduced (alot of time I've got 10% accuracy).
So it seems that a lot of parameters can confuse the model. But if you want to know why the network does not overfit, I have no answer for you.
This is my code in order to join resnet50 model with this model (that I want to train on my dataset). I want to freeze layers of the resnet50 model ( see Trainable=false) in the code .
Here I'm importing resnet 50 model
``
import tensorflow.keras
import tensorflow as tf
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
resnet50_imagnet_model = tensorflow.keras.applications.resnet.ResNet50(weights = "imagenet",
include_top=False,
input_shape = (150, 150, 3),
pooling='max')
``
Here I create my model
```
# freeze feature layers and rebuild model
for l in resnet50_imagnet_model.layers:
l.trainable = False
#construction du model
model5 = [
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(12, activation='softmax')
]
#Jointure des deux modeles
model_using_pre_trained_resnet50 = tf.keras.Sequential(resnet50_imagnet_model.layers + model5 )
```
Last line doesn't work and I have this error :
Input 0 of layer conv2_block1_3_conv is incompatible with the layer: expected axis -1 of input shape to have value 64 but received input with shape [None, 38, 38, 256
Thanks for help .
You can also use keras' functional API, like below
from tensorflow.keras.applications.resnet50 import ResNet50
import tensorflow as tf
resnet50_imagenet_model = ResNet50(include_top=False, weights='imagenet', input_shape=(150, 150, 3))
#Flatten output layer of Resnet
flattened = tf.keras.layers.Flatten()(resnet50_imagenet_model.output)
#Fully connected layer 1
fc1 = tf.keras.layers.Dense(128, activation='relu', name="AddedDense1")(flattened)
#Fully connected layer, output layer
fc2 = tf.keras.layers.Dense(12, activation='softmax', name="AddedDense2")(fc1)
model = tf.keras.models.Model(inputs=resnet50_imagenet_model.input, outputs=fc2)
Also refer this question.