Keras: What is the difference between model and layers? - python

EDIT: This video by Franchois Chollet says that Layer + training eval methods = Model
In keras documentation it says that models are made up of layers. However in this section it shows that a model can be made up of models.
from keras.layers import Conv2D, MaxPooling2D, Input, Dense, Flatten
from keras.models import Model
# First, define the vision modules
digit_input = Input(shape=(27, 27, 1))
x = Conv2D(64, (3, 3))(digit_input)
x = Conv2D(64, (3, 3))(x)
x = MaxPooling2D((2, 2))(x)
out = Flatten()(x)
vision_model = Model(digit_input, out)
# Then define the tell-digits-apart model
digit_a = Input(shape=(27, 27, 1))
digit_b = Input(shape=(27, 27, 1))
# The vision model will be shared, weights and all
out_a = vision_model(digit_a)
out_b = vision_model(digit_b)
concatenated = keras.layers.concatenate([out_a, out_b])
out = Dense(1, activation='sigmoid')(concatenated)
classification_model = Model([digit_a, digit_b], out)
So, what is the effective difference between Model and layers? Is it just for code readability or does it serve some function?

In Keras, a network is a directed acyclic graph (DAG) of layers. A model is a network with added training and evaluation routines.
The framework allows you to build network DAGs out of both individual layers and other DAGs. The latter is what you're seeing in the example and what seems to be causing the confusion.

The difference is that models can be trained (they have a fit method), while layers do not have such method and need to be part of a Model instance so you can train them. Generally speaking, layers in isolation aren't useful.
The idea of the Functional API to use models inside models is that you can define one model, and reuse its weights as part of another model, in a way that weights are shared. This is not possible with layers alone.

Related

How to visualize a keras neural network with trained weights?

I have created a sequential model using keras package similar to this:
from keras.layers import Dense
from keras.models import Sequential
model = Sequential()
# Adding the input layer and the first hidden layer
model.add(Dense(6, activation='relu',input_dim = 11))
# Adding the second hidden layer (The real model contains many hidden layers)
model.add(Dense(6,activation='relu'))
# Adding the output layer
model.add(Dense( 1, activation = 'sigmoid'))
Then I used keras visualizer to get a visualization of the neural network without weights.
# Compiling the ANN
classifier.compile(optimizer = 'Adamax', loss = 'binary_crossentropy',metrics=['accuracy'])
model_history=classifier.fit(X_train, y_train.to_numpy(), batch_size = 10, epochs = 100)
I want to print trained weights of the model for this kind of visualization. Is there any library or module that I can use for that? Any suggestion will be helpful. Here is the picture of the designed neural network without printing weights.
I want to print trained weights of the model to this kind of visualization. Is there any library or module that I can use for that?
Option1: deepreplay
There is a workaround in the form of package\module so-called Deep Replay you can import as a library for resolving your problem.
Thanks to this package, you can visualize\animate and the most probably print trained weights using the following example:
# install FFMPEG (to generate animations)
#!apt-get install ffmpeg
# install actual deepreplay package
#!pip install deepreplay
from keras.initializers import glorot_normal, glorot_uniform, he_normal, he_uniform
from keras.layers import Dense
from keras.models import Sequential
from deepreplay.callbacks import ReplayData
from deepreplay.datasets.ball import load_data
from deepreplay.plot import compose_plots, compose_animations
from deepreplay.replay import Replay
from matplotlib import pyplot as plt
plt.rcParams['animation.ffmpeg_path'] = '/usr/bin/ffmpeg'
X, y = load_data(n_dims=10)
activation = 'relu'
initializer_name = 'he_uniform'
initializer = eval(initializer_name)(seed=13)
title = 'Activation: ReLU - Initializer: {}'.format(initializer_name)
group_name = 'relu_{}'.format(initializer_name)
filename = f'{group_name}_{initializer_name}_{activation}_weight_initializers.h5'
# Model builder function
def build_model(n_layers, input_dim, units, activation, initializer):
if isinstance(units, list):
assert len(units) == n_layers
else:
units = [units] * n_layers
model = Sequential()
# Adds first hidden layer with input_dim parameter
model.add(Dense(units=units[0],
input_dim=input_dim,
activation=activation,
kernel_initializer=initializer,
name='h1'))
# Adds remaining hidden layers
for i in range(2, n_layers + 1):
model.add(Dense(units=units[i-1],
activation=activation,
kernel_initializer=initializer,
name='h{}'.format(i)))
# Adds output layer
model.add(Dense(units=1, activation='sigmoid', kernel_initializer=initializer, name='o'))
# Compiles the model
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['acc'])
return model
replaydata = ReplayData(X, y, filename=filename, group_name=group_name)
# Create the MLP model with 5 layers within 10 input neurons and 100 unists in hidden and output layers
model = build_model(n_layers=5, input_dim=10, units=100, activation=activation, initializer=initializer)
# fit the model over 10 epochs with batch size of 16
model.fit(X, y, epochs=10, batch_size=16, callbacks=[replaydata])
# Plot the results
replay = Replay(replay_filename=filename, group_name=group_name)
fig = plt.figure(figsize=(12, 6))
ax_zvalues = plt.subplot2grid((2, 2), (0, 0))
ax_weights = plt.subplot2grid((2, 2), (0, 1))
ax_activations = plt.subplot2grid((2, 2), (1, 0))
ax_gradients = plt.subplot2grid((2, 2), (1, 1))
wv = replay.build_weights(ax_weights)
gv = replay.build_gradients(ax_gradients)
# Z-values
zv = replay.build_outputs(ax_zvalues, before_activation=True,
exclude_outputs=True, include_inputs=False)
# Activations
av = replay.build_outputs(ax_activations, exclude_outputs=True, include_inputs=False)
# Save plots
fig = compose_plots([zv, wv, av, gv], epoch=0, title=title)
fig.savefig('part2.png', format='png', dpi=120)
# Animate & save mp4
sample_anim = compose_animations([zv, wv, av, gv])
sample_anim.save('part2.mp4', dpi=120, fps=5)
visulize output results using violin plots over 10 epochs for simple:
So the top right subplot shows Weights change through the layers over ten epochs. The other subplots illustrate the performance of Z-values, Activation functions, and Gradients changes.
Note1: if you are interested to interpret violin plots, please check these posts: post1 , post2, post3
Note2: Please notice that the training process starts with some initializers, which can have different weighing at the beginning. The common initialization schemes are as follows:
Random
Xavier / Glorot
He
By default, kernel initializer is glorot_uniform when you use keras module (reference), but you can check this post and this paper Understanding the difficulty of training deep feedforward neural networks for further info. It is also possible to initialize weights in NN manually. You can check this post.
Note3: Recently, this package has a bug and can't be implemented in Google Colab Notebook, which is still an open issue; its GH Repo as well as post in SoF. So it is better to try it on your own local machine, hopefully.
Option2:wandb
There is another ML-based tool, so-called W&B (Weights and Biases) you can import as a library for resolving your problem.
once you sign-up and login into your account based on instructions, you can use this API to track and visualize all the pieces of your ML pipeline, including Weights and Biases and other parameters in your pipeline:
import wandb
from wandb.keras import WandbCallback
# Step1: Initialize W&B run
wandb.init(project='project_name')
# 2. Save model inputs and hyperparameters
config = wandb.config
config.learning_rate = 0.01
# Model training code here ...
import tensorflow as tf
from tensorflow import keras
loss=tf.keras.losses.MeanSquaredError()
Optimiser=tf.keras.optimizers.Adam(learning_rate =0.001)
model.compile(loss=loss, optimizer=Optimiser, metrics=['accuracy'])
wandb.log({"loss": loss})
# Step 3: Add WandbCallback
model.fit(X, y, epochs=10, batch_size=16, callbacks=[WandbCallback()])
once you run your model, you can check graph info in the Model section which is selected\shown on the left side with blue color:
hope this answer helps you out, and if it is so, you can accept it as an answer ✅.

Adding a Concatenated layer to TensorFlow 2.0 (using Attention)

In building a model that uses TensorFlow 2.0 Attention I followed the example given in the TF docs. https://www.tensorflow.org/api_docs/python/tf/keras/layers/Attention
The last line in the example is
input_layer = tf.keras.layers.Concatenate()(
[query_encoding, query_value_attention])
Then the example has the comment
# Add DNN layers, and create Model.
# ...
So it seemed logical to do this
model = tf.keras.Sequential()
model.add(input_layer)
This produces the error
TypeError: The added layer must be an instance of class Layer.
Found: Tensor("concatenate/Identity:0", shape=(None, 200), dtype=float32)
UPDATE (after #thushv89 response)
What I am trying to do in the end is add an attention layer in the following model which works well (or convert it to an attention model).
model = tf.keras.Sequential()
model.add(layers.Embedding(vocab_size, embedding_nodes, input_length=max_length))
model.add(layers.LSTM(20))
#add attention here?
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(loss='mean_squared_error', metrics=['accuracy'])
My data looks like this
4912,5059,5079,0
4663,5145,5146,0
4663,5145,5146,0
4840,5117,5040,0
Where the first three columns are the inputs and the last column is binary and the goal is classification. The data was prepared similarly to this example with a similar purpose, binary classification. https://machinelearningmastery.com/use-word-embedding-layers-deep-learning-keras/
So, first thing is Keras has three APIs when it comes to creating models.
Sequential - (Which is what you're doing here)
Functional - (Which is what I'm using in the solution)
Subclassing - Creating Python classes to represent custom models/layers
The way the model created in the tutorial is not to be used with sequential models but a model from the Functional API. So you got to do the following. Note that, I've taken the liberty of defining the dense layers with arbitrary parameters (e.g. number of output classes, which you can change as needed).
import tensorflow as tf
# Variable-length int sequences.
query_input = tf.keras.Input(shape=(None,), dtype='int32')
value_input = tf.keras.Input(shape=(None,), dtype='int32')
# ... the code in the middle
# Concatenate query and document encodings to produce a DNN input layer.
input_layer = tf.keras.layers.Concatenate()(
[query_encoding, query_value_attention])
# Add DNN layers, and create Model.
# ...
dense_out = tf.keras.layers.Dense(50, activation='relu')(input_layer)
pred = tf.keras.layers.Dense(10, activation='softmax')(dense_out)
model = tf.keras.models.Model(inputs=[query_input, value_input], outputs=pred)
model.summary()

How to bypass portion of neural network in TensorFlow for some (but not all) features

In my TensorFlow model I have some data that I feed into a stack of CNNs before it goes into a few fully connected layers. I have implemented that with Keras' Sequential model. However, I now have some data that should not go into the CNN and instead be fed directly into the first fully connected layer because that data contains some values and labels that are part of the input data but that data should not undergo convolutions as it is not image data.
Is such a thing possible with tensorflow.keras or should I do that with tensorflow.nn instead? As far as I understand Keras' sequential models is that the input goes in one end and comes out the other with no special wiring in the middle.
Am I correct that to do this I have to use tensorflow.concat on the data from the last CNN layer and the data that bypasses the CNNs before feeding it into the first fully connected layer?
Here is an simple example in which the operation is to sum the activations from different subnets:
import keras
import numpy as np
import tensorflow as tf
from keras.layers import Input, Dense, Activation
tf.reset_default_graph()
# this represents your cnn model
def nn_model(input_x):
feature_maker = Dense(10, activation='relu')(input_x)
feature_maker = Dense(20, activation='relu')(feature_maker)
feature_maker = Dense(1, activation='linear')(feature_maker)
return feature_maker
# a list of input layers, of course the input shapes can be different
input_layers = [Input(shape=(3, )) for _ in range(2)]
coupled_feature = [nn_model(input_x) for input_x in input_layers]
# assume you take the sum of the outputs
coupled_feature = keras.layers.Add()(coupled_feature)
prediction = Dense(1, activation='relu')(coupled_feature)
model = keras.models.Model(inputs=input_layers, outputs=prediction)
model.compile(loss='mse', optimizer='adam')
# example training set
x_1 = np.linspace(1, 90, 270).reshape(90, 3)
x_2 = np.linspace(1, 90, 270).reshape(90, 3)
y = np.random.rand(90)
inputs_x = [x_1, x_2]
model.fit(inputs_x, y, batch_size=32, epochs=10)
You can actually plot the model to gain more intuition
from keras.utils.vis_utils import plot_model
plot_model(model, show_shapes=True)
The model of the above code looks like this
With a little remodeling and the functional API you can:
#create the CNN - it can also be a sequential
cnn_input = Input(image_shape)
cnn_output = Conv2D(...)(cnn_input)
cnn_output = Conv2D(...)(cnn_output)
cnn_output = MaxPooling2D()(cnn_output)
....
cnn_model = Model(cnn_input, cnn_output)
#create the FC model - can also be a sequential
fc_input = Input(fc_input_shape)
fc_output = Dense(...)(fc_input)
fc_output = Dense(...)(fc_output)
fc_model = Model(fc_input, fc_output)
There is a lot of space for creativity, this is just one of the ways.
#create the full model
full_input = Input(image_shape)
full_output = cnn_model(full_input)
full_output = fc_model(full_output)
full_model = Model(full_input, full_output)
You can use any of the three models in any way you want. They share the layers and the weights, so internally they are the same.
Saving and loading the full model might be quirky. I'd probably save the other two separately and when loading create the full model again.
Notice also that if you save two models that share the same layers, after loading they will probably not share these layers anymore. (Another reason for saving/loading only fc_model and cnn_model, while creating full_model again from code)

Build a model with Keras Functional API for Tensorflow LITE

I was testing with Keras Functional API because I need to migrate a model to Tensorflow LITE.
I built a model with 3 inputs and 3 outputs.
The model works if all inputs have the same number of observations. I don't understand that point because the are independent.
ValueError: All input arrays (x) should have the same number of samples. Got array shapes: [(10, 5), (20, 5), (30, 5)
I would like to build a model with several inputs with different number of observations. Is that possible?
import numpy as np
from keras.layers import Input, Dense
from keras.models import Model
capas = 3
data = [ np.random.random(size=(50,5)) for i in range(3)]
labels = [ np.random.random(size=(50,2)) for i in range(3)]
visible=[]
preds=[]
for i in range( capas):
visible.append(Input(shape=(5,)))
x=Dense(5, activation='relu')(visible[i])
x=Dense(10, activation='relu')(x)
preds.append( Dense(2)(x))
model = Model(inputs=visible,output=preds)
model.compile(optimizer='adam',
loss='mean_squared_error',
metrics=['accuracy'])
model.fit(data, labels,epochs=50)
It does not matter if the sub-models are each independent, because if you make a multi-input multi-output model, it is trained by combining (weighting) the losses of each model into a single loss from where gradient descent is performed, and this requires the same number of samples in each input and output.
Since you say that the models are each independent, you can train them independently, and then make a new model that combines the three models (and their trained weights) with multiple inputs and outputs.

Output tensors to a Model must be Keras tensors

I was trying to make a model learned from difference between two model output. So I made code like below. But it occurred error read:
TypeError: Output tensors to a Model must be Keras tensors. Found:
Tensor("sub:0", shape=(?, 10), dtype=float32)
I have found related answer including lambda, but I couldn't solve this issue.
Does anyone know this issue?
It might be seen converting tensor to keras's tensor.
Thx in advance.
from keras.layers import Dense
from keras.models import Model
from keras.models import Sequential
left_branch = Sequential()
left_branch.add(Dense(10, input_dim=784))
right_branch = Sequential()
right_branch.add(Dense(10, input_dim=784))
diff = left_branch.output - right_branch.output
model = Model(inputs=[left_branch.input, right_branch.input], outputs=[diff])
model.compile(optimizer='rmsprop', loss='binary_crossentropy', loss_weights=[1.])
model.summary(line_length=150)
It's better to keep all operations done by a layer, do not subtract outputs like that (I wouldn't risk hidden errors for doing things differently from what the documentation expects):
from keras.layers import *
def negativeActivation(x):
return -x
left_branch = Sequential()
left_branch.add(Dense(10, input_dim=784))
right_branch = Sequential()
right_branch.add(Dense(10, input_dim=784))
negativeRight = Activation(negativeActivation)(right_branch.output)
diff = Add()([left_branch.output,negativeRight])
model = Model(inputs=[left_branch.input, right_branch.input], outputs=diff)
model.compile(optimizer='rmsprop', loss='binary_crossentropy', loss_weights=[1.])
When joining models like that, I do prefer using the Model way of doing it, with layers, instead of using Sequential:
def negativeActivation(x):
return -x
leftInput = Input((784,))
rightInput = Input((784,))
left_branch = Dense(10)(leftInput) #Dense(10) creates a layer
right_branch = Dense(10)(rightInput) #passing the input creates the output
negativeRight = Activation(negativeActivation)(right_branch)
diff = Add()([left_branch,negativeRight])
model = Model(inputs=[leftInput, rightInput], outputs=diff)
model.compile(optimizer='rmsprop', loss='binary_crossentropy', loss_weights=[1.])
With this, you can create other models with the same layers, they will share the same weights:
leftModel = Model(leftInput,left_branch)
rightModel = Model(rightInput,right_branch)
fullModel = Model([leftInput,rightInput],diff)
Training one of them will affect the others if they share the same layer.
You can train just the right part in the full model by making left_branch.trainable = False before compiling (or compile again for training), for instance.
I think I solve this, but it might be exact solution.
I added some code like below :
diff = left_branch.output - right_branch.output
setattr(diff, '_keras_history', getattr(right_branch.output, '_keras_history'))
setattr(diff, '_keras_shape', getattr(right_branch.output, '_keras_shape'))
setattr(diff, '_uses_learning_phase', getattr(right_branch.output, '_uses_learning_phase'))
The reason why the error occur is diff tensor doesn't have attr named _keras_history so on. So adding intentionally them to diff tensor could prevent above error. I checked original code ran and being possible to learn.

Categories