I need to see how I would initialize all layers of a Sequential model with data from a same-sized sequential model.
E.G. How would I initialize the weights for every layer of the following Sequential model?
model = tf.keras.Sequential([Dense(2000, activation='relu', input_shape=(11,)),
Dense(1, activation='relu'),
Dropout(0.5),
Dense(400, activation='relu'),
Dropout(0.5),
Dense(150, activation='relu'),
BatchNormalization(),
Dense(y_max+1, activation='softmax')
])
I am fairly new to CNN training and have managed to make the above code work through trial and error and extensive research.
Datatype is list and np.array() of dtype np.float64
The idea is that I grab the weights from one model (same as above) and return it to another model (also same as above). I just need to be able to visualize how I can initialize the weights and biases of all layers using the following:
weights = model.get_weights()[0]
biases = model.get_weights()[1]
return weights, biases
I have attempted the model.set_weights() method, but I keep getting the following error message, given the code before the TypeError:
if iteration == 1:
for layer in model.layers:
layer.set_weights(None, None)
TypeError: set_weights() takes 2 positional arguments but 3 were given
I'd be very appreciative of any help, thank you.
In the Sequential example above, each layer parameters can be accessed and assigned new weights as shown below,
#example of first layer
model.layers[0]
#weights of the first layer,
model.layers[0].weights #gives the weights of kernel and bias of dense in this case
#assign new_weights by
model.layers[0].kernel.assign(tf.Variable(new_kernel_weights))
model.layers[0].bias.assign(tf.Variable(new_bias_weights))
Related
I am using an LSTM for fake news detection and added an embedding layer to my model.
It is working fine without adding any input_shape in the LSTM function, but I thought the input_shape parameter was mandatory. Could someone help me with why there is no error even without defining input_shape? Is it because the embedding layer implicitly defines the input_shape?
Following is the code:
model=Sequential()
embedding_layer = Embedding(total_words, embedding_dim, weights=[embedding_matrix], input_length=max_length)
model.add(embedding_layer)
model.add(LSTM(64,))
model.add(Dense(1,activation='sigmoid'))
opt = SGD(learning_rate=0.01,decay=1e-6)
model.compile(loss = "binary_crossentropy", optimizer = opt,metrics=['accuracy'])
model.fit(data,train['label'], epochs=30, verbose=1)
You only need to provide an input_length to the Embedding layer. Furthermore, if you use a sequential model, you do not need to provide an input layer. Avoiding an input layer essentially means that your models weights are only created when you pass real data, as you did in model.fit(*). If you wanted to see the weights of your model before providing real data, you would have to define an input layer before your Embedding layer like this:
embedding_input = tf.keras.layers.Input(shape=(max_length,))
And yes, as you mentioned, your model infers the input_shape implicitly when you provide the real data. Your LSTM layer does not need an input_shape as it is also derived based on the output of your Embedding layer. If the LSTM layer were the first layer of your model, it would be best to specify an input_shape for clarity. For example:
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(32, input_shape=(10, 5)))
model.add(tf.keras.layers.Dense(1))
where 10 represents the number of time steps and 5 the number of features. In your example, your input to the LSTM layer has the shape(max_length, embedding_dim). Also here, if you do not specify the input_shape, your model will infer the shape based on your input data.
For more information check out the Keras documentation.
I have a simple neural network, I need to get weights and biases from the model. I have tried a few approaches discussed before but I keep getting the out of bounds value error. Not sure how to fix this, or what I'm missing.
Network-
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.layers[0].get_weights()[1]
Error -
IndexError: list index out of range
This is what has been mentioned in a few questions,but I end up getting the out of bounds error for this.
I have another question, the index followed aftermodel.layers[], does it correspond to the layer?
For instance model.layers[1] gives the weights corresponding to the second layer, something like that?
I've been there, I have been looking at my old code to see if I could remember how did I solved that issue.
What I did was to print the length of the model.layer[index].get_weights()[X] to figure out where keras was saving the weights I needed.
In my old code, model.layers[0].get_weights()[1] would return the biases, while model.layers[0].get_weights()[0] would return the actual weights.
In any case, take into account that there are layers which weights aren't saved (as they don't have weights), so if asking for model.layers[0].get_weights()[0] doesn't work, try with model.layers[1].get_weights()[1], as I'm not sure about flatten layers, but I do know that dense layers should save their weights.
The first layer (index 0) in your model is a Flatten layer, which does not have any weights, that's why you get errors.
To get the Dense layer, which is the second layer, you have to use index 1:
model.layers[1].get_weights()[1]
Just model.get_weights(), you will get all the weights and bias of your model
To get the weights and bias on a Keras sequential and for every iteration, you can do it as in the next example:
# create model
model = Sequential()
model.add(Dense(numHiddenNeurons, activation="tanh", input_dim=4, kernel_initializer="uniform"))
model.add(Dense(1, activation="linear", kernel_initializer="uniform"))
# Compile model
model.compile(loss='mse', optimizer='adam', metrics=['accuracy', 'mse', 'mae', 'mape'])
weightsBiasDict = {}
weightAndBiasCallback = tf.keras.callbacks.LambdaCallback \
(on_epoch_end=lambda epoch, logs: weightsBiasDict.update({epoch:model.get_weights()}))
# Fit the model
history= model.fit(X1, Y1, epochs=numIterations, batch_size=batch_size, verbose=0, callbacks=weightAndBiasCallback)
weights and bias are accessible for every iteration on the dictionary weightsBiasDict
If you just need weights and bias values at the end of the training you can use
model.layer[index].get_weights()[0] for weights
and
model.layer[index].get_weights()[1] for biases
where index is the layer number on your network, starting at zero for the input layer.
I'm trying to get to grips with the basics of neural networks and am struggling to understand keras layers.
Take the following code from tensorflow's tutorials:
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
So this network has 3 layers? The first is just the 28*28 nodes representing the pixel values. The second is a hidden layer which takes weighted sums from the first, applies relu and then sends these to 10 output layers which are softmaxed?
but then this model seems to require different inputs to the layers:
model = keras.Sequential([
layers.Dense(64, activation=tf.nn.relu, input_shape=[len(train_dataset.keys())]),
layers.Dense(64, activation=tf.nn.relu),
layers.Dense(1)
])
Why does the input layer now have both an input_shape and a value 64? I read that the first parameter specifies the number of nodes in the second layer, but that doesn't seem to fit with the code in the first example. Also, why does the input layer have an activation? Is this just relu-ing the values before they enter the network?
Also, with regards activation functions, why are softmax and relu treated as alternatives? I thought relu applied to all the inputs of a single node, whereas softmax acted on the outputs of all the nodes across a layer?
Any help is really appreciated!
First example is from: https://www.tensorflow.org/tutorials/keras/basic_classification
Second example is from: https://www.tensorflow.org/tutorials/keras/basic_regression
Basically you have two types of API in Keras: Sequential and Functional API https://keras.io/getting-started/sequential-model-guide/
In Sequential API, you don't explictly refers an Input Layer Input https://keras.io/layers/core/#input
That is why you need to add an input_shape to specify the dimensions Of the First layer,
more information in https://jovianlin.io/keras-models-sequential-vs-functional/
I a trying to merge 2 sequential models in keras. Here is the code:
model1 = Sequential(layers=[
# input layers and convolutional layers
Conv1D(128, kernel_size=12, strides=4, padding='valid', activation='relu', input_shape=input_shape),
MaxPooling1D(pool_size=6),
Conv1D(256, kernel_size=12, strides=4, padding='valid', activation='relu'),
MaxPooling1D(pool_size=6),
Dropout(.5),
])
model2 = Sequential(layers=[
# input layers and convolutional layers
Conv1D(128, kernel_size=20, strides=5, padding='valid', activation='relu', input_shape=input_shape),
MaxPooling1D(pool_size=5),
Conv1D(256, kernel_size=20, strides=5, padding='valid', activation='relu'),
MaxPooling1D(pool_size=5),
Dropout(.5),
])
model = merge([model1, model2], mode = 'sum')
Flatten(),
Dense(256, activation='relu'),
Dropout(.5),
Dense(128, activation='relu'),
Dropout(.35),
# output layer
Dense(5, activation='softmax')
return model
Here is the error log:
File
"/nics/d/home/dsawant/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py",
line 392, in is_keras_tensor
raise ValueError('Unexpectedly found an instance of type ' + str(type(x)) + '. ' ValueError: Unexpectedly found an instance of
type <class 'keras.models.Sequential'>. Expected a symbolic tensor
instance.
Some more log:
ValueError: Layer merge_1 was called with an input that isn't a
symbolic tensor. Received type: class 'keras.models.Sequential'.
Full input: [keras.models.Sequential object at 0x2b32d518a780,
keras.models.Sequential object at 0x2b32d521ee80]. All inputs to the
layer should be tensors.
How can I merge these 2 Sequential models that use different window sizes and apply functions like 'max', 'sum' etc to them?
Using the functional API brings you all possibilities.
When using the functional API, you need to keep track of inputs and outputs, instead of just defining layers.
You define a layer, then you call the layer with an input tensor to get the output tensor. Models and layers can be called exactly the same way.
For the merge layer, I prefer using other merge layers that are more intuitive, such as Add(), Multiply() and Concatenate() for instance.
from keras.layers import *
mergedOut = Add()([model1.output,model2.output])
#Add() -> creates a merge layer that sums the inputs
#The second parentheses "calls" the layer with the output tensors of the two models
#it will demand that both model1 and model2 have the same output shape
This same idea apply to all the following layers. We keep updating the output tensor giving it to each layer and getting a new output (if we were interested in creating branches, we would use a different var for each output of interest to keep track of them):
mergedOut = Flatten()(mergedOut)
mergedOut = Dense(256, activation='relu')(mergedOut)
mergedOut = Dropout(.5)(mergedOut)
mergedOut = Dense(128, activation='relu')(mergedOut)
mergedOut = Dropout(.35)(mergedOut)
# output layer
mergedOut = Dense(5, activation='softmax')(mergedOut)
Now that we created the "path", it's time to create the Model. Creating the model is just like telling at which input tensors it starts and where it ends:
from keras.models import Model
newModel = Model([model1.input,model2.input], mergedOut)
#use lists if you want more than one input or output
Notice that since this model has two inputs, you have to train it with two different X_training vars in a list:
newModel.fit([X_train_1, X_train_2], Y_train, ....)
Now, suppose you wanted only one input, and both model1 and model2 would take the same input.
The functional API allows that quite easily by creating an input tensor and feeding it to the models (we call the models as if they were layers):
commonInput = Input(input_shape)
out1 = model1(commonInput)
out2 = model2(commonInput)
mergedOut = Add()([out1,out2])
In this case, the Model would consider this input:
oneInputModel = Model(commonInput,mergedOut)
Suppose I have trained the model below for an epoch:
model = Sequential([
Dense(32, input_dim=784), # first number is output_dim
Activation('relu'),
Dense(10), # output_dim, input_dim is taken for granted from above
Activation('softmax'),
])
And I got the weights dense1_w, biases dense1_b of first hidden layer (named it dense1) and a single data sample sample.
How do I use these to get the output of dense1 on the sample in keras?
Thanks!
The easiest way is to use the keras backend. With the keras backend you can define a function that gives you the intermediate output of a keras model as defined here (https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer).
So in essence:
get_1st_layer_output = K.function([model.layers[0].input],
[model.layers[1].output])
layer_output = get_1st_layer_output([X])
Just recreate the first part of the model up until the layer for which you would like the output (in your case only the first dense layer). Afterwards you can load the trained weights of the first part in your newly created model and compile it.
The output of the prediction with this new model will be the output of the layer (in your case the first dense layer).
from keras.models import Sequential
from keras.layers import Dense, Activation
import numpy as np
model = Sequential([
Dense(32, input_dim=784), # first number is output_dim
Activation('relu'),
Dense(10), # output_dim, input_dim is taken for granted from above
Activation('softmax'),
])
model.compile(optimizer='adam', loss='categorical_crossentropy')
#create some random data
n_features = 5
samples = np.random.randint(0, 10, 784*n_features).reshape(-1,784)
labels = np.arange(10*n_features).reshape(-1, 10)
#train your sample model
model.fit(samples, labels)
#create new model
new_model= Sequential([
Dense(32, input_dim=784), # first number is output_dim
Activation('relu')])
#set weights of the first layer
new_model.set_weights(model.layers[0].get_weights())
#compile it after setting the weights
new_model.compile(optimizer='adam', loss='categorical_crossentropy')
#get output of the first dens layer
output = new_model.predict(samples)
As for weights, I had a none-Sequential model. What I did was using model.summary() to get the desired layers name and then model.get_layer("layer_name").get_weights() to get the weights.