Keras Functional API issue with Input layer and first LSTM layer - python

I am trying to create a Functional API as opposed to a Sequential API. I have built the model previously using the Sequential API, and it worked just fine. It is an LSTM, and I am having trouble with the batch_size going from the Input to the LSTM layer. The Sequential API was built as follows:
new_model = Sequential()
new_model.add(LSTM(n_neurons, batch_input_shape=(batch_size,train_X.shape[1], train_X.shape[2]), activation='tanh', stateful=True, return_sequences=True))
new_model.add(Dropout(0))
new_model.add(LSTM(n_neurons, batch_input_shape=(batch_size,train_X.shape[1], train_X.shape[2]), activation='tanh', stateful=True))
new_model.add(Dropout(0))
new_model.add(Dense(n_neurons1, activation='tanh'))
new_model.add(Dropout(0.1))
new_model.add(Dense(nm))
new_model.compile(loss='mse', optimizer=optimizer)
The above snippet works fine. The Functional API I am trying to get to work is as follows:
inp = Input(shape = (train_X.shape[1], train_X.shape[2]), batch_size = batch_size)
L1 = LSTM(n_neurons, batch_input_shape=(batch_size,train_X.shape[1], train_X.shape[2]), activation='tanh', stateful=True, return_sequences=True)(inp)
D1 = Dropout(0)(L1)
L2 = LSTM(n_neurons, batch_input_shape=(batch_size,train_X.shape[1], train_X.shape[2]), activation='tanh', stateful=True, return_sequences=True)(D1)
D2 = Dropout(0)(L2)
F1 = Dense(n_neurons1, activation='tanh')(D2)
D3 = Dropout(0.1)(F1)
out = Dense(nm)
new_model = Model(inp,out)
new_model.compile(loss='mse', optimizer=optimizer)
I get an error saying "Input() got an unexpected keyword argument 'batch_size", even though I know batch_size is an argument for the Input layer. Then, if I get rid of the argument, I get an error with the first LSTM layer saying:
"If a RNN is stateful, it needs to know its batch size. Specify the batch size of your input tensors:
If using a Sequential model, specify the batch size by passing a batch_input_shape argument to your first layer.
If using the functional API, specify the batch size by passing a batch_shape argument to your Input layer."
I have already tried updating tensorflow but that did not fix the Input() issue. Where do I go from here?

You describe passing a batch_size parameter via the functional API and getting an error suggesting "passing a batch_shape argument to your Input layer."
If you try changing batch_size = batch_size in your input layer to
batch_shape = (batch_size,train_X.shape[1], train_X.shape[2])
does that solve it?

Related

Need an Example of tf.keras.Sequential() Weight Initialization

I need to see how I would initialize all layers of a Sequential model with data from a same-sized sequential model.
E.G. How would I initialize the weights for every layer of the following Sequential model?
model = tf.keras.Sequential([Dense(2000, activation='relu', input_shape=(11,)),
Dense(1, activation='relu'),
Dropout(0.5),
Dense(400, activation='relu'),
Dropout(0.5),
Dense(150, activation='relu'),
BatchNormalization(),
Dense(y_max+1, activation='softmax')
])
I am fairly new to CNN training and have managed to make the above code work through trial and error and extensive research.
Datatype is list and np.array() of dtype np.float64
The idea is that I grab the weights from one model (same as above) and return it to another model (also same as above). I just need to be able to visualize how I can initialize the weights and biases of all layers using the following:
weights = model.get_weights()[0]
biases = model.get_weights()[1]
return weights, biases
I have attempted the model.set_weights() method, but I keep getting the following error message, given the code before the TypeError:
if iteration == 1:
for layer in model.layers:
layer.set_weights(None, None)
TypeError: set_weights() takes 2 positional arguments but 3 were given
I'd be very appreciative of any help, thank you.
In the Sequential example above, each layer parameters can be accessed and assigned new weights as shown below,
#example of first layer
model.layers[0]
#weights of the first layer,
model.layers[0].weights #gives the weights of kernel and bias of dense in this case
#assign new_weights by
model.layers[0].kernel.assign(tf.Variable(new_kernel_weights))
model.layers[0].bias.assign(tf.Variable(new_bias_weights))

is keras LSTM supposed to work without an input_shape parameter?

I am using an LSTM for fake news detection and added an embedding layer to my model.
It is working fine without adding any input_shape in the LSTM function, but I thought the input_shape parameter was mandatory. Could someone help me with why there is no error even without defining input_shape? Is it because the embedding layer implicitly defines the input_shape?
Following is the code:
model=Sequential()
embedding_layer = Embedding(total_words, embedding_dim, weights=[embedding_matrix], input_length=max_length)
model.add(embedding_layer)
model.add(LSTM(64,))
model.add(Dense(1,activation='sigmoid'))
opt = SGD(learning_rate=0.01,decay=1e-6)
model.compile(loss = "binary_crossentropy", optimizer = opt,metrics=['accuracy'])
model.fit(data,train['label'], epochs=30, verbose=1)
You only need to provide an input_length to the Embedding layer. Furthermore, if you use a sequential model, you do not need to provide an input layer. Avoiding an input layer essentially means that your models weights are only created when you pass real data, as you did in model.fit(*). If you wanted to see the weights of your model before providing real data, you would have to define an input layer before your Embedding layer like this:
embedding_input = tf.keras.layers.Input(shape=(max_length,))
And yes, as you mentioned, your model infers the input_shape implicitly when you provide the real data. Your LSTM layer does not need an input_shape as it is also derived based on the output of your Embedding layer. If the LSTM layer were the first layer of your model, it would be best to specify an input_shape for clarity. For example:
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(32, input_shape=(10, 5)))
model.add(tf.keras.layers.Dense(1))
where 10 represents the number of time steps and 5 the number of features. In your example, your input to the LSTM layer has the shape(max_length, embedding_dim). Also here, if you do not specify the input_shape, your model will infer the shape based on your input data.
For more information check out the Keras documentation.

LSTM Text generation Input_shape

I am attempting to add another LSTM layer to my model but I am only a beginner and I am not very good. I am using the (Better) - Donal Trump Tweets! dataset on Kaggle for LSTM text generation.
I am struggling to get it to run as it returns an Error:
<ValueError: Input 0 of layer lstm_16 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 128]>
My model is:
print('Building model...')
model2 = Sequential()
model2.add(LSTM(128, input_shape=(maxlen, len(chars)),return_sequences=True))
model2.add(Dropout(0.5))
model2.add(LSTM(128))
model2.add(Dropout(0.5))
model2.add(LSTM(128))
model2.add(Dropout(0.2))
model2.add(Dense(len(chars), activation='softmax'))
# optimizer = RMSprop(lr=0.01)
optimizer = Adam()
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
print('model built')
The Model works with only two LSTM layers, two Dropout layers, and one dense layer. I think something is wrong with my setup for input_shape, but I could be wrong. My model is based off of a notebook from the above data set notebook here.
In order to stack RNN's you will have to use return_sequences=True.
From the error it could be seen that the layer was expecting 3 dimentional tensor, but received a 2 dimentional. Here you can read that that return_sequences=True flag will output a 3 dimentional tensor.
If True the full sequences of successive outputs for each timestep is
returned (a 3D tensor of shape (batch_size, timesteps,
output_features)).
Assuming, that there are no issues with your input layer and the input data is passed on correctly, I will propose to try the following model.
print('Building model...')
model2 = Sequential()
model2.add(LSTM(128, input_shape=(maxlen, len(chars)),return_sequences=True))
model2.add(Dropout(0.5))
model2.add(LSTM(128, return_sequences=True))
model2.add(Dropout(0.5))
model2.add(LSTM(128))
model2.add(Dropout(0.2))
model2.add(Dense(len(chars), activation='softmax'))
# optimizer = RMSprop(lr=0.01)
optimizer = Adam()
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
print('model built')

Input 0 is incompatible with layer conv2d_11: expected ndim=4, found ndim=3 (Keras)

I'm attempting to train a genetic algorithm using Keras Dense layers. The error I am currently experiencing is:
ValueError: Input 0 is incompatible with layer conv2d_13: expected ndim=4, found ndim=3
however the individual layers are:
model.add(Dense(nb_neurons,activation=activation,input_shape=(x_train.shape,input_shape,1)))
The full block of code that is failing is:
embedding_layer = Embedding(num_words,
EMBEDDING_DIM,
embeddings_initializer=self.Constant(embedding_matrix),
input_length=MAX_SEQUENCE_LENGTH,
trainable=False)
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
# Add each layer.
for i in range(nb_layers):
# Need input shape for first layer.
if i == 0:
model.add(embedded_sequences)
model.add(Dense(nb_neurons,activation=activation,input_shape=(x_train.shape,input_shape,1)))
else:
model.add(Dense(nb_neurons, activation=activation,input_shape=(x_train.shape,input_shape,1)))
model.add(Dropout(0.2)) # hard-coded dropout
# Output layer.
model.add(Dense(nb_classes, activation = 'softmax',input_shape=(x_train.shape,input_shape,1)))
model.compile(loss = 'categorical_crossentropy', optimizer = optimizer,
metrics = ['accuracy'])
I've deliberately set the input shape on each layer in an attempt to prevent this, however it's still feeding this error. Any help identifying the cause of the error would be greatly appreciated.

keras - model wont work after pop()

I have taken a standard ResNet50 model:
model = keras.applications.resnet50.ResNet50(include_top=False,
weights='imagenet',
classes=10,
input_shape=(224, 224, 3))
And added several dense layers of my own:
top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(2, activation='softmax'))
model = Model(input=model.input, output=top_model(model.output))
This way it works great, however, when I want to delete last Dense and Dropout layers with model.pop() keras wont work well:
model.layers[-1].layers
[<keras.layers.core.Flatten at 0x16b5c00b8>,
<keras.layers.core.Dense at 0x16b5c0320>,
<keras.layers.core.Dropout at 0x16b5c02e8>,
<keras.layers.core.Dense at 0x16b5c0d68>]
model.layers[-1].pop()
model.layers[-1].pop()
model.layers[-1].layers
[<keras.layers.core.Flatten at 0x1ae6e5940>,
<keras.layers.core.Dense at 0x1ae6e9e10>]
model.layers[-1].outputs = [model.layers[-1].layers[-1].output]
model.outputs = model.layers[-1].outputs
model.layers[-1].layers[-1].outbound_nodes = []
Then I just compile the model and when trying to predict, I get an error:
You must feed a value for placeholder tensor 'flatten_7_input_12' with dtype float and shape [?,1,1,2048]
model.pop() takes care of all the underlying settings, including setting the model.output to the output of the new last layer. Therefore, you don't need to handle anything regarding the outputs.
Please also note that you are assigning to the model variable; thus, model.outputs is already referring to the extended model's output.
Here is a sample code that works fine on keras 2.0.6 with TensorFlow backend (1.4.0):
import keras
from keras.models import Sequential, Model
from keras.layers import *
import numpy as np
model = keras.applications.resnet50.ResNet50(include_top=False,
weights='imagenet',
classes=10,
input_shape=(224, 224, 3))
top_model = Sequential()
top_model.add(keras.layers.Flatten(input_shape=model.output_shape[1:]))
top_model.add(keras.layers.Dense(256, activation='relu'))
top_model.add(keras.layers.Dropout(0.5))
top_model.add(keras.layers.Dense(2, activation='softmax'))
model_extended = Model(input=model.input, output=top_model(model.output))
model_extended.layers[-1].pop()
model_extended.layers[-1].pop()
model_extended.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
model_extended.predict(np.zeros((1, 224, 224, 3)))

Categories