I had 5 LSTM layers and 2 MLP's which must be concatenate together into another MLP which produce the final output. Here is the code I wrote using the API approach, which works fine:
lstm_input = Input(shape=(X_dynamic_LSTM.shape[1], X_dynamic_LSTM.shape[2]))
x = LSTM(70, activation='tanh', return_sequences=True)(lstm_input )
x = Dropout(0.3)(x)
x = Dense(1, activation='tanh')(x)
mlp_input=Input(shape=(X_static_MLP.shape[1]))
mlp = Dense(30, activation='relu')(mlp_input)
mlp = Dense(10, activation='relu')(mlp)
merge = Concatenate()([x, mlp])
hidden1 = Dense(5, activation='relu')(merge)
mlp_out = Dense(1, activation='relu')(hidden1)
model = Model(inputs=[lstm_input, mlp_input], outputs=mlp_out)
model.compile(loss='mae', optimizer='Adam')
history = model.fit([X_dynamic_LSTM, X_static_MLP], y_train, batch_size=20,
epochs=10, validation_split=0.2)
If I want to convert this format to one similar to below:
x = Sequential()
x.add(LSTM(70, return_sequences=True))
x.add(Dropout(0.3))
x.add(Dense(1, activation='tanh'))
Can any one help me how should I define the MLP, the Concatenate and the part regarding the "model = Model(inputs=[lstm_input, mlp_input], outputs=mlp_out)" ??
My main problem is I want to add an Embedding layer to the LSTM. when I add the dollowing code to non-API approach the model works perfect.
x.add(Embedding(X_dynamic_LSTM.shape[0], 1,mask_zero=True))
But when instead I used
lstm_input = Embedding(X_dynamic_LSTM.shape[0], 1,mask_zero=True)
It gave me the error : TypeError: Inputs to a layer should be tensors, So I got to stick with non-API approach.
Related
I am trying to code a 5 class classifier ANN, and this code return this error:
classifier = Sequential()
classifier.add(Dense(units=10, input_dim=14, kernel_initializer='uniform', activation='relu'))
classifier.add(Dense(units=6, kernel_initializer='uniform', activation='relu'))
classifier.add(Dense(units=5, kernel_initializer='uniform', activation='softmax'))
classifier.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
RD_Model = classifier.fit(X_train,y_train, batch_size=10 , epochs=10, verbose=1)
File "c:\Program Files\Python310\lib\site-packages\keras\backend.py", line 5119, in categorical_crossentropy
target.shape.assert_is_compatible_with(output.shape)
ValueError: Shapes (None, 1) and (None, 5) are incompatible
I figured this is caused because I have a probability matrix instead of an actual output, so I have been trying to apply an argmax, but haven't figured a way
Can someone help me out?
Have you tried applying:
tf.keras.backend.argmax()
You can define a lambda layer using the following:
from keras.layer import Lambda
from keras import backend as K
def argmax_layer(input):
return K.argmax(input, axis=-1)
Keras provides two paradigms for defining a model topology.
The code you are using uses the Sequential API. You might have to revert to the Functional API.
input_layer = Input(shape=(14,))
layer_1 = Dense(10, activation="relu")(input_layer)
layer_2 = Dense(6, activation="relu")(layer_1)
layer_3 = argmax_layer()(layer_2 )
output_layer= Dense(5, activation="linear")(layer_3 )
model = Model(inputs=input_layer, outputs=output_layer)
model.compile(optimizer='adam',
loss='categorical_crossentropy', metrics=['accuracy'])
Another option would be to instantiate an inherited class of a Keras Layer.
https://www.tutorialspoint.com/keras/keras_customized_layer.htm
As Dr. Snoopy mentioned, it was indeed a problem of one-hot encoding... I missed to do that, resulting in my model not working.
So I just one hot encoded it:
encoder = LabelEncoder()
encoder.fit(y_train)
encoded_Y = encoder.transform(y_train)
# convert integers to dummy variables (i.e. one hot encoded)
dummy_y = np_utils.to_categorical(encoded_Y)
And it worked after using dummy_y. Thank you for your help.
I am trying to classify images. Those images have different shapes, but this is not a problem.
However, I am trying to create a dataset using the tf.data.Dataset.from_generator function provided by Tensorflow and I have the feeling that something is not working as it should.
Here is the code:
filenames_ds = tf.data.Dataset.from_tensor_slices(categ_img[:1000]['image_name'])
labels_ds = tf.data.Dataset.from_tensor_slices(categ_img[:1000]['category_label'])
images_ds = filenames_ds.map(lambda x: tf.image.decode_jpeg(tf.read_file(x)))
labels_ds = labels_ds.map(lambda x: tf.one_hot(x, NUM_CATEGORIES))
ds = tf.data.Dataset.zip((images_ds, labels_ds)).batch(1)
I also tried to create the labels_ds like this:
labels_ds.map(lambda x: tf.expand_dims(tf.one_hot(x, NUM_CATEGORIES), axis=0))
categ_imgis a pandas.DataFrame containing image paths and labels under image_name and category_label columns respectively.
And I keep getting this error:
InvalidArgumentError: logits and labels must have the same first dimension, got logits shape [1,50] and labels shape [50]
My model is based on a pretrained ResNet model provided by Keras:
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(None, None, 3))
for layer in base_model.layers:
layer.trainable = False
x = base_model.output
x = GlobalAveragePooling2D()(x)
for fc in FC_LAYERS:
x = Dense(fc, activation='relu')(x)
x = Dropout(DROPOUT)(x)
output = Dense(NUM_CATEGORIES, activation='softmax', name='fully-connected')(x)
model = Model(inputs=base_model.input, outputs=output)
optimizer = tf.keras.optimizers.SGD(lr=LEARNING_RATE)
cce = tf.keras.losses.CategoricalCrossentropy()
model.compile(optimizer, loss=cce)
return model
It is trained like this:
model_classification.fit(
ds,
epochs=epochs,
steps_per_epoch=steps
)
Which seems pretty straight-forward to me.
Any help would be appreciated.
Thank you.
I finally tried something that worked.
Here is the line you need to change:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
I don't know why, but this made things working.
print("Building model...")
ques1_enc = Sequential()
ques1_enc.add(Embedding(output_dim=64, input_dim=vocab_size, weights=[embedding_weights], mask_zero=True))
ques1_enc.add(LSTM(100, input_shape=(64, seq_maxlen), return_sequences=False))
ques1_enc.add(Dropout(0.3))
ques2_enc = Sequential()
ques2_enc.add(Embedding(output_dim=64, input_dim=vocab_size, weights=[embedding_weights], mask_zero=True))
ques2_enc.add(LSTM(100, input_shape=(64, seq_maxlen), return_sequences=False))
ques2_enc.add(Dropout(0.3))
model = Sequential()
model.add(Merge([ques1_enc, ques2_enc], mode="sum"))
model.add(Dense(2, activation="softmax"))
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
print("Building model costs:", time.time() - start)
print("Training...")
checkpoint = ModelCheckpoint(filepath=os.path.join("C:/Users/", "quora_dul_best_lstm.hdf5"), verbose=1, save_best_only=True)
model.fit([x_ques1train, x_ques2train], ytrain, batch_size=32, epochs=1, validation_split=0.1, verbose=2, callbacks=[checkpoint])
print("Training neural network costs:", time.time() - start)
I want to convert the above code into functional API in keras as in sequential API Merge() function is not supported. I have been trying it for long time but getting few errors. About the details of the attrributes:
ques_pairs contains the preprocessed data,
word2index contains the word count,
seq_maxlen contains the maximum length of question one or two.
iam trying to implement this model on Quora Question Pair Dataset https://www.kaggle.com/c/quora-question-pairs
I will give you a small example, that you can apply to your own model:
from keras.layers import Input, Dense, Add
input1 = Input(shape=(16,))
output1 = Dense(8, activation='relu')(input1)
output1 = Dense(4, activation='relu')(output1) # Add as many layers as you like like this
input2 = Input(shape=(16,))
output2 = Dense(8, activation='relu')(input2)
output2 = Dense(4, activation='relu')(output2) # Add as many layers as you like like this
output_full = Add()([output1, output2])
output_full = Dense(1, activation='sigmoid')(output_full) # Add as many layers as you like like this
model_full = Model(inputs=[input1, input2], outputs=output_full)
You need to define an Input for each of your model parts first, then add layers (as shown in the code) to both models. Then you can add them using the Add layer. Finally you call Model with a list of the input layers and the output layer.
model_full can then be compiled and trained like any other model.
Are you trying to achieve something like the following ?
from tensorflow.python import keras
from keras.layers import *
from keras.models import Sequential, Model
vocab_size = 1000
seq_maxlen = 32
embedding_weights = np.zeros((vocab_size, 64))
print("Building model...")
ques1_enc = Sequential()
ques1_enc.add(Embedding(output_dim=64, input_dim=vocab_size, weights=[embedding_weights], mask_zero=True))
ques1_enc.add(LSTM(100, input_shape=(64, seq_maxlen), return_sequences=False))
ques1_enc.add(Dropout(0.3))
ques2_enc = Sequential()
ques2_enc.add(Embedding(output_dim=64, input_dim=vocab_size, weights=[embedding_weights], mask_zero=True))
ques2_enc.add(LSTM(100, input_shape=(64, seq_maxlen), return_sequences=False))
ques2_enc.add(Dropout(0.3))
merge = Concatenate(axis=1)([ques1_enc.output, ques2_enc.output])
output = Dense(2, activation="softmax")(merge)
model = Model([ques1_enc.input, ques2_enc.input], output)
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
model.summary()
I understand that the features extracted from an auto-encoder can be fed into an mlp for classification or regression purpose. This is something that I did earlier.
But what if I have 2 auto-encoders? Can I extract the features from the bottleneck layers of 2 auto-encoders and feed them into an mlp which performs classification based on these features? If yes, then how? I am not sure how to concatenate these two feature sets. I tried with numpy.hstack() which gives me 'unhashable slice' error, whereas, using tf.concat() gives me the error 'Input tensors to a Model must be Keras tensors.' the bottleneck layers of the two auto-encoders are of dimension (None,100) each. So, essentially, if I stack them horizontally, I should be getting a (None, 200). The hidden layer of the mlp may contain some (num_hidden=100) neurons. Could anyone please help?
x1 = autoencoder1.get_layer('encoder2').output
x2 = autoencoder2.get_layer('encoder2').output
#inp = np.hstack((x1, x2))
inp = tf.concat([x1, x2], 1)
x = tf.concat([x1, x2], 1)
h = Dense(num_hidden, activation='relu', name='hidden')(x)
y = Dense(1, activation='sigmoid', name='prediction')(h)
mymlp = Model(inputs=inp, outputs=y)
# Compile model
mymlp.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Train model
mymlp.fit(x_train, y_train, epochs=20, batch_size=8)
updated as per #twolffpiggott's suggestion:
from keras.layers import Input, Dense, Dropout
from keras import layers
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy as np
x1 = Data1
x2 = Data2
y = Data3
num_neurons1 = x1.shape[1]
num_neurons2 = x2.shape[1]
# Train-test split
x1_train, x1_test, x2_train, x2_test, y_train, y_test = train_test_split(x1, x2, y, test_size=0.2)
# scale data within [0-1] range
scalar = MinMaxScaler()
x1_train = scalar.fit_transform(x1_train)
x1_test = scalar.transform(x1_test)
x2_train = scalar.fit_transform(x2_train)
x2_test = scalar.transform(x2_test)
x_train = np.concatenate([x1_train, x2_train], axis =-1)
x_test = np.concatenate([x1_test, x2_test], axis =-1)
# Auto-encoder1
encoding_dim1 = 500
encoding_dim2 = 100
input_data = Input(shape=(num_neurons1,))
encoded = Dense(encoding_dim1, activation='relu', name='encoder1')(input_data)
encoded1 = Dense(encoding_dim2, activation='relu', name='encoder2')(encoded)
decoded = Dense(encoding_dim2, activation='relu', name='decoder1')(encoded1)
decoded = Dense(num_neurons1, activation='sigmoid', name='decoder2')(decoded)
# this model maps an input to its reconstruction
autoencoder1 = Model(inputs=input_data, outputs=decoded)
autoencoder1.compile(optimizer='sgd', loss='mse')
# training
autoencoder1.fit(x1_train, x1_train,
epochs=100,
batch_size=8,
shuffle=True,
validation_data=(x1_test, x1_test))
# Auto-encoder2
encoding_dim1 = 500
encoding_dim2 = 100
input_data = Input(shape=(num_neurons2,))
encoded = Dense(encoding_dim1, activation='relu', name='encoder1')(input_data)
encoded2 = Dense(encoding_dim2, activation='relu', name='encoder2')(encoded)
decoded = Dense(encoding_dim2, activation='relu', name='decoder1')(encoded2)
decoded = Dense(num_neurons2, activation='sigmoid', name='decoder2')(decoded)
# this model maps an input to its reconstruction
autoencoder2 = Model(inputs=input_data, outputs=decoded)
autoencoder2.compile(optimizer='sgd', loss='mse')
# training
autoencoder2.fit(x2_train, x2_train,
epochs=100,
batch_size=8,
shuffle=True,
validation_data=(x2_test, x2_test))
# MLP
num_hidden = 100
encoded1.trainable = False
encoded2.trainable = False
encoded1 = autoencoder1(autoencoder1.inputs)
encoded2 = autoencoder2(autoencoder2.inputs)
concatenated = layers.concatenate([encoded1, encoded2], axis=-1)
x = Dropout(0.2)(concatenated)
h = Dense(num_hidden, activation='relu', name='hidden')(x)
h = Dropout(0.5)(h)
y = Dense(1, activation='sigmoid', name='prediction')(h)
myMLP = Model(inputs=[autoencoder1.inputs, autoencoder2.inputs], outputs=y)
# Compile model
myMLP.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Training
myMLP.fit(x_train, y_train, epochs=200, batch_size=8)
# Testing
myMLP.predict(x_test)
giving me an error: unhashable type: 'list' from the line:
myMLP = Model(inputs=[autoencoder1.inputs, autoencoder2.inputs], outputs=y)
The problem is that you're mixing numpy arrays with keras tensors. This can't go.
There are two approaches.
Predict numpy arrays from each autoencoder, concat the arrays, send them to the third model
Connect all models, probably make the autoencoders untrainable, fit with one input for each autoencoder.
Personally, I'd go for the first. (Assuming the autoencoders are already trained and don't need change).
First approach
numpyOutputFromAuto1 = autoencoder1.predict(numpyInputs1)
numpyOutputFromAuto2 = autoencoder2.predict(numpyInputs2)
inputDataForThird = np.concatenate([numpyOutputFromAuto1,numpyOutputFromAuto2],axis=-1)
inputTensorForMlp = Input(inputsForThird.shape[1:])
h = Dense(num_hidden, activation='relu', name='hidden')(inputTensorForMlp)
y = Dense(1, activation='sigmoid', name='prediction')(h)
mymlp = Model(inputs=inputTensorForMlp, outputs=y)
....
mymlp.fit(inputDataForThird ,someY)
Second Approach
This is a little more complicated, and at first I don't see much reason to do this. (But of course there may be cases where it's a good choice)
Now we're totally forgetting numpy and working with keras tensors.
Creating the mlp on its own (good if you will use it later without the autoencoders):
inputTensorForMlp = Input(input_shape_compatible_with_concatenated_encoder_outputs)
x = Dropout(0.2)(inputTensorForMlp)
h = Dense(num_hidden, activation='relu', name='hidden')(x)
h = Dropout(0.5)(h)
y = Dense(1, activation='sigmoid', name='prediction')(h)
myMLP = Model(inputs=[autoencoder1.inputs, autoencoder2.inputs], outputs=y)
We probably want the bottleneck features of the autoencoders, right? If you happened to create the autoencoders properly with: encoder model, decoder model, join both, then it's easier to use just the encoder model. Else:
encodedOutput1 = autoencoder1.layers[bottleneckLayer].outputs #or encoder1.outputs
encodedOutput2 = autoencoder1.layers[bottleneckLayer].outputs #or encoder2.outputs
Creating a joined model. The concatenation must use a keras layer (we're working with keras tensors):
concatenated = Concatenate()([encodedOutput1,encodedOutput2])
output = myMLP(concatenated)
joinedModel = Model([autoencoder1.input,autoencoder2.input],output)
I'd also go with Daniel's first approach (for simplicity and efficiency), but if you're interested in the second; for instance if you're interested in running the network end-to-end, you'd approach it like this:
# make autoencoders not trainable
autoencoder1.trainable = False
autoencoder2.trainable = False
encoded1 = autoencoder1(kerasInputs1)
encoded2 = autoencoder2(kerasInputs2)
concatenated = layers.concatenate([encoded1, encoded2], axis=-1)
h = Dense(num_hidden, activation='relu', name='hidden')(concatenated)
y = Dense(1, activation='sigmoid', name='prediction')(h)
myMLP = Model([input_data1, input_data2], y)
myMLP.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Training
myMLP.fit([x1_train, x2_train], y_train, epochs=200, batch_size=8)
# Testing
myMLP.predict([x1_test, x2_test])
Key edits
The weights of both autoencoders should be frozen end-to-end (otherwise early-stage gradient updates from the randomly initialized MLP will likely result in the loss of much of their learning).
The autoencoder input layers should be assigned to separate variables input_data1 and input_data2 per autoencoder (instead of both to input_data). Even though autoencoder1.inputs returns a tf tensor, this is the source of the unhashable type: list exception, and replacing with [input_data1, input_data2] solves the issue.
When fitting the MLP for the end-to-end model, the input should be a list of x1_train and x2_train rather than the concatenated inputs. Same when predicting.
I am trying to create my first ensemble models in keras. I have 3 input values and a single output value in my dataset.
from keras.optimizers import SGD,Adam
from keras.layers import Dense,Merge
from keras.models import Sequential
model1 = Sequential()
model1.add(Dense(3, input_dim=3, activation='relu'))
model1.add(Dense(2, activation='relu'))
model1.add(Dense(2, activation='tanh'))
model1.compile(loss='mse', optimizer='Adam', metrics=['accuracy'])
model2 = Sequential()
model2.add(Dense(3, input_dim=3, activation='linear'))
model2.add(Dense(4, activation='tanh'))
model2.add(Dense(3, activation='tanh'))
model2.compile(loss='mse', optimizer='SGD', metrics=['accuracy'])
model3 = Sequential()
model3.add(Merge([model1, model2], mode = 'concat'))
model3.add(Dense(1, activation='sigmoid'))
model3.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['accuracy'])
model3.input_shape
The ensemble model(model3) compiles without any error but while fitting the model I have to pass the same input two times model3.fit([X,X],y). Which I think is an unnecessary step and instead of passing input twice I want to have a common input nodes for my ensemble model. How can I do it?
Keras functional API seems to be a better fit for your use case, as it allows more flexibility in the computation graph. e.g.:
from keras.layers import concatenate
from keras.models import Model
from keras.layers import Input, Merge
from keras.layers.core import Dense
from keras.layers.merge import concatenate
# a single input layer
inputs = Input(shape=(3,))
# model 1
x1 = Dense(3, activation='relu')(inputs)
x1 = Dense(2, activation='relu')(x1)
x1 = Dense(2, activation='tanh')(x1)
# model 2
x2 = Dense(3, activation='linear')(inputs)
x2 = Dense(4, activation='tanh')(x2)
x2 = Dense(3, activation='tanh')(x2)
# merging models
x3 = concatenate([x1, x2])
# output layer
predictions = Dense(1, activation='sigmoid')(x3)
# generate a model from the layers above
model = Model(inputs=inputs, outputs=predictions)
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
# Always a good idea to verify it looks as you expect it to
# model.summary()
data = [[1,2,3], [1,1,3], [7,8,9], [5,8,10]]
labels = [0,0,1,1]
# The resulting model can be fit with a single input:
model.fit(data, labels, epochs=50)
Notes:
There might be slight differences in the API between Keras versions (pre- and post- version 2)
The example above specifies different optimizer and loss function for each of the models. However, since fit() is being called only once (on model3), the same settings - those of model3 - will apply to the entire model. In order to have different settings when training the sub-models, they will have to be fit() separately -
see comment by #Daniel.
EDIT: updated notes based on comments
etov's answer is a great option.
But suppose you already have model1 and model2 ready and you don't want to change them, you can create the third model like this:
singleInput = Input((3,))
out1 = model1(singleInput)
out2 = model2(singleInput)
#....
#outN = modelN(singleInput)
out = Concatenate()([out1,out2]) #[out1,out2,...,outN]
out = Dense(1, activation='sigmoid')(out)
model3 = Model(singleInput,out)
And if you already have all the models ready and don't want to change them, you can have something like this (not tested):
singleInput = Input((3,))
output = model3([singleInput,singleInput])
singleModel = Model(singleInput,output)
Define new input layer and use model outputs directly (works in functional api):
assert model1.input_shape == model2.input_shape # make sure they got same shape
inp = tf.keras.layers.Input(shape=model1.input_shape[1:])
model = tf.keras.models.Model(inputs=[inp], outputs=[model1(inp), model2(inp)])