Problem: I have S sequences of T timesteps each and each timestep contains F features so collectively, a dataset of
(S x T x F) and each s in S is described by 2 values (Target_1 and Target_2)
Goal: Model/Train an architecture using LSTMs in order to learn/achieve a function approximator model M and given a sequence s, to predict Target_1 and Target_2 ?
Something like this:
M(s) ~ (Target_1, Target_2)
I'm really struggling to find a way, below is a Keras implementation of an example that probably does not work. I made 2 models one for the first Target value and 1 for the second.
model1 = Sequential()
model1.add(Masking(mask_value=-10.0))
model1.add(LSTM(1, input_shape=(batch, timesteps, features), return_sequences = True))
model1.add(Flatten())
model1.add(Dense(hidden_units, activation = "relu"))
model1.add(Dense(1, activation = "linear"))
model1.compile(loss='mse', optimizer=Adam(learning_rate=0.0001))
model1.fit(x_train, y_train[:,0], validation_data=(x_test, y_test[:,0]), epochs=epochs, batch_size=batch, shuffle=False)
model2 = Sequential()
model2.add(Masking(mask_value=-10.0))
model2.add(LSTM(1, input_shape=(batch, timesteps, features), return_sequences=True))
model2.add(Flatten())
model2.add(Dense(hidden_units, activation = "relu"))
model2.add(Dense(1, activation = "linear"))
model2.compile(loss='mse', optimizer=Adam(learning_rate=0.0001))
model2.fit(x_train, y_train[:,1], validation_data=(x_test, y_test[:,1]), epochs=epochs, batch_size=batch, shuffle=False)
I want to make somehow good use of LSTMs time relevant memory in order to achieve good regression.
IIUC, you can start off with a simple (naive) approach by using two output layers:
import tensorflow as tf
timesteps, features = 20, 5
inputs = tf.keras.layers.Input((timesteps, features))
x = tf.keras.layers.Masking(mask_value=-10.0)(inputs)
x = tf.keras.layers.LSTM(32, return_sequences=False)(x)
x = tf.keras.layers.Dense(32, activation = "relu")(x)
output1 = Dense(1, activation = "linear", name='output1')(x)
output2 = Dense(1, activation = "linear", name='output2')(x)
model = tf.keras.Model(inputs, [output1, output2])
model.compile(loss='mse', optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001))
x_train = tf.random.normal((500, timesteps, features))
y_train = tf.random.normal((500, 2))
model.fit(x_train, [y_train[:,0],y_train[:,1]] , epochs=5, batch_size=32, shuffle=False)
Related
def create_model():
model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape=(40002, 12)))
model.add(LSTM(50, return_sequences= True))
model.add(LSTM(50, return_sequences= True))
model.add(tf.keras.layers.LSTM(30))
model.add(Dense(2, activation='linear'))
def rmse(Y_test, prediction):
return K.sqrt(K.mean(K.square(Y_test-prediction)))
# compile
model.compile(optimizer='adam', loss=rmse, metrics=['mean_squared_error', rmse])
return model
# fit the model
model = create_model()
model.fit(x_train, Y_train, shuffle=False, verbose=1, epochs=10)
# # predict model
prediction = model.predict(x_test, verbose=0)
print(prediction)
How to calculate mean relative error for tensor inputs i.e my Y_test and prediction are tensor.
Y_test and prediction as 2 values
Example:
Y_test = [[0.2,0.003],
[0.3, 0.008]]
prediction = [[0.4,0.005],
[0.5,0.007]]
mean_relative_error = mean(absolute(0.2-0.4)/0.2 + absolute(0.003-0.005)/0.003), mean(absolute(0.3-0.5)/0.3 + absolute(0.008-0.007)/0.008)
mean_relative_error = [0.533, 0.3925]
Please note that I don't want to use it for backpropagation to improve the network.
Would have added like this:
from tensorflow.math import reduce_mean, abs, reduce_sum
relative_error = reduce_mean(reduce_sum(abs(prediction-Y_test)/prediction, axis=1))
# [0.9, 0.54285717]
mean_relative_error = reduce_mean(relative_error)
# 0.7214286
I couldn't use tf.keras.losses.MeanAbsoluteError(reduction=tf.keras.losses.Reduction.NONE) because of a bug. The MeanAbsoluteError still does reduce to mean despite specifying it not to. The bug reported HERE
So, I am creating an AI which predicts how long a user will take to finish exercises. I previously created a NN with Sklearn, but I want to integrate Tensorflow.
I have 6 features as input and 1 output, which is a number.
I tried this but it does not seem to be willing to work:
# Train data
X_train = X[:1500]
y_train = y[:1500]
# Test data
X_test = X[1500:]
y_test = y[1500:]
# Create the TF model
model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(6,)),
tf.keras.layers.Dense(256, activation='softmax'),
tf.keras.layers.Dense(128, activation='softmax'),
tf.keras.layers.Dense(64, activation='softmax'),
tf.keras.layers.Dense(1)
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
model.fit(X_train, y_train, epochs=10)
With that, it used to work with a simple MLPClassifier.
I also managed to get this nice error which does not seem to be fixed by changing the layers:
Received a label value of 1209638408 which is outside the valid range of [0, 1).
So I changed it a bit and came up with this:
features_train = features[:1500]
output_train = output[:1500]
features_test = features[1500:]
output_test = output[1500:]
classifier = Sequential()
classifier.add(Dense(units = 16, activation = 'relu', input_dim = 6))
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 64, activation = 'relu'))
classifier.add(Dense(units = 32, activation = 'relu'))
classifier.add(Dense(units = 8, activation = 'relu'))
classifier.add(Dense(units = 2, activation = 'relu'))
classifier.add(Dense(units = 1))
classifier.compile(optimizer='rmsprop', loss='binary_crossentropy')
classifier.fit(features_train, output_train, batch_size = 1, epochs = 10)
But now I get a loss of 100%.
You should use a smaller network. Try with fewer Dense layers, 2 or 3 maximum. If you use the binary_crossentropy loss, use a sigmoid activation in the last Dense layer. You can also pass metrics=['accuracy'] when compiling the model to monitor the accuracy.
I'm trying to add add conv layers to the transfer learning code mentioned below. But not sure how to proceed. I want to add
conv, max-pooling, 3x3 filter and stride 3 and activation mode ReLU or
conv, max-pooling, 3x3 filter and stride 3 and activation mode LReLU this layer in the below mentioned transfer learning code. Let me know if it's possible and if yes how?
CLASSES = 2
# setup model
base_model = MobileNet(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dropout(0.4)(x)
predictions = Dense(CLASSES, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)
# transfer learning
for layer in base_model.layers:
layer.trainable = False
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])
"""##Data augmentation"""
# data prep
"""
## Transfer learning
"""
from tensorflow.keras.callbacks import ModelCheckpoint
filepath="mobilenet/my_model.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
EPOCHS = 1
BATCH_SIZE = 32
STEPS_PER_EPOCH = 5
VALIDATION_STEPS = 32
MODEL_FILE = 'mobilenet/filename.model'
history = model.fit_generator(
train_generator,
epochs=EPOCHS,
steps_per_epoch=STEPS_PER_EPOCH,
validation_data=validation_generator,
validation_steps=VALIDATION_STEPS,
callbacks=callbacks_list)
model.save(MODEL_FILE)
backup_model = model
model.summary()
You can do it several ways, one of them is:
model = Sequential([
base_model,
GlobalAveragePooling2D(name='avg_pool'),
Dropout(0.4),
Conv(...), # the layers you would like to add for the base model
MaxPool(...),
...
])
model.compile(...)
I think this is what you are after
CLASSES=2
new_filters=256 # specify the number of filter you want in the added convolutional layer
img_shape=(224,224,3)
base_model=tf.keras.applications.mobilenet.MobileNet( include_top=False, input_shape=img_shape, weights='imagenet',dropout=.4)
x=base_model.output
x= Conv2D(new_filters, 3, padding='same', strides= (3,3), activation='relu', name='added')(x)
x= GlobalAveragePooling2D(name='avg_pool')(x)
x= Dropout(0.4)(x)
predictions= Dense(CLASSES, activation='softmax', name='output')(x)
model=Model(inputs=base_model.input, outputs=predictions)
model.summary()
I'll start with the code and after put my question.
model1_input= keras.Input(shape=(5,10))
x = layers.Dense(16, activation='relu')(model1_input)
model1_output = layers.Dense(4)(x)
model1= keras.Model(model1_input, model1_output, name='model1')
model1.summary()
//----
model2_input= keras.Input(shape=(5,10))
y = layers.Dense(16, activation='relu')(model2_input)
model2_output = layers.Dense(4)(y)
model2= keras.Model(model2_input, model2_output, name='model2')
model2.summary()
//----
model3_input= keras.Input(shape=(5, 10))
layer1 = model1(model3_input)
layer2 = model2(layer1)
model3_output = layers.Dense(1)(layer2)
model3= keras.Model(model3_input, model3_output , name='model3')
model3.summary()
model3.compile(loss='mse', optimizer='adam')
model3.fit(inputs, outputs, epochs=10, batch_size=32)
When execute this code, what will happen with the model 1 and model 2 weights? they would stay untrained?
I would like to use trained model1 and trained model2 predictions to train model3. Can I write something like that?
model1_input= keras.Input(shape=(5,10))
x = layers.Dense(16, activation='relu')(model1_input)
model1_output = layers.Dense(4)(x)
model1= keras.Model(model1_input, model1_output, name='model1')
model1.summary()
model1.compile(loss='mse', optimizer='adam')
model1.fit(model1_inputs, model1_outputs, epochs=10, batch_size=32)
//----
model2_input= keras.Input(shape=(5,10))
y = layers.Dense(16, activation='relu')(model2_input)
model2_output = layers.Dense(4)(y)
model2= keras.Model(model2_input, model2_output, name='model2')
model2.summary()
model2.compile(loss='mse', optimizer='adam')
model2.fit(model2_inputs, model2_outputs, epochs=10, batch_size=32)
//----
model3_input= keras.Input(shape=(5, 10))
layer1 = model1(model3_input)
layer2 = model2(layer1)
model3_output = layers.Dense(1)(layer2)
model3= keras.Model(model3_input, model3_output , name='model3')
model3.summary()
model3.compile(loss='mse', optimizer='adam')
model3.fit(inputs, outputs, epochs=10, batch_size=32)
I'm afraid when I train model3 this will change the already trained weights of models 1 and 2. In this case, what will happen with models 1 and 2 weights?
I am not sure if keras works that way but even if it does it still will change the weights as long as the layers are trainable. Try freezing the layers. These links might help you 1,2.
Another option will be to branch the layers like this.
I am trying to predict 2 features. This is how my model looks like:
Defining the model
def my_model():
input_x = Input(batch_shape=(batch_size, look_back, x_train.shape[2]), name='input')
drop = Dropout(0.5)
lstm_1 = LSTM(100, return_sequences=True, batch_input_shape=(batch_size, look_back, x_train.shape[2]), name='3dLSTM', stateful=True)(input_x)
lstm_1_drop = drop(lstm_1)
lstm_2 = LSTM(100, batch_input_shape=(batch_size, look_back, x_train.shape[2]), name='2dLSTM', stateful=True)(lstm_1_drop)
lstm_2_drop = drop(lstm_2)
y1 = Dense(1, activation='relu', name='op1')(lstm_2_drop)
y2 = Dense(1, activation='relu', name='op2')(lstm_2_drop)
model = Model(inputs=input_x, outputs=[y1,y2])
optimizer = Adam(lr=0.001, decay=0.00001)
model.compile(loss='mse', optimizer=optimizer,metrics=['mse'])
model.summary()
return model
model = my_model()
for j in range(50):
start = time.time()
history = model.fit(x_train, [y_11_train,y_22_train], epochs=1, batch_size=batch_size, verbose=0, shuffle=False)
model.reset_states()
print("Epoch",j, time.time()-start,"s")
p = model.predict(x_test, batch_size=batch_size)
My data set has 9 features:
x_train (31251, 6, 9)
y_11_train (31251,)
y_22_train (31251,)
x_test (13399, 6, 9)
y_11_test (13399,)
y_22_test (13399,)
I am trying to predict the first(y_11) and second(y_22) feature of my dataset. But I am getting prediction for only first feature not the second one.
Any help on how can I get both the predictions instead of one?
First of all you should remove multiple inputs of the same thing:
(batch_size, look_back, x_train.shape[2])
Also, try to concatenate your ouputs inside your model like this:
def my_model():
from keras.layers import concatenate
lstm_1 = LSTM(100, return_sequences=True, batch_input_shape=(batch_size, look_back, x_train.shape[2]), name='3dLSTM', stateful=True)
lstm_1_drop = drop(lstm_1)
lstm_2 = LSTM(100, name='2dLSTM', stateful=True)(lstm_1_drop)
lstm_2_drop = drop(lstm_2)
y1 = Dense(1, activation='linear', name='op1')(lstm_2_drop)
y2 = Dense(1, activation='linear', name='op2')(lstm_2_drop)
y= concatenate([y1,y2])
model = Model(inputs=input_x, outputs=y)
optimizer = Adam(lr=0.001, decay=0.00001)
model.compile(loss='mse', optimizer=optimizer,metrics=['mse'])
model.summary()
return model
EDIT
I think you should fit like this:
y_11_train = y_11_train.reshape(y_11_train.shape[0],1)
y_22_train = y_22_train.reshape(y_22_train.shape[0],1)
model = my_model()
model.fit(x_train,np.concatenate((y_11_train,y_22_train),axis=1),...)