In the following code, I have XE, XW, YE, and YW of shapes (474077, 32), (474077, 32), (474077, 1), and (474077, 1), respectively.
After separately training modelE and modelW on 32 inputs and 1 output each, I add a Lambda layer that minimizes the difference between the outputs of both models. This code ran without errors.
I'm assuming this Lambda layer updates the weights and biases of modelE and modelW to minimize the difference between their outputs. How do I use the new updated weights and biases of modelE and modelW to predict their new outputs? I want to compare the initial outputs of the models and their outputs after the Lambda layer minimized the difference between them.
XtrainE, XtestE, YtrainE, YtestE = train_test_split(XE, YE, test_size=.5)
XtrainW, XtestW, YtrainW, YtestW = train_test_split(XW, YW, test_size=.5)
modelE = Sequential()
modelE.add(Dense(50, activation='relu', input_dim=32))
modelE.add(Dense(20, activation='relu'))
modelE.add(Dense(1, activation='relu'))
modelW = Sequential()
modelW.add(Dense(50, activation='relu', input_dim=32))
modelW.add(Dense(20, activation='relu'))
modelW.add(Dense(1, activation='relu'))
modelE.compile(loss='mse', optimizer='rmsprop')
modelW.compile(loss='mse', optimizer='rmsprop')
historyE= modelE.fit(XtrainE, YtrainE, validation_data=(XtestE,YtestE), epochs=200, batch_size=100, verbose=1)
historyW= modelW.fit(XtrainW, YtrainW, validation_data=(XtestW,YtestW), epochs=200, batch_size=100, verbose=1)
YpredE = modelE.predict(XtestE)
YpredW = modelW.predict(XtestW)
difference = Lambda(lambda x: x[0] - x[1])([modelE.output, modelW.output])
diffModel = Model(modelE.inputs + modelW.inputs, difference)
diffModel.compile(optimizer = 'adam', loss='mse')
diffModel.fit([XE,XW], np.zeros(YE.shape), epochs=200, batch_size=100, verbose=1)
I tried:
YpredWnew = modelW.predict(XtestW)
YpredEnew = modelE.predict(XtestE)
for i in range (len(YpredWnew)):
print("oldE= %.2f, newE= %.2f, oldW= %.2f, newW= %.2f," % (YpredE[i], YpredWnew[i], YpredW[i], YpredWnew[i]))
but this gives back the same value for all i in YpredEnew[i]
Thanks
Related
Problem: I have S sequences of T timesteps each and each timestep contains F features so collectively, a dataset of
(S x T x F) and each s in S is described by 2 values (Target_1 and Target_2)
Goal: Model/Train an architecture using LSTMs in order to learn/achieve a function approximator model M and given a sequence s, to predict Target_1 and Target_2 ?
Something like this:
M(s) ~ (Target_1, Target_2)
I'm really struggling to find a way, below is a Keras implementation of an example that probably does not work. I made 2 models one for the first Target value and 1 for the second.
model1 = Sequential()
model1.add(Masking(mask_value=-10.0))
model1.add(LSTM(1, input_shape=(batch, timesteps, features), return_sequences = True))
model1.add(Flatten())
model1.add(Dense(hidden_units, activation = "relu"))
model1.add(Dense(1, activation = "linear"))
model1.compile(loss='mse', optimizer=Adam(learning_rate=0.0001))
model1.fit(x_train, y_train[:,0], validation_data=(x_test, y_test[:,0]), epochs=epochs, batch_size=batch, shuffle=False)
model2 = Sequential()
model2.add(Masking(mask_value=-10.0))
model2.add(LSTM(1, input_shape=(batch, timesteps, features), return_sequences=True))
model2.add(Flatten())
model2.add(Dense(hidden_units, activation = "relu"))
model2.add(Dense(1, activation = "linear"))
model2.compile(loss='mse', optimizer=Adam(learning_rate=0.0001))
model2.fit(x_train, y_train[:,1], validation_data=(x_test, y_test[:,1]), epochs=epochs, batch_size=batch, shuffle=False)
I want to make somehow good use of LSTMs time relevant memory in order to achieve good regression.
IIUC, you can start off with a simple (naive) approach by using two output layers:
import tensorflow as tf
timesteps, features = 20, 5
inputs = tf.keras.layers.Input((timesteps, features))
x = tf.keras.layers.Masking(mask_value=-10.0)(inputs)
x = tf.keras.layers.LSTM(32, return_sequences=False)(x)
x = tf.keras.layers.Dense(32, activation = "relu")(x)
output1 = Dense(1, activation = "linear", name='output1')(x)
output2 = Dense(1, activation = "linear", name='output2')(x)
model = tf.keras.Model(inputs, [output1, output2])
model.compile(loss='mse', optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001))
x_train = tf.random.normal((500, timesteps, features))
y_train = tf.random.normal((500, 2))
model.fit(x_train, [y_train[:,0],y_train[:,1]] , epochs=5, batch_size=32, shuffle=False)
I am trying to use a custom loss function for my model. I am scaling y values previously and in my loss function I inverse scale them.(Using the answer from scaling back data in customized keras training loss function) After a random amount of epochs the loss starts to come as NaN also mean_absolute_error val_mean_absolute_error and val_loss are all NaN. Heres my model and custom loss function:
model = Sequential()
model.add(LSTM(units=512, activation="tanh", return_sequences=True, input_shape=(X_train.shape[1],X_train.shape[2])))
model.add(Dropout(0.2))
model.add(LSTM(units=256, activation="tanh", return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=128, activation="tanh", return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=64, activation="tanh"))
model.add(Dropout(0.2))
model.add(Dense(units = 2))
model.compile(optimizer = "Adam", loss = my_loss_function , metrics=['mean_absolute_error'])
model.summary()
I have 2 outputs as you can see.
def my_loss_function(y_actual, y_predicted):
y_actual = (y_actual - K.constant(y_scaler.min_)) / K.constant(y_scaler.scale_)
y_predicted = (y_predicted - K.constant(y_scaler.min_)) / K.constant(y_scaler.scale_)
a_loss = abs(y_actual[0]-y_predicted[0])*128000
b_loss = abs(y_actual[1]-y_predicted[1])*27000
loss= tf.math.sqrt(tf.square(a_loss) + tf.square(b_loss))
return loss
y_scaler is used earlier:
y_scaler = MinMaxScaler(feature_range = (0, 1))
y_scaler.fit(y_data)
y_data=y_scaler.transform(y_data)
y_testdata=y_scaler.transform(y_testdata)
Can anyone help?
When I use MSE, MAE etc. it works fine
I am learning Keras using audio classification, Actually, I am implementing the code with modification from https://github.com/deepsound-project/genre-recognition/blob/master/train_model.py using Keras.
The shape of the dataset is
X_train shape = (800, 32, 1)
y_train shape = (800, 10)
X_test shape = (200, 32, 1)
y_test shape = (200, 10)
The model
model = Sequential()
model.add(Conv1D(filters=256, kernel_size=5, input_shape=(32,1), activation="relu"))
model.add(BatchNormalization(momentum=0.9))
model.add(MaxPooling1D(2))
model.add(Dropout(0.5))
model.add(Conv1D(filters=256, kernel_size=5, activation="relu"))
model.add(BatchNormalization(momentum=0.9))
model.add(MaxPooling1D(2))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128, activation="relu", ))
model.add(Dense(10, activation='softmax'))
model.compile(
loss='categorical_crossentropy',
optimizer = Adam(lr=0.001),
metrics = ['accuracy'],
)
model.summary()
red_lr= ReduceLROnPlateau(monitor='val_loss',patience=2,verbose=2,factor=0.5,min_delta=0.01)
check=ModelCheckpoint(filepath=r'/content/drive/My Drive/Colab Notebooks/gen/cnn.hdf5', verbose=1, save_best_only = True)
History = model.fit(X_train,
y_train,
epochs=100,
#batch_size=512,
validation_data = (X_test, y_test),
verbose = 2,
callbacks=[check, red_lr],
shuffle=True )
The accuracy graph
Loss graph
I do not understand, Why the val_acc is in the range of 70%. I tried to modify the model architecture including optimizer, but no improvement.
And, Is it good to have a lot of difference between loss and val_loss.
how to improve the accuracy above 80... any help...
Thank you
I found it, I use concatenate function from Keras to concatenate all convolution layers and, it gives the best performance.
I am new to machine learning. I am having trouble getting my data into my network.
This is the error that I am receiving:
ValueError: Error when checking input: expected cu_dnnlstm_22_input to have 3 dimensions, but got array with shape (2101, 17)
I have tried adding model.add(Flatten()) before the dense layer. I would really appreciate your help!
BATCH_SIZE = 64
test_size_length = int(len(main_df)*TESTING_SIZE)
training_df = main_df[:test_size_length]
validation_df = main_df[test_size_length:]
train_x, train_y = training_df.drop('target',1).to_numpy(), training_df['target'].tolist()
validation_x, validation_y = validation_df.drop('target',1).to_numpy(), validation_df['target'].tolist()
#train_x.shape is (2101, 17)
model = Sequential()
# model.add(Flatten())
model.add(CuDNNLSTM(128, input_shape=(train_x.shape), return_sequences=True))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(CuDNNLSTM(128, return_sequences=True))
model.add(Dropout(0.1))
model.add(BatchNormalization())
model.add(CuDNNLSTM(128))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(2, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr=0.001, decay=1e-6)
# Compile model
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy']
)
tensorboard = TensorBoard(log_dir="logs/{}".format(NAME))
filepath = "RNN_Final-{epoch:02d}-{val_acc:.3f}" # unique file name that will include the epoch and the validation acc for that epoch
checkpoint = ModelCheckpoint("models/{}.model".format(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')) # saves only the best ones
# Train model
history = model.fit(
train_x, train_y,
batch_size=BATCH_SIZE,
epochs=EPOCHS,
validation_data=(validation_x, validation_y),
callbacks=[tensorboard, checkpoint],
)
# Score model
score = model.evaluate(validation_x, validation_y, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
# Save model
model.save("models/{}".format(NAME))
The input to your LSTM layer (CuDNNLSTM) should have shape: (batch_size, timesteps, input_dim).
It looks like you are missing one of these dimensions.
Often we can oversee the last dimension in the case where the input dimension is 1. If this is the case with your model (if you are predicting from a sequence of single numbers), then you might consider expanding the dimensions before the CuDNNLSTM layer with something like this:
model.add(Lambda(lambda t: tf.expand_dims(t, axis=-1)))
model.add(CuDNNLSTM(128))
Without knowing the problem you are working on it's hard to know whether this is a valid way forward but certainly you should keep in mind the required shape of a LSTM layer and reshape/expand dims accordingly.
I want to build a binary classifier using a Keras CNN.
I have about 6000 rows of input data which looks like this:
>> print(X_train[0])
[[[-1.06405307 -1.06685851 -1.05989663 -1.06273152]
[-1.06295958 -1.06655996 -1.05969803 -1.06382503]
[-1.06415248 -1.06735609 -1.05999593 -1.06302975]
[-1.06295958 -1.06755513 -1.05949944 -1.06362621]
[-1.06355603 -1.06636092 -1.05959873 -1.06173742]
[-1.0619655 -1.06655996 -1.06039312 -1.06412326]
[-1.06415248 -1.06725658 -1.05940014 -1.06322857]
[-1.06345662 -1.06377347 -1.05890365 -1.06034568]
[-1.06027557 -1.06019084 -1.05592469 -1.05537518]
[-1.05550398 -1.06038988 -1.05225064 -1.05676692]]]
>>> print(y_train[0])
[1]
And then I've build a CNN by this way:
model = Sequential()
model.add(Convolution1D(input_shape = (10, 4),
nb_filter=16,
filter_length=4,
border_mode='same'))
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dropout(0.2))
model.add(Convolution1D(nb_filter=8,
filter_length=4,
border_mode='same'))
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(64))
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dense(1))
model.add(Activation('softmax'))
reduce_lr = ReduceLROnPlateau(monitor='val_acc', factor=0.9, patience=30, min_lr=0.000001, verbose=0)
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
history = model.fit(X_train, y_train,
nb_epoch = 100,
batch_size = 128,
verbose=0,
validation_data=(X_test, y_test),
callbacks=[reduce_lr],
shuffle=True)
y_pred = model.predict(X_test)
But it returns the following:
>> print(confusion_matrix(y_test, y_pred))
[[ 0 362]
[ 0 608]]
Why all predictions are ones? Why does the CNN perform so bad?
Here are the loss and acc charts:
It always predicts one because of the output in your network. You have a Dense layer with one neuron, with a Softmax activation. Softmax normalizes by the sum of exponential of each output. Since there is one output, the only possible output is 1.0.
For a binary classifier you can either use a sigmoid activation with the "binary_crossentropy" loss, or put two output units at the last layer, keep using softmax and change the loss to categorical_crossentropy.