Regarding loss weighting in Keras regression problem for multiple outputs - python

I am running a Hyperas optimization for regression problem, with 3 predictors (X) and 2 targets (Y).
I did this, after ingesting the raw data:
X_train, X_val, Y_train, Y_val = train_test_split(X, Y, test_size=0.2, random_state=111)
# Input layers and Hidden Layers
model = Sequential()
model.add(Dense({{choice([np.power(2,1),np.power(2,2),np.power(2,3),np.power(2,4),np.power(2,5)])}}, input_dim = X_train.shape[1]))
model.add(Activation({{choice(['tanh','relu', 'sigmoid'])}}))
model.add(Dropout({{uniform(0, 1)}}))
model.add(Dense({{choice([np.power(2,1),np.power(2,2),np.power(2,3),np.power(2,4),np.power(2,5)])}}))
model.add(Activation({{choice(['tanh','relu', 'sigmoid'])}}))
model.add(Dropout({{uniform(0, 1)}}))
# Output layer
model.add(Dense(Y_train.shape[1]))
model.add(Activation('linear'))
model.compile(loss='mae', metrics=['mae'],optimizer=optimizer, loss_weights=[0.6,0.4])
history = model.fit(X_train, Y_train,
batch_size={{choice([16,32,64,128])}},
epochs={{choice([20000])}},
verbose=2,
validation_data=(X_val, Y_val),
callbacks=callbacks_list)
However, when running this, it says:
ValueError: When passing a list as loss_weights, it should have one entry per model output. The model has 1 outputs, but you passed loss_weights=[1, 1]
I'm guessing its due to the format of my inputs and outputs. However, I can't figure out the proper format for which I am supposed to feed it into the model.
Appreciate your advice please, thank you.

Related

What is the best approach to build a model with multiple targets on the Y set?

I need to create a model to predict multiple labels based on sixteen (16) input features. The dataset has 4486 instances, each instance has a different number of labels (48 different labels).
This is how the data looks:
X Data example
Y Data example
The challenge is to predict labels on a new instance; I know the learning is the problem because the imbalance in the number of labels, this make the learning a bit difficult.
I will appreciate commets and advice regarding how to tackle this issue.
My best result is 30% in accuracy, but I've noticed it predicts the same labels sometimes and not given any satissfactory results so far...
This is the model I've implemented:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
n_inputs, n_outputs = X_train.shape[1], y_train.shape[1]
nodes = math.sqrt(n_inputs*n_outputs)
model = Sequential()
model.add(Dense(nodes, activation='relu'))
model.add(Flatten())
model.add(Dense(n_outputs, activation = 'sigmoid'))
model.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = [tf.keras.metrics.BinaryAccuracy(),tf.keras.metrics.AUC(), 'accuracy'])
history = model.fit(X_train, y_train, epochs=300, verbose=1, shuffle=True, validation_data=
(X_test, y_test), batch_size=8)

RNN model predicting only one class?

I am trying to use GloVe embeddings to train a rnn model based on this article.
I have a labeled data: text(tweets) on one column, labels on another (hate, offensive or neither).
However the model seems to predict only one class in the result.
This is the LSTM model:
model = Sequential()
hidden_layer = 3
gru_node = 32
# model embedding matrix here....
for i in range(0,hidden_layer):
model.add(GRU(gru_node,return_sequences=True, recurrent_dropout=0.2))
model.add(Dropout(dropout))
model.add(GRU(gru_node, recurrent_dropout=0.2))
model.add(Dropout(dropout))
model.add(Dense(64, activation='softmax'))
model.add(Dense(nclasses, activation='softmax'))
start=time.time()
model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
fitting the model:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 1)
X_train_Glove,X_test_Glove, word_index, embeddings_index = loadData_Tokenizer(X_train, X_test)
model_RNN = Build_Model_RNN_Text(word_index,embeddings_index, 20)
model_RNN.fit(X_train_Glove,y_train,
validation_data=(X_test_Glove, y_test),
epochs=4,
batch_size=128,
verbose=2)
y_preds = model_RNN.predict_classes(X_test_Glove)
print(metrics.classification_report(y_test, y_preds))
Results:
classification report
Confusion matrix
Am I missing something here?
Update:
this is what the distribution looks like
and the model summary, more or less
How the distribution of your data looks like? The first suggestion is to stratify train/test split (here is the link for the documentation).
The second question is how much data do you have in comparison with the complexity of the model? Maybe, your model is so complex, that just do overfitting. You can use the command model.summary() to see the number of trainable parameters.

Keras LSTM - Input shape for time series prediction

I am trying to predict the output of a function. (Eventually it will be multi input multi output) but for now just to get the mechanics right I am trying to predict the output of sin function. My dataset is as follows,
t0 t1
0 0.000000 0.125333
1 0.125333 0.248690
2 0.248690 0.368125
3 0.368125 0.481754
4 0.481754 0.587785
5 0.587785 0.684547
6 0.684547 0.770513
7 0.770513 0.844328
8 0.844328 0.904827
9 0.904827 0.951057
.....
Total of 100 values. t0 is the current input t1 is the next output I want to predict. Then data is split into train/test via scikit,
x_train, x_test, y_train, y_test = train_test_split(wave["t0"].values, wave["t1"].values, test_size=0.20)
Problem happens in fit, I get an error that says input wrong dimensions.
model = Sequential()
model.add(LSTM(128, input_shape=??? ,stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x_train, y_train,
batch_size=10, epochs=100,
validation_data=(x_test, y_test))
I've tried other questions on the site to fix the problem but no matter what i try i can not get keras to recognize correct input.
The LSTM expects the input data to be of shape (batch_size, time_steps, num_features). In sine-wave prediction, the num_features is 1, the time_steps is how many previous time-points the LSTM should use for prediction. In the example below, batch size is 1, time_steps is 2 and num_features is 1.
x_train = np.ones((1,2,1))
y_train = np.ones((1,1))
x_test = np.ones((1,2,1))
y_test = np.ones((1,1))
model = Sequential()
model.add(LSTM(128, input_shape=(2,1)))
#for stateful
#model.add(LSTM(128, batch_input_shape=(1,2,1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x_train, y_train,
batch_size=1, epochs=100,
validation_data=(x_test, y_test))

python - returning nan when trying to predict with Keras

For a school project, I'm trying to predict data using the keras framework, but it's returning 'nan' loss and values when I try to get predicted data.
Source code :
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=5)
# create model
model = Sequential()
model.add(Dense(950, input_shape=(425,), activation='relu'))
model.add(Dense(425, activation='relu'))
model.add(Dense(200, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile model
sgd = optimizers.SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer='sgd')
# Fit the model
model.fit(X_train, y_train, epochs=20, batch_size=1, verbose=1)
#evaluate the model
y_pred = model.predict(X_test)
score = model.evaluate(X_test, y_test,verbose=1)
print(score)
# calculate predictions
predictions = model.predict(X_pred)
Data :
X_train and X_test are (panda)dataframes of 5000 rows(nber of samples) * 425 columns (number of dimensions).
y_train and y_test look like :
array([ 1.17899644, 1.46080518, 0.9662137 , ..., 2.40157461,
0.53870386, 1.3192718 ])
Can you help me with that ?
Thank you for you help!
Usually, this means that something converges to infinity. As #desertnaut pointed out in the comment, reducing the learning rate might help.
But the root of the issue is your input data. What do these 425 data points mean? Are they from different sources, different features, different parameters? Finding outliners or normalizing the data, could help.
Your code looks fine otherwise.
Make sure your target output is in range (0, 1) as you have sigmoid in the last layer.
sigmoid has an output between zero and one so if the target output is not in this range then (a) change the activation function or (b) normalize outputs in the required range.
Make sure the purpose of this model is the regression.
After considering the above three points, play around with learning rate (decrease) and the optimiser (replace with any other).
Try changing your optimizer to 'Adam' instead of SGD
You initialized your SGD optimizer in variable sgd but you're not using it in compile

Why my model work ok with test data from train_test_split while doesn't with the new data?

I am new to machine learning.
I have a continuous dataset. I am trying to model the target label using several features. I utilize the train_test_split function to separate the train and the test data. I am training and testing the model using the code below:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = Sequential()
model.add(Dense(128, input_dim=X.shape[1], kernel_initializer = 'normal', activation='relu'))
model.add(Dense(1, kernel_initializer = 'normal'))
hist = model.fit(X_train.values, y_train.values, validation_data=(X_test.values,y_test.values), epochs=200, batch_size=64, verbose=1)
I can get good results when I use X_test and y_test for validation data:
https://drive.google.com/open?id=0B-9aw4q1sDcgNWt5TDhBNVZjWmc
However, when I use this model to predict another data (X_real, y_real) (which are not so different from the X_test and y_test except that they are not randomly chosen by train_test_split) I get bad results:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = Sequential()
model.add(Dense(128, input_dim=X.shape[1], kernel_initializer = 'normal', activation='relu'))
model.add(Dense(1, kernel_initializer = 'normal'))
hist = model.fit(X_train.values, y_train.values, validation_data=(X_real.values,y_real.values), epochs=200, batch_size=64, verbose=1)
https://drive.google.com/open?id=0B-9aw4q1sDcgYWFZRU9EYzVKRFk
Is it an issue of overfitting? If it is so, why does my model work ok with the X_test and y_test generated by train_test_split?
Seems that your "real data" differs from your train and test data.
Why do you have "real" and "training" data in the first place?
My approach would be:
1: Mix up all Data you have
2: Devide your Data randomly in 3 sets (train, test and validate)
3: use train and test like you do it now and optimize your classifier
4: When it's good enough validate the classifier with your validation set to make sure no overfitting occurs.
If you have less data then I would suggest you to try a different algorithm. Neural networks generally need a lot of data to get the weights right.
Also, your real data doesn't seem to belong to the same distribution as the train and test data. Don't keep anything hidden, shuffle everything and use Train/Validation/Test splits.

Categories