I used the doc2vec eigenvector to train the MLP, But training MLP is always equal to 0 per epoch.
The MLP model is:
def MySimpleMLP(X_train=None):
lengths = sorted([len(X) for X in X_train])
percentile = 0.90
seq_cutoff = lengths[int(len(lengths) * percentile)]
vocab = 50
N = 256
size = 3
seq_indices = Input(shape=(seq_cutoff,), name='seq_input')
seq_embedded = Embedding(input_dim=vocab + 1, output_dim=EMBEDDING_DIM,
seq_conv = Conv1D(N, size, activation='relu')(Dropout(0.2)(seq_embedded))
max_conv = GlobalMaxPooling1D()(seq_conv)
hidden_repr = Dense(N, activation='relu')(max_conv)
sentiment = Dense(1, activation='sigmoid')(Dropout(0.2)(hidden_repr))
model = Model(inputs=[seq_indices], outputs=[sentiment])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
return model
The training data is:
Doc2VecArrayFilePath = "../data/mpk/tools/Doc2VecArray.pkl"
with open(Doc2VecArrayFilePath, "rb+") as f:
X_train, Y_train = pickle.load(f)
X_train = [25000,50]
[[ 1.54979062 0.99996233 0.10931063 ... -1.12303877 -1.30322146
[ 0.90919989 -1.39264524 -1.69380188 ... -0.35270166 1.00891471
[-0.66494519 0.76236057 1.37783039 ... 0.69574219 1.99134898
[ 1.08792138 0.00841406 -0.27354664 ... -0.18176237 0.76443428
[ 0.78027207 -0.80181849 -1.21321726 ... -0.14031847 0.55475223
[ 0.59591568 -0.57823026 -0.91873246 ... -0.22376266 1.16658998
The MLP training results are:
Epoch 2/10
- 2s - loss: 0.6586 - accuracy: 0.6250 - val_loss: 0.9066 - val_accuracy: 0.0000e+00
Epoch 3/10
- 2s - loss: 0.6588 - accuracy: 0.6250 - val_loss: 0.8222 - val_accuracy: 0.0000e+00
Epoch 4/10
- 2s - loss: 0.6582 - accuracy: 0.6250 - val_loss: 0.9356 - val_accuracy: 0.0000e+00
Epoch 5/10
- 2s - loss: 0.6563 - accuracy: 0.6250 - val_loss: 0.8692 - val_accuracy: 0.0000e+00
The rest of the code is:
mlp = MySimpleMLP(X_train=X_train)
mlp.fit(np.array(X_train, dtype='int32'), Y_train, validation_split=0.2, epochs=10, batch_size=64, verbose=2)
How can I modify the MLP model to accept doc2vec input with good accuracy?help,please.
I'm trying to create a small transformer model with Keras to model stock prices, based off of this tutorial from the Keras docs. The problem is, my test loss is massive and barely changes between epochs, unsurprisingly resulting in severe underfitting, with my outputs all the same arbitrary value.
My code is below:
def transformer_encoder_block(inputs, head_size, num_heads, filters, dropout=0):
# Normalization and Attention
x = layers.LayerNormalization(epsilon=1e-6)(inputs)
x = layers.MultiHeadAttention(
key_dim=head_size, num_heads=num_heads, dropout=dropout
)(x, x)
x = layers.Dropout(dropout)(x)
res = x + inputs
# Feed Forward Part
x = layers.LayerNormalization(epsilon=1e-6)(res)
x = layers.Conv1D(filters=filters, kernel_size=1, activation="relu")(x)
x = layers.Dropout(dropout)(x)
x = layers.Conv1D(filters=inputs.shape[-1], kernel_size=1)(x)
return x + res
data = ...
input = np.array(
keras.preprocessing.sequence.pad_sequences(data["input"], padding="pre", dtype="float32"))
output = np.array(
keras.preprocessing.sequence.pad_sequences(data["output"], padding="pre", dtype="float32"))
# Input shape: (723, 36, 22)
# Output shape: (723, 36, 1)
# Train data
train_features = input[100:]
train_labels = output[100:]
train_labels = tf.keras.utils.to_categorical(train_labels, num_classes=3)
# Test data
test_features = input[:100]
test_labels = output[:100]
test_labels = tf.keras.utils.to_categorical(test_labels, num_classes=3)
inputs = keras.Input(shape=(None,22), dtype="float32", name="inputs")
# Ignore padding in inputs
x = layers.Masking(mask_value=0)(inputs)
x = transformer_encoder_block(x, head_size=64, num_heads=16, filters=3, dropout=0.2)
# Multiclass = Softmax (decrease, no change, increase)
outputs = layers.TimeDistributed(layers.Dense(3, activation="softmax", name="outputs"))(x)
# Create model
model = keras.Model(inputs=inputs, outputs=outputs)
# Compile model
model.compile(loss="categorical_crossentropy", optimizer=(tf.keras.optimizers.Adam(learning_rate=0.005)), metrics=['accuracy'])
# Train model
history = model.fit(train_features, train_labels, epochs=10, batch_size=32)
# Evaluate on the test data
test_loss = model.evaluate(test_features, test_labels, verbose=0)
print("Test loss:", test_loss)
out = model.predict(test_features)
After padding, input is of shape (723, 36, 22), and output is of shape (723, 36, 1) (before converting output to one hop, after which there are 3 output classes).
Here's an example output for ten epochs (trust me, more than ten doesn't make it better):
Epoch 1/10
20/20 [==============================] - 2s 62ms/step - loss: 10.7436 - accuracy: 0.3335
Epoch 2/10
20/20 [==============================] - 1s 62ms/step - loss: 10.7083 - accuracy: 0.3354
Epoch 3/10
20/20 [==============================] - 1s 60ms/step - loss: 10.6555 - accuracy: 0.3392
Epoch 4/10
20/20 [==============================] - 1s 62ms/step - loss: 10.7846 - accuracy: 0.3306
Epoch 5/10
20/20 [==============================] - 1s 60ms/step - loss: 10.7600 - accuracy: 0.3322
Epoch 6/10
20/20 [==============================] - 1s 59ms/step - loss: 10.7074 - accuracy: 0.3358
Epoch 7/10
20/20 [==============================] - 1s 59ms/step - loss: 10.6569 - accuracy: 0.3385
Epoch 8/10
20/20 [==============================] - 1s 60ms/step - loss: 10.7767 - accuracy: 0.3314
Epoch 9/10
20/20 [==============================] - 1s 61ms/step - loss: 10.7346 - accuracy: 0.3341
Epoch 10/10
20/20 [==============================] - 1s 62ms/step - loss: 10.7093 - accuracy: 0.3354
Test loss: [10.073813438415527, 0.375]
4/4 [==============================] - 0s 22ms/step
Using the same data on a simple LSTM model with the same shape yielded a desirable prediction with a constantly decreasing loss.
Tweaking the learning rate appears to have no effect, nor does stacking more transformer_encoder_block()s.
If anyone has any suggestions for how I can solve this, please let me know.
I have made a model that tries to predict the chances of every piano key playing in a time step given all time steps before it. I tried making a GRU network with 88 outputs(one for every piano key)
input shape = (600,88,)
desired output/ label shape = (88, )
import numpy as np
import midi_processer
from keras import models
from keras import layers
x_train, x_test = np.load("samples.npy", mmap_mode='r'), np.load("test_samples.npy", mmap_mode='r')
y_train, y_test = np.load("labels.npy", mmap_mode='r'), np.load("test_labels.npy", mmap_mode='r')
def build_model():
model = models.Sequential()
model.add(layers.GRU(512,activation='tanh', recurrent_activation='hard_sigmoid'))
model.add(layers.Dense(88, activation = 'sigmoid'))
return model
x_partial, x_val = x_train[:13000], x_train[13000:]
y_partial, y_val = y_train[:13000], y_train[13000:]
model = build_model()
model.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = ['accuracy'])
history = model.fit(x_partial, y_partial, batch_size = 50, epochs = , validation_data= (x_val,y_val))
instead of learning normally my algorithm had stayed with constant accuracy throughout all of the epochs
Epoch 1/15
260/260 [==============================] - 998s 4s/step - loss: -0.1851 - accuracy: 0.0298 - val_loss: -8.8735 - val_accuracy: 0.0310
Epoch 2/15
260/260 [==============================] - 827s 3s/step - loss: -33.6520 - accuracy: 0.0382 - val_loss: -56.0122 - val_accuracy: 0.0310
Epoch 3/15
260/260 [==============================] - 844s 3s/step - loss: -78.6130 - accuracy: 0.0382 - val_loss: -98.2798 - val_accuracy: 0.0310
Epoch 4/15
260/260 [==============================] - 906s 3s/step - loss: -121.0963 - accuracy: 0.0382 - val_loss: -139.3440 - val_accuracy: 0.0310
Epoch 5/15
I am working on an image classification results. My training and testing split used the same random_state. Model definition is the same. However, when I run the model for multiple times, three out of four times, the model is not learning, the loss function does not go down; one out of four times, the model is learning, I can get good classificaiton results. I suspect the randomness comes from the ImageDataGenerator(). But I cannot figure out how to let the model learn every time.
I have a relative small labeled dataset, I don't have ways to increase the data size
I tried different optimizers, different batch size. It doesn't help. I found that when I reduce the trainable layers and make the later fully-connected layers smaller (reduce to 256 units), the model start to learn every time. But why big network does not learn well even on the training data set??? My understanding is that the model will overfit, but why in this case, it is not learning at all?
filenames = os.listdir(r"XXX")
ref_db= pd.read_csv(r"XXX")
ref_db['obj_id']= [str(i)+ '.tif' for i in ref_db.OBJECTID.values ]
ref_db2= ref_db[['label', 'obj_id' ]]
ref_db2['label'] = ref_db2['label'].apply(str)
train_df, validate_df = train_test_split(ref_db2, test_size=0.20, random_state=42)
train_df = train_df.reset_index(drop=True)
validate_df = validate_df.reset_index(drop=True)
total_train = train_df.shape[0]
total_validate = validate_df.shape[0]
train_datagen = ImageDataGenerator(
train_generator = train_datagen.flow_from_dataframe(
inputs= Input(shape=(IMAGE_WIDTH, IMAGE_HEIGHT, 3))
base_model = VGG19(weights='imagenet', include_top=False,)
for layer in base_model.layers[:-3]:
layer.trainable = False
x = base_model(inputs)
x = Flatten()(x)
x = Dense(1024, activation="relu")(x)
#x = Dropout(0.5)(x)
x = Dense(512, activation="relu")(x)
predictions = Dense(1, activation="sigmoid")(x)
model_vgg= Model(inputs=inputs , outputs=predictions)
model_vgg.compile(optimizer='Adam', loss='binary_crossentropy', metrics=['accuracy'])
history = model_vgg.fit_generator(
This is the unwanted behavior, Model is not learning, all observations are predicted as 1, the loss is not dropping
Found 756 validated image filenames belonging to 2 classes.
Found 190 validated image filenames belonging to 2 classes.
Epoch 1/50
- 4s - loss: 4.0464 - acc: 0.6203 - val_loss: 4.9820 - val_acc: 0.6875
Epoch 2/50
- 2s - loss: 4.3811 - acc: 0.7252 - val_loss: 4.8856 - val_acc: 0.6935
Epoch 3/50
- 2s - loss: 5.0209 - acc: 0.6851 - val_loss: 5.3556 - val_acc: 0.6641
Epoch 4/50
- 2s - loss: 4.3583 - acc: 0.7266 - val_loss: 4.1142 - val_acc: 0.7419
Epoch 5/50
- 2s - loss: 4.9317 - acc: 0.6907 - val_loss: 4.7329 - val_acc: 0.7031
Epoch 6/50
- 2s - loss: 4.6275 - acc: 0.7097 - val_loss: 5.3998 - val_acc: 0.6613
Epoch 7/50
This is the expected behavior, model is learning, both 1 and 0 are predicted, the loss is dropping
Found 756 validated image filenames belonging to 2 classes.
Found 190 validated image filenames belonging to 2 classes.
Epoch 1/50
- 4s - loss: 2.1181 - acc: 0.6484 - val_loss: 0.8013 - val_acc: 0.6562
Epoch 2/50
- 2s - loss: 0.6609 - acc: 0.7096 - val_loss: 0.5670 - val_acc: 0.7581
Epoch 3/50
- 2s - loss: 0.6539 - acc: 0.6912 - val_loss: 0.5923 - val_acc: 0.6953
Epoch 4/50
- 2s - loss: 0.5695 - acc: 0.7083 - val_loss: 0.5426 - val_acc: 0.6774
Epoch 5/50
- 2s - loss: 0.5262 - acc: 0.7176 - val_loss: 0.5386 - val_acc: 0.6875
I am trying to train a neural network to make Inverse Kinematics calculations for a robotic arm with predefined segment lengths. I am not including the segment lengths in neural network inputs but rather through the training data. The training data is a pandas dataframe with the spatial mappings of the arm, with labels being the angles of rotation for the three segments of the arm and the features being the solutions of the x and y coordinates of where the endpoint of the last segment would end up in.
I am using Keras with Theano as the Backend.
model = Sequential([
Dense(3, input_shape=(2,), activation="relu"),
Dense(3, activation="relu"),
model.compile(Adam(lr=0.001), loss='mean_squared_error', metrics=['accuracy'])
model.fit(samples, labels, validation_split=0.2, batch_size=1000, epochs=10,shuffle=True, verbose=1)
score = model.evaluate(samples, labels, batch_size=32, verbose=1)
print('Test score:', score[0])
print('Test accuracy:', score[1])
weights = model.get_weights()
predictions = model.predict(samples, verbose=1)
print predictions
Train on 6272736 samples, validate on 1568184 samples
Epoch 1/10
- 5s - loss: 10198.7558 - acc: 0.9409 - val_loss: 12149.1703 - val_acc: 0.9858
Epoch 2/10
- 5s - loss: 4272.9105 - acc: 0.9932 - val_loss: 12117.0527 - val_acc: 0.9858
Epoch 3/10
- 5s - loss: 4272.7862 - acc: 0.9932 - val_loss: 12113.3804 - val_acc: 0.9858
Epoch 4/10
- 5s - loss: 4272.7567 - acc: 0.9932 - val_loss: 12050.8211 - val_acc: 0.9858
Epoch 5/10
- 5s - loss: 4272.7271 - acc: 0.9932 - val_loss: 12036.5538 - val_acc: 0.9858
Epoch 6/10
- 5s - loss: 4272.7350 - acc: 0.9932 - val_loss: 12103.8665 - val_acc: 0.9858
Epoch 7/10
- 5s - loss: 4272.7553 - acc: 0.9932 - val_loss: 12175.0442 - val_acc: 0.9858
Epoch 8/10
- 5s - loss: 4272.7282 - acc: 0.9932 - val_loss: 12161.4815 - val_acc: 0.9858
Epoch 9/10
- 5s - loss: 4272.7213 - acc: 0.9932 - val_loss: 12101.4021 - val_acc: 0.9858
Epoch 10/10
- 5s - loss: 4272.7909 - acc: 0.9932 - val_loss: 12152.4966 - val_acc: 0.9858
Test score: 5848.549130022683
Test accuracy: 0.9917127071823204
[[ 59.452095 159.26912 258.94424 ]
[ 59.382706 159.41936 259.25183 ]
[ 59.72419 159.69777 259.48584 ]
[ 59.58721 159.33467 258.9603 ]
[ 59.51745 159.69331 259.62595 ]
[ 59.984367 160.5533 260.7689 ]]
Both the test accuracy and validation accuracy are seem good but they don't exactly reflect the reality. The predictions should have looked something like this
[[ 0 0 0]
[ 0 0 1]
[ 0 0 2]
[358 358 359]
[358 359 359]
[359 359 359]]
Since I fed back the same features expecting to get the same labels. Instead I'm getting this numbers for some reason:
[[ 59.452095 159.26912 258.94424 ]
[ 59.382706 159.41936 259.25183 ]
[ 59.72419 159.69777 259.48584 ]
[ 59.58721 159.33467 258.9603 ]
[ 59.51745 159.69331 259.62595 ]
[ 59.984367 160.5533 260.7689 ]]
Thank you for your time.
First of all your metric is accuracy and you are predicting continuous values. You get predictions, but they don´t make any sense. Your problem is a regression and your metric is for classification. You could just use "MSE" "R²" or other regression metrics
from keras import metrics
model.compile(loss='mse', optimizer='adam', metrics=[metrics.mean_squared_error, metrics.mean_absolute_error])
Additionally you should consider increasing the number of neurons and if your input data is really only 2 dimensions, think about some shallow models, not ANNs. (SVM with gauss kernel e.g.)
I'm trying to do a simple Keras Neural Network but the model doesn't fit:
Train on 562 samples, validate on 188 samples
Epoch 1/20
562/562 [==============================] - 1s 1ms/step - loss: 8.1130 - acc: 0.4911 - val_loss: 7.6320 - val_acc: 0.5213
Epoch 2/20
562/562 [==============================] - 0s 298us/step - loss: 8.1130 - acc: 0.4911 - val_loss: 7.6320 - val_acc: 0.5213
Epoch 3/20
562/562 [==============================] - 0s 295us/step - loss: 8.1130 - acc: 0.4911 - val_loss: 7.6320 - val_acc: 0.5213
Epoch 4/20
562/562 [==============================] - 0s 282us/step - loss: 8.1130 - acc: 0.4911 - val_loss: 7.6320 - val_acc: 0.5213
Epoch 5/20
562/562 [==============================] - 0s 289us/step - loss: 8.1130 - acc: 0.4911 - val_loss: 7.6320 - val_acc: 0.5213
Epoch 6/20
562/562 [==============================] - 0s 265us/step - loss: 8.1130 - acc: 0.4911 - val_loss: 7.6320 - val_acc: 0.5213
The data base is structured in a CSV file like this:
doc venda img1 img2 v1 v2 gt
RG venda1 img123 img12 [3399, 162675, ...] [3399, 162675, ...] 1
My intent its to use the diff between v1 and v2 vector to answer me if img1 and im2 are from the same class.
The code:
from sklearn.model_selection import train_test_split
(X_train, X_test, Y_train, Y_test) = train_test_split(train, train_labels, test_size=0.25, random_state=42)
# create the model
model = Sequential()
model.add(Dense(10, activation="relu", input_dim=10, kernel_initializer="uniform"))
model.add(Dense(6, activation="relu", kernel_initializer="uniform"))
model.add(Dense(1, activation='sigmoid'))
validation_data=(np.array(X_test), np.array(Y_test)),
What i'm doing wrong?
Divide the difference vector by some constant number so that the feature vector is in range 0 to 1 or -1 to 1. Right now the values are too big and the loss is coming high. Network learns faster if the data is normalized properly.
I have had success normalizing features using this function. I forget exactly why I use the same mu and sigma from train set on the test and val but I am pretty sure I learned it during the deep.ai course on coursera
def normalize_features(dataset):
mu = np.mean(dataset, axis = 0) # columns
sigma = np.std(dataset, axis = 0)
norm_parameters = {'mu': mu,
'sigma': sigma}
return (dataset-mu)/(sigma+1e-10), norm_parameters
# Normal X data; using same mu and sigma from test set;
x_train, norm_parameters = normalize_features(x_train)
x_val = (x_val-norm_parameters['mu'])/(norm_parameters['sigma']+1e-10)
x_test = (x_test-norm_parameters['mu'])/(norm_parameters['sigma']+1e-10)