I am not sure where i am wrong in this code. My goal is to train my dataset for binary classification using LSTM and GRU.
[the output comes with module wrapper and GRU not executing please check the image][1]
#BUILD THE MODEL
top_words = 10000
embedding_vecor_length = 32
model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length, input_length=X.shape[1]))
#model.add(Dropout(0.2))
model.add(GRU(100,dropout=0.2, recurrent_dropout=0.2, return_sequences=True))
model.add(LSTM(100,dropout=0.2, recurrent_dropout=0.2))
#model.add(Dropout(0.2))
model.add(Dense(1, activation='sigmoid'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='Adam',
metrics=['accuracy'])
print(model.summary())
model.summary()
```
[1]: https://i.stack.imgur.com/14pyl.jpg
I have the following code for training a model based off of some numbers:
from numpy import loadtxt
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from time import sleep
dataset = loadtxt("data.csv", delimiter=",")
X = dataset[:,0:2]
y = dataset[:,2]
model = Sequential()
model.add(Dense(196, input_dim=2, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=600, batch_size=10)
_, accuracy = model.evaluate(X, y)
print('Accuracy: %.2f' % (accuracy*100))
For reference, here is some of the data it is being presented with:
433,866,1299,1732
421,842,1263,1684
443,886,1329,1772
142,284,426,568
437,874,1311,1748
455,910,1365,1820
172,344,516,688
219,438,657,876
101,202,303,404
289,578,867,1156
110,220,330,440
421,842,1263,1684
472,944,1416,1888
121,242,363,484
215,430,645,860
134,268,402,536
488,976,1464,1952
467,934,1401,1868
418,836,1254,1672
134,268,402,536
241,482,723,964
116,232,348,464
395,790,1185,1580
438,876,1314,1752
396,792,1188,1584
57,114,171,228
218,436,654,872
372,744,1116,1488
305,610,915,1220
462,924,1386,1848
455,910,1365,1820
42,84,126,168
347,694,1041,1388
394,788,1182,1576
184,368,552,736
302,604,906,1208
326,652,978,1304
333,666,999,1332
335,670,1005,1340
176,352,528,704
168,336,504,672
62,124,186,248
26,52,78,104
335,670,1005,1340
(The first three numbers should be inputs, and the last one an output)
The Keras program keeps training but only warrants an accuracy of 0. What am I doing wrong?
Like discussed in comments, this is a regression problem (not classification), so we can use, for example, mse (mean squared errors) as a loss function, and change activation of the last layer to linear:
X = dataset[:,0:3]
y = dataset[:,3]
model = Sequential()
model.add(Dense(196, input_dim=3, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(1, activation='linear'))
model.compile(loss='mse', optimizer='adam')
model.fit(X, y, epochs=600, batch_size=10)
I have built up a NN for classification, but when trying to compile I get problems with the dimensions of input and output:
from keras.models import Sequential
from keras.layers import Dense
# data splited into input (X) and output (y) variables
model = Sequential()
model.add(Dense(12, input_dim=456, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(8, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
Here are the dimensions of my y and X
print(y.shape, X.shape)
(8000, 1) (8000, 456, 3)
I have 8000 sub sets which contain 456 particles(x,y,z);
and I have labels which are in y ranging from 0 to 7; this is also why my output layer has 8 nodes.
But when I fit with
model.fit(X, y, epochs=15, batch_size=10)
I do not get why this error occurs:
ValueError: Error when checking input: expected dense_26_input to have 2 dimensions, but got array with shape (8000, 456, 3)
Any suggestions?
To answer your question, you can achieve what you want by doing :
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
model = Sequential()
model.add(Dense(12, input_shape=(456,3), activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(8, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
Edit:
I think what you're looking for is that type of architecture :
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Flatten
model = Sequential()
model.add(Dense(12, input_shape=(456,3), activation='relu'))
model.add(Flatten())
model.add(Dense(8, activation='relu'))
model.add(Dense(8, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
So that it only output the 8 labels
I am training a simple neural network in Keras with Theano backend consisting of 4 dense layers connected to a Merge layer and then to a softmax classifier layer. Using Adam for training, the first few epochs train in about 60s each (in the CPU) but, after that, the training time per epoch starts increasing, taking more than 400s by epoch 70, making it unusable.
Is there anything wrong with my code or is this suppose to happen?
This only happens when using Adam, not with sgd, adadelta, rmsprop or adagrad. I'd use any of the other methods but Adam produces far better results.
The code:
modela = Sequential()
modela.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
modelb = Sequential()
modelb.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
modelc = Sequential()
modelc.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
modeld = Sequential()
modeld.add(Dense(700, input_dim=40, init='uniform', activation='relu'))
model = Sequential()
model.add(Merge([modela, modelb, modelc, modeld], mode='concat', concat_axis=1))
model.add(Dense(258, init='uniform', activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
hist = model.fit([Xa, Xb, Xc, Xd], Ycat, validation_split=.25, nb_epoch=80, batch_size=100, verbose=2)
I'm trying to fit a simple Neural Network to predict a binary target using keras-1.0.6. The output saturates after the very first epoch. I try playing around with the learning rate (from 0.1 to 1e-6), decay and momentum of the SGD optimizer and with the layers (10-512 hidden neurons and 1-2 hidden layers) and their activation functions of the network, but nothing worked - the prediction accuracy was the same.
My training set has shape (13602, 115) and my validation set has shape (3400,115). The target variable y_train and y_test have values encoded as 1 and 0 (60% are 1's and 40% are 0's). At first, the data was not normalized though when I normalized it I got the same results.
Verifying the output, I see that the model is predicting only 1 class. Sometimes it predicts only 1's and other times only 0's (when I tweak the model).
I also tried to encode the target variable in the shape (n_sample, 2) but the output was the same.
I followed some questions here and googling that suggests tunning the learning rate and not using 'softmax' activation but couldn't improve the results.
Some of the models I tried is below:
The simplest model:
model.add(Dense(1, input_dim=X_train.shape[1], activation='sigmoid'))
Model 2:
model = Sequential()
model.add(Dense(512, input_dim=X_train.shape[1]))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
model.add(Activation('sigmoid'))
Model 3
model.add(Dense(64, input_dim=X_train.shape[1], init='uniform', activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
Model 4
model.add(Dense(64, input_dim=X_train.shape[1], init='uniform', activation='sigmoid'))
model.add(Dense(1, input_dim=X_train.shape[1], activation='sigmoid'))
and to compile and fit the model:
sgd = SGD(lr=0.01, decay=0.1, momentum=0.0, nesterov=True)
model.compile(optimizer=sgd, loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train2, nb_epoch=5, batch_size=50, validation_split=0.2)
model.predict(X_test)
The output gives either [0,0,0,0,0,0,0,...] or [1,1,1,1,1,1,1,1,...]
Does anybody have a clue on what's going on here?