keras and sigmoid not predict a right class - python

I'm trying to fit a model that predict a target class that can be: 0, 1, 2, 3
during fitting his val_accuracy is: 1.0
but his prediction is like:
array([[1.2150223e-09]], dtype=float32)
X_train.shape
#(1992, 1, 68)
model = Sequential()
model.add(LSTM(128, input_shape=(1,X_train.shape[2])))
model.add(Dense(128, activation="relu",kernel_regularizer=regularizers.l1_l2(l1=1e-5, l2=1e-4)))
model.add(Dropout(0.4))
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer='adam',loss='mae', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=100,batch_size=16, validation_split=0.1, shuffle=True
X_test = np.expand_dims(X_test,1)
y_test = np.expand_dims(y_test,1)
model.evaluate(X_test,y_test)
#[0.0010176461655646563, 1.0]
data = np.expand_dims(data, 1)
model.predict(data) #array([[1.2150223e-09]], dtype=float32) <---- here expected was 0, 1, 2 or 3
data.shape #(1, 1, 68)
I can't undestand what is wrong

Your model has only one output however you have four classes, so you need to change the last Dense layer to model.add(Dense(4, activation="softmax")), sigmoid is usually for binary classification. In your case, there 4 classes, so softmax need to be used. Then you will get probabilities by probabilities = model.predict(data) then use
CATEGORIES[np.argmax(probabilities)] it gives predicted class. By the way, when you want to calculate the loss, use multi-class cross entropy loss: model.compile(.., loss='categorical_crossentropy', ..)
model.add(Dense(4, activation="softmax"))
probabilities = model.predict(data)
print(CATEGORIES[np.argmax(probabilities)])

Related

Why is my multiclass neural model not training (accuracy and loss staying same)?

I am learning neural networks. I get 98% accuracy with classical ML methods, so I think I made a coding error. The neural networks model is not learning.
Things I tried:
Changing X and y to float64 or float32
Normalizing data
Changing the activation to "linear" or "relu"
Removing Flatten()
Adding hidden layers
Using stochastic gradient descent as optimizer, instead of "adam".
Changing the y label with another label
There are 9 labels in X_train and 8 different classes in y_train.
X_train:
y_train:
Code:
model = keras.models.Sequential()
model.add(keras.layers.Input(shape=(9,)))
model.add(keras.layers.Dense(8, activation='softmax'))
model.add(layers.Flatten())
model.compile(optimizer= 'adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
Fitting:
I tried these lines by changing the target label. None of them help training the model. Some give "nan" loss, some go slightly up and down, but all of them are below 0.1% accuracy:
model = tf.keras.Sequential()
model.add(layers.Input(shape=(9,)))
model.add(layers.Dense(1, name='dense1'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=20, batch_size=24)
or this:
model = tf.keras.Sequential()
model.add(layers.Input(shape=(9,)))
model.add(layers.Dense(3, activation='relu', name='relu1'))
model.add(layers.Dense(16, activation='relu', name='relu2'))
model.add(layers.Dense(16, activation='relu', name='relu3'))
model.add(layers.Dense(1, name='dense1'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
history = model.fit(x=X_train, y=y_train, epochs=20)

Preparing Pandas DataFrame for LSTM

I'm trying to fit a LSTM classifier using Keras but don't understand how to prepare the data for training.
I currently have two dataframes for the training data. X_train contains 48 hand-crafted temporal features from IMU data, and y_train contains corresponding labels (4 kinds) representing terrain. The shape of these dataframes is given below:
X_train = X_train.values.reshape(X_train.shape[0],X_train.shape[1],1)
print(X_train.shape, y_train.shape)
**(268320, 48, 1) (268320,)**
Model using batch_size = (32,5,48):
def def_model():
model = Sequential()
model.add(LSTM(units=144,batch_size=(32, 5, 48),return_sequences=True))
model.add(Dropout(0.5))
model.add(Dense(144, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4, activation='softmax'))
model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['categorical_accuracy'])
return model
model_LSTM = def_model()
LSTM_history = model_LSTM.fit(X_train, y_train, epochs=15, validation_data=(X_valid, y_valid), verbose=1)
The error that I am getting:
ValueError: Shapes (32, 1) and (32, 48, 4) are incompatible
Any insight into how to fix this particular error and any intuition into what Keras is expecting?
What is the 5 in your batch size ? The batch_size argument in the LSTM layer indicates that your data should be in the form (batch_size, time_steps, feature_per_time_step). If I am understanding correctly, your data has time_steps = 1 and feature_per_time_step = 48.
Here is a sample of working code and the shape of each of them.
def def_model():
model = Sequential()
model.add(LSTM(units=144,batch_size=(32, 1, 48),return_sequences=True))
model.add(Dropout(0.5))
model.add(Dense(144, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4, activation='softmax'))
model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['categorical_accuracy'])
return model
model_LSTM = def_model()
X_train = np.random.random((10000,1,48))
y_train = np.random.random((10000,4))
y_train = y_train.reshape(-1,1,4)
data = tf.data.Dataset.from_tensor_slices((X_train, y_train)).batch(32)
model_LSTM.fit(data, epochs=15, verbose=1)
Passing data instead of x_train and y_train in your fit function will fit the model properly.
If you want to have 5 timesteps in your data, you will have to create your X_train in such a way to have it have a shape (n_samples,5,48).

ValueError: logits and labels must have the same shape ((1, 7, 7, 2) vs (1, 2))

I'm quite new to CNN.
I'm trying to create a the following model. but I get the following error: "ValueError: logits and labels must have the same shape ((1, 7, 7, 2) vs (1, 2))"
Below the code I'm trying to implement
#create the training data set
train_data=scaled_data[0:training_data_len,:]
#define the number of periods
n_periods=28
#split the data into x_train and y_train data set
x_train=[]
y_train=[]
for i in range(n_periods,len(train_data)):
x_train.append(train_data[i-n_periods:i,:28])
y_train.append(train_data[i,29])
x_train=np.array(x_train)
y_train=np.array(y_train)
#Reshape the train data
x_train=x_train.reshape(x_train.shape[0],x_train.shape[1],x_train.shape[2],1)
x_train.shape
y_train = keras.utils.to_categorical(y_train,2)
# x_train as the folllowing shape (3561, 28, 28, 1)
# y_train as the following shape (3561, 2, 2)
#Build the 2 D CNN model for regression
model= Sequential()
model.add(Conv2D(32,kernel_size=(3,3),padding='same',activation='relu',input_shape=(x_train.shape[1],x_train.shape[2],1)))
model.add(Conv2D(64,kernel_size=(3,3),padding='same',activation='relu'))
model.add(MaxPooling2D(pool_size=(4,4)))
model.add(Dropout(0.25))
model.add(Dense(128,activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='sigmoid'))
model.add(Dense(2, activation='sigmoid'))
model.summary()
#compile the model
model.compile(optimizer='ADADELTA', loss='binary_crossentropy', metrics=['accuracy'])
#train the model
model.fit(x_train, y_train, batch_size=1, epochs=1, verbose=2)
There are two problems in your approach:
You're using Convolutional/MaxPooling layers in which the inputs/outputs are as matrices, i.e., with the shape of (Batch_Size, Height, Width, Depth). You then add some Dense layers which usually expect vectors, not matrices as inputs. Therefore, you have to first flatten the outputs of MaxPooling before giving it to Dense layer, i.e., add a model.add(Flatten()) after model.add(Dropout(0.25)) and before model.add(Dense(128,activation='relu')).
You are doing binary classification, i.e., you have two classes. You are using binary_crossentropy as the loss function, for this to work, you should keep your targets as they are (0 and 1) and not use y_train = keras.utils.to_categorical(y_train,2). Your final layer should have 1 neuron and not 2 (Change model.add(Dense(2, activation='sigmoid')) into model.add(Dense(1, activation='sigmoid')) )

How to use Embedding layer for RNN with a categorical feature - Classification Task for RecoSys

I would like to build a model (RNN >> LSTM) with an Embedding layer for a categorical feature (Item ID), My training set looks so:
train_x = [[[184563.1], [184324.1], [187853.1], [174963.1], [181663.1]], [[…],[…],[…],[…],[…]], …]
I predict the sixth item ID:
train_y = [0,1,2, …., 12691]
I have 12692 unique item IDs, length of timesteps = 5 and this is a classification task.
This is a brief summary for what I've done so far: (Please correct me if I'm wrong)
One-hot-encoding for the categorical feature:
train_x = [[[1 0 0 … 0 0 0], [0 1 0 … 0 0 0], [0 0 1 … 0 0 0], […], […]], [[…],[…],[…],[…],[…]], …]
Build model:
model = Sequential()
model.add(Embedding(input_dim=12692 , output_dim=250, input_length=5))
model.add(LSTM(128, return_sequences=True)
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(LSTM(128, return_sequences=True))
model.add(Dropout(0.1))
model.add(BatchNormalization())
model.add(LSTM(128))
model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(12692, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr=0.001, decay=1e-6)
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
print(model.summary())
history = model.fit(
train_x, train_y,
batch_size=64,
epochs=epochs,
validation_data=(validation_x, validation_y))
score = model.evaluate(validation_x, validation_y, verbose=0)
I get this model summary:
Train on 131204 samples, validate on 107904 samples
But after that, this error appears:
ValueError: Error when checking input: expected embedding_input to have 2 dimensions, but got array with shape (131204, 5, 12692)
Where is my mistake and what would be the solution?
The embedding layer turns positive integers (indexes) into dense vectors of fixed size (Docs). So your train_x is not one-hot-encoded but the integer representing its index in the vocab. It will be the integer corresponding to the categorical feature.
train_x.shape will be (No:of sample X 5) --> Each representing the index of of the categorical feature
train_y.shape will be (No:of sample) --> Each representing the index of the sixth item in your time series.
Working sample
import numpy as np
import keras
from keras.layers import Embedding, LSTM, Dense
n_samples = 100
train_x = np.random.randint(0,12692,size=(n_samples ,5))
train_y = np.random.randint(0,12692,size=(n_samples))
model = keras.models.Sequential()
model.add(Embedding(input_dim=12692+1, output_dim=250, input_length=5))
model.add(LSTM(128, return_sequences=True))
model.add(LSTM(32))
model.add(Dense(32, activation='relu'))
model.add(Dense(12692, activation='softmax'))
opt = keras.optimizers.Adam(lr=0.001, decay=1e-6)
model.compile(
loss='sparse_categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
print(model.summary())
history = model.fit(
train_x, train_y,
batch_size=64,
epochs=32)

Why does a binary Keras CNN always predict 1?

I want to build a binary classifier using a Keras CNN.
I have about 6000 rows of input data which looks like this:
>> print(X_train[0])
[[[-1.06405307 -1.06685851 -1.05989663 -1.06273152]
[-1.06295958 -1.06655996 -1.05969803 -1.06382503]
[-1.06415248 -1.06735609 -1.05999593 -1.06302975]
[-1.06295958 -1.06755513 -1.05949944 -1.06362621]
[-1.06355603 -1.06636092 -1.05959873 -1.06173742]
[-1.0619655 -1.06655996 -1.06039312 -1.06412326]
[-1.06415248 -1.06725658 -1.05940014 -1.06322857]
[-1.06345662 -1.06377347 -1.05890365 -1.06034568]
[-1.06027557 -1.06019084 -1.05592469 -1.05537518]
[-1.05550398 -1.06038988 -1.05225064 -1.05676692]]]
>>> print(y_train[0])
[1]
And then I've build a CNN by this way:
model = Sequential()
model.add(Convolution1D(input_shape = (10, 4),
nb_filter=16,
filter_length=4,
border_mode='same'))
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dropout(0.2))
model.add(Convolution1D(nb_filter=8,
filter_length=4,
border_mode='same'))
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(64))
model.add(BatchNormalization())
model.add(LeakyReLU())
model.add(Dense(1))
model.add(Activation('softmax'))
reduce_lr = ReduceLROnPlateau(monitor='val_acc', factor=0.9, patience=30, min_lr=0.000001, verbose=0)
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
history = model.fit(X_train, y_train,
nb_epoch = 100,
batch_size = 128,
verbose=0,
validation_data=(X_test, y_test),
callbacks=[reduce_lr],
shuffle=True)
y_pred = model.predict(X_test)
But it returns the following:
>> print(confusion_matrix(y_test, y_pred))
[[ 0 362]
[ 0 608]]
Why all predictions are ones? Why does the CNN perform so bad?
Here are the loss and acc charts:
It always predicts one because of the output in your network. You have a Dense layer with one neuron, with a Softmax activation. Softmax normalizes by the sum of exponential of each output. Since there is one output, the only possible output is 1.0.
For a binary classifier you can either use a sigmoid activation with the "binary_crossentropy" loss, or put two output units at the last layer, keep using softmax and change the loss to categorical_crossentropy.

Categories