CNN training multi label classification---not working - python

Try to predict texture images' labels, an image can contain two labels like['banded', 'striped'], though most of them only have one label.
The output accuracy is extremely high....the first epoch can have 0.96 acc...but the prediction array are all close to 0, which is wrong, there must be at least one number is relevant closed to 1.
Can someone help me?
Thank you!!
Here are the code
Input image = (read by opencv)/255
Multi-labels = First LabelEncoder convert to numbers, then keras.to_categorical
Then I built a CNN model as follow
X_train, X_test, y_train, y_test = train_test_split(img_array, test_value, test_size=0.1)
model = Sequential()
model.add(Conv2D(filters=64, kernel_size=(5, 5), padding='Same', data_format='channels_last', activation='relu',
input_shape=(300, 300, 3)))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Conv2D(filters=32, kernel_size=(3, 3), padding='Same', activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(300, init ='uniform',activation='relu'))
model.add(Dense(285, init = 'uniform',activation='sigmoid'))
model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(X_train, y_train, batch_size= 24, epochs=10, validation_split=0.15)

If your model has only 2 labels, the last layer should be
model.add(Dense(2, init = 'uniform',activation='sigmoid'))
However, your class imbalance can also affect the accuracy. If your class imbalance is too high, your model will show 95%+ training, validation, and testing accuracies but the individual accuracies will still be low and the model will not work for real world data.
The detailed and class-based accuracy can be understood using :
from sklearn.metrics import classification_report
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.30)
X_test1, X_valid, y_test1, y_valid = train_test_split(X_test, y_test, test_size=0.30)
model.fit(X_train, y_train, batch_size=64, epochs=8, shuffle=True, validation_data=(X_test1,y_test1), callbacks=[metrics])
Y_TEST = np.argmax(y_valid, axis=1)
y_pred = model.predict_classes(X_valid)
print("#"*50,"\n",classification_report(Y_TEST, y_pred))
Please share your class distribution for further understanding.

Not sure why the number of neurons in Dense layer is 285. If there are 47 categories, then the output neurons of Dense layer should be 47. Also, use a kernel initializer like he_normal instead of uniform. https://github.com/keras-team/keras-applications/blob/master/keras_applications/resnet50.py
model.add(Dense(47, activation='sigmoid'))
model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy'])
This is a multi-label classification example with 5 classes.
https://github.com/suraj-deshmukh/Keras-Multi-Label-Image-Classification

Related

What is the best approach to build a model with multiple targets on the Y set?

I need to create a model to predict multiple labels based on sixteen (16) input features. The dataset has 4486 instances, each instance has a different number of labels (48 different labels).
This is how the data looks:
X Data example
Y Data example
The challenge is to predict labels on a new instance; I know the learning is the problem because the imbalance in the number of labels, this make the learning a bit difficult.
I will appreciate commets and advice regarding how to tackle this issue.
My best result is 30% in accuracy, but I've noticed it predicts the same labels sometimes and not given any satissfactory results so far...
This is the model I've implemented:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
n_inputs, n_outputs = X_train.shape[1], y_train.shape[1]
nodes = math.sqrt(n_inputs*n_outputs)
model = Sequential()
model.add(Dense(nodes, activation='relu'))
model.add(Flatten())
model.add(Dense(n_outputs, activation = 'sigmoid'))
model.compile(optimizer = 'adam',
loss = 'binary_crossentropy',
metrics = [tf.keras.metrics.BinaryAccuracy(),tf.keras.metrics.AUC(), 'accuracy'])
history = model.fit(X_train, y_train, epochs=300, verbose=1, shuffle=True, validation_data=
(X_test, y_test), batch_size=8)

Conv1D for classify non-image dataset show error ValueError : `logits` and `labels` must have the same shape

I found this paper they present Convolutional Neural Network can get the best accuracy for non-image classify. So, I want to use CNN with non-image dataset. I download Early Stage Diabetes Risk Prediction Dataset form kaggle. I create CNN moldel like this code.
dataset = loadtxt('diabetes_data_upload.csv', delimiter=',')
# split into input (X) and output (y) variables
X = dataset[:,0:16]
Y = dataset[:,16]
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)
model = Sequential()
model.add(Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=100, batch_size=10)
It show error like this.
ValueError: `logits` and `labels` must have the same shape, received ((None, 15, 1) vs (None,)).
How to fix it ?
You can use tf.keras.layers.Flatten(). Something like below can solve youe problem.
from sklearn.model_selection import train_test_split
import tensorflow as tf
import numpy as np
X = np.random.rand(100, 16)
Y = np.random.randint(0,2, size = 100) # <- Because you have two labels, I generate ranom 0,1
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=1, batch_size=10)
Update by thanks Ameya, we can solve this problem by only using tf.keras.layers.GlobalAveragePooling1D() too.
(by thanks Djinn and his_comment, but consider: these are two different approaches that do different things. Flatten() preserves all data, and just converts input tensors to a 1D tensor BUT GlobalAveragePooling1D() tries to generalize and loses data. Pooling layers with non-image data can significantly affect performance but I've noticed AveragePooling does the least "damage,")
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(tf.keras.layers.GlobalAveragePooling1D())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
7/7 [==============================] - 0s 2ms/step - loss: 0.6954 - accuracy: 0.0000e+00

Tensorflow (keras) model gives me always the same value

model.predict(x) where x is the same np array i used to train the model(x is obviously without the validation values).
Running this I just get the same value for all 1733 lines of numpy array. If you need code or an example for the np arrays used ask me.
the model is:
dataset = pd.read_csv('BNB.csv')
x = dataset.drop(columns=["Valuable"])
x = np.asarray(x).astype('float32')
y = dataset["Valuable"]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(256, input_shape=x_train.shape, activation='sigmoid'))
model.add(tf.keras.layers.Dense(256, activation='sigmoid'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=1000)
The numpy array (csv file) I used to train and test looks like this:
Valuable,Open,High,Low,Close,EMA8,EMA14,EMA50,ht,sar,MorningStar,Engulfing
-1,355.48,355.82,355.21,355.76,355.21,355.51,357.96,356.63,351.08,0,0
0,355.77,356.2,355.52,355.79,355.34,355.54,357.87,356.51,351.08,0,0
0,355.82,356.61,355.5,356.23,355.54,355.63,357.81,356.44,351.08,0,0
0,356.14,356.17,354.63,354.92,355.4,355.54,357.69,356.46,351.08,0,0
0,354.88,355.54,354.81,354.96,355.3,355.46,357.59,356.55,351.08,0,0
0,354.91,354.91,353.71,354.11,355.04,355.28,357.45,356.59,351.08,0,0
0,354.12,354.93,353.89,354.72,354.97,355.21,357.34,356.44,351.08,0,0
0,354.72,355.2,354.01,354.7,354.91,355.14,357.24,356.21,351.08,0,0
0,354.69,355.46,354.43,355.23,354.98,355.15,357.16,355.9,351.08,0,100
0,355.27,355.47,354.54,355.39,355.07,355.18,357.09,355.57,351.08,0,0
0,355.37,356.0,355.22,355.81,355.24,355.27,357.04,355.31,351.08,0,0
0,355.79,356.23,355.11,355.54,355.3,355.3,356.98,355.15,351.08,0,0
0,355.56,355.67,354.78,355.21,355.28,355.29,356.91,355.08,351.08,0,0
0,355.2,355.63,354.88,355.2,355.26,355.28,356.84,355.06,351.08,0,0
0,355.2,355.99,355.2,355.76,355.37,355.34,356.8,355.08,351.08,0,0
0,355.74,355.97,355.17,355.37,355.37,355.35,356.75,355.14,351.08,0,0
0,355.37,355.38,354.51,354.69,355.22,355.26,356.67,355.19,351.08,0,0
0,354.78,355.4,354.64,355.02,355.18,355.23,356.6,355.23,351.08,0,0
I want to predict whether Valuable is 0, -1, -2, 1 or 2 (my csv file is about 1700 lines long).
There are few problems with your model.
First:
You should use sparse categorical cross entropy loss instead of binary loss for your model if you have more than two classes in output.
Second:
Use softmax activation for the last/output layer.
Third:
Use as many neurons in the last layer as there are classes.
I consider the distinct values in valuable column are: [-1,-2,0,1,2].
First encode your target column like this:
y = dataset["Valuable"] # after this
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y = le.fit_transform(y)
Then Change your model definition like this:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
model = tf.keras.models.Sequential()
# changes
model.add(tf.keras.layers.Dense(256, input_shape=x_train.shape, activation="relu"))
model.add(tf.keras.layers.Dense(256, activation="relu"))
model.add(tf.keras.layers.Dense(5, activation="softmax"))
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=1000)

Hyperparameter tuning in keras using nested k-fold cross-validation

RandomizedSearchcv accepts only a one-dimensional target variable, but for this binary classification I need to convert y_train and y_test to one-hot variable to keras. I got error 'Supported target types are: ('binary', 'multiclass'). Got 'multilabel-indicator' instead.' Could anyone give me some tips? Many thanks!
def create_baseline():
model = Sequential()
model.add(Reshape((TIME_PERIODS, num_sensors), input_shape=(input_shape,)))
model.add(Conv1D(100, 6, activation='relu', input_shape=(TIME_PERIODS, num_sensors)))
#model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(MaxPooling1D(3))
model.add(Conv1D(100, 6, activation='relu'))
model.add(Dropout(0.5))
model.add(MaxPooling1D(3))
# LSTM
model.add(LSTM(64,return_sequences=True))
model.add(Dropout(0.5))
model.add(LSTM(32,return_sequences=True))
model.add(Dropout(0.5))
model.add(Dense(128, activation="sigmoid", kernel_initializer="uniform"))
model.add(Dropout(0.5))
model.add(GlobalAveragePooling1D())
model.add(Flatten())
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
return model
from sklearn.model_selection import StratifiedKFold,KFold
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
seed=42
#y_train = np_utils.to_categorical(y_train, num_classes)
estimator = KerasClassifier(build_fn=create_baseline, epochs=30, batch_size=800, verbose=1)
# Nested k-fold cross-validation (Subject_dependent)
from sklearn.model_selection import GridSearchCV,cross_val_score, StratifiedKFold,RandomizedSearchCV
#train/validation/test=0.8/0.2/0.2
inner_cv = StratifiedKFold(n_splits = 4,shuffle=True,random_state=42)
outer_cv = StratifiedKFold(n_splits = 5,shuffle=True,random_state=42)
accuracy=[]
p_grid=[]
estimators=[]
#p_grid={'batch_size':[400,800]}
from sklearn.preprocessing import LabelEncoder
#def get_new_labels(y):
#y = LabelEncoder().fit_transform([''.join(str(l)) for l in y])
#return y
#y = get_new_labels(y)
for train_index, test_index in outer_cv.split(x,y):
print('Train Index:',train_index,'\n')
print('Test Index:',test_index)
x_train, x_test = x[train_index], x[test_index]
y_train, y_test = y[train_index], y[test_index]
y_train = np_utils.to_categorical(y_train, num_classes)
y_test = np_utils.to_categorical(y_test, num_classes)
grid = RandomizedSearchCV(estimator=estimator,
param_distributions=p_grid,
cv=inner_cv,
refit='roc_auc_scorer',
return_train_score=True,
verbose=1,n_jobs=-1,n_iter=20)
grid.fit(x_train, y_train)
estimators.append(grid.best_estimator_)
prediction = grid.predict(x_test)
accuracy.append(grid.score(x_test,y_test))
print('Accuracy:{}'.format(accuracy))
In binary classification it's either a dog, or not a dog, and your encoded labels would just be 1's or 0's:
[[0] <- single row label
[1] <- single row label
[0]]
In multiclass classification it's either a dog, cat, or a bird, and not more than one i.e. they are mutually exclusive So your encoded labels look like :
[[0,0,1] <-- a single rows encoded label
[1,0,0] <-- another rows encoded label
[0,1,0]]
Multilabel classification is different, it accepts label sets which are not mutually exclusive i.e. it can be a building, as well as a house, as well as an office i.e:
[[1,1,1]
[0,0,1]
[1,0,1]]
The problem here is that it looks like you're passing multilabel labels to your classifier - you should double check your labels and make sure that there is only a 1 or a 0 for each row of training data if that is what you need.
Using to_categorical for binary classification is fine, however you might want to double check that num_classes=2 for binary classification.
Also, if it is a binary classification problem, your final Dense layer activation needs to be 'sigmoid' not 'softmax'. See here for notes.

Very simple Keras binary classification doesn't work

Can someone please explain why the following code achieves only about 50% classification accuracy?
I am trying to classify lists of 20 items into 0 or 1. The lists are all 5s or all 6s.
import numpy as np
import keras
from sklearn.model_selection import train_test_split
positive_samples = [[5]*20]*100
negative_samples = [[6]*20]*100
x_list = np.array(positive_samples+negative_samples, dtype=np.float32)
y_list = np.array([1]*len(positive_samples)+[0]*len(negative_samples), dtype=np.float32)
x_train, x_test, y_train, y_test = train_test_split(x_list, y_list, test_size=0.20, random_state=42)
y_train = keras.utils.to_categorical(y_train, 2)
y_test = keras.utils.to_categorical(y_test, 2)
model = keras.models.Sequential()
model.add(keras.layers.Dense(10, input_dim=x_train.shape[1], kernel_initializer='normal', activation='relu'))
model.add(keras.layers.Dense(5, kernel_initializer='normal', activation='relu'))
model.add(keras.layers.Dense(2, kernel_initializer='normal', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=10, epochs=20, verbose=2, validation_data=(x_test, y_test))
print (model.evaluate(x_test, y_test, verbose=0))
Since the last output layer has 2 values per sample, you need to use a softmax activation instead of sigmoid.
Also, that means binary_crossentropy cannot be used, and you have to use categorical_crossentropy.
I have also normalized the dataset x_list by dividing with the maximum (6).
x_list /= x_list.max()
Also, you need to shuffle the dataset, by passing shuffle=True in train_test_split.
import numpy as np
import keras
from sklearn.model_selection import train_test_split
positive_samples = [[5]*20]*100
negative_samples = [[6]*20]*100
x_list = np.array(positive_samples+negative_samples, dtype=np.float32)
y_list = np.array([1]*len(positive_samples)+[0]*len(negative_samples), dtype=np.float32)
x_list /= x_list.max()
x_train, x_test, y_train, y_test = train_test_split(x_list, y_list, test_size=0.20, shuffle=True, random_state=42)
y_train = keras.utils.to_categorical(y_train, 2)
y_test = keras.utils.to_categorical(y_test, 2)
model = keras.models.Sequential()
model.add(keras.layers.Dense(10, input_dim=x_train.shape[1], kernel_initializer='normal', activation='relu'))
model.add(keras.layers.Dense(5, kernel_initializer='normal', activation='relu'))
model.add(keras.layers.Dense(2, kernel_initializer='normal', activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=10, epochs=100, verbose=2, validation_data=(x_test, y_test))
print (model.evaluate(x_test, y_test, verbose=0))
A sigmoid activation in the output makes sense only when there is 1 output, in which the value would be in range [0, 1] signifying probability of the instance being a 1.
In case of 2 (or more) output neurons, it is necessary we normalize the probabilities to sum upto 1 so we use a softmax layer instead.
Data should be normalized before feeding it to the network, this is normally done by changing the values to be between 0 and 1 or -1 and 1. Setting the input to;
positive_samples = [[1]*20]*100
negative_samples = [[-1]*20]*100
works or the model could be changed to:
model = keras.models.Sequential()
model.add(BatchNormalization())
model.add(keras.layers.Dense(10, kernel_initializer='normal', activation='relu'))
model.add(keras.layers.Dense(5, kernel_initializer='normal', activation='relu'))
model.add(keras.layers.Dense(2, kernel_initializer='normal', activation='sigmoid'))

Categories