I found this paper they present Convolutional Neural Network can get the best accuracy for non-image classify. So, I want to use CNN with non-image dataset. I download Early Stage Diabetes Risk Prediction Dataset form kaggle. I create CNN moldel like this code.
dataset = loadtxt('diabetes_data_upload.csv', delimiter=',')
# split into input (X) and output (y) variables
X = dataset[:,0:16]
Y = dataset[:,16]
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)
model = Sequential()
model.add(Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=100, batch_size=10)
It show error like this.
ValueError: `logits` and `labels` must have the same shape, received ((None, 15, 1) vs (None,)).
How to fix it ?
You can use tf.keras.layers.Flatten(). Something like below can solve youe problem.
from sklearn.model_selection import train_test_split
import tensorflow as tf
import numpy as np
X = np.random.rand(100, 16)
Y = np.random.randint(0,2, size = 100) # <- Because you have two labels, I generate ranom 0,1
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=1, batch_size=10)
Update by thanks Ameya, we can solve this problem by only using tf.keras.layers.GlobalAveragePooling1D() too.
(by thanks Djinn and his_comment, but consider: these are two different approaches that do different things. Flatten() preserves all data, and just converts input tensors to a 1D tensor BUT GlobalAveragePooling1D() tries to generalize and loses data. Pooling layers with non-image data can significantly affect performance but I've noticed AveragePooling does the least "damage,")
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(tf.keras.layers.GlobalAveragePooling1D())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
7/7 [==============================] - 0s 2ms/step - loss: 0.6954 - accuracy: 0.0000e+00
model.predict(x) where x is the same np array i used to train the model(x is obviously without the validation values).
Running this I just get the same value for all 1733 lines of numpy array. If you need code or an example for the np arrays used ask me.
the model is:
dataset = pd.read_csv('BNB.csv')
x = dataset.drop(columns=["Valuable"])
x = np.asarray(x).astype('float32')
y = dataset["Valuable"]
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(256, input_shape=x_train.shape, activation='sigmoid'))
model.add(tf.keras.layers.Dense(256, activation='sigmoid'))
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=1000)
The numpy array (csv file) I used to train and test looks like this:
Valuable,Open,High,Low,Close,EMA8,EMA14,EMA50,ht,sar,MorningStar,Engulfing
-1,355.48,355.82,355.21,355.76,355.21,355.51,357.96,356.63,351.08,0,0
0,355.77,356.2,355.52,355.79,355.34,355.54,357.87,356.51,351.08,0,0
0,355.82,356.61,355.5,356.23,355.54,355.63,357.81,356.44,351.08,0,0
0,356.14,356.17,354.63,354.92,355.4,355.54,357.69,356.46,351.08,0,0
0,354.88,355.54,354.81,354.96,355.3,355.46,357.59,356.55,351.08,0,0
0,354.91,354.91,353.71,354.11,355.04,355.28,357.45,356.59,351.08,0,0
0,354.12,354.93,353.89,354.72,354.97,355.21,357.34,356.44,351.08,0,0
0,354.72,355.2,354.01,354.7,354.91,355.14,357.24,356.21,351.08,0,0
0,354.69,355.46,354.43,355.23,354.98,355.15,357.16,355.9,351.08,0,100
0,355.27,355.47,354.54,355.39,355.07,355.18,357.09,355.57,351.08,0,0
0,355.37,356.0,355.22,355.81,355.24,355.27,357.04,355.31,351.08,0,0
0,355.79,356.23,355.11,355.54,355.3,355.3,356.98,355.15,351.08,0,0
0,355.56,355.67,354.78,355.21,355.28,355.29,356.91,355.08,351.08,0,0
0,355.2,355.63,354.88,355.2,355.26,355.28,356.84,355.06,351.08,0,0
0,355.2,355.99,355.2,355.76,355.37,355.34,356.8,355.08,351.08,0,0
0,355.74,355.97,355.17,355.37,355.37,355.35,356.75,355.14,351.08,0,0
0,355.37,355.38,354.51,354.69,355.22,355.26,356.67,355.19,351.08,0,0
0,354.78,355.4,354.64,355.02,355.18,355.23,356.6,355.23,351.08,0,0
I want to predict whether Valuable is 0, -1, -2, 1 or 2 (my csv file is about 1700 lines long).
There are few problems with your model.
First:
You should use sparse categorical cross entropy loss instead of binary loss for your model if you have more than two classes in output.
Second:
Use softmax activation for the last/output layer.
Third:
Use as many neurons in the last layer as there are classes.
I consider the distinct values in valuable column are: [-1,-2,0,1,2].
First encode your target column like this:
y = dataset["Valuable"] # after this
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y = le.fit_transform(y)
Then Change your model definition like this:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
model = tf.keras.models.Sequential()
# changes
model.add(tf.keras.layers.Dense(256, input_shape=x_train.shape, activation="relu"))
model.add(tf.keras.layers.Dense(256, activation="relu"))
model.add(tf.keras.layers.Dense(5, activation="softmax"))
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=1000)
RandomizedSearchcv accepts only a one-dimensional target variable, but for this binary classification I need to convert y_train and y_test to one-hot variable to keras. I got error 'Supported target types are: ('binary', 'multiclass'). Got 'multilabel-indicator' instead.' Could anyone give me some tips? Many thanks!
def create_baseline():
model = Sequential()
model.add(Reshape((TIME_PERIODS, num_sensors), input_shape=(input_shape,)))
model.add(Conv1D(100, 6, activation='relu', input_shape=(TIME_PERIODS, num_sensors)))
#model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(MaxPooling1D(3))
model.add(Conv1D(100, 6, activation='relu'))
model.add(Dropout(0.5))
model.add(MaxPooling1D(3))
# LSTM
model.add(LSTM(64,return_sequences=True))
model.add(Dropout(0.5))
model.add(LSTM(32,return_sequences=True))
model.add(Dropout(0.5))
model.add(Dense(128, activation="sigmoid", kernel_initializer="uniform"))
model.add(Dropout(0.5))
model.add(GlobalAveragePooling1D())
model.add(Flatten())
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
return model
from sklearn.model_selection import StratifiedKFold,KFold
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
seed=42
#y_train = np_utils.to_categorical(y_train, num_classes)
estimator = KerasClassifier(build_fn=create_baseline, epochs=30, batch_size=800, verbose=1)
# Nested k-fold cross-validation (Subject_dependent)
from sklearn.model_selection import GridSearchCV,cross_val_score, StratifiedKFold,RandomizedSearchCV
#train/validation/test=0.8/0.2/0.2
inner_cv = StratifiedKFold(n_splits = 4,shuffle=True,random_state=42)
outer_cv = StratifiedKFold(n_splits = 5,shuffle=True,random_state=42)
accuracy=[]
p_grid=[]
estimators=[]
#p_grid={'batch_size':[400,800]}
from sklearn.preprocessing import LabelEncoder
#def get_new_labels(y):
#y = LabelEncoder().fit_transform([''.join(str(l)) for l in y])
#return y
#y = get_new_labels(y)
for train_index, test_index in outer_cv.split(x,y):
print('Train Index:',train_index,'\n')
print('Test Index:',test_index)
x_train, x_test = x[train_index], x[test_index]
y_train, y_test = y[train_index], y[test_index]
y_train = np_utils.to_categorical(y_train, num_classes)
y_test = np_utils.to_categorical(y_test, num_classes)
grid = RandomizedSearchCV(estimator=estimator,
param_distributions=p_grid,
cv=inner_cv,
refit='roc_auc_scorer',
return_train_score=True,
verbose=1,n_jobs=-1,n_iter=20)
grid.fit(x_train, y_train)
estimators.append(grid.best_estimator_)
prediction = grid.predict(x_test)
accuracy.append(grid.score(x_test,y_test))
print('Accuracy:{}'.format(accuracy))
In binary classification it's either a dog, or not a dog, and your encoded labels would just be 1's or 0's:
[[0] <- single row label
[1] <- single row label
[0]]
In multiclass classification it's either a dog, cat, or a bird, and not more than one i.e. they are mutually exclusive So your encoded labels look like :
[[0,0,1] <-- a single rows encoded label
[1,0,0] <-- another rows encoded label
[0,1,0]]
Multilabel classification is different, it accepts label sets which are not mutually exclusive i.e. it can be a building, as well as a house, as well as an office i.e:
[[1,1,1]
[0,0,1]
[1,0,1]]
The problem here is that it looks like you're passing multilabel labels to your classifier - you should double check your labels and make sure that there is only a 1 or a 0 for each row of training data if that is what you need.
Using to_categorical for binary classification is fine, however you might want to double check that num_classes=2 for binary classification.
Also, if it is a binary classification problem, your final Dense layer activation needs to be 'sigmoid' not 'softmax'. See here for notes.
Can someone please explain why the following code achieves only about 50% classification accuracy?
I am trying to classify lists of 20 items into 0 or 1. The lists are all 5s or all 6s.
import numpy as np
import keras
from sklearn.model_selection import train_test_split
positive_samples = [[5]*20]*100
negative_samples = [[6]*20]*100
x_list = np.array(positive_samples+negative_samples, dtype=np.float32)
y_list = np.array([1]*len(positive_samples)+[0]*len(negative_samples), dtype=np.float32)
x_train, x_test, y_train, y_test = train_test_split(x_list, y_list, test_size=0.20, random_state=42)
y_train = keras.utils.to_categorical(y_train, 2)
y_test = keras.utils.to_categorical(y_test, 2)
model = keras.models.Sequential()
model.add(keras.layers.Dense(10, input_dim=x_train.shape[1], kernel_initializer='normal', activation='relu'))
model.add(keras.layers.Dense(5, kernel_initializer='normal', activation='relu'))
model.add(keras.layers.Dense(2, kernel_initializer='normal', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=10, epochs=20, verbose=2, validation_data=(x_test, y_test))
print (model.evaluate(x_test, y_test, verbose=0))
Since the last output layer has 2 values per sample, you need to use a softmax activation instead of sigmoid.
Also, that means binary_crossentropy cannot be used, and you have to use categorical_crossentropy.
I have also normalized the dataset x_list by dividing with the maximum (6).
x_list /= x_list.max()
Also, you need to shuffle the dataset, by passing shuffle=True in train_test_split.
import numpy as np
import keras
from sklearn.model_selection import train_test_split
positive_samples = [[5]*20]*100
negative_samples = [[6]*20]*100
x_list = np.array(positive_samples+negative_samples, dtype=np.float32)
y_list = np.array([1]*len(positive_samples)+[0]*len(negative_samples), dtype=np.float32)
x_list /= x_list.max()
x_train, x_test, y_train, y_test = train_test_split(x_list, y_list, test_size=0.20, shuffle=True, random_state=42)
y_train = keras.utils.to_categorical(y_train, 2)
y_test = keras.utils.to_categorical(y_test, 2)
model = keras.models.Sequential()
model.add(keras.layers.Dense(10, input_dim=x_train.shape[1], kernel_initializer='normal', activation='relu'))
model.add(keras.layers.Dense(5, kernel_initializer='normal', activation='relu'))
model.add(keras.layers.Dense(2, kernel_initializer='normal', activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=10, epochs=100, verbose=2, validation_data=(x_test, y_test))
print (model.evaluate(x_test, y_test, verbose=0))
A sigmoid activation in the output makes sense only when there is 1 output, in which the value would be in range [0, 1] signifying probability of the instance being a 1.
In case of 2 (or more) output neurons, it is necessary we normalize the probabilities to sum upto 1 so we use a softmax layer instead.
Data should be normalized before feeding it to the network, this is normally done by changing the values to be between 0 and 1 or -1 and 1. Setting the input to;
positive_samples = [[1]*20]*100
negative_samples = [[-1]*20]*100
works or the model could be changed to:
model = keras.models.Sequential()
model.add(BatchNormalization())
model.add(keras.layers.Dense(10, kernel_initializer='normal', activation='relu'))
model.add(keras.layers.Dense(5, kernel_initializer='normal', activation='relu'))
model.add(keras.layers.Dense(2, kernel_initializer='normal', activation='sigmoid'))