I'm trying to build a model for multi class classification, but I don't understand how to set the correct input shape. I have a training set with shape (5420, 212) and this is the model I built:
model = models.Sequential()
model.add(layers.Dense(64, activation='relu', input_shape = (5420,)))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(5, activation='softmax'))
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()
history = model.fit(X_train, y_train, epochs=20, batch_size=512)
When I run it I get the error:
ValueError: Input 0 of layer sequential_9 is incompatible with the layer: expected axis -1 of input shape to have value 5420 but received input with shape (None, 212)
Why? Isn't the input value correct?
The input shape should be equal to the length of the input X second dimension, while the output shape should be equal to the length of the output Y second dimension (assuming that both X and Y are 2-dimensional, i.e. that they don't have higher dimensions).
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.datasets import make_classification
from sklearn.preprocessing import OneHotEncoder
tf.random.set_seed(0)
# generate some data
X, y = make_classification(n_classes=5, n_samples=5420, n_features=212, n_informative=212, n_redundant=0, random_state=42)
print(X.shape, y.shape)
# (5420, 212) (5420,)
# one-hot encode the target
Y = OneHotEncoder(sparse=False).fit_transform(y.reshape(-1, 1))
print(X.shape, Y.shape)
# (5420, 212) (5420, 5)
# extract the input and output shapes
input_shape = X.shape[1]
output_shape = Y.shape[1]
print(input_shape, output_shape)
# 212 5
# define the model
model = Sequential()
model.add(Dense(64, activation='relu', input_shape=(input_shape,)))
model.add(Dense(64, activation='relu'))
model.add(Dense(output_shape, activation='softmax'))
# compile the model
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
# fit the model
history = model.fit(X, Y, epochs=3, batch_size=512)
# Epoch 1/3
# 11/11 [==============================] - 0s 1ms/step - loss: 4.8206 - accuracy: 0.2208
# Epoch 2/3
# 11/11 [==============================] - 0s 1ms/step - loss: 2.8060 - accuracy: 0.3229
# Epoch 3/3
# 11/11 [==============================] - 0s 1ms/step - loss: 2.0705 - accuracy: 0.3989
Related
I am getting zero validation accuracy on my LSTM model.
As my model is a many to one model, I am using one unit in the last dense layer. But, it is giving me this accuracy.
536/536 [==============================] - 6s 8ms/step - loss: nan -
accuracy: 0.0000e+00 - val_loss: nan - val_accuracy: 0.0000e+00
<keras.callbacks.History at 0x7efd6b9bc5d0>
My model is:
classifier1 = Sequential()
classifier1.add(CuDNNLSTM(100, input_shape = (x1_train.shape[1], x1_train.shape[2]), return_sequences= True))
# classifier1.add(Dropout(0.02))
classifier1.add(CuDNNLSTM(100))
classifier1.add(Dropout(0.02))
classifier1.add(Dense(100, activation = 'sigmoid'))
# classifier1.add(Dense(300))
classifier1.add(Dropout(0.02))
classifier1.add(Dense(1, activation='softmax'))
# classifier1.add(Dropout(0.02))
# classifier1.add(Dense(1))
early_stopping = EarlyStopping(monitor='val_loss',patience=3, verbose = 1)
callback = [early_stopping]
classifier1.compile(
loss='sparse_categorical_crossentropy',
# loss = 'mean_squared_error',
# optimizer=Adam(learning_rate=0.05, decay= 1e-6),
optimizer='rmsprop',
metrics=['accuracy'])
classifier1.fit(x1_train, y1_train, epochs=1 ,
validation_data=(x1_test, y1_test),
batch_size=50
# class_weight= 'balanced'
# callbacks = callback)
)
The output of your network is of dimension 1, so you should use in the last layer a sigmoid activation instead of a softmax and especially replace the categorical cross entropy by a binary cross entropy, because it is a two class problem. Then it should work.
i am pretty new to ML and trying to do an typical fashion_mnist Classification. The Problem is that the accuracy Score after I run the code is only 0.1 and the loss is below 0. So i guess the ML is not learning but I dont know what the Problem is?
Thx
from tensorflow.keras.datasets import fashion_mnist
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
x_train = x_train.astype('float32')
print(type(x_train))
x_train =x_train.reshape(60000,784)
x_train = x_train / 255.0
x_test =x_test.reshape(60000,784)
x_test= x_test/ 255.0
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
model= Sequential()
model.add(Dense(100, activation="sigmoid", input_shape=(784,)))
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer='sgd', loss="binary_crossentropy", metrics=["accuracy"])
model.fit(
x_train,
y_train,
epochs=10,
batch_size=1000)
Output:
Multiple issues with your code -
You have some error in the reshape x_test = x_test.reshape(10000,784) as it has 10000 images only.
You are using a sigmoid activation in the first dense layer, which is not a good practice. Instead, use relu.
Your output Dense has only 1 node. You are working with a dataset that has 10 unique classes. The output has to be Dense(10). Please understand that even though the y_train has classes 0-10, a neural network can't predict integer values with a softmax or sigmoid activation. Instead what you are trying to do is predict the probability values for EACH of the 10 classes.
You are using the incorrect activation in the final layer for multi-class classification. Use softmax.
You are using the incorrect loss function. For multi-class classification use categorical_crossentropy. Since your output is a 10-dimensional probability distribution, but your y_train is a single value for each class label, you can use sparse_categorical_crossentropy instead which is the same thing but handles label encoded y.
Try using a better optimizer to avoid getting stuck in local minima, such as adam.
It's preferred to use CNNs for image data since a simple Dense layer will not be able to capture the spatial features that make up the image. Since the images are small (28,28) and this is a toy example, it's ok the way it is.
Please refer to this table for checking out what to use. You have to ensure you know what problem you are solving in the first place though.
In your case, you want to do a multi-class single label classification but you are instead doing a multi-class multi-label classification by using the incorrect loss and output layer activation.
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
#Load data
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
#Normalize
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
#Reshape
x_train = x_train.reshape(60000,784)
x_train = x_train / 255.0
x_test = x_test.reshape(10000,784)
x_test = x_test / 255.0
print('Data shapes->',[i.shape for i in [x_train, y_train, x_test, y_test]])
#Contruct computation graph
model = Sequential()
model.add(Dense(100, activation="relu", input_shape=(784,)))
model.add(Dense(10, activation="softmax"))
#Compile with loss as cross_entropy and optimizer as adam
model.compile(optimizer='adam', loss="sparse_categorical_crossentropy", metrics=["accuracy"])
#Fit model
model.fit(x_train, y_train, epochs=10, batch_size=1000)
Data shapes-> [(60000, 784), (60000,), (10000, 784), (10000,)]
Epoch 1/10
60/60 [==============================] - 0s 5ms/step - loss: 0.8832 - accuracy: 0.7118
Epoch 2/10
60/60 [==============================] - 0s 6ms/step - loss: 0.5125 - accuracy: 0.8281
Epoch 3/10
60/60 [==============================] - 0s 6ms/step - loss: 0.4585 - accuracy: 0.8425
Epoch 4/10
60/60 [==============================] - 0s 6ms/step - loss: 0.4238 - accuracy: 0.8547
Epoch 5/10
60/60 [==============================] - 0s 7ms/step - loss: 0.4038 - accuracy: 0.8608
Epoch 6/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3886 - accuracy: 0.8656
Epoch 7/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3788 - accuracy: 0.8689
Epoch 8/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3669 - accuracy: 0.8725
Epoch 9/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3560 - accuracy: 0.8753
Epoch 10/10
60/60 [==============================] - 0s 6ms/step - loss: 0.3451 - accuracy: 0.8794
I am also adding a code for your reference with Convolutional layers, using categorical_crossentropy and functional API instead of Sequential. Please read the comments inline the code for more clarity. This should help you get an idea of some good practices when working with Keras.
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras import layers, Model, utils
#Load data
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
#Normalize
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
#Reshape
x_train = x_train.reshape(60000,28,28,1)
x_train = x_train / 255.0
x_test = x_test.reshape(10000,28,28,1)
x_test = x_test / 255.0
#Set y to onehot instead of label encoded
y_train = utils.to_categorical(y_train)
y_test = utils.to_categorical(y_test)
#print([i.shape for i in [x_train, y_train, x_test, y_test]])
#Contruct computation graph
inp = layers.Input((28,28,1))
x = layers.Conv2D(32, (3,3), activation='relu', padding='same')(inp)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Conv2D(32, (3,3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2,2))(x)
x = layers.Flatten()(x)
out = Dense(10, activation='softmax')(x)
#Define model
model = Model(inp, out)
#Compile with loss as cross_entropy and optimizer as adam
model.compile(optimizer='adam', loss="categorical_crossentropy", metrics=["accuracy"])
#Fit model
model.fit(x_train, y_train, epochs=10, batch_size=1000)
utils.plot_model(model, show_layer_names=False, show_shapes=True)
I am trying to train a simple CNN model for a binary classification task in Keras with a dataset of images I mined. The problem is that I am getting constant accuracy, val_accuracy and loss after a couple of epochs. Am I processing the data the wrong way? Or is it something in the model settings?
At the beginning I was using softmax as the final activation function and categorical cossentropy, I was also using the to_categorical function on the labels.
After reading up on what usually causes this to happen I decided to use sigmoid and binary_crossentropy instead and not use to_categorical. Still the problem persists and I am starting to wonder whether it's my data the problem (the two classes are too similar) or the way I am feeding the image arrays.
conkeras1 = []
pics = os.listdir("/Matrices/")
# I do this for the images of both classes, just keeping it short.
for x in range(len(pics)):
img = image.load_img("Matrices/"+pics[x])
conkeras1.append(img)
conkeras = conkeras1+conkeras2
conkeras = np.array([image.img_to_array(x) for x in conkeras]).astype("float32")
conkeras = conkeras / 255 # I also tried normalizing with a z-score with no success
yecs1 = [1]*len(conkeras1)
yecs2 = [0]*len(conkeras2)
y_train = yecs1+yecs2
y_train = np.array(y_train).astype("float32")
model = Sequential([
Conv2D(64, (3, 3), input_shape=conkeras.shape[1:], padding="same", activation="relu"),
Conv2D(32, (3, 3), activation="relu", padding="same"),
Flatten(),
Dense(500, activation="relu"),
#Dense(4096, activation="relu"),
Dense(1, activation="sigmoid")
])
model.compile(loss=keras.losses.binary_crossentropy,
optimizer=keras.optimizers.Adam(lr=0.001),
metrics=['accuracy'])
history = model.fit(conkeras, y_train,
batch_size=32,
epochs=32, shuffle=True,
verbose=1,
callbacks=[tensorboard])
The output I get is this:
975/975 [==============================] - 107s 110ms/step - loss: 8.0022 - acc: 0.4800
Epoch 2/32
975/975 [==============================] - 99s 101ms/step - loss: 8.1756 - acc: 0.4872
Epoch 3/32
975/975 [==============================] - 97s 100ms/step - loss: 8.1756 - acc: 0.4872
Epoch 4/32
975/975 [==============================] - 97s 99ms/step - loss: 8.1756 - acc: 0.4872
and these are the shapes of the traning set and labels
>>> conkeras.shape
(975, 100, 100, 3)
>>> y_train.shape
(975,)
I'm new to tensorflow keras and dataset. Can anyone help me understand why the following code doesn't work?
import tensorflow as tf
import tensorflow.keras as keras
import numpy as np
from tensorflow.python.data.ops import dataset_ops
from tensorflow.python.data.ops import iterator_ops
from tensorflow.python.keras.utils import multi_gpu_model
from tensorflow.python.keras import backend as K
data = np.random.random((1000,32))
labels = np.random.random((1000,10))
dataset = tf.data.Dataset.from_tensor_slices((data,labels))
print( dataset)
print( dataset.output_types)
print( dataset.output_shapes)
dataset.batch(10)
dataset.repeat(100)
inputs = keras.Input(shape=(32,)) # Returns a placeholder tensor
# A layer instance is callable on a tensor, and returns a tensor.
x = keras.layers.Dense(64, activation='relu')(inputs)
x = keras.layers.Dense(64, activation='relu')(x)
predictions = keras.layers.Dense(10, activation='softmax')(x)
# Instantiate the model given inputs and outputs.
model = keras.Model(inputs=inputs, outputs=predictions)
# The compile step specifies the training configuration.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])
# Trains for 5 epochs
model.fit(dataset, epochs=5, steps_per_epoch=100)
It failed with the following error:
model.fit(x=dataset, y=None, epochs=5, steps_per_epoch=100)
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 1510, in fit
validation_split=validation_split)
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 994, in _standardize_user_data
class_weight, batch_size)
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 1113, in _standardize_weights
exception_prefix='input')
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training_utils.py", line 325, in standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking input: expected input_1 to have 2 dimensions, but got array with shape (32,)
According to tf.keras guide, I should be able to directly pass the dataset to model.fit, as this example shows:
Input tf.data datasets
Use the Datasets API to scale to large datasets or multi-device training. Pass a tf.data.Dataset instance to the fit method:
# Instantiates a toy dataset instance:
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32)
dataset = dataset.repeat()
Don't forget to specify steps_per_epoch when calling fit on a dataset.
model.fit(dataset, epochs=10, steps_per_epoch=30)
Here, the fit method uses the steps_per_epoch argument—this is the number of training steps the model runs before it moves to the next epoch. Since the Dataset yields batches of data, this snippet does not require a batch_size.
Datasets can also be used for validation:
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32).repeat()
val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_labels))
val_dataset = val_dataset.batch(32).repeat()
model.fit(dataset, epochs=10, steps_per_epoch=30,
validation_data=val_dataset,
validation_steps=3)
What's the problem with my code, and what's the correct way of doing it?
To your original question as to why you're getting the error:
Error when checking input: expected input_1 to have 2 dimensions, but got array with shape (32,)
The reason your code breaks is because you haven't applied the .batch() back to the dataset variable, like so:
dataset = dataset.batch(10)
You simply called dataset.batch().
This breaks because without the batch() the output tensors are not batched, i.e. you get shape (32,) instead of (1,32).
You are missing defining an iterator which is the reason why there is an error.
data = np.random.random((1000,32))
labels = np.random.random((1000,10))
dataset = tf.data.Dataset.from_tensor_slices((data,labels))
dataset = dataset.batch(10).repeat()
inputs = Input(shape=(32,)) # Returns a placeholder tensor
# A layer instance is callable on a tensor, and returns a tensor.
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)
# Instantiate the model given inputs and outputs.
model = keras.Model(inputs=inputs, outputs=predictions)
# The compile step specifies the training configuration.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])
# Trains for 5 epochs
model.fit(dataset.make_one_shot_iterator(), epochs=5, steps_per_epoch=100)
Epoch 1/5
100/100 [==============================] - 1s 8ms/step - loss: 11.5787 - acc: 0.1010
Epoch 2/5
100/100 [==============================] - 0s 4ms/step - loss: 11.4846 - acc: 0.0990
Epoch 3/5
100/100 [==============================] - 0s 4ms/step - loss: 11.4690 - acc: 0.1270
Epoch 4/5
100/100 [==============================] - 0s 4ms/step - loss: 11.4611 - acc: 0.1300
Epoch 5/5
100/100 [==============================] - 0s 4ms/step - loss: 11.4546 - acc: 0.1360
This is the result on my system.
please tell me what I did wrong, why the accuracy does not increase?
I tried everything, added layers, increased and decreased the number of iterations, even tried to install dropout (even though I do not have retraining here), but it does not work out :(
from __future__ import print_function
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.utils import np_utils
from keras.layers import Dropout
np.random.seed()
NB_EPOCH = 100
VERBOSE = 1
NB_CLASSES = 2
X_in = [[1,0],[1,1],[0,0],[0,1],[1,1],[0,0],[1,1]]
X_answer = [1,0,0,1,0,0,0]
X_in = np.asarray(X_in, dtype=np.float32)
X_answer = np.asarray(X_answer, dtype=np.float32)
X_answer = np_utils.to_categorical(X_answer, NB_CLASSES)
model = Sequential()
model.add(Dense(300, input_dim = 2, activation='relu'))
model.add(Dense(300, input_dim = 300, activation='softmax'))
model.add(Dense(2, input_dim = 300, activation='relu'))
#model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X_in, X_answer, epochs=NB_EPOCH, verbose=VERBOSE)
There are only 4 possible outcomes from XOR operation, I have changed your source a bit, so it works just fine now, however requires a few hundred iterations to get to learn the necessary stuff:
#!/usr/bin/env python
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.utils import np_utils
from keras.layers import Dropout
np.random.seed()
NB_EPOCH = 1000
VERBOSE = 1
NB_CLASSES = 2
X_in = [[1,0],[1,1],[0,0],[0,1]]
X_answer = [[0,1],[1,0],[1,0],[0,1]]
X_in = np.asarray(X_in, dtype=np.float32)
X_answer = np.asarray(X_answer, dtype=np.float32)
#X_answer = np_utils.to_categorical(X_answer, NB_CLASSES)
model = Sequential()
model.add(Dense(10, input_dim = 2, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(2, activation='softmax'))
#model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X_in, X_answer, nb_epoch=NB_EPOCH, verbose=VERBOSE)
print model.predict( X_in )
the result is:
Epoch 995/1000
4/4 [==============================] - 0s - loss: 0.1393 - acc: 1.0000
Epoch 996/1000
4/4 [==============================] - 0s - loss: 0.1390 - acc: 1.0000
Epoch 997/1000
4/4 [==============================] - 0s - loss: 0.1387 - acc: 1.0000
Epoch 998/1000
4/4 [==============================] - 0s - loss: 0.1385 - acc: 1.0000
Epoch 999/1000
4/4 [==============================] - 0s - loss: 0.1383 - acc: 1.0000
Epoch 1000/1000
4/4 [==============================] - 0s - loss: 0.1380 - acc: 1.0000
[[ 0.00492113 0.9950788 ]
[ 0.99704748 0.0029525 ]
[ 0.99383503 0.00616499]
[ 0.00350395 0.99649602]]
which is really close to the required [0,1],[1,0],[1,0],[0,1] (X_answer)
There are quite a few problems here. No, it's not impossible (*)
Dropout has nothing to do with your problem (**). You're using a softmax and then a relu? That seems odd to me. Why are you casting to categorical? Why are you using such a big network when you have such small input examples (10 samples, but 300 paramters on the hidden layer)?
From here, a minimal example of xor with keras:
X = np.array([[0,0],[0,1],[1,0],[1,1]])
y = np.array([[0],[1],[1],[0]])
model = Sequential()
model.add(Dense(8, input_dim=2))
model.add(Activation('tanh'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
sgd = SGD(lr=0.1)
model.compile(loss='binary_crossentropy', optimizer=sgd)
model.fit(X, y, show_accuracy=True, batch_size=1, nb_epoch=1000)
print(model.predict_proba(X))
(*) You're using a multilayer perceptron, but the title grabbed my attention because there's a (relatively) famous proof that an NN without a hidden layer can't learn xor. Proof example here
(**) Dropout helps with generalization when you have deep networks. This is vital when training large models to detect complex high dimensional structures like humans in images. It is going to make your life a lot more difficult when trying to fit xor.