I am training an image classification model to classify certain images containing mountain, waterfall and people classes. Currently, I am using Vgg-net (transfer learning) to train this data. While training, I am getting almost 93% accuracy on training data and 90% accuracy on validation data. However, when I want to check the classes being wrongly classified during training by using confusion matrix on the training data, the accuracy seems to be much less. Only about 30% of the data is classified correctly.
I have tried checking the confusion matrix on other similar image classification problems but there it seems to be showing the accuracy that we see while training.
Code to create ImageDataGenerator objects for training and validation
from keras.applications.inception_v3 import preprocess_input, decode_predictions
#Train DataSet Generator with Augmentation
print("\nTraining Data Set")
train_generator = ImageDataGenerator(preprocessing_function=preprocess_input)
train_flow = train_generator.flow(
train_images, train_labels,
batch_size = BATCH_SIZE
)
#Validation DataSet Generator with Augmentation
print("\nValidation Data Set")
val_generator = ImageDataGenerator(preprocessing_function=preprocess_input)
val_flow = val_generator.flow(
validation_images,validation_labels,
batch_size = BATCH_SIZE
)
Code to build the model and compile it
# Initialize InceptionV3 with transfer learning
base_model = applications.vgg16.VGG16(weights='imagenet',
include_top=False,
input_shape=(WIDTH, HEIGHT,3))
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# and a dense layer
x = Dense(1024, activation='relu')(x)
predictions = Dense(len(categories), activation='softmax')(x)
# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
layer.trainable = False
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
# compile the model (should be done *after* setting layers to non-trainable)
model.compile(optimizer=optimizers.RMSprop(lr=1e-4), metrics=['accuracy'], loss='categorical_crossentropy')
model.summary()
Code to fit the model to the training dataset and validate using validation dataset
import math
top_layers_file_path=r"/content/drive/My Drive/intel-image-classification/intel-image-classification/top_layers.iv3.hdf5"
checkpoint = ModelCheckpoint(top_layers_file_path, monitor='loss', verbose=1, save_best_only=True, mode='min')
tb = TensorBoard(log_dir=r'/content/drive/My Drive/intel-image-classification/intel-image-classification/logs', batch_size=BATCH_SIZE, write_graph=True, update_freq='batch')
early = EarlyStopping(monitor="loss", mode="min", patience=5)
csv_logger = CSVLogger(r'/content/drive/My Drive/intel-image-classification/intel-image-classification/logs/iv3-log.csv', append=True)
history = model.fit_generator(train_flow,
epochs=30,
verbose=1,
validation_data=val_flow,
validation_steps=math.ceil(validation_images.shape[0]/BATCH_SIZE),
steps_per_epoch=math.ceil(train_images.shape[0]/BATCH_SIZE),
callbacks=[checkpoint, early, tb, csv_logger])
training steps show the following accuracy:
Epoch 1/30
91/91 [==============================] - 44s 488ms/step - loss: 0.6757 - acc: 0.7709 - val_loss: 0.4982 - val_acc: 0.8513
Epoch 2/30
91/91 [==============================] - 32s 349ms/step - loss: 0.4454 - acc: 0.8395 - val_loss: 0.3980 - val_acc: 0.8557
.
.
Epoch 20/30
91/91 [==============================] - 32s 349ms/step - loss: 0.2026 - acc: 0.9238 - val_loss: 0.2684 - val_acc: 0.8940
.
.
Epoch 30/30
91/91 [==============================] - 32s 349ms/step - loss: 0.1739 - acc: 0.9364 - val_loss: 0.2616 - val_acc: 0.8984
Ran predictions on the training dataset itself:
import math
import numpy as np
predictions = model.predict_generator(
train_flow,
verbose=1,
steps=math.ceil(train_images.shape[0]/BATCH_SIZE))
predicted_classes = [x[0] for x in enc.inverse_transform(predictions)]
true_classes = [x[0] for x in enc.inverse_transform(train_labels)]
enc - OneHotEncoder
However, the confusion matrix looks like the following:
import sklearn
sklearn.metrics.confusion_matrix(
true_classes,
predicted_classes,
labels=['mountain','people','waterfall'])
Confusion Matrix (Could not upload a better looking picture)
([[315, 314, 283],
[334, 309, 273],
[337, 280, 263]])
The complete code has been uploaded on https://nbviewer.jupyter.org/github/paridhichoudhary/scene-image-classification/blob/master/Classification_v7.ipynb
I believe it's because of the train_generator.flow has shuffle=True by default. This result in predicted_classes doesn't match the train_labels.
Maybe set shuffle=False on train_generator.flow should help this, or instead use something like this might easier to understand.
predicted_classes = []
true_classes = []
for i in range(len(train_flow)): # you can just use `len(train_flow)` instead of `math.ceil(train_images.shape[0]/BATCH_SIZE))`
x, y = train_flow[i] # you can use `train_flow[i]` like this
z = model.predict(x)
for j in range(x.shape[0]):
predicted_classes.append(z[j])
true_classes.append(y[j])
I didn't try this yet but it should work though.
Related
I am using model.fit() several times, each time is responsible for training a block of layers where other layers are freezed
CODE
# create the base pre-trained model
base_model = efn.EfficientNetB0(input_tensor=input_tensor,weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# add a fully-connected layer
x = Dense(x.shape[1], activation='relu',name='first_dense')(x)
x=Dropout(0.5)(x)
x = Dense(x.shape[1], activation='relu',name='output')(x)
x=Dropout(0.5)(x)
no_classes=10
predictions = Dense(no_classes, activation='softmax')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
# first: train only the top layers (which were randomly initialized)
# i.e. freeze all convolutional layers
for layer in base_model.layers:
layer.trainable = False
#FIRST COMPILE
model.compile(optimizer='Adam', loss=loss_function,
metrics=['accuracy'])
#FIRST FIT
model.fit(features[train], labels[train],
batch_size=batch_size,
epochs=top_epoch,
verbose=verbosity,
validation_split=validation_split)
# Generate generalization metrics
scores = model.evaluate(features[test], labels[test], verbose=1)
print(scores)
#Let all layers be trainable
for layer in model.layers:
layer.trainable = True
from tensorflow.keras.optimizers import SGD
#FIRST COMPILE
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss=loss_function,
metrics=['accuracy'])
#SECOND FIT
model.fit(features[train], labels[train],
batch_size=batch_size,
epochs=no_epochs,
verbose=verbosity,
validation_split=validation_split)
What is weird is that in the second fit, accuracy resulted from first epoch is much lower that the accuracy of the last epoch of the first fit.
RESULT
Epoch 40/40
6286/6286 [==============================] - 14s 2ms/sample - loss: 0.2370 - accuracy: 0.9211 - val_loss: 1.3579 - val_accuracy: 0.6762
874/874 [==============================] - 2s 2ms/sample - loss: 0.4122 - accuracy: 0.8764
Train on 6286 samples, validate on 1572 samples
Epoch 1/40
6286/6286 [==============================] - 60s 9ms/sample - loss: 5.9343 - accuracy: 0.5655 - val_loss: 2.4981 - val_accuracy: 0.5115
I think the weights of the second fit are not taken from the first fit
Thanks in advance!!!
I think this is the result of using a different optimizer. You used Adam in the first fit and SGD in the second fit. Try using Adam in the second fit and see if it works correctly
I solved this by removing the second compiler.
I am currently working on a project involving training a regression model, saving it and then loading it to make further predictions using that model. However I'm having a problem. Each time that I model.predict on images it gives out the same predictions. I am not entirely sure what the problem is, maybe it's in the training stage or i'm just doing something wrong.
I was following this tutorial
All of the files are in this github repo
Here are some bits from the code:
(This part is training the model and saving it)
model = create_cnn(400, 400, 3, regress=True)
opt = Adam(lr=1e-3, decay=1e-3 / 200)
model.compile(loss="mean_absolute_percentage_error", optimizer=opt)
model.fit(X, Y, epochs=70, batch_size=8)
model.save("D:/statispic2/final-statispic_model.hdf5")
The next code part is from loading the model and making predictions.
model = load_model("D:/statispic2/statispic_model.hdf5") # Loading the model
prediction = model.predict(images_ready_for_prediction) #images ready for prediction include a numpy array
#that is loaded with the images just like I loaded them for the training stage.
print(prediction_list)
After trying it out this is the output prediction from the model:
[[0.05169942] # I gave it 5 images as parameters
[0.05169942]
[0.05169942]
[0.05169942]
[0.05169942]]
If anything is unclear, or you would like to see some more code, please let me know.
People saying regression and CNN are two completely different things clearly have missed some basic learnings in their ML course. Yes they are completely different! But should not be compared ;)
CNN is a type of deep neural network usually which became quite famous for its use on images. Therefore it is a framework to solve problem, and can solve both regression AND classification problems.
Regression refers to the type of output you are predicting. So comparing the two directly is quite stupid to be honest.
I cant comment on the specific people misleading you in this section, since I need a specific number of points to do so.
However, back to the problem. Do you encounter this problem before or after saving it? If you encounter it before, I would try scaling your output values to an easier distribution. If it happens after you save, I would look into versions of your framework and the documentation of how they save it.
It could also just be that there is no information in the pictures.
No, no, no! Regression is completely different from CNN. Do a little research and the differences will quickly become apparent. In the meantime, I'll share two code samples with you right here.
Regression:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
#%matplotlib inline
import sklearn
from sklearn.datasets import load_boston
boston = load_boston()
# Now we will load the data into a pandas dataframe and then will print the first few rows of the data using the head() function.
bos = pd.DataFrame(boston.data)
bos.head()
bos.columns = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT']
bos.head()
bos['MEDV'] = boston.target
bos.describe()
bos.isnull().sum()
sns.distplot(bos['MEDV'])
plt.show()
sns.pairplot(bos)
corr_mat = bos.corr().round(2)
sns.heatmap(data=corr_mat, annot=True)
sns.lmplot(x = 'RM', y = 'MEDV', data = bos)
X = bos[['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX','PTRATIO', 'B', 'LSTAT']]
y = bos['MEDV']
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 10)
# Training the Model
# We will now train our model using the LinearRegression function from the sklearn library.
from sklearn.linear_model import LinearRegression
lm = LinearRegression()
lm.fit(X_train, y_train)
# Prediction
# We will now make prediction on the test data using the LinearRegression function and plot a scatterplot between the test data and the predicted value.
prediction = lm.predict(X_test)
plt.scatter(y_test, prediction)
df1 = pd.DataFrame({'Actual': y_test, 'Predicted':prediction})
df2 = df1.head(10)
df2
df2.plot(kind = 'bar')
from sklearn import metrics
from sklearn.metrics import r2_score
print('MAE', metrics.mean_absolute_error(y_test, prediction))
print('MSE', metrics.mean_squared_error(y_test, prediction))
print('RMSE', np.sqrt(metrics.mean_squared_error(y_test, prediction)))
print('R squared error', r2_score(y_test, prediction))
Result:
MAE 4.061419182954711
MSE 34.413968453138565
RMSE 5.866341999333023
R squared error 0.6709339839115628
CNN:
# keras imports for the dataset and building our neural network
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Conv2D, MaxPool2D, Flatten
from keras.utils import np_utils
# to calculate accuracy
from sklearn.metrics import accuracy_score
# loading the dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# building the input vector from the 28x28 pixels
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
# normalizing the data to help with the training
X_train /= 255
X_test /= 255
# one-hot encoding using keras' numpy-related utilities
n_classes = 10
print("Shape before one-hot encoding: ", y_train.shape)
Y_train = np_utils.to_categorical(y_train, n_classes)
Y_test = np_utils.to_categorical(y_test, n_classes)
print("Shape after one-hot encoding: ", Y_train.shape)
# building a linear stack of layers with the sequential model
model = Sequential()
# convolutional layer
model.add(Conv2D(25, kernel_size=(3,3), strides=(1,1), padding='valid', activation='relu', input_shape=(28,28,1)))
model.add(MaxPool2D(pool_size=(1,1)))
# flatten output of conv
model.add(Flatten())
# hidden layer
model.add(Dense(100, activation='relu'))
# output layer
model.add(Dense(10, activation='softmax'))
# compiling the sequential model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
# training the model for 10 epochs
model.fit(X_train, Y_train, batch_size=128, epochs=10, validation_data=(X_test, Y_test))
Result:
Train on 60000 samples, validate on 10000 samples
Epoch 1/10
60000/60000 [==============================] - 27s 451us/step - loss: 0.2037 - accuracy: 0.9400 - val_loss: 0.0866 - val_accuracy: 0.9745
Epoch 2/10
60000/60000 [==============================] - 27s 451us/step - loss: 0.0606 - accuracy: 0.9819 - val_loss: 0.0553 - val_accuracy: 0.9812
Epoch 3/10
60000/60000 [==============================] - 27s 445us/step - loss: 0.0352 - accuracy: 0.9892 - val_loss: 0.0533 - val_accuracy: 0.9824
Epoch 4/10
60000/60000 [==============================] - 27s 446us/step - loss: 0.0226 - accuracy: 0.9930 - val_loss: 0.0572 - val_accuracy: 0.9825
Epoch 5/10
60000/60000 [==============================] - 27s 448us/step - loss: 0.0148 - accuracy: 0.9959 - val_loss: 0.0516 - val_accuracy: 0.9834
Epoch 6/10
60000/60000 [==============================] - 27s 443us/step - loss: 0.0088 - accuracy: 0.9976 - val_loss: 0.0574 - val_accuracy: 0.9824
Epoch 7/10
60000/60000 [==============================] - 26s 442us/step - loss: 0.0089 - accuracy: 0.9973 - val_loss: 0.0526 - val_accuracy: 0.9847
Epoch 8/10
60000/60000 [==============================] - 26s 440us/step - loss: 0.0047 - accuracy: 0.9988 - val_loss: 0.0593 - val_accuracy: 0.9838
Epoch 9/10
60000/60000 [==============================] - 28s 469us/step - loss: 0.0056 - accuracy: 0.9986 - val_loss: 0.0559 - val_accuracy: 0.9836
Epoch 10/10
60000/60000 [==============================] - 27s 449us/step - loss: 0.0059 - accuracy: 0.9981 - val_loss: 0.0663 - val_accuracy: 0.9820
CNN is deep learning. You use regression models for calculating a number, like the price of a car.
I had the exact same issue after pickle.dump and pickle.load my model. The problem I was missing is that I was not normalizing features (vector X) before predicting using the model. I hope the will help you.
Changing the optimizer from Adam() to RMSprop() with a learning rate of >0.001 worked for me.
I made Keras NN model for fake news detection. My features are avg length of the words, avg length of the sentence, number of punctuation signs, number of capital words, number of questions etc. I have 34 features. I have one output, 0 and 1 (0 for fake and 1 for real news).
I have used 50000 samples for training, 10000 for testing and 2000 for validation. Values of my data are going from -1 to 10, so there is not big difference between values. I have used Standard Scaler like this:
x_train, x_test, y_train, y_test = train_test_split(features, results, test_size=0.20, random_state=0)
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)
validation_features = scaler.transform(validation_features)
My NN:
model = Sequential()
model.add(Dense(34, input_dim = x_train.shape[1], activation = 'relu')) # input layer requires input_dim param
model.add(Dense(150, activation = 'relu'))
model.add(Dense(150, activation = 'relu'))
model.add(Dense(150, activation = 'relu'))
model.add(Dense(150, activation = 'relu'))
model.add(Dense(150, activation = 'relu'))
model.add(Dense(1, activation='sigmoid')) # sigmoid instead of relu for final probability between 0 and 1
model.compile(loss="binary_crossentropy", optimizer= "adam", metrics=['accuracy'])
es = EarlyStopping(monitor='val_loss', min_delta=0.0, patience=0, verbose=0, mode='auto')
model.fit(x_train, y_train, epochs = 15, shuffle = True, batch_size=64, validation_data=(validation_features, validation_results), verbose=2, callbacks=[es])
scores = model.evaluate(x_test, y_test)
print(model.metrics_names[0], round(scores[0]*100,2), model.metrics_names[1], round(scores[1]*100,2))
Results:
Train on 50407 samples, validate on 2000 samples
Epoch 1/15
- 3s - loss: 0.3293 - acc: 0.8587 - val_loss: 0.2826 - val_acc: 0.8725
Epoch 2/15
- 1s - loss: 0.2647 - acc: 0.8807 - val_loss: 0.2629 - val_acc: 0.8745
Epoch 3/15
- 1s - loss: 0.2459 - acc: 0.8885 - val_loss: 0.2602 - val_acc: 0.8825
Epoch 4/15
- 1s - loss: 0.2375 - acc: 0.8930 - val_loss: 0.2524 - val_acc: 0.8870
Epoch 5/15
- 1s - loss: 0.2291 - acc: 0.8960 - val_loss: 0.2423 - val_acc: 0.8905
Epoch 6/15
- 1s - loss: 0.2229 - acc: 0.8976 - val_loss: 0.2495 - val_acc: 0.8870
12602/12602 [==============================] - 0s 21us/step
loss 23.95 acc 88.81
Accuracy check:
prediction = model.predict(validation_features , batch_size=64)
res = []
for p in prediction:
res.append(p[0].round(0))
# Accuracy with sklearn
acc_score = accuracy_score(validation_results, res)
print("Sklearn acc", acc_score) # 0.887
Saving the model:
model.save("new keras fake news acc 88.7.h5")
scaler_filename = "keras nn scaler.save"
joblib.dump(scaler, scaler_filename)
I have saved that model and that scaler.
When I load that model and that scaler, and when I want to make prediction, I get accuracy of 52%, and thats very low because I had accuracy of 88.7% when I was training that model.
I applied .transform on my new data for testing.
validation_df = pd.read_csv("validation.csv")
validation_features = validation_df.iloc[:,:-1]
validation_results = validation_df.iloc[:,-1].tolist()
scaler = joblib.load("keras nn scaler.save")
validation_features = scaler.transform(validation_features)
my_model_1 = load_model("new keras fake news acc 88.7.h5")
prediction = my_model_1.predict(validation_features , batch_size=64)
res = []
for p in prediction:
res.append(p[0].round(0))
# Accuracy with sklearn - much lower
acc_score = accuracy_score(validation_results, res)
print("Sklearn acc", round(acc_score,2)) # 0.52
Can you tell me what am I doing wrong, I have read a lot about this on github and stackoverflow but I couldnt find the answer?
It is difficult to answer that without your actual data. But there is a smoking gun, raising suspicions that your validation data might be (very) different from your training & test ones; and it comes from your previous question on this:
If i use fit_transform on my [validation set] features, I do not get an error, but I get accuracy of 52%, and that's terrible (because I had 89.1 %).
Although using fit_transform on the validation data is indeed wrong methodology (the correct one being what you do here), in practice, it should not lead to such a high discrepancy in the accuracy.
In other words, I have actually seen many cases where people erroneously apply such fit_transform approaches on their validation/deployment data, without never realizing any mistake in it, simply because they don't get any performance discrepancy - hence they are not alerted. And such a situation is expected, if indeed all these data are qualitatively similar.
But discrepancies such as yours here lead to strong suspicions that your validation data are actually (very) different from your training & test ones. If this is the case, such performance discrepancies are to be expected: the whole ML practice is founded upon the (often implicit) assumption that our data (training, validation, test, real-world deployment ones etc) do not change qualitatively, and they all come from the same statistical distribution.
So, the next step here is to perform an exploratory analysis to both your training & validation data to investigate this (actually, this is always assumed to be the step #0 in any predictive task). I guess that even elementary measures (mean & max/min values etc) will show if there are strong differences between them, as I suspect.
In particular, scikit-learn's StandardScaler uses
z = (x - u) / s
for the transformation, where u is the mean value and s the standard deviation of the data. If these values are significantly different between your training and validation sets, the performance discrepancy is not to be unexpected.
I'm new to tensorflow keras and dataset. Can anyone help me understand why the following code doesn't work?
import tensorflow as tf
import tensorflow.keras as keras
import numpy as np
from tensorflow.python.data.ops import dataset_ops
from tensorflow.python.data.ops import iterator_ops
from tensorflow.python.keras.utils import multi_gpu_model
from tensorflow.python.keras import backend as K
data = np.random.random((1000,32))
labels = np.random.random((1000,10))
dataset = tf.data.Dataset.from_tensor_slices((data,labels))
print( dataset)
print( dataset.output_types)
print( dataset.output_shapes)
dataset.batch(10)
dataset.repeat(100)
inputs = keras.Input(shape=(32,)) # Returns a placeholder tensor
# A layer instance is callable on a tensor, and returns a tensor.
x = keras.layers.Dense(64, activation='relu')(inputs)
x = keras.layers.Dense(64, activation='relu')(x)
predictions = keras.layers.Dense(10, activation='softmax')(x)
# Instantiate the model given inputs and outputs.
model = keras.Model(inputs=inputs, outputs=predictions)
# The compile step specifies the training configuration.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])
# Trains for 5 epochs
model.fit(dataset, epochs=5, steps_per_epoch=100)
It failed with the following error:
model.fit(x=dataset, y=None, epochs=5, steps_per_epoch=100)
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 1510, in fit
validation_split=validation_split)
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 994, in _standardize_user_data
class_weight, batch_size)
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training.py", line 1113, in _standardize_weights
exception_prefix='input')
File "/home/wuxinyu/pyEnv/lib/python3.5/site-packages/tensorflow/python/keras/engine/training_utils.py", line 325, in standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking input: expected input_1 to have 2 dimensions, but got array with shape (32,)
According to tf.keras guide, I should be able to directly pass the dataset to model.fit, as this example shows:
Input tf.data datasets
Use the Datasets API to scale to large datasets or multi-device training. Pass a tf.data.Dataset instance to the fit method:
# Instantiates a toy dataset instance:
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32)
dataset = dataset.repeat()
Don't forget to specify steps_per_epoch when calling fit on a dataset.
model.fit(dataset, epochs=10, steps_per_epoch=30)
Here, the fit method uses the steps_per_epoch argument—this is the number of training steps the model runs before it moves to the next epoch. Since the Dataset yields batches of data, this snippet does not require a batch_size.
Datasets can also be used for validation:
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32).repeat()
val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_labels))
val_dataset = val_dataset.batch(32).repeat()
model.fit(dataset, epochs=10, steps_per_epoch=30,
validation_data=val_dataset,
validation_steps=3)
What's the problem with my code, and what's the correct way of doing it?
To your original question as to why you're getting the error:
Error when checking input: expected input_1 to have 2 dimensions, but got array with shape (32,)
The reason your code breaks is because you haven't applied the .batch() back to the dataset variable, like so:
dataset = dataset.batch(10)
You simply called dataset.batch().
This breaks because without the batch() the output tensors are not batched, i.e. you get shape (32,) instead of (1,32).
You are missing defining an iterator which is the reason why there is an error.
data = np.random.random((1000,32))
labels = np.random.random((1000,10))
dataset = tf.data.Dataset.from_tensor_slices((data,labels))
dataset = dataset.batch(10).repeat()
inputs = Input(shape=(32,)) # Returns a placeholder tensor
# A layer instance is callable on a tensor, and returns a tensor.
x = Dense(64, activation='relu')(inputs)
x = Dense(64, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)
# Instantiate the model given inputs and outputs.
model = keras.Model(inputs=inputs, outputs=predictions)
# The compile step specifies the training configuration.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
loss='categorical_crossentropy',
metrics=['accuracy'])
# Trains for 5 epochs
model.fit(dataset.make_one_shot_iterator(), epochs=5, steps_per_epoch=100)
Epoch 1/5
100/100 [==============================] - 1s 8ms/step - loss: 11.5787 - acc: 0.1010
Epoch 2/5
100/100 [==============================] - 0s 4ms/step - loss: 11.4846 - acc: 0.0990
Epoch 3/5
100/100 [==============================] - 0s 4ms/step - loss: 11.4690 - acc: 0.1270
Epoch 4/5
100/100 [==============================] - 0s 4ms/step - loss: 11.4611 - acc: 0.1300
Epoch 5/5
100/100 [==============================] - 0s 4ms/step - loss: 11.4546 - acc: 0.1360
This is the result on my system.
I'm using ImageDataGenerator and flow_from_directory to generate my data, and
using model.fit_generator to fit the data.
This defaults to outputting the accuracy for training data set only.
There doesn't seem to be an option to output validation accuracy to the terminal.
Here is the relevant portion of my code:
#train data generator
print('Starting Preprocessing')
train_datagen = ImageDataGenerator(preprocessing_function = preprocess)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = 'categorical') #class_mode = 'categorical'
#same for validation
val_datagen = ImageDataGenerator(preprocessing_function = preprocess)
validation_generator = val_datagen.flow_from_directory(
validation_data_dir,
target_size = (img_height, img_width),
batch_size=batch_size,
class_mode='categorical')
########################Model Creation###################################
#create the base pre-trained model
print('Finished Preprocessing, starting model creating \n')
base_model = InceptionV3(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(12, activation='softmax')(x)
model = Model(input=base_model.input, output=predictions)
for layer in model.layers[:-34]:
layer.trainable = False
for layer in model.layers[-34:]:
layer.trainable = True
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.001, momentum=0.92),
loss='categorical_crossentropy',
metrics = ['accuracy'])
#############SAVE Model #######################################
file_name = str(datetime.datetime.now()).split(' ')[0] + '_{epoch:02d}.hdf5'
filepath = os.path.join(save_dir, file_name)
checkpoints =ModelCheckpoint(filepath, monitor='val_acc', verbose=1,
save_best_only=False, save_weights_only=False,
mode='auto', period=2)
###############Fit Model #############################
model.fit_generator(
train_generator,
steps_per_epoch =total_samples//batch_size,
epochs = epochs,
validation_data=validation_generator,
validation_steps=total_validation//batch_size,
callbacks = [checkpoints],
shuffle= True)
UPDATE OUTPUT:
Throughout training, I'm only getting the output of training accuracy,
but at the end of training, I"m getting both training, validation accuracy.
Epoch 1/10
1/363 [..............................] - ETA: 1:05:58 - loss: 2.4976 - acc: 0.0640
2/363 [..............................] - ETA: 51:33 - loss: 2.4927 - acc: 0.0760
3/363 [..............................] - ETA: 48:55 - loss: 2.5067 - acc: 0.0787
4/363 [..............................] - ETA: 47:26 - loss: 2.5110 - acc: 0.0770
5/363 [..............................] - ETA: 46:30 - loss: 2.5021 - acc: 0.0824
6/363 [..............................] - ETA: 45:56 - loss: 2.5063 - acc: 0.0820
The idea is that you go through you validation set after each epoch, not after each batch.
If after every batch, you had to evaluate the performances of the model on the whole validation set, you would loose a lot of time.
After each epoch, you will have the corresponding losses and accuracies both for training and validation. But during one epoch, you will only have access to the training loss and accuracy.
Validation loss and validation accuracy gets printed for every epoch once you specify the validation_split.
model.fit(X, Y, epochs=1000, batch_size=10, validation_split=0.2)
I have used the above in my code, and val_loss and val_acc are getting printed for every epoch, but not after every batch.
Hope that answers your question.
Epoch 1/500
1267/1267 [==============================] - 0s 376us/step - loss: 0.6428 - acc: 0.6409 - val_loss: 0.5963 - val_acc: 0.6656
In fit_generator,
fit_generator(generator, steps_per_epoch=None, epochs=1, verbose=1, callbacks=None, **validation_data=None, validation_steps=None**, validation_freq=1, class_weight=None, max_queue_size=10, workers=1, use_multiprocessing=False, shuffle=True, initial_epoch=0)
since there is no validation_split parameter, you can create two different ImageDataGenerator flow, one for training and one for validating and then place that 'validation_generator' in validation_data. Then it will print the validation loss and accuracy.