traning MNIST model with proper hyper params in Python

traning MNIST model with proper hyper params in Python - python

here is my code, i don't know why it gives me 0.3% accuracy
can anyone tell me what is the problem with this code?
def train_mnist():
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
history = model.fit(x_train, y_train, epochs=5)
return history.epoch, history.history['acc'][-1]
train_mnist()
thanks in Adavnce

this will work! try this
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer='adam',
loss=loss_fn,
metrics=['accuracy'])

The problem seem to be your loss function
Try this:
Method 1
You could use categorical_crossentropy as loss but the last layer should be
tf.keras.layers.Dense(10,activation='softmax')
and then
model.compile(optimizer = 'adam',
loss"categorical_crossentropy",
metrics=["accuracy"])
Method 2
In your case, the sparse_categorical_crossentropy loss need to define
tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True,name='sparse_categorical_crossentropy')
To understand the difference b\w these two see this

Related

An FFNN in pyhton for recognition of nanocrystals

I tried with this one but i think its not really good working so any idea ?
the code:
import tensorflow as tf
import numpy as np
(x_train, y_train), (x_test, y_test) = load_data_MEB()
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print('Test accuracy:', test_acc)
The project is the recognition of nanocrystals (present in cement) on SEM.
We have a database of 1000 SEM pictures where we can see the crystals, we need a program in python or any other programming language (CNN but ideally FFNN) to recognize these crystals and eventually their shape (square and triangular).

How to calculate mean relative error on test datasets

def create_model():
model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape=(40002, 12)))
model.add(LSTM(50, return_sequences= True))
model.add(LSTM(50, return_sequences= True))
model.add(tf.keras.layers.LSTM(30))
model.add(Dense(2, activation='linear'))
def rmse(Y_test, prediction):
return K.sqrt(K.mean(K.square(Y_test-prediction)))
# compile
model.compile(optimizer='adam', loss=rmse, metrics=['mean_squared_error', rmse])
return model
# fit the model
model = create_model()
model.fit(x_train, Y_train, shuffle=False, verbose=1, epochs=10)
# # predict model
prediction = model.predict(x_test, verbose=0)
print(prediction)
How to calculate mean relative error for tensor inputs i.e my Y_test and prediction are tensor.
Y_test and prediction as 2 values
Example:
Y_test = [[0.2,0.003],
[0.3, 0.008]]
prediction = [[0.4,0.005],
[0.5,0.007]]
mean_relative_error = mean(absolute(0.2-0.4)/0.2 + absolute(0.003-0.005)/0.003), mean(absolute(0.3-0.5)/0.3 + absolute(0.008-0.007)/0.008)
mean_relative_error = [0.533, 0.3925]
Please note that I don't want to use it for backpropagation to improve the network.

Would have added like this:
from tensorflow.math import reduce_mean, abs, reduce_sum
relative_error = reduce_mean(reduce_sum(abs(prediction-Y_test)/prediction, axis=1))
# [0.9, 0.54285717]
mean_relative_error = reduce_mean(relative_error)
# 0.7214286
I couldn't use tf.keras.losses.MeanAbsoluteError(reduction=tf.keras.losses.Reduction.NONE) because of a bug. The MeanAbsoluteError still does reduce to mean despite specifying it not to. The bug reported HERE

Get different accuracy on the same evaluating dataset after loading saved model

I just simply use MNIST dataset to implement a simple ML application. My code is
import tensorflow as tf
import numpy as np
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer='adam',
loss=loss_fn,
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
print('Before saving')
model.evaluate(x_test, y_test, verbose=2)
model.save('model.h5')
# load model again
loaded_model = tf.keras.models.load_model('model.h5')
# evaluate on the same data
print('After loading')
loaded_model.evaluate(x_test, y_test, verbose=2)
The accuracies on the same dataset are different after loading

This is a known issue: https://github.com/tensorflow/tensorflow/issues/42045
Compile the model with metrics='sparse_categorical_accuracy' instead of just 'accuracy'.

See all correctly and incorrectly identified images when training on the mnist dataset

I'm trying to find a way to visualize which numbers in the mnist dataset a model was able to correctly identify and which ones it wasn't.
What I can't seem to find is if such a visualization is possible in tensorboard or if I would need to use/create something else to achieve it.
I'm currently working from the basic tutorial provided for tensorflow 2.0 with tensorboard added.
import datetime
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
log_dir="logs/fit/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)
model.fit(x_train,
y_train,
epochs=5,
validation_data=(x_test, y_test),
callbacks=[tensorboard_callback])
model.evaluate(x_test, y_test)

It appears the what-if tool is what I was looking for, it allows you to visually sort testing data depending on whether it was correctly or incorrectly identified by the model.
If you want to test it out here is their demo that I used to get the above image and they have multiple other demos on the tools site.

Hyperparapeters optimization with grid_search in keras and flow_from_directory

I tried to optimize hyperparameters in my keras CNN made for image classification. I decided to use grid search from sklearn. I overcame the fundamental difficulty with making x and y out of keras flow_from_directory but it still doesn't work.
Error in the last line
ValueError: dropout is not a legal parameter
def grid_model(optimizer='adam',
kernel_initializer='random_uniform',
dropout=0.2,
loss='categorical_crossentropy'):
model = Sequential()
model.add(Conv2D(6,(5,5),activation="relu",padding="same",
input_shape=(img_width, img_height, 3)))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(dropout))
model.add(Conv2D(16,(5,5),activation="relu"))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(dropout))
model.add(Flatten())
model.add(Dense(120, activation='relu', kernel_initializer=kernel_initializer))
model.add(Dropout(dropout))
model.add(Dense(84, activation='relu', kernel_initializer=kernel_initializer))
model.add(Dropout(dropout))
model.add(Dense(10, activation='softmax'))
model.compile(loss=loss,
optimizer=optimizer,
metrics=['accuracy'])
return model
train_generator = ImageDataGenerator(rescale=1/255)
validation_generator = ImageDataGenerator(rescale=1/255)
# Retrieve images and their classes for train and validation sets
train_flow = train_generator.flow_from_directory(directory=train_data_dir,
batch_size=batch_size,
target_size=(img_height,img_width))
validation_flow = validation_generator.flow_from_directory(directory=validation_data_dir,
batch_size=batch_size,
target_size=(img_height,img_width),
shuffle = False)
clf = KerasClassifier(build_fn=grid_model(), epochs=epochs, verbose=0)
param_grid = {
'clf__optimizer':['adam', 'Nadam'],
'clf__epochs':[100, 200],
'clf__dropout':[0.1, 0.2, 0.5],
'clf__kernel_initializer':['normal','uniform'],
'clf__loss':['categorical_crossentropy',
'sparse_categorical_crossentropy',
'kullback_leibler_divergence']
}
pipeline = Pipeline([('clf',clf)])
(X_train, Y_train) = train_flow.next()
grid = GridSearchCV(pipeline, cv=2, param_grid=param_grid)
grid.fit(X_train, Y_train)

The problem is in this line:
clf = KerasClassifier(build_fn=grid_model(), epochs=epochs, verbose=0)
change it to
clf = KerasClassifier(build_fn=grid_model, epochs=epochs, verbose=0)
The grid_model method should not be invoked but a reference to it should be passed.
Also, in the list of losses, 'sparse_categorical_crossentropy'(integer) cannot be used because the output shape required of the model is incompatible with that of 'categorical_crossentropy'(one-hot).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

traning MNIST model with proper hyper params in Python - python

this will work! try this loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])

Related

An FFNN in pyhton for recognition of nanocrystals

How to calculate mean relative error on test datasets

Get different accuracy on the same evaluating dataset after loading saved model

See all correctly and incorrectly identified images when training on the mnist dataset

Hyperparapeters optimization with grid_search in keras and flow_from_directory

Categories

Resources