Different accuracies on different machines using same seeds, code and dataset

Different accuracies on different machines using same seeds, code and dataset - python

I am trying to develop a CNN for signature recognition to identify which person a given signature belongs to. There are 3 different classes(persons) and 23 signatures for each of them. Having this little amount of samples, I decided to use the Keras ImageDataGenerator to create additional images.
However, testing the CNN on different machines (Windows 10 and Mac OS) gives different accuracy scores when evaluating the model on the test data. 100% on windows and 93% on mac OS. Both machines run python 3.7.3 64-bit.
The data is split using train_test_split from sklearn.model_selection with 0.8 for training and 0.2 for testing, random_state is 1. Data and labels are properly normalised and adjusted to fit into the CNN. Various numbers for steps_per_epoch, batch_size and epoch have been tried.
I have tried using both np.random.seed and tensorflow.set_random_seed, generating 100% accuracy on the test data using seed(1) on PC, however the same seed on the other machine still yields a different accuracy score.
Here is the CNN architecture along with the method call for creating additional images. The following code yields an accuracy of 100% on one machine and 93.33% on the other.
seed(185)
set_random_seed(185)
X_train, X_test, y_train, y_test = train_test_split(data, labels, train_size=0.8, test_size=0.2, random_state=1)
datagen = ImageDataGenerator()
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit_generator(datagen.flow(X_train, y_train, batch_size=64), validation_data=(X_test,y_test),steps_per_epoch= 30, epochs=10)
model.evaluate(X_test, y_test)
EDIT
So after more research I've discovered that using different hardware, specifically different graphics cards, will result in varying accuracies.
Saving the trained model and using that to evalute data on different machines is the ideal solution.

Providing the solution here (Answer Section), even though it is present in the Question Section, for the benefit of the community.
Using different hardware, specifically different graphics cards will result in varying accuracies though we use same seeds, code and dataset. Saving the trained model and using that to evaluate data on different machines is the ideal solution.

Related

how could i increase the accuracy of training set

i'm working on a classification problem (human activity classification) and i used CNN the code of model is :
model = Sequential()
model.add(Conv2D(100, (2, 2), activation = 'relu', input_shape = X_train[0].shape))
model.add(Dropout(0.1))
#adding pooling layer
model.add(MaxPool2D(2,2))
model.add(Dense(64, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(64, activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(7, activation='softmax'))
compiling and fiting :
model.compile(optimizer=Adam(learning_rate = 0.001), loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
history = model.fit(X_train, y_train, epochs = 20, validation_data= (X_test, y_test), verbose=1)
the accuracy was like this
how coul'd i increase the last value of accuracy ? and why the curve is increasing kinda fast?

There are a few avenues you can pursue here, specifically finding answers to the following questions for your particular problem. Here's a great video, although not for tensorflow, but I think the question you are asking is general enough for it to apply
What is the right amount of time to train for? Likely the answer here is somewhere between 20 epochs and 90, more specifically, it's where your two series in the plot start to diverge; in other words, your model starts to memorize the training data at the point of divergence. Tensorflow has early stopping mechanisms to help with this.
What is the performance of a naïve guesser? Is the complexity of your model proportional to the complexity/dimensionality of the problem?
What is the human insight that you can bring to the problem? Are there things you can do to the features that will help the model create separability in higher dimensions? For example, let's say your model is going to predict what activity a person is going to do at a given point in time. In this case, information related to people might be separate from time and activity data. You can create features that represent combinations of other features (assuming you have a lot of data), and encode this and feed it to your model. You can create embeddings in your model to get your model to deal with the sparsity that occurs when you combine such categorical features.
Another aspect of this that I think is very important to answer is "Why am I solving this problem?". In some cases, the answer might be "I want to learn X", in which case you might approach it differently. For example, if it's all tabular data, you might have more interpretable/better results using something like scikit-learn using a tree based model. It also, of course, depends on the amount and type of data you have. Nested cross-validation can give you great insight into what are the combinations of hyperparameters and features that will produce a model that generalizes, and also about the variation you can expect to see on unseen data.
Best of luck!

Neural Network optimization for image classification in keras/tensorflow

I am writing a program for clasifying images into two categories: "Wires" and "non-Wires". I have hand-labeled around 5000 microscope images, examples:
non-wire
wire
The neural network I am using is adapted from "Deep Learning with Python", chapter about convolutional networks (I don't think convolutional networks are neccesary here because there are no obvious hierarchies; Dense networks should be more suitable):
model = models.Sequential()
model.add(layers.Dense(32, activation='relu',input_shape=(200,200,3)))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Dense(32, activation='relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(2, activation='softmax'))
However, test accuracy after 10 epochs of training does not go over 92% when playing around with the paramters of the network. Training images contain about 1/3 wires, 2/3 non-wires. My question: Do you see any obvious mistakes in this neural network design that inhibits accuracy, or do you think I am limited by the image quality? I have about 4000 train and 1000 test images.

You might get some improvement by trying to handle the class imbalance using a weights dictionary. If the label of non wire is 0 and the label for wire is 1 then the weight dictionary would be
weight_dict= { 0:.5, 1:1}
in model.fit set
class_weight=weight_dict .
Without seeing the results of training (training loss and validation loss) can't tell what else to do. If you are over fitting try adding some dropout layers. Also recommend you try using an adjustable learning using the keras callback ReduceLROnPlateau, and early stopping using the keras callback EarlyStopping. Documentation is here. Set each callback to monitor validation loss. My suggested code is shown below:
reduce_lr=tf.keras.callbacks.ReduceLROnPlateau(
monitor="val_loss",factor=0.5, patience=2, verbose=1)
e_stop=tf.keras.callbacks.EarlyStopping( monitor="val_loss", patience=5,
verbose=0, restore_best_weights=True)
callbacks=[reduce_lr, e_stop]
In model.fit include
callbacks=callbacks
If you want to give a convolutional network a try I recommend transfer learning using the Mobilenetmodel. Documentation for that is here.. My recommend code for that is below:
base_model=tf.keras.applications.mobilenet.MobileNet( include_top=False,
input_shape=(200,200,3) pooling='max', weights='imagenet',dropout=.4)
x=base_model.output
x=keras.layers.BatchNormalization(axis=-1, momentum=0.99, epsilon=0.001 )(x)
x = Dense(1024, activation='relu')(x)
x=Dropout(rate=.3, seed=123)(x)
output=Dense(2, activation='softmax')(x)
model=Model(inputs=base_model.input, outputs=output)
model.compile(Adamax(lr=.001),loss='categorical_crossentropy',metrics=
['accuracy'] )
In model.fit include the callbacks as shown above.

AMD plaidml vs CPU Tensorflow - Unexpected results

I am currently running a simple script to train the mnist dataset.
Running the training through my CPU via Tensorflow is giving me 49us/sample and a 3e epoch using the following code:-
# CPU
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = tf.keras.utils.normalize(x_train, axis=1)
x_test = tf.keras.utils.normalize(x_test, axis=1)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))
model.compile(optimizer='adam', loss="sparse_categorical_crossentropy", metrics=["accuracy"])
model.fit(x_train, y_train, epochs=3)
When I run the dataset through my AMD Pro 580 using the opencl_amd_radeon_pro_580_compute_engine via plaidml setup I get the following results 249us/sample with a 15s epoch, using the following code:-
# GPU
import plaidml.keras
plaidml.keras.install_backend()
import keras
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = keras.utils.normalize(x_train, axis=1)
x_test = keras.utils.normalize(x_test, axis=1)
model = keras.models.Sequential()
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Dense(10, activation='softmax'))
model.compile(optimizer='adam', loss="sparse_categorical_crossentropy", metrics=["accuracy"])
model.fit(x_train, y_train, epochs=3)
I can see my CPU firing up for the CPU test and my GPU maxing out for the GPU test, but I am very confused as to why the CPU is out performing the GPU by a factor of 5.
Should this be the expected results?
Am I doing something wrong in my code?

It seems I've found the right solution at least for macOS/Keras/AMD GPU setup.
TL;DR:
Do not use OpenCL, use *metal instead.
Do not use Tensorflow 2.0, use Keras only API
Here are the details:
Run plaidml-setup and pickup metal🤘🏻this is important!
...
Multiple devices detected (You can override by setting PLAIDML_DEVICE_IDS).
Please choose a default device:
1 : llvm_cpu.0
2 : metal_intel(r)_uhd_graphics_630.0
3 : metal_amd_radeon_pro_560x.0
Default device? (1,2,3)[1]:3
...
Make sure you saved changes:
Save settings to /Users/alexanderegorov/.plaidml? (y,n)[y]:y
Success!
Now run MNIST example, you should see something like:
INFO:plaidml:Opening device "metal_amd_radeon_pro_560x.0"
This is it. I have made a comparison using plaidbench keras mobilenet:
metal_amd_radeon_pro_560x.0 FASTEST!
Example finished, elapsed: 0.435s (compile), 8.057s (execution)
opencl_amd_amd_radeon_pro_560x_compute_engine.0
Example finished, elapsed: 3.197s (compile), 14.620s (execution)
llvm_cpu.0
Example finished, elapsed: 3.619s (compile), 47.837s (execution)

I think there are two aspects of the observed situation:
plaidml is not that great in my experience and I’ve had similar results, sadly.
Moving data to the gpu is slow. In this case, the MNIST data is really small and the time to move the data there outweighs the “benefit” of paralleling the computation. Actually the TF CPU probably does parallel matrix multiplication as well but it’s much faster as the data is smal and closer to the processing unit.

How can I recognize digits from photo (.jpg format) using Python and TF, Keras?

I used OpenCV to crop images from the photo.
From this:
to this :
Then I crop it to 5 different parts with different types of threshold and angle (in rotation matrix 2D) for training a neural network.
Now I have 45 similar jpg files for any digits from 0 to 9.
But I can't understand how could I train it with my own data, not using MNIST datasets
Help me out to dealing with building a digit recognition program, please. I need to extract all the digits from img to text.

If you are going for the NN approach, I would first start with a small nn, and see how well it does, you can use the MNIST toy example from here.
Just note that you will need to use your own data, instead of mnist:
import tensorflow as tf
x_train, y_train = load_train_data()
x_test, y_test = load_test_data()
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)
Note that I 'invented' 2 functions: load_train_data() and load_test_data(), you need to implement them for your data, and return a tuple, of ((samples,x,y), labels), for each one of the functions.
Once you get the feeling, I would explore a bit more advanced networks, you can look here: https://towardsdatascience.com/a-simple-2d-cnn-for-mnist-digit-recognition-a998dbc1e79a, its a nice tutorial for 2d CNN network, just use your data loading functions instead of the mnist.
As you will now probably face a wall, as you don't have enough data, you need to apply some data augmentation.
There is a very nice solution 'Deep Diffeomorphic Transformer Networks' from the last CVPR, performs very well on digits classification with a low amount of samples. You can find the mnist code here, again use your functions for the data.

Learning Keras model by using Distributed Tensorflow

I have two GPU installed on two different machines. I want to build a cluster that allows me to learn a Keras model by using the two GPUs together.
Keras blog shows two slices of code in Distributed training section and link official Tensorflow documentation.
My problem is that I don't know how to learn my model and put into practice what is reported in Tensorflow documentation.
For example, what should I do if I want to execute the following code on a cluster of multiple GPU?
# For a single-input model with 2 classes (binary classification):
model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
# Generate dummy data
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(2, size=(1000, 1))
# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, epochs=10, batch_size=32)

In the first and second part of the blog he explains how to use keras models with tensorflow.
Also I found this example of keras with distributed training.
And here is another with horovod.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.