I'm running a keras model for binary classification on two separate computers; the first is running python 2.7.5 with Tensorflow 1.0.1 and keras 2.0.2 and cpu computations; the second is running python 2.7.5 with Tensorflow 1.2.1 and keras 2.0.6 and uses the gpu.
My model is modified from the siamese network model at https://gist.github.com/mmmikael/0a3d4fae965bdbec1f9d. I added regularization (activity_regularizer=keras.regularizers.l1 in the Dense layers), but for everything else I'm using the same structure as the mnist example.
I use the exact same code and training data on both computers, but the first one gives me a classification accuracy 86% and recall 88% on the test set, and the other gives me accuracy 52% and recall 100% (it classifies every test sample as "positive"). These results are reproducible with separate initializations.
I'm not even sure where to start looking for why the performance is so vastly different. I've been reading through the keras/tensorflow release notes to see if any of the changes pertain to something in my model, but I don't see anything that looks helpful. It doesn't make sense that a version change in tensorflow/keras would cause that much of a difference in the performance. Any sort of help figuring this out would be greatly appreciated.
Related
Is there a performance loss when converting TensorFlow models to the TensorFlow Lite format?
Because I got these results from different edge-devices:
Does it make sense that the Nvidia Jetson has a higher accuracy with the TensorFlow model (TensorRT optimized) when comparing it to the Raspberry one which is a TensorFlow Lite model.
Normally, there is a performance loss, but not such a significant one, more precisely around 3% in accuracy for instance in some certain models, but you have to test it on your own to check the accuracy.
Models which are subjected to TensorRT or TensorFlow-Lite do not go through the same exact conversion steps(otherwise they would be the same). Therefore, it is evident that a difference is noticeable.
To conclude: The gain in speed as compared to the performance loss(max 3%) is much more important. For each and every assumption tests should be employed.
This article is also a good read: https://www.hackster.io/news/benchmarking-tensorflow-and-tensorflow-lite-on-the-raspberry-pi-43f51b796796
I am trying to convert a keras model to tpu model in google colab, but this model has another model inside.
Take a look at the code:
https://colab.research.google.com/drive/1EmIrheKnrNYNNHPp0J7EBjw2WjsPXFVJ
This is a modified version of one of the examples in the google tpu documentation:
https://colab.research.google.com/github/tensorflow/tpu/blob/master/tools/colab/fashion_mnist.ipynb
If the sub_model is converted and used directly it works, but if the sub model is inside another model it does not work. I need the sub model type of network because i am trying to train a GAN network that has 2 networks inside (gan=generator+discriminator) so if this test works probably it will work with the gan too.
I have tried several things:
Convert to tpu the model without converting the sub model, in that case when training starts an error is prompted related to the inputs of the sub model.
Convert both the model and sub model to tpu, in that case an error is prompted when converting the "parent" model, the exception only says at the end "layers".
Convert only the sub model to tpu, in that case no error is prompted but the training is not accelerated by the tpu and it is extremely slow like if no conversion to tpu was made at all.
Using fixed batch size or not, both have the same result, the model does not work.
Any ideas? Thanks a lot.
Divide into parts only use submodel at tpu first. Then put something simple instead of submodel and use the model in TPU. If this does not work , create something very simple which includes similar structure with models you are sure that are working and then step by step add things to converge your complex model which you want to use in TPU.
I am struggling with such things. What I did at the very beginning using MNIST is trained the model and get the coefficients outside rewrite relu dense dropout and NN matricies myself and run the model using numpy and then cupy and then pyopencl and then I replaced functions with my own raw cuda C and opencl functions so that getting deeper and simpler I can find what is wrong when something does not work. At last I write my genetic selective training algo and learned a lot.
And most important it gave me the opportunity to try some crazy ideas for training and modelling and manuplating and making sense of NN coffecients.
The problem in my opinion is TF - Keras etc are too high level. Optimizers - Solvers , there is too much unknown. Even neural networks are not under control. GAN is problematic while training it does not converge everytime takes days to train most of the time. Even if you train. You dont know any idea how it converges. Most of the tricks - techniques which protects you from vanishing gradient are not mathematically backed they are nevertheless works very amazingly. (?!?)
**Go simpler deeper and and complexity step by step. Follow a practicing on which you comprehend as much as you can ** It will cost some time and energy but you will benefit it tremendously in my opinion.
Hi I have some problem about Keras with python 3.6
My enviroment is keras with Python and Only CPU.
but the problem is when I iterate same Keras model for predict some diferrent input, its getting slower and slower..
my code is so simple just like that
for i in range(100):
model.predict(x)
the First run is fast. it takes 2 seconds may be. but second run takes 3 seconds and Third takes 5 seconds... its getting slower and slower even if I use same input.
what can I iterate predict keras model hold fast? I don't want any getting slower.. it will be very critical.
How can I Fix IT??
Try using the __call__ method directly. The documentation of the predict method states the following:
For small numbers of inputs that fit in one batch, directly use __call__() for faster execution, e.g., model(x).
I see the performance is critical in this case. So, if it doesn't help, you could use OpenVINO which is optimized for Intel hardware but it should work with any CPU. Your performance should be much better than using Keras directly.
It's rather straightforward to convert the Keras model to OpenVINO. The full tutorial on how to do it can be found here. Some snippets below.
Install OpenVINO
The easiest way to do it is using PIP. Alternatively, you can use this tool to find the best way in your case.
pip install openvino-dev[tensorflow2]
Save your model as SavedModel
OpenVINO is not able to convert HDF5 model, so you have to save it as SavedModel first.
import tensorflow as tf
from custom_layer import CustomLayer
model = tf.keras.models.load_model('model.h5', custom_objects={'CustomLayer': CustomLayer})
tf.saved_model.save(model, 'model')
Use Model Optimizer to convert SavedModel model
The Model Optimizer is a command-line tool that comes from OpenVINO Development Package. It converts the Tensorflow model to IR, which is a default format for OpenVINO. You can also try the precision of FP16, which should give you better performance without a significant accuracy drop (change data_type). Run in the command line:
mo --saved_model_dir "model" --data_type FP32 --output_dir "model_ir"
Run the inference
The converted model can be loaded by the runtime and compiled for a specific device e.g. CPU or GPU (integrated into your CPU like Intel HD Graphics). If you don't know what is the best choice for you, use AUTO.
# Load the network
ie = Core()
model_ir = ie.read_model(model="model_ir/model.xml")
compiled_model_ir = ie.compile_model(model=model_ir, device_name="CPU")
# Get output layer
output_layer_ir = compiled_model_ir.output(0)
# Run inference on the input image
result = compiled_model_ir([input_image])[output_layer_ir]
Disclaimer: I work on OpenVINO.
If your model calls the fit function in batches, there are different samples in the same batch with slightly different times over the course of the iteration, and then you try again and again to get more and more groups of predictive model performance time will be longer and longer.
For learning purposes, I am trying to implement a CNN from scratch, but the results do not seem to improve from random guessing. I know this is not the best approach on home hardware, and following course.fast.ai I have obtained much better results via transfer learning, but for a deeper understanding I would like to see, at least in theory, how one could do it otherwise.
Testing on CIFAR-10 posed no issues - a small CNN trained from scratch in a matter of minutes with an error of less than 0.5%.
However, when trying to test against the Cats vs. Dogs Kaggle dataset, the results did not bulge from 50% accuracy. The architecture is basically a copy of AlexNet, including the non-state-of-the-art choices (large filters, histogram equalization, Nesterov-SGD optimizer). For more details, I put the code in a notebook on GitHub:
https://github.com/mspinaci/deep-learning-examples/blob/master/dogs_vs_cats_with_AlexNet.ipynb
(I also tried different architectures, more VGG-like and using Adam optimizer, but the result was the same; the reason why I followed the structure above was to match as closely as possible the Caffe procedure described here:
https://github.com/adilmoujahid/deeplearning-cats-dogs-tutorial
and that seems to converge quickly enough, according to the author's description here: http://adilmoujahid.com/posts/2016/06/introduction-deep-learning-python-caffe/).
I was expecting some fitting to happen quickly, possibly flattening out due to the many suboptimal choices made (e.g. small dataset, no data augmentation). Instead, I saw no increment at all, as the notebook shows.
So I thought that maybe I was simply overestimating my GPU and patience, and that the model was too complicated even to overfit my data in a few hours (I ran 70 epochs, each time roughly 360 batches of 64 images). Therefore I tried to overfit as hard as I could, running these other models:
https://github.com/mspinaci/deep-learning-examples/blob/master/Ridiculously%20overfitting%20models...%20or%20maybe%20not.ipynb
The purely linear model started showing some overfit - around 53.5% training accuracy vs 52% validation accuracy (which I guess is thus my best result). That followed my expectations. However, to try and overfit as hard as I could, the second model is a simple 2 layers feedforward neural network, without any regularization, that I trained on just 2000 images with batch size up to 500. I was expecting the NN to overfit wildly, quickly getting to 100% train accuracy (after all it has 77M parameters for 2k pictures!). Instead, nothing happened, and the accuracy flattened to 50% quickly enough.
Any tip about why none of the "multi-layer" models seems able to pick any feature (be it "true" or out of overfitting) would be very much appreciated!
Note on versions etc: the notebooks were run on Python 2.7, Keras 2.0.8, Theano 0.9.0. The OS is Windows 10, and the GPU is a not-so-powerful, but that should be sufficient for basic tasks, GeForce GTX 960M.
I am using LightGBM 2.0.6 Python API. My training data has around 80K samples and 400 features, and I am training a model with ~2000 iterations, and the model is for multi-class classification (#classes = 10). When the model is trained, and when I called model.feature_importance(), I encountered segmentation fault.
I tried to generate artificial data to test (with the same number of samples, classes, iterations and hyperparameters), and I can successfully obtain the list of feature importance. Therefore I suspect whether the problem occurs depends on the training data.
I would like to see if someone else has encountered this problem and if so how was it overcome. Thank you.
This is a bug in LightGBM; 2.0.4 doesn't have this issue. It should be also fixed in LightGBM master. So either downgrade to 2.0.4, wait for a next release, or use LightGBM master.
The problem indeed depends on training data; feature_importances segfault only when there are "constant" trees in the trained ensemble, i.e. trees with a single leaf, without any splits.