How to use trained neural network in different platform/technology? - python

Given I trained a simple neural network using Tensorflow and Python on my laptop and I want to use this model on my phone in C++ app.
Is there any compatibility format I can use? What is the minimal framework to run neural networks (not to train)?
UDP. I'm also interested in Tensorflow to NOT-Tensorflow compatibility. Do I need to build it up from scratch or there're any best practices?

Yes if you are using iOS or Android. Depending on your specific needs, you have a choice between TensorFlow for Mobile and TensorFlow Lite
https://www.tensorflow.org/mobile/
In particular, to load pre-trained models
https://www.tensorflow.org/mobile/prepare_models

Technically you don't need a framework at all. A conventional fully connected neural network is simple enough that you can implement it in straight C++. It's about 100 lines of code for the matrix multiplication and a dozen or so for the non-linear activation function.
The biggest part is figuring out how to parse a serialized Tensorflow model, especially given that there are quite a few ways to do so. You probably will want to freeze your TensorFlow model; this inserts the weights from the latest training into the model.

Related

Do Python have a model which is similar to nnetar in R's package forecast?

R's package 'forecast' has a function nnetar, which uses feed-forward neural networks with a single hidden layer to predict in time series.
Now I am using Python to do the similar analysis. I want to use neural network which does not need to be as complex as deep learning. Maybe 2 layers and a couple of nodes are good enough for my case.
So, does Python have a model of simple neural networks which can be used in time series lik nnetar? If not, how to deal with this problem?
Any NN model that uses 1 or more hidden layers is a multi-layer perceptron model, and for that case it is trivial to make it extendable to N layers. So any library that you pick will support it. My guess for you not picking a complex library like pytorch/Tensorflow is its size.
Tensorflow does have TF-Lite which can work for smaller IOT devices.
Sklearn does have MLPRegressor that can train NNs if that is more to your liking.
You can always write your model. There are plenty of examples for this that use numpy and are plenty fast for cpu computation.( Single Hidden layer NN I am guessing will be more memory bound than computation bound)
Use another ML algorithm. Single Hidden layer NNs will not perform nearly as well as other other simpler algorithms.
If there are other reasons for not using a standard library like tensorflow/pytorch then you should mention them.

Is there an established way to convert a Tensorflow network architecture written in graph/session, to Keras API for TPU use?

In order to use Google TPU, your code must either use the Estimator API or Keras API.
Converting a graph to use the Estimator API is pretty straight forward, as you mostly allocate the code among model_fn, features, input_fn, etc.
Converting to graph to Keras is not as straight forward, as Keras has unique functions and datatypes to handle various operations. However, from Tensorflow's blog post, they seem to recommend Keras over Estimator
https://medium.com/tensorflow/standardizing-on-keras-guidance-on-high-level-apis-in-tensorflow-2-0-bad2b04c819a.
By establishing Keras as the high-level API for TensorFlow, we are
making it easier for developers new to machine learning to get started
with TensorFlow.
That said, if you are working on custom architectures, we suggest
using tf.keras to build your models instead of Estimator.
The Tensorflow Keras API is built on top of Tensorflow though, so I'm guessing there's a way to make anything in Tensorflow also in keras, especially since there seems to be functions that directly convert a Tensorflow function to keras, such as keras.optimizers.TFOptimizer(<tensorflow optimizer>)
Is there an established way to convert any architecture coded using graph/session to the Keras API?

Is it possible to remove categories in a pretrained tensorflow model?

I am currently using Tensorflow Object Detection API for my human detection app.
I tried filtering in the API itself which worked but I am still not contended by it because it's slow. So I'm wondering if I could remove other categories in the model itself to also make it faster.
If it is not possible, can you please give me other suggestions to make the API faster since I will be using two cameras. Thanks in advance and also pardon my english :)
Your questions addresses several topics for using neural network pretrained models.
Theoretical methods
In general, you can always neutralize categories by removing the corresponding neurons in the softmax layer and compute a new softmax layer only with the relevant rows of the matrix.
This method will surely work (maybe that is what you meant by filtering) but will not accelerate the network computation time by much, since most of the flops (multiplications and additions) will remain.
Similar to decision trees, pruning is possible but may reduce performance. I will explain what pruning means, but note that the accuracy over your categories may remain since you are not just trimming, you are predicting less categories as well.
Transfer the learning to your problem. See stanford's course in computer vision here. Most of the times I've seen that works good is by keeping the convolution layers as-is, and preparing a medium-size dataset of the objects you'd like to detect.
I will add more theoretical methods if you request, but the above are the most common and accurate I know.
Practical methods
Make sure you are serving your tensorflow model, and not just using an inference python code. This could significantly accelerate performance.
You can export the parameters of the network and load them in a faster framework such as CNTK or Caffe. These frameworks work in C++/CSharp and can inference much faster. Make sure you load the weights correctly, some frameworks use different order in tensor dimensions when saving/loading (little/big endian-like issues).
If your application perform inference on several images, you can distribute the computation via several GPUs. **This can also be done in tensorflow, see Using GPUs.
Pruning a neural network
Maybe this is the most interesting method of adapting big networks for simple tasks. You can see a beginner's guide here.
Pruning means that you remove parameters from your network, specifically the whole nodes/neurons in a decision tree/neural network (resp). To do that in object detection, you can do as follows (simplest way):
Randomly prune neurons from the fully connected layers.
Train one more epoch (or more) with low learning rate, only on objects you'd like to detect.
(optional) Perform the above several times for validation and choose best network.
The above procedure is the most basic one, but you can find plenty of papers that suggest algorithms to do so. For example
Automated Pruning for Deep Neural Network Compression and An iterative pruning algorithm for feedforward neural networks.

How to deploy trained tensorflow network on e.g. Raspberry Pi

I'm trying to make a simple gesture recognition system to use with my Raspberry Pi equipped with a camera. I would like to train a neural network with tensorflow on my more powerful laptop and then transfer it to the RPi for prediction (as part of a Magic Mirror). Is there a way to export the trained network and weights and use a lightweight version of tensorflow for the linear algebra and prediction without the overhead of all the symbolic graph machinery that are necessary for training? I have seen the tutorials on tensorflow server, but I'd rather not set up a server and just have it run the prediction on the RPi.
Yes, possible and available in the source repository. This allows to deploy and run a model trained on your laptop. Note that this is the same model, which can be big.
To deal with size and efficiency, TF is currently moving along a quantization approach. After your model is trained, a few extra steps allow to "translate" it into a lighter model with similar accuracy. Currently, the implementation is quite slow, though. There is a recent post that shows the whole process for iOS---pretty similar to RaspberryPI overall.
The Makefile contribution is also quite relevant for tuning and extra configuration.
Beware that this code moves often and breaks. It is sometimes useful to checkout an old "release" tag to get something that works end to end.

How to export ConvNet trained using Python's Theano/Lasagne to iOS?

I trained a convolutional neural net with Lasagne and Theano frameworks on Python.
I am satisfied with the architecture and the performance of the net on test data and I want to use it on an iPad application.
I was wondering if there is any simple way to take that net and use it on iOS without rewriting the code in another framework and/or train it again?
As far a I know, there is no way to convert the Python code directly to c/c++ (which might be a method for exporting the net/training code).

Categories