I'm in the process of getting my feet wet with tensorflow. I've started and successfully run the MNIST classifier with layers tutorial on my computer, now my objective is to play around with the parameters of the tutorial program (step size, layer properties). To do this, I want to collect the data on each run of the convolutional neural network and see if I can learn something from the changes of the network's output based on the changes of the network's design.
At the end of each run of the program (CNN MNIST Classifier) tensorflow returns numerous lines labeling what the classifier is doing. This one caught my eye:
INFO:tensorflow:Saving dict for global step 161: accuracy = 0.1663,
global_step = 161, loss = 2.286773
Now, I can copy/screenshot this line every time I want to record my run, but I'd like to figure out how to access the dictionary where this data is being collected.
I've tried find tensorflow in my bash, this wasn't very helpful.
My question is:
How would I find the run history for this classifier? Can you give me any advice on how to record/access this information in the future?
Related
I am working on an object detector using Tensorflow Object Detection API whereby I downloaded a model from the model zoo repository, specifically ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8 to train my custom dataset on which contains 2500 images split into 70:30 training and testing ratio. I have edited the pipeline.config file accordingly following a tutorial, where I added the label_maps paths and other requirements. I have trained this model for 50,000 steps and monitored the training process on tensorboard which displayed good training progress results. Now the issue occurred when I went to evaluate the model. For some reason, the evaluation results are the following:
What could possibly be the issue, since the loss graphs seem to be good and even the learning rate graphs? Thanks
I am in the process of writing an AI for the game 2048. At the moment, I can pull the game state from the browser and send moves to the game, but I don't know how to integrate that with TensorFlow. The nature of the project isn't conducive to training data, so I was wondering if it's possible to pass in the state of the game, have the network chuck out a move, run the move, repeat until the game is over, and then have it do the training?
This is certainly possible and trivial. You'll have to set up the model you want to use and I will assume that's been built.
From the perspective of interacting with a tensorflow model you just need to marshal your data into numpy arrays to pass in via the feed_dict property of sess.run.
To pass an input to tensorflow and get a result you would run something like this:
result = sess.run([logits], feed_dict={x:input_data})
This would perform a forward pass producing the output of the model without making any update. Now you'll take the results and use them to take the next step in the game.
Now that you have the result of your action (e.g. labels) you can perform an update step:
sess.run([update_op], feed_dict={x:input_data, y:labels})
It's as simple as that. Notice that your model will have an optimizer defined (update_op in this example), but if you don't ask tensorflow to compute it (as in the first code sample) no updates will occur. Tensorflow is all about a dependency graph. The optimizer is dependent on the output logits, but computing logits is not dependent on the optimizer.
Presumably you'll initialize this model randomly, so the first results will be randomly generated, but each step after that will benefit from the previous updates being applied.
If you're using a reinforcement learning model then you would only produce a reward at some indeterminant time in the future and when you run the update would vary a little from this example, but the general nature of the problem remains the same.
I am trying to use ssd_inception_v2_coco pre-trained model from Tensorflow API by training it with a single class dataset and also applying Transfer Learning. I trained the net for around 20k steps(total loss around 1) and using the checkpoint data, I created the inference_graph.pb and used it in the detection code.
To my surprise, when I tested the net with the training data the graph is not able to detect even 1 out of 11 cases (0/11). I am lost in finding the issue.
What might be the possibile mistake?.
P.S : I am not able to run train.py and eval.py at the same time, due to memory issues. So, I don't have info about precision from tensorboard
Has anyone faced similar kind of issue?
I am writing a neural network in tensorflow and I want to be able to export my final trained network and import it in another program to play a game. I have found multiple forum posts like:
Tensorflow: How to use a trained model in a application?
Tensorflow: how to save/restore a model?
I also saw in the tf documentations they were using estimators to save the model but I am not sure if that is what I'm looking for and how to apply it.
But those talk about exporting the entire session and importing it into the application and using Session.run, but as I understand it that requires an input of the predicted output and will run another training step on my network. I don't want to continue training my network - it's finished - I now want to evaluate a specific state given to me by the game only.
Thanks in advance for any help available.
As I know, there are 2 way of doing it.
checkpoint files(metagraph)
savedmodel
savedmodel is very convenient, but study curve is higher than checkpoint file. you can check this tutorial
and import model is not continue run the training, it is basically restore all the variable you learned.
I'm attempting to make multiple sequential predictions from a tensorflow network, but performance seems very poor (~500ms per prediction for a 2-layer 8x8 convolutional network) even for a CPU. I suspect that part of the problem is that it appears to be reloading the network parameters every time. Each call to classifier.predict in the code below results in the following line of output - which I therefore see hundreds of times.
INFO:tensorflow:Restoring parameters from /tmp/model_data/model.ckpt-102001
How can I reuse the checkpoint that is already loaded?
(I can't do batch predictions here because the output of the network is a move to play in a game, which then needs to be applied to the current state before feeding the the new game state.)
Here's the loop that's doing the predictions.
def rollout(classifier, state):
while not state.terminated:
predict_input_fn = tf.estimator.inputs.numpy_input_fn(x={"x": state.as_nn_input()}, shuffle=False)
prediction = next(classifier.predict(input_fn=predict_input_fn))
index = np.random.choice(NUM_ACTIONS, p=prediction["probabilities"]) # Select a move according to the network's output probabilities
state.apply_move(index)
classifier is a tf.estimator.Estimator created with...
classifier = tf.estimator.Estimator(
model_fn=cnn_model_fn, model_dir=os.path.join(tempfile.gettempdir(), 'model_data'))
The Estimator API is a high-level API.
The tf.estimator framework makes it easy to construct and train
machine learning models via its high-level Estimator API. Estimator
offers classes you can instantiate to quickly configure common model
types such as regressors and classifiers.
The Estimator API abstracts away lots of the complexity of TensorFlow, but loses some generality in the process. Having read the code, it's clear that there's no way to run multiple sequential predictions without reloading the model each time. The low-level TensorFlow APIs allow this behaviour. But...
Keras is a high-level framework that supports this use case. Simple define the model and then call predict repeatedly.
def rollout(model, state):
while not state.terminated:
predictions = model.predict(state.as_nn_input())
for _, prediction in enumerate(predictions):
index = np.random.choice(bt.ACTIONS, p=prediction)
state.apply_mode(index)
Unscientific benchmarking shows that this is ~100x faster.