Regression through reinforcement learning - python

I'm trying to build an Agent that can play Pocket Tanks using RL. The problem I'm facing now is that how can I train a neural network to output the correct Power and Angle. so instead of actions classification. and I want a regression.

In order to output the correct power and angle, all you need to do is go into your neural network architecture and change the activation of your last layer.
In your question, you stated that you are currently using an action classification output, so it is most likely a softmax output layer. We can do two things here:
If the power and angle has hard constraints, e.g. the angle cannot be greater than 360°, or the power cannot exceed 700 kW, we can change the softmax output to a TanH output (hyperbolic tangent) and multiply it by the constraint of power/angle. This will create a "scaling effect" because tanh's output is between -1 and 1. Multiplying the tanh's output by the constraint of the power/angle ensures that the constraints are always satisfied and the output is the correct power/angle.
If there are no constraints on your problem. We can simply just delete the softmax output all together. Removing the softmax allows for the output to no longer be constrained between 0 and 1. The last layer of the neural network will simply act as a linear mapping, i.e., y = Wx + b.
I hope this helps!
EDIT: In both cases, your reward function to train your neural network can simply be a MSE loss. Example: loss = (real_power - estimated_power)^2 + (real_angle - estimated_angle)^2

Related

Siamese Network in Tensorflow -- Changing Sigmoid Activiation Cutoffs

I've built the siamese network from the PyImageSearch.com tutorial here:
https://pyimagesearch.com/2020/11/30/siamese-networks-with-keras-tensorflow-and-deep-learning/
I'm augmenting the CNN to detect whether an image is identical (instead of the same number, as in the tutorial). This just calls for augmenting one line of code within the make_pairs() function in utils.py:
old line: pairImages.append([currentImage, posImage])
new line: pairImages.append([currentImage, currentImage])
I thought this would be pretty easy for the network to learn 100% accuracy quickly because the images are identical. However, it doesn't achieve accuracy beyond ~95% in 10 epochs.
I think the issue is in the output layer. A euclidean distance is passed to a Dense output layer that contains a sigmoid function. The positive class is identical images, so the euclidean distance between these two images is 0. I'm thinking that if I could change the sigmoid activation from 0.5 to something close to 1.0, it would speed up learning. Identical images would have a euclidean distance of 0, so I imagine this should work.
My question is then:
how do I change the cutoff for the sigmoid function from the default of 0.5
Would changing the sigmoid cutoff have the intended effect?
Try using rectified linear unit (ReLU) activation instead of sigmoid.

Tensorflow how to compute the gradient of output with respect to the input?

Recently, I try to do some experiments and I have a neural network D(x) where x is the input image with batch size 64. I want to compute the gradient of D(x) with respect to x. Should I do the computation as the following?
grad = tf.gradients(D(x), [x])
Thank you everybody!
Yes, you will need to use tf.gradients. For more details see https://www.tensorflow.org/api_docs/python/tf/gradients.
During the training of a neural network, the gradient is generally computed of a loss function with respect to the input. This is because, the loss function can be well defined along with its gradient.
However, if you talk about the gradient of your output D(x), this I assume is some set of vector(s). You will need to define how the gradient will be computed with respect to its input (i.e the layer which generates the output).
The exact details of that implementation depends upon the framework which you are using.

Tensorflow neural network loss value NaN

I'm trying to build a simple multilayer perceptron model on a large data set but I'm getting the loss value as nan. The weird thing is: after the first training step, the loss value is not nan and is about 46 (which is oddly low. when i run a logistic regression model, the first loss value is about ~3600). But then, right after that the loss value is constantly nan. I used tf.print to try and debug it as well.
The goal of the model is to predict ~4500 different classes - so it's a classification problem. When using tf.print, I see that after the first training step (or feed forward through MLP), the predictions coming out from the last fully connected layer seem right (all varying numbers between 1 and 4500). But then, after that the outputs from the last fully connected layer go to either all 0's or some other constant number (0 0 0 0 0).
For some information about my model:
3 layer model. all fully connected layers.
batch size of 1000
learning rate of .001 (i also tried .1 and .01 but nothing changed)
using CrossEntropyLoss (i did add an epsilon value to prevent log0)
using AdamOptimizer
learning rate decay is .95
The exact code for the model is below: (I'm using the TF-Slim library)
input_layer = slim.fully_connected(model_input, 5000, activation_fn=tf.nn.relu)
hidden_layer = slim.fully_connected(input_layer, 5000, activation_fn=tf.nn.relu)
output = slim.fully_connected(hidden_layer, vocab_size, activation_fn=tf.nn.relu)
output = tf.Print(output, [tf.argmax(output, 1)], 'out = ', summarize = 20, first_n = 10)
return {"predictions": output}
Any help would be greatly appreciated! Thank you so much!
Two (possibly more) reasons why it doesn't work:
You skipped or inappropriately applied feature scaling of your
inputs and outputs. Consequently, data may be difficult to handle
for Tensorflow.
Using ReLu, which is a discontinuous function, may raise issues. Try using other activation functions, such as tanh or sigmoid.
For some reasons, your training process has diverged, and you may have infinite values in your weights, wich gives NaN losses. The reasons can be many, try changing your training parameters (use smaller batchs for test).
Also, using a relu for the last output in a classifier is not the usual method, try using a sigmoid.
From my understanding Relu doesn't put a cap on the upper bound for Neural Networks so its more likely to deconverge depending upon its implementation.
Try switching all the activation functions to tanh or sigmoid. Relu is generally used for convolution in cnns.
Its also difficult to determine if your deconverging due to cross entropy as we don't know how you effected it with your epsilon value. Try just using the residual its much simpler but still effective.
Also a 5000-5000-4500 neural network is huge. Its unlikely you actually need a network that large.

What activation function to use or modifications to make when neural network gives same output on regression with PyBrain?

I have a neural network with one input, three hidden neurons and one output. I have 720 input and corresponding target values, 540 for training, 180 for testing.
When I train my network using Logistic Sigmoid or Tan Sigmoid function, I get the same outputs while testing, i.e. I get same number for all 180 output values. When I use Linear activation function, I get NaN, because apparently, the value gets too high.
Is there any activation function to use in such a case? Or any improvements to be done? I can update the question with details and code if required.
Neural nets are not stable when fed input data on arbitrary scales (such as between approximately 0 and 1000 in your case). If your output units are tanh they can't even predict values outside the range -1 to 1 or 0 to 1 for logistic units!
You should try recentering/scaling the data (making it have mean zero and unit variance - this is called standard scaling in the datascience community). Since it is a lossless transformation you can revert back to your original scale once you've trained the net and predicted on the data.
Additionally, a linear output unit is probably the best as it makes no assumptions about the output space and I've found tanh units to do much better on recurrent neural networks in low dimensional input/hidden/output nets.
Newmu is right that the scaling is probably the issue here; you need to scale your inputs to lie in the valid range. (Standardization to zero mean, unit variance, as they suggest, though, isn't a great choice since that means about a third of your data will like outside [-1, 1]....) I don't know about pybrain, but in scikit-learn you'd want sklearn.preprocessing.MinMaxScaler.
But, also, in the comments you said your dataset looks like this:
where the horizontal axis is inputs, vertical is targets. So, when you see an input of 200, you have one training example saying it's 80 and one saying it's 320; what do you want it to say then? An "optimal" neural network (which may be hard to achieve) would predict 200 or so.
You may need to think about how to reframe your learning problem to be a more-consistent function from inputs to targets.

Python Neurolab - fixing output range

I am learning some model based on examples ${((x_{i1},x_{i2},....,x_{ip}),y_i)}_{i=1...N}$ using a neural network of Feed Forward Multilayer Perceptron (newff) (using python library neurolab). I expect the output of the NN to be positive for any further simulation of the NN.
How can I make sure that the results of simulation of my learned NN are always positive?
(how I do it in neurolab?)
Simply use a standard sigmoid/logistic activation function on the output neuron. sigmoid(x) > 0 forall real-valued x so that should do what you want.
By default, many neural network libraries will use either linear or symmetric sigmoid outputs (which can go negative).
Just note that it takes longer to train networks with a standard sigmoid output function. It's usually better in practice to let the values go negative and instead transform the outputs from the network into the range [0,1] after the fact (shift up by the minimum, divide by the range (aka max-min)).

Categories