I am trying to write a simple neural network that can come up with weights to for, say, the y=x function. Here's my code:
http://codepad.org/rPdZ7fOz
As you can see, the error level never really goes down much. I tried changing the momentum and learning rate but it did not help much. Is my number of input, hidden and output correct for what I want to do? If not, what should it be? If so, what else could be wrong?
You're attempting to train the network to give output values 1,2,3,4 as far as I understood. Yet, at the output you use a sigmoid (math.tanh(..)) whose values are always between -1 and 1.
So the output of your Neural network is always between -1 and 1 and thus you always get a large error when trying to fit output values outside that range.
(I just checked that when scaling your input and output values by 0.1, there seems to be a nice training progress and I get at the end:
error 0.00025
)
The Neural Network you're using is useful if you want to do classification (e.g. assign the data point to class A if the NN output is < 0 or B if it is > 0). It looks like what you want to do is regression (fit a real-valued function).
You can remove the sigmoid at the output node but you will have to slightly modify your backpropagation procedure to take this into account.
Related
I am trying to build a Neural Network from scratch, using only numpy. I have the following code and functions. However, the output after the training is not matching the expected output that i have (using XOR as an example). I think one of my functions is not correct but cannot figure out the mistake. The output I get is, for example: [[0.73105858], [0.53336314],[0.79343002],[0.5786911 ]], which is not close to the expected output [0,0,0,1]
I don't so any issues with your code, but here are some thing you should have in mind:
Your neural network is trained for 2 iterations, with a learning rate of 0.01. This means that your network is only updated 2 times with a small rate of improvement resulting in an undertrained neural network. Also, your always using a tensor of the size 4*4 for input, meaning that the neural network is only updated for the average of all samples, hence the result that just seems like an average.
For improvement, my suggestion would be to increase the number of iterations and also increase the number of samples for each iterations, also making sure that each iteration has more than one update. Still, i believe that you won't get 100% accurate results, since you are only using one linear layer for XOR, which can't be solved with just one linear system. You could consider adding another layer for better results.
I've coded a simple neural network for XOR in python. While there is loads of information online about how to program this, there isn't much on how to feed the data through it. I've tested the change in weights after one cycle for inputs [1,1] to compare my results with my lecture slides and it's 100% the same, so I believe the code works. I can train the network for that same input, but when I change the input (and corresponding target) every cycle the error doesn't go down.
Should I allow changing the weights and inputs after every cycle or should I run through all the possible inputs first, get an average error and then change the weights? (But changing weights are dependent on the output, so what output would I use then)
I can share my code, if needed, but I'm pretty certain it's correct.
Please give me some advice? Thank you in advance.
So, you're saying you implemented a neural network on your own ?
well in this case, basically each neuron on the input layer must be assigned with a feature of a certain row, than just iterate through each layer and each neuron in that layer and calculate as instructed.
I'm sure you are familiar with the back-propagation algorithm so you'll know when to stop.
once you're done with that row, do it again to the next row, assign each feature to each of the input neurons and start the iterations again.
once youre done with all records, thats an Epoch.
I hope that answers your question.
also, I would recommend you to try out Keras, its easy to use and a good tool to be experienced in.
I have trained a NN with a vector input and scalar output (regression).
Now I want to find the global minimun of the NN using GD with pytorch.
I’m new to programming in general, python specifically, and pytorch even more specifically.
I believe what I’m trying to do must have been done a thousand times before, if not ten thousand times. I’ll be super happy and grateful if anyone could point me to some code somewhere (maybe in github) where there’s an example of what I’m trying to do that I could adjust to my needs.
You do what you've done to train your network but instead of updating the weights, you update the input:
input = torch.zeros([1,3,32,32], requires_grad=True) # Whatever the expected input size of your network is
output = model(input)
target = 0.0 # What kind of target should your network reach?
loss = ((target - output) ** 2) # Replace this with the loss you are using
grad = torch.autograd.grad(loss, input)
You can apply the gradient (maybe multiplied with a learning rate) to the input and repeat this step many times. I've updated this from https://discuss.pytorch.org/t/gradient-of-loss-of-neural-network-with-respect-to-input/9482
You should pay attention to the fact that it's likely your network may produce a pretty noisy "inputs", so you should think about what your initial input should be. Google has done something similar before, see for example https://www.networkworld.com/article/2974718/software/deep-dream-artificial-intelligence-meets-hallucinations.html
I use the Elman recurrent network from neurolab to predict a time series of continuous values. The network is trained from a sequence such that the input is the value at index i and the target is the value at index i+1.
To make predictions beyond the immediate next time step, the output of the net is feed back as input. If, for example, I intend to predict the value at i+5, I proceed as follows.
Input the value from i
Take the output and feed it to the net the as next input value (e.g. i+1)
Repeat 1. to 3. four more times
The output is a prediction of the value ati+5
So for predictions beyond the immediate next time step, recurrent networks must be activated with the output from a previous activation.
In most examples, however, the network is fed with an already complete sequence. See, for example, the functions train and sim in the example behind the link above. The first function trains the network with an already complete list of examples and the second function activates the network with a complete list of input values.
After some digging in neurolab, I found the function step to return a single output for a single input. Results from using step suggest, however, that the function does not retain the activation of the recurrent layer, which is crucial to recurrent networks.
How can I activate a recurrent Elman network in neurolab with a single input such that it maintains its internal state for the next single input activation?
It turns out it is quite normal for output which is generated from previous output sooner or later to converge towards a constant value. In effect, the output of a network cannot depend only on its previous output.
I obtain the same result - constant . However I noticed something:
-> if you use 0 and 1 data, results improve. 0 - decrease 1 - increase. Result is no longer a constant.
-> try to use another variable to explain the targeted one as one of our colleagues already mentioned.
I have a neural network with one input, three hidden neurons and one output. I have 720 input and corresponding target values, 540 for training, 180 for testing.
When I train my network using Logistic Sigmoid or Tan Sigmoid function, I get the same outputs while testing, i.e. I get same number for all 180 output values. When I use Linear activation function, I get NaN, because apparently, the value gets too high.
Is there any activation function to use in such a case? Or any improvements to be done? I can update the question with details and code if required.
Neural nets are not stable when fed input data on arbitrary scales (such as between approximately 0 and 1000 in your case). If your output units are tanh they can't even predict values outside the range -1 to 1 or 0 to 1 for logistic units!
You should try recentering/scaling the data (making it have mean zero and unit variance - this is called standard scaling in the datascience community). Since it is a lossless transformation you can revert back to your original scale once you've trained the net and predicted on the data.
Additionally, a linear output unit is probably the best as it makes no assumptions about the output space and I've found tanh units to do much better on recurrent neural networks in low dimensional input/hidden/output nets.
Newmu is right that the scaling is probably the issue here; you need to scale your inputs to lie in the valid range. (Standardization to zero mean, unit variance, as they suggest, though, isn't a great choice since that means about a third of your data will like outside [-1, 1]....) I don't know about pybrain, but in scikit-learn you'd want sklearn.preprocessing.MinMaxScaler.
But, also, in the comments you said your dataset looks like this:
where the horizontal axis is inputs, vertical is targets. So, when you see an input of 200, you have one training example saying it's 80 and one saying it's 320; what do you want it to say then? An "optimal" neural network (which may be hard to achieve) would predict 200 or so.
You may need to think about how to reframe your learning problem to be a more-consistent function from inputs to targets.