I've coded a simple neural network for XOR in python. While there is loads of information online about how to program this, there isn't much on how to feed the data through it. I've tested the change in weights after one cycle for inputs [1,1] to compare my results with my lecture slides and it's 100% the same, so I believe the code works. I can train the network for that same input, but when I change the input (and corresponding target) every cycle the error doesn't go down.
Should I allow changing the weights and inputs after every cycle or should I run through all the possible inputs first, get an average error and then change the weights? (But changing weights are dependent on the output, so what output would I use then)
I can share my code, if needed, but I'm pretty certain it's correct.
Please give me some advice? Thank you in advance.
So, you're saying you implemented a neural network on your own ?
well in this case, basically each neuron on the input layer must be assigned with a feature of a certain row, than just iterate through each layer and each neuron in that layer and calculate as instructed.
I'm sure you are familiar with the back-propagation algorithm so you'll know when to stop.
once you're done with that row, do it again to the next row, assign each feature to each of the input neurons and start the iterations again.
once youre done with all records, thats an Epoch.
I hope that answers your question.
also, I would recommend you to try out Keras, its easy to use and a good tool to be experienced in.
Related
I am trying to build a Neural Network from scratch, using only numpy. I have the following code and functions. However, the output after the training is not matching the expected output that i have (using XOR as an example). I think one of my functions is not correct but cannot figure out the mistake. The output I get is, for example: [[0.73105858], [0.53336314],[0.79343002],[0.5786911 ]], which is not close to the expected output [0,0,0,1]
I don't so any issues with your code, but here are some thing you should have in mind:
Your neural network is trained for 2 iterations, with a learning rate of 0.01. This means that your network is only updated 2 times with a small rate of improvement resulting in an undertrained neural network. Also, your always using a tensor of the size 4*4 for input, meaning that the neural network is only updated for the average of all samples, hence the result that just seems like an average.
For improvement, my suggestion would be to increase the number of iterations and also increase the number of samples for each iterations, also making sure that each iteration has more than one update. Still, i believe that you won't get 100% accurate results, since you are only using one linear layer for XOR, which can't be solved with just one linear system. You could consider adding another layer for better results.
I saw a youtube video on genetic algorithms by davidrandallmiller and wanted to see if I could code it myself but I'm not sure how to go about doing it
if I generate a bunch of random binary encoded genes, every gene describes a single connection between two neuron, if I understood from the video correctly, every neuron calculates it's own output by getting the weighted sum of all of it's inputs, and then runs them through thanh function and passes that output to the next neuron
but I don't understand how a neuron can connect to itself.
and how do I find the order of which to calculate neuron outputs, because I feel like I should make it so a neuron can't be activated (calculate it's outputs and pass them) unless all of it's incoming connections have also been activated
I'm not sure if that makes sense, maybe there is some other technique I can use to achieve a genetic algorithm with neural network?
any help is appreciated
How can I use Long Short-term Memory (LSTM) to predict a future value x(t+1) (out of sample prediction) based on a historical dataset. I read and tried many web tutorials for forecasting and prediction using lstm, but still far away from the point. What's the exact procedure to do this prediction? Is it just as simple as shifting the target array (n)steps where n is the number of future predicts and do the prediction operation? or there's another techniques?
please help or leave a suggestion.
Can you provide the framework you are using? tensorflow? pytorch? which web tutorials specifically?
Assuming you are going tensorflow, you can copy and paste code from one of these, test that it works on the provided dataset, then modify the input encoding functions to fit your dataset, then run on your dataset.
https://github.com/llSourcell/How-to-Predict-Stock-Prices-Easily-Demo (best)
https://github.com/sebastianheinz/stockprediction
https://github.com/talolard/MarketVectors/blob/master/preparedata.ipynb (you will have to replace fc layers with lstm, and fiddle with inputs)
In general procedure is something like (assuming tensorflow):
Download Dataset
Create a function to load batches of data
Create a function to encode batch of data (normalization, other transforms)
Create LSTM layer to recieve series of inputs.
Create output layer (usually fully connected) to take last lstm state and predict output of your desired size.
Create a tf session to wire everything together, and hit run.
Some questions to ask conceptually about which network use:
How many inputs to how many outputs - see this excellent http://cs231n.stanford.edu/slides/2016/winter1516_lecture10.pdf by Karpathy
How far back do you consider the stock prices eg {t-100... t} or {t-10 ...t} which may dictate size of hidden layers.
What other information do you think is relevant to the model? does stock A influence stock B? in which case you may have 2 lstms outputing a state to your fully connected layer...
For several days now, I am trying to build a simple sine-wave sequence generation using LSTM, without any glimpse of success so far.
I started from the time sequence prediction example
All what I wanted to do differently is:
Use different optimizers (e.g RMSprob) than LBFGS
Try different signals (more sine-wave components)
This is the link to my code. "experiment.py" is the main file
What I do is:
I generate artificial time-series data (sine waves)
I cut those time-series data into small sequences
The input to my model is a sequence of time 0...T, and the output is a sequence of time 1...T+1
What happens is:
The training and the validation losses goes down smoothly
The test loss is very low
However, when I try to generate arbitrary-length sequences, starting from a seed (a random sequence from the test data), everything goes wrong. The output always flats out
I simply don't see what the problem is. I am playing with this for a week now, with no progress in sight.
I would be very grateful for any help.
Thank you
This is normal behaviour and happens because your network is too confident of the quality of the input and doesn't learn to rely on the past (on it's internal state) enough, relying soley on the input. When you apply the network to its own output in the generation setting, the input to the network is not as reliable as it was in the training or validation case where it got the true input.
I have two possible solutions for you:
The first is the simplest but less intuitive one: Add a little bit of Gaussian noise to your input. This will force the network to rely more on its hidden state.
The second, is the most obvious solution: during training, feed it not the true input but its generated output with a certain probability p. Start out training with p=0 and gradually increase it so that it learns to general longer and longer sequences, independently. This is called schedualed sampling, and you can read more about it here: https://arxiv.org/abs/1506.03099 .
I use the Elman recurrent network from neurolab to predict a time series of continuous values. The network is trained from a sequence such that the input is the value at index i and the target is the value at index i+1.
To make predictions beyond the immediate next time step, the output of the net is feed back as input. If, for example, I intend to predict the value at i+5, I proceed as follows.
Input the value from i
Take the output and feed it to the net the as next input value (e.g. i+1)
Repeat 1. to 3. four more times
The output is a prediction of the value ati+5
So for predictions beyond the immediate next time step, recurrent networks must be activated with the output from a previous activation.
In most examples, however, the network is fed with an already complete sequence. See, for example, the functions train and sim in the example behind the link above. The first function trains the network with an already complete list of examples and the second function activates the network with a complete list of input values.
After some digging in neurolab, I found the function step to return a single output for a single input. Results from using step suggest, however, that the function does not retain the activation of the recurrent layer, which is crucial to recurrent networks.
How can I activate a recurrent Elman network in neurolab with a single input such that it maintains its internal state for the next single input activation?
It turns out it is quite normal for output which is generated from previous output sooner or later to converge towards a constant value. In effect, the output of a network cannot depend only on its previous output.
I obtain the same result - constant . However I noticed something:
-> if you use 0 and 1 data, results improve. 0 - decrease 1 - increase. Result is no longer a constant.
-> try to use another variable to explain the targeted one as one of our colleagues already mentioned.