I use the tensorflow library to solve the time series problem.
I get the dimensions or properties by subtracting the current value from the previous value (according to this article)
In this article, there is the data needed for forecasting. It chooses a value for training and a value for testing that there are no problems.
But my question is how can I predict the future? Suppose if I want to forecast 5 months later there will be no dimensions or attributes to send to the forecast function.
--If you have a better source, please introduce it ...Thanks in advance
If you have a lot of data it could be possible, it means that your model knows a lot of data and can generalize with new data and it can find a knowed pattern. If you have a poor model it will throws bad predictions because the new input is new and the model can't find a knowed pattern
I'm new to Machine Learning and I'm a bit confused about how data is being read for the training/testing process. Assuming my data works with date and I want the model to read the later dates first before getting to the newer dates, the data is saved in the form of earliest date on line 1 and line n has the oldest date. I'm assuming that naturally data is being read from line 1 down to line n, but I just need to be sure about it. And is there any way to make the model(E.g Logistic Regression) read data whichever direction I want it to?
A machine learning model in supervised learning method learns from all samples in no order, it is encouraged to shuffle the samples during training.
Most of the times, the model won't get fed with all samples at once; the training set is split into batches, batches of random samples or just in whatever order it is in the training set.
I‘m new to tensorflow have a general question:
I have a certain amount of training data and want to do a time series prediction.
The intervals on my training data is one minute and I want to do a prediction for the following minutes based on the new input data which is provided via a REST API
What I don‘t understand is this:
Let‘s say I train the model with all the data till yesterday and this means I can predict the first values of today by a certain amount. But the new values of today have not been observed by the model that has been build yesterday.
How would you solve this problem?
Thanks
Djboblo,
I assume that you need to predict whole next day values on per minute basis.
In that case your options are:
recursive prediction, i.e. using predicted data as input for the next prediction
structuring the model to provide you with prediction for the whole next day
If it is just a matter of predicting for a single minute forward and your model is trained on reasonably large amount of data - don't worry, just feed it with the values up to the prediction minute. Periodically you may re-train the model using new data.
What I was looking for is this:
How to use a Keras RNN model to forecast for future dates or events?
to predict stateful events
and after a while use the .fit Method to update the network with new data
See: https://machinelearningmastery.com/update-lstm-networks-training-time-series-forecasting/
This is a question about a general approach rather than a specific coding problem. I'm trying to do time series forecasting with Tensorflow where features of the label timestep are known to the model. E.g. a human trying to predict a variable a week from now would know things that are going to happen in the next week that will affect that variable. So a window of 20 timesteps where the label is the 20th timestep would look something like this:
Timesteps 1-19 would each have a set of features plus the timeseries data
Timestep 20 would have a set of features which are known, plus the timeseries label which is unknown
Is there a model that could handle this sort of data? I've gone through the Tensorflow time series forecasting tutorial, done a Coursera course on Tensorflow time series forecasting and searched elsewhere but I can't find anything. I'm fairly new to this so apologies for any imprecise language.
I once tried to do this kind of TS problem by stacking a multivariate model and another machine learning model. My idea was that I use the normal TS model's output, add it as another feature in the other model that only takes the last time step's info as input. But it is complicated and might overfit a lot even if I carefully regularized the second model. The idea is that I use step 1 to window_size - 1 info to predict a rough output at step window_size, then use the info at step window_size to reduce the residual between my TS model output and the actual label; But I don't think this approach is theoretically correct and the result might be worse than using a TS model without feeding the target step's info.
I don't think tensorflow have any API for your problem because this type of problem is not a normal TS problem. Usually people would just treat this kind of problem as a regression or classification problem.
I am not an expert on this problem as well, but I just happened to attempt to solve the exact problem so this is just my personal experience...
I'm working with a dataset of a bunch of different features for stock forecasting for a project to help teach myself programming and to get into machine learning. I am running into some issues with predicting future data (or time series forecasting) and I was hoping someone out there could give me some advice! Any advice or criticism you could provide will be greatly appreciated.
Below I've listed detailed examples of the three implementations I have tried for forecasting time series data. I could definitely be wrong on this but I don't believe this is mechanical code issue because all of the results are consistent despite me re-coding it a few times (the only thing I can really think of here is not using MinMaxScaler correctly. See closing thoughts). It could definitely, however, be a macrocoding mistake.
I didn't post any code for the project here because it was starting to turn into a wall of words and I had three separate examples, but if you have any questions or think it would benefit your assistance to see the code used for any of the below examples or the data used for all of them feel free to let me know and I'll link whatever's needed.
The three forecasting implementations I have tried:
1) - A sliding window implementation. Input data is shifted backwards in timesteps (x-1, x-2...), target data is current timestep (x). Data used for forecasting the first prediction is n-rows of test data shifted in same manner as input data. For every subsequent prediction the oldest timestep is removed and the prediction is appended to front of prediction row, maintaining the same number of total timesteps but progressing forward in time.
2) - Input data is just x, target data is shifted 30 timesteps forward for prediction (y+1, y+2 ... y+30). Attempting to forecast future by taking the first sample of x in test data and predicting 30 steps into the future with it.
3) - Combination of both methods, input data is shifted backward and in the example shown below, 101 timesteps including the present timestep (x-100, x-99 ... x) were used. Target data, similar to implementation 2, is shifted 30 timesteps into the future (y+1, y+2... y+30). With this, I am attempting to forecast the future by taking 101 timesteps of the first n-rows of test data and predicting 30 steps into the future from there.
For all tests, I cut off the end of my dataset at an arbitrary amount (last ~10% of total dataset), split everything before the cutoff into training/validation (80/20) and saved everything after the cutoff for testing & forecasting purposes.
As for network architectures, I've tried a bunch of different ones, from bidirectional LSTM to multi-input CNN / GRU, to a wavenet like CNN implementation and all produce prediction results that are bad in a similar enough way that I feel like this is either a data manipulation problem or a problem of me not understanding how model.predict(), or my model's output works.
The architectures I will be using for each implementation below are:
1) causal dilation CNN
2) two layers LSTM
neural network architecture diagrams here: https://imgur.com/a/cY2RWNG
For every example below the model's weights were tuned by the model training on the training data (first 80% of dataset) and attempting to achieve the lowest validation loss possible using the validation data (last 20% of dataset).
--- First Implementation ---
(unfortunately, there's an image limit on stack overflow for my current rating or whatever so I've put each implementation into its own album)
Implementation 1 - Graphs for both CNN/LSTM: Images 1-7 https://imgur.com/a/36DZCIf
In every training/validation test graph, black represents the actual data and red is the predicted data, in the forecasting predictions blue represents the predictions made and orange represents the actual close price on a longer time scale than the prediction for better scale, all forecast predictions are 30 days into the future.
Using this methodology, and displaying the actual close price against the predicted close price in every instance:
Image 1 - sliding window set up for this implementation using one and two features and a range of numbers for viewing ease
CNN:
(images 2 & 3 description in album)
Image 4 - Sliding window approach of forecasting every feature in the data, with the prediction for close price plotted against the actual close price. The timesteps start at the first row of the cutoff data.
When the first prediction is made I append the prediction to the end of this row and remove the first timestep, repeating for every future timestep I wish to predict.
I really don't even know what to say about this prediction, it's really bad...
LSTM:
(images 5 & 6 description in album)
Image 7 - Sliding window prediction: https://i.imgur.com/Ywf6xvr.png
This prediction seems to be getting the trend somewhat accurately I guess.. But the starting point is really nowhere near the last known data point which is confusing.
--- Second Implementation ---
Implementation 2 - Graphs for both CNN/LSTM: Images 1-7
https://imgur.com/a/3CAk1xc
For this attempt, I made the target prediction many timesteps into the future. With this implementation, the model takes in the current timestep(x) of features and attempts to predict the closing price at y+1, y+2,y+3 etc. There is only one prediction here -- a sequence of time steps into the future.
The same graphing and network conventions as implementation 1 had applied to this too.
Image 1 - Set up of input and target data, using a range and only one or two features for viewing ease.
CNN:
(images 2 & 3 description in album)
Image 4 - Plotting all 30 predictions made from the first row of data features after the cutoff... this is horrible, why is the start again nowhere near the last known data point? I don't understand how it can predict y+1 being so far away from the closing price of x when in every instance of its training y+1 was almost certainly extremely close to x.
LSTM:
(images 5 & 6 description in album)
Image 7 - All 30 predictions into the future made from the first row of cutoff data: Again, both all over the place and the predictions start nowhere near the last actual data point, not sure what else to add.
It's starting to appear that either my CNN implementation is poorly done or LSTM is just a better solution here. Regardless, the predictions and actual forecasting are still terrible so I'll hold my judgment on the network architecture until I get something that looks remotely like an actual forecast prediction.
--- Third Implementation ---
Implementation 3 - Graphs for both CNN/LSTM: Images 1-7
https://imgur.com/a/clcKFF8
This was the final idea I had for forecasting the future and it's essentially a combination of the first two. For this implementation, I take x-n (x, x-1, x-2, x-3) etc., which is similar to the first implementation, and set the target data to y+1, y+2, y+3, which is similar to the second implementation. My goal for predicting with this was the same strategy as the second implementation where I would predict 30 days into the future, but instead of doing so on one timestep of features, I'd do so on many timesteps into the past. I had hoped that this implementation would give the prediction enough supporting data to accurately forecast the future
Image 1 - Input data or "x" and Target data or "y" implementation and set up. I use a range of numbers again. In this example, the input data has 2 features, includes the present timestep (x) and 4 timesteps shifted backward (x-1, x-2, x-3, x-4) and the target data has 5 timesteps into the future (y+1, y+2, y+3, y+4, y+5)
CNN:
(images 2 & 3 description in album)
Image 4 - 30 predictions into the future using 101 timesteps of x
This is probably the worst result yet and that's despite the prediction having way more timesteps back of data to use.
LSTM:
(images 5 & 6 description in album)
Image 7 - 30 predictions on input data row of 101 timesteps.
This actually has some action to it I guess, but it's all over the place, doesn't start near the last actual data point and is clearly not accurate at all.
closing thoughts
I've also tried removing the target variable (close price) from the input data but it doesn't seem to change much and the past n-days of closing data should be available to a model anyway I would think.
Originally I MinMaxScaled all of my data in my pre-processing page and did not inverse_transform any of the data. The results were basically just as bad as the examples above. For the examples above I have min max scaled the prediction, validation & test datasets separately to be within the range of 0.2 - 0.8. For the actual forecasting predictions, I've inverse_transformed the data before plotting it against the actual closing price which was never transformed.
If I am doing something fundamentally wrong in the above examples I would love to know as I'm just starting out and this is my first real programming/machine learning project.
A few other things relating to this that I've come across / tried:
I've experimented briefly with using a stateful model where I reset_states() after every prediction to some moderate success.
I've read that sequence to sequence models could be useful for forecasting time series data but I'm really not sure what that system is designed to do with time series despite reading into it quite a bit and am thus not sure how to implement it or test it out.
I tried bidirectional LSTM because of one random StackOverflow post suggesting it for time series forecasting... the results were very mediocre however and it doesn't seem to make much sense to me in this situation from what I understand of how it works. I've only tried it with the first implementation above though, let me know if it's something to look more into.
Any tips/criticism at all that you could provide would be greatly appreciated, I'm really not sure how to progress from here. Thanks in advance!
I have been through that,for me the sliding window approach with LSTM, NN worked like magic for small time series But on a bigger time series with data coming in on hourly basis for a few years it failed miserably.
Later on I ditched LSTM,GBTs & started implementing algos from statsmodels.tsa, ARIMA, SARIMA most of the time, I'll suggest you to read about them too. Very easy to implement no need to worry about sliding window, moving data few timestamps back, it takes care of all. Just train, tune the parameters & predict for the next timestamps.
Sometimes, I also faced issues like my time series had missing timestamps & data, then I had to impute those values, the frequency on which I trained (hourly,weekly,monthly) was different from the frequency on which I wanted to predict, then I had to bring data in right form too. I faced different frequency issue while visualising on a plot as well.
model=statsmodels.api.tsa.SARIMAX(train_df,order=(1,0,1),seasonal_order=(1,1,0,24))
model = model.fit()
other than the data pre-processing part, imputing missing data, training on right frequency & some logic for parameters tuning, you will need to use just these two lines, your data_frame will have the index in a date format & columns will have time series data