Related
I built the Bayesian regression using PyMC3 package. I'm trying to generate prediction using new data. I used the data container pm.Data() to train the model with the training data, then passed the new data to pm.set_data() before calling pm.sample_posterior_predictive(). The prediction was what I would expect from the training data, not the new data.
Here's my model:
df_train = df.drop(['Unnamed: 0', 'DATE_AT'], axis=1)
with Model() as model:
response_mean = []
x_ = pm.Data('features', df_train) # a data container, can be changed
t = np.transpose(x_.get_value())
# intercept
y = Normal('y', mu=0, sigma=6000)
response_mean.append(y)
# channels that can have DECAY and SATURATION effects
for channel_name in delay_channels:
i = df_train.columns.get_loc(channel_name)
xx = t[i].astype(float)
print(f'Adding Delayed Channels: {channel_name}')
c = coef.loc[coef['features']==channel_name, 'coef'].values[0]
s = abs(c*0.015)
if c <= 0:
channel_b = HalfNormal(f'beta_{channel_name}', sd=s)
else:
channel_b = Normal(f'beta_{channel_name}', mu=c, sigma=s)
alpha = Beta(f'alpha_{channel_name}', alpha=3, beta=3)
channel_mu = Gamma(f'mu_{channel_name}', alpha=3, beta=1)
response_mean.append(logistic_function(
geometric_adstock_tt(xx, alpha), channel_mu) * channel_b)
# channels that have SATURATION effects only
for channel_name in non_lin_channels:
i = df_train.columns.get_loc(channel_name)
xx = t[i].astype(float)
print(f'Adding Non-Linear Logistic Channel: {channel_name}')
c = coef.loc[coef['features']==channel_name, 'coef'].values[0]
s = abs(c*0.015)
if c <= 0:
channel_b = HalfNormal(f'beta_{channel_name}', sd=s)
else:
channel_b = Normal(f'beta_{channel_name}', mu=c, sigma=s)
# logistic reach curve
channel_mu = Gamma(f'mu_{channel_name}', alpha=3, beta=1)
response_mean.append(logistic_function(xx, channel_mu) * channel_b)
# continuous external features
for channel_name in control_vars:
i = df_train.columns.get_loc(channel_name)
xx = t[i].astype(float)
print(f'Adding control: {channel_name}')
c = coef.loc[coef['features']==channel_name, 'coef'].values[0]
s = abs(c*0.015)
if c <= 0:
control_beta = HalfNormal(f'beta_{channel_name}', sd=s)
else:
control_beta = Normal(f'beta_{channel_name}', mu=c, sigma=s)
channel_contrib = control_beta * xx
response_mean.append(channel_contrib)
# categorical control variables
for var_name in index_vars:
i = df_train.columns.get_loc(var_name)
shape = len(np.unique(t[i]))
x = t[i].astype('int')
print(f'Adding Index Variable: {var_name}')
ind_beta = Normal(f'beta_{var_name}', sd=6000, shape=shape)
channel_contrib = ind_beta[x]
response_mean.append(channel_contrib)
# noise
sigma = Exponential('sigma', 10)
# define likelihood
likelihood = Normal(outcome, mu=sum(response_mean), sd=sigma, observed=df[outcome].values)
trace = pm.sample(tune=3000, cores=4, init='advi')
Here's the beta's from the model. Notice that ADWORD_SEARCH is one of the most important features:
Betas
When I zeroed out ADWORD_SEARCH feature, I got practically identical prediction, which can not be the case:
with model:
y_pred = sample_posterior_predictive(trace)
mod_channel = 'ADWORDS_SEARCH'
df_mod = df_train.copy(deep=True)
df_mod.iloc[12:-12, df_mod.columns.get_loc(mod_channel)] = 0
with model:
pm.set_data({'features':df_mod})
y_pred_mod = pm.sample_posterior_predictive(trace)
Predictions Plot
By zeroeing out ADWORD_SEARCH, I would expect that the prediction would be significantly lower than the original prediction since ADWORD_SEARCH is one of the most important features according to the betas.
I started questioning the model, but it seems to perform well:
MAPE = 6.3%
r2 = 0.7
I also tried passing in the original training data set to pm.setdata() and I got very similar results as well.
This is difference between prediction from training data and new data:
y1-y2
This is the difference between prediction from training data and the same training data using pm.setdata():
y1-y3
Anyone know what I'm doing wrong?
I currently have a RNN model for time series predictions. It uses 3 input features "value", "temperature" and "hour of the day" of the last 96 time steps to predict the next 96 time steps of the feature "value".
Here you can see a schema of it:
and here you have the current code:
#Import modules
import pandas as pd
import numpy as np
import tensorflow as tf
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error
from tensorflow import keras
# Define the parameters of the RNN and the training
epochs = 1
batch_size = 50
steps_backwards = 96
steps_forward = 96
split_fraction_trainingData = 0.70
split_fraction_validatinData = 0.90
randomSeedNumber = 50
#Read dataset
df = pd.read_csv('C:/Users/Desktop/TestData.csv', sep=';', header=0, low_memory=False, infer_datetime_format=True, parse_dates={'datetime':[0]}, index_col=['datetime'])
# standardize data
data = df.values
indexWithYLabelsInData = 0
data_X = data[:, 0:3]
data_Y = data[:, indexWithYLabelsInData].reshape(-1, 1)
scaler_standardized_X = StandardScaler()
data_X = scaler_standardized_X.fit_transform(data_X)
data_X = pd.DataFrame(data_X)
scaler_standardized_Y = StandardScaler()
data_Y = scaler_standardized_Y.fit_transform(data_Y)
data_Y = pd.DataFrame(data_Y)
# Prepare the input data for the RNN
series_reshaped_X = np.array([data_X[i:i + (steps_backwards+steps_forward)].copy() for i in range(len(data) - (steps_backwards+steps_forward))])
series_reshaped_Y = np.array([data_Y[i:i + (steps_backwards+steps_forward)].copy() for i in range(len(data) - (steps_backwards+steps_forward))])
timeslot_x_train_end = int(len(series_reshaped_X)* split_fraction_trainingData)
timeslot_x_valid_end = int(len(series_reshaped_X)* split_fraction_validatinData)
X_train = series_reshaped_X[:timeslot_x_train_end, :steps_backwards]
X_valid = series_reshaped_X[timeslot_x_train_end:timeslot_x_valid_end, :steps_backwards]
X_test = series_reshaped_X[timeslot_x_valid_end:, :steps_backwards]
Y_train = series_reshaped_Y[:timeslot_x_train_end, steps_backwards:]
Y_valid = series_reshaped_Y[timeslot_x_train_end:timeslot_x_valid_end, steps_backwards:]
Y_test = series_reshaped_Y[timeslot_x_valid_end:, steps_backwards:]
# Build the model and train it
np.random.seed(randomSeedNumber)
tf.random.set_seed(randomSeedNumber)
model = keras.models.Sequential([
keras.layers.SimpleRNN(10, return_sequences=True, input_shape=[None, 3]),
keras.layers.SimpleRNN(10, return_sequences=True),
keras.layers.TimeDistributed(keras.layers.Dense(1))
])
model.compile(loss="mean_squared_error", optimizer="adam", metrics=['mean_absolute_percentage_error'])
history = model.fit(X_train, Y_train, epochs=epochs, batch_size=batch_size, validation_data=(X_valid, Y_valid))
#Predict the test data
Y_pred = model.predict(X_test)
# Inverse the scaling (traInv: transformation inversed)
data_X_traInv = scaler_standardized_X.inverse_transform(data_X)
data_Y_traInv = scaler_standardized_Y.inverse_transform(data_Y)
series_reshaped_X_notTransformed = np.array([data_X_traInv[i:i + (steps_backwards+steps_forward)].copy() for i in range(len(data) - (steps_backwards+steps_forward))])
X_test_notTranformed = series_reshaped_X_notTransformed[timeslot_x_valid_end:, :steps_backwards]
Y_pred_traInv = scaler_standardized_Y.inverse_transform (Y_pred)
Y_test_traInv = scaler_standardized_Y.inverse_transform (Y_test)
# Calculate errors for every time slot of the multiple predictions
abs_diff = np.abs(Y_pred_traInv - Y_test_traInv)
abs_diff_perPredictedSequence = np.zeros((len (Y_test_traInv)))
average_LoadValue_testData_perPredictedSequence = np.zeros((len (Y_test_traInv)))
abs_diff_perPredictedTimeslot_ForEachSequence = np.zeros((len (Y_test_traInv)))
absoluteError_Load_Ratio_allPredictedSequence = np.zeros((len (Y_test_traInv)))
absoluteError_Load_Ratio_allPredictedTimeslots = np.zeros((len (Y_test_traInv)))
mse_perPredictedSequence = np.zeros((len (Y_test_traInv)))
rmse_perPredictedSequence = np.zeros((len(Y_test_traInv)))
for i in range (0, len(Y_test_traInv)):
for j in range (0, len(Y_test_traInv [0])):
abs_diff_perPredictedSequence [i] = abs_diff_perPredictedSequence [i] + abs_diff [i][j]
mse_perPredictedSequence [i] = mean_squared_error(Y_pred_traInv[i] , Y_test_traInv [i] )
rmse_perPredictedSequence [i] = np.sqrt(mse_perPredictedSequence [i])
abs_diff_perPredictedTimeslot_ForEachSequence [i] = abs_diff_perPredictedSequence [i] / len(Y_test_traInv [0])
average_LoadValue_testData_perPredictedSequence [i] = np.mean (Y_test_traInv [i])
absoluteError_Load_Ratio_allPredictedSequence [i] = abs_diff_perPredictedSequence [i] / average_LoadValue_testData_perPredictedSequence [i]
absoluteError_Load_Ratio_allPredictedTimeslots [i] = abs_diff_perPredictedTimeslot_ForEachSequence [i] / average_LoadValue_testData_perPredictedSequence [i]
rmse_average_allPredictictedSequences = np.mean (rmse_perPredictedSequence)
absoluteAverageError_Load_Ratio_allPredictedSequence = np.mean (absoluteError_Load_Ratio_allPredictedSequence)
absoluteAverageError_Load_Ratio_allPredictedTimeslots = np.mean (absoluteError_Load_Ratio_allPredictedTimeslots)
absoluteAverageError_allPredictedSequences = np.mean (abs_diff_perPredictedSequence)
absoluteAverageError_allPredictedTimeslots = np.mean (abs_diff_perPredictedTimeslot_ForEachSequence)
Here you have some test data Download Test Data
So now I actually would like to include not only past values of the features into the prediction but also future values of the features "temperature" and "hour of the day" into the prediction. The future values of the feature "temperature" can for example be taken from an external weather forecasting service and for the feature "hour of the day" the future values are know before (in the test data I have included a "forecast" of the temperature that is not a real forecast; I just randomly changed the values).
This way, I could assume that - for several applications and data - the forecast could be improved.
In a schema it would look like this:
Can anyone tell me, how I can do that in Keras with a RNN (or LSTM)? One way could be to include the future values as independant features as input. But I would like the model to know that the future values of a feature are connected to the past values of a feature.
Reminder: Does anybody have an idea how to do this? I'll highly appreciate every comment.
The standard approach is to use an encoder-decoder architecture (see 1 and 2 for instance):
The encoder takes as input the past values of the features and of the target and returns an output representation.
The decoder takes as input the encoder output and the future values of the features and returns the predicted values of the target.
You can use any architecture for the encoder and for the decoder and you can also consider different approaches for passing the encoder output to the decoder (e.g. adding or concatenating it to the decoder input features, adding or concatenating it to the output of some intermediate decoder layer, or adding it to the final decoder output), the code below is just an example.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.layers import Input, Dense, LSTM, TimeDistributed, Concatenate, Add
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
# define the inputs
target = ['value']
features = ['temperatures', 'hour of the day']
sequence_length = 96
# import the data
df = pd.read_csv('TestData.csv', sep=';', header=0, low_memory=False, infer_datetime_format=True, parse_dates={'datetime': [0]}, index_col=['datetime'])
# scale the data
target_scaler = StandardScaler().fit(df[target])
features_scaler = StandardScaler().fit(df[features])
df[target] = target_scaler.transform(df[target])
df[features] = features_scaler.transform(df[features])
# extract the input and output sequences
X_encoder = [] # past features and target values
X_decoder = [] # future features values
y = [] # future target values
for i in range(sequence_length, df.shape[0] - sequence_length):
X_encoder.append(df[features + target].iloc[i - sequence_length: i])
X_decoder.append(df[features].iloc[i: i + sequence_length])
y.append(df[target].iloc[i: i + sequence_length])
X_encoder = np.array(X_encoder)
X_decoder = np.array(X_decoder)
y = np.array(y)
# define the encoder and decoder
def encoder(encoder_features):
y = LSTM(units=100, return_sequences=True)(encoder_features)
y = TimeDistributed(Dense(units=1))(y)
return y
def decoder(decoder_features, encoder_outputs):
x = Concatenate(axis=-1)([decoder_features, encoder_outputs])
# x = Add()([decoder_features, encoder_outputs])
y = TimeDistributed(Dense(units=100, activation='relu'))(x)
y = TimeDistributed(Dense(units=1))(y)
return y
# build the model
encoder_features = Input(shape=X_encoder.shape[1:])
decoder_features = Input(shape=X_decoder.shape[1:])
encoder_outputs = encoder(encoder_features)
decoder_outputs = decoder(decoder_features, encoder_outputs)
model = Model([encoder_features, decoder_features], decoder_outputs)
# train the model
model.compile(optimizer=Adam(learning_rate=0.001), loss='mse')
model.fit([X_encoder, X_decoder], y, epochs=100, batch_size=128)
# extract the last predicted sequence
y_true = target_scaler.inverse_transform(y[-1, :])
y_pred = target_scaler.inverse_transform(model.predict([X_encoder, X_decoder])[-1, :])
# plot the last predicted sequence
plt.plot(y_true.flatten(), label='actual')
plt.plot(y_pred.flatten(), label='predicted')
plt.show()
In the example above the model takes two inputs, X_encoder and X_decoder, so in your case when generating the forecasts you can use the past observed temperatures in X_encoder and the future temperature forecasts in X_decoder.
It is a pytorch code to time series prediction with an known external/exogenous regressor to the given period forecasted.Hope it helps!!!Have a marvellous day !!!
The input format is a 3d Tensor an output 1d array (MISO-Multiple Inputs Single Output)
def CNN_Attention_Bidirectional_LSTM_Encoder_Decoder_predictions(model,data ,regressors, extrapolations_leght):
n_input = extrapolations_leght
pred_list = []
batch = data[-n_input:]
model = model.train()
pred_list.append(torch.cat(( model(batch)[-1], torch.FloatTensor(regressors.iloc[1,[1]]).to(device).unsqueeze(0)),1))
batch = torch.cat((batch[n_input-1].unsqueeze(0), pred_list[-1].unsqueeze(0)),1)
batch = batch[:, 1:, :]
for i in range(n_input-1):
model = model.eval()
pred_list.append(torch.cat((model(batch).squeeze(0), torch.FloatTensor(regressors.iloc[i+1,[1]]).to(device).unsqueeze(0)),1))
batch = torch.cat((batch, pred_list[-1].unsqueeze(0)),1)
batch = batch[:, 1:, :]
model = model.train()
return np.array([pred_list[j].cpu().detach().numpy() for j in range(n_input)])[:,:, 0]
I have a dataframe containing numerical daily data and a target variable ("Score") in the last column that I am trying to predict. The code below seems to work but I would like to visualise the results of the model fit while the model is calibrating against the actual data in the training set.
All variables are time series so they are ordered in time but the plotting I managed to test shows the actual time series of the target variable (in the training period) but for the predicted values I didn't manage to get the expected results.
If I plot them, the fitted values don't respect the time ordering of the actual data and this seems to be due to the fact that there is a shuffling happening at a earlier stage.
How can I recover the time ordering of the fitted data at each iteration so that I can compare against the actual target variable while the model calibrates?
#allData is a dataframe containing the target variable in last column
y = 'Score' #name of the target variable to predict
target = allData[y].shift(1).dropna() #shift by 1 days as I want to predict the future score
X_ = allData.drop([y], axis=1) #all features
df = pd.concat([X_, target], join='outer', axis=1).dropna() #put them all back in a dataframe
df_train = df['2015':'2018']
df_test = df[prediction[0] : prediction[1]]
#scale variables
scaler = MinMaxScaler(feature_range=(-1, 1))
train= scaler.fit_transform(df_train.values)
test = scaler.transform(df_test.values)
x_train = train[:, :-1]
y_train = train[:, -1]
x_test = test[:, :-1]
num_features = x_train.shape[1]
x = tf.placeholder(dtype=tf.float32, shape=[None, num_features])
y_ = tf.placeholder(dtype=tf.float32, shape=[None])
nl_1, nl_2, nl_3, nl_4 = 512, 256, 128, 64
wi = tf.contrib.layers.variance_scaling_initializer(mode='FAN_AVG', uniform=True, factor=1)
zi = tf.zeros_initializer()
# 4 Hidden layers
wt_hidden_1 = tf.Variable(wi([num_features, nl_1]))
bias_hidden_1 = tf.Variable(zi([nl_1]))
wt_hidden_2 = tf.Variable(wi([nl_1, nl_2]))
bias_hidden_2 = tf.Variable(zi([nl_2]))
wt_hidden_3 = tf.Variable(wi([nl_2, nl_3]))
bias_hidden_3 = tf.Variable(zi([nl_3]))
wt_hidden_4 = tf.Variable(wi([nl_3, nl_4]))
bias_hidden_4 = tf.Variable(zi([nl_4]))
# Output layer
wt_out = tf.Variable(wi([nl_4, 1]))
bias_out = tf.Variable(zi([1]))
hidden_1 = tf.nn.relu(tf.add(tf.matmul(x, wt_hidden_1), bias_hidden_1))
hidden_2 = tf.nn.relu(tf.add(tf.matmul(hidden_1, wt_hidden_2), bias_hidden_2))
hidden_3 = tf.nn.relu(tf.add(tf.matmul(hidden_2, wt_hidden_3), bias_hidden_3))
hidden_4 = tf.nn.relu(tf.add(tf.matmul(hidden_3, wt_hidden_4), bias_hidden_4))
out = tf.transpose(tf.add(tf.matmul(hidden_4, wt_out), bias_out))
mse = tf.reduce_mean(tf.squared_difference(out, y_))
optimizer = tf.train.AdamOptimizer().minimize(mse)
session = tf.InteractiveSession()
session.run(tf.global_variables_initializer())
BATCH_SIZE = 100
EPOCHS = 100
for epoch in range(EPOCHS):
# Shuffle the training data
shuffle_data = permutation(arange(len(y_train)))
x_train = x_train[shuffle_data]
y_train = y_train[shuffle_data]
# Mini-batch training
for i in range(len(y_train)//BATCH_SIZE):
start = i*BATCH_SIZE
batch_x = x_train[start:start+BATCH_SIZE]
batch_y = y_train[start:start+BATCH_SIZE]
session.run(optimizer, feed_dict={x: batch_x, y_: batch_y})
# Show plot of fitted model against actual data
if np.mod(i, 5) == 0:
pred = session.run(out, feed_dict={x: x_train}) #x_train is scaled
dd = train.copy()
dd[:, -1] = pred[0]
pred = scaler.inverse_transform(dd) #need to rescale in order to compare with actual data
fig = plt.figure()
ax1 = fig.add_subplot(111)
line1, = ax1.plot(df_train[y].values) #the actual data in the training period
line2, = ax1.plot(pred[:, -1][::-1]) #the fitted data in the training period don't seem to be ordered in time, like the original data
plt.title('Epoch ' + str(epoch) + ', Batch ' + str(i))
plt.show()
plt.pause(0.01)
I have a single-input,multi-output Neural Network model whose last layers are
out1 = Dense(168, activation = 'softmax')(dense)
out2 = Dense(11, activation = 'softmax')(dense)
out3 = Dense(7, activation = 'softmax')(dense)
model = Model(inputs=inputs, outputs=[out1,out2,out3])
the Y-labels for each image are as follows
train
>>
image_id class_1 class_2 class_3
0 Train_0 15 9 5
1 Train_1 159 0 0
...
...
...
453651 Train_453651 0 15 34
453652 Train_453652 18 0 7
EDIT:-
train.iloc[:,1:4].nunique()
>>
class_1 168
class_2 11
class_3 7
dtype: int64
So looking at these different range of classes, should I use categorical_crossentropy or sparse_categorical_crossentropy? and how should I use the Y_labels in flow for the code given below?
imgs_arr = df.iloc[:,1:].values.reshape(df.shape[0],137,236,1)
# 32332 columns representing pixels of 137*236 and single channel images.
# converting it to (samples,w,h,c) format
Y = train.iloc[:,1:].values #need help from here
image_data_gen = ImageDataGenerator(validation_split=0.25)
train_gen = image_data_gen.flow(x=imgs_arr, y=Y, batch_size=32,subset='training')
valid_gen = image_data_gen.flow(x=imgs_arr,y=Y,subset='validation')
is this this the right way to pass Yor use Y=[y1,y2,y3] where
y1=train.iloc[:,1].values
y2=train.iloc[:,2].values
y3=train.iloc[:,3].values
Ouch....
By the message given in your flow, you will need a single output. So you need to make the separation inside your model. (Keras failed to follow its own standards there)
This means something like:
Y = train.iloc[:,1:].values #shape = (50210, 3)
With a single output like:
out = Dense(168+11+7, activation='linear')(dense)
And a loss function that handles the separation:
def custom_loss(y_true, y_pred):
true1 = y_true[:,0:1]
true2 = y_true[:,1:2]
true3 = y_true[:,2:3]
out1 = y_pred[:,0:168]
out2 = y_pred[:,168:168+11]
out3 = y_pred[:,168+11:]
out1 = K.softmax(out1, axis=-1)
out2 = K.softmax(out2, axis=-1)
out3 = K.softmax(out3, axis=-1)
loss1 = K.sparse_categorical_crossentropy(true1, out1, from_logits=False, axis=-1)
loss2 = K.sparse_categorical_crossentropy(true2, out2, from_logits=False, axis=-1)
loss3 = K.sparse_categorical_crossentropy(true3, out3, from_logits=False, axis=-1)
return loss1+loss2+loss3
Compile the model with loss=custom_loss.
Then the flow should stop complaining when you do flow.
Just make sure X and Y are exactly in the same order: imgs_arr[i] corresponds to Y[i] correctly.
Another workaround is:
Make an array of tuple, then pass it to the ImageDataGenerator flow method.
Make an iterator method that accepts the iterator made by the previous step. This iterator converts back the array of tuple to list of arrays.
Here is the methods to implement the steps above:
def make_array_of_tuple(tuple_of_arrays):
array_0 = tuple_of_arrays[0]
array_of_tuple = np.empty(array_0.shape[0], dtype=np.object)
for i, tuple_of_array_elements in enumerate(zip(*tuple_of_arrays)):
array_of_tuple[i] = tuple_of_array_elements
return array_of_tuple
def convert_to_list_of_arrays(array_of_tuple):
array_length = array_of_tuple.shape[0]
tuple_length = len(array_of_tuple[0])
array_list = [
np.empty(array_length, dtype=np.uint8) for i in range(tuple_length) ]
for i, array_element_tuple in enumerate(array_of_tuple):
for array, tuple_element in zip(array_list, array_element_tuple):
array[i] = tuple_element
return array_list
def tuple_of_arrays_flow(original_flow):
while True:
(X, array_of_tuple) = next(original_flow)
list_of_arrays = convert_to_list_of_arrays(array_of_tuple)
yield X, list_of_arrays
To call the ImageDataGenerator flow() method and get the flow used for the model:
y_train = make_array_of_tuple((y_train_1, y_train_2, y_train_3))
orig_image_flow = train_image_generator.flow(X_train, y=y_train)
train_image_flow = tuple_of_arrays_flow(orig_image_flow)
The size of y_train is the same as X_train, so it should be accepted.
'train_image_flow' returns list of arrays
that should be accepted by the Keras multi-output model.
ADDED (2019/01/26)
One another idea, simpler than the above one:
Pass array of indices, which contains 0, 1, 2, ..., to ImageDataGenerator.flow().
In the iterator, select the elements in the arrays for the multiple output by using the returned indices array from the original flow.
Here is the implementation:
def make_multi_output_flow(image_gen, X, y_list, batch_size):
y_item_0 = y_list[0]
y_indices = np.arange(y_item_0.shape[0])
orig_flow = image_gen.flow(X, y=y_indices, batch_size=batch_size)
while True:
(X, y_next_i) = next(orig_flow)
y_next = [ y_item[y_next_i] for y_item in y_list ]
yield X, y_next
This is an example to call the method above.
y_train = [y_train_1, y_train_2, y_train_3]
multi_output_flow = make_multi_output_flow(
image_data_generator, X_train, y_train, batch_size)
I've stored the coefficients of intercept, AR, MA off ARIMA model of statsmodel package
x = df_sku
x_train = x['Weekly_Volume_Sales']
x_train_log = np.log(x_train)
x_train_log[x_train_log == -np.inf] = 0
x_train_mat = x_train_log.as_matrix()
model = ARIMA(x_train_mat, order=(1,1,1))
model_fit = model.fit(disp=0)
res = model_fit.predict(start=1, end=137, exog=None, dynamic=False)
print(res)
params = model_fit.params
But I'm unable to find any documentation on statsmodel that lets me refit the model parameters onto a set of new data and predict N steps.
Has anyone been able to accomplishing refitting the model and predicting out of time samples ?
I'm trying to accomplish something similar to R:
# Refit the old model with testData
new_model <- Arima(as.ts(testData.zoo), model = old_model)
Here is a code you can use:
def ARIMAForecasting(data, best_pdq, start_params, step):
model = ARIMA(data, order=best_pdq)
model_fit = model.fit(start_params = start_params)
prediction = model_fit.forecast(steps=step)[0]
#This returns only last step
return prediction[-1], model_fit.params
#Get the starting parameters on train data
best_pdq = (3,1,3) #It is fixed, but you can search for the best parameters
model = ARIMA(train_data, best_pdq)
model_fit = model.fit()
start_params = model_fit.params
data = train_data
predictions = list()
for t in range(len(test_data)):
real_value = data[t]
prediction = ARIMAForecasting(data, best_pdq, start_params, step)
predictions.append(prediction)
data.append(real_value)
#After you can compare test_data with predictions
Details you can check here:
https://www.statsmodels.org/dev/generated/statsmodels.tsa.arima_model.ARIMA.fit.html#statsmodels.tsa.arima_model.ARIMA.fit
Great question. I have found such example: https://alkaline-ml.com/pmdarima/develop/auto_examples/arima/example_add_new_samples.html
briefly:
import pmdarima as pmd
...
### split data as train/test:
train, test = ...
### fit initial model on `train` data:
arima = pmd.auto_arima(train)
...
### update initial fit with `test` data:
arima.update(test)
...
### create forecast using updated fit for N steps:
new_preds = arima.predict(n_periods=10)