I'm very new to Keras and machine learning in general, and am training a model like so:
history = model.fit_generator(flight_generator(train_files_train, 4), steps_per_epoch=500, epochs=50)
Where flight_generator is a function that prepares the training data and formats it, and then yields it back to the model to fit. this works great, so now I want to add some validation and after much looking online I still don't know how to implement it.
My best guess would be something like:
history = model.fit_generator(flight_generator(train_files_train, 4), steps_per_epoch=500, epochs=50, validation_data=flight_generator(train_files_cv, 4))
But when I run the code it just freezes in the first epoch. What am I missing?
EDIT:
Code for flight_generator:
def flight_generator(files, batch_size):
while True:
batch_inputs = numpy.random.choice(a = files,
size = batch_size)
batch_input_X = []
batch_input_Y = []
c=0
for batch_input in batch_inputs:
# reshape into X=t and Y=t+1
trainX, trainY = create_dataset(batch_input, look_back)
# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
if c is 0:
batch_input_X = trainX
batch_input_Y = trainY
else:
batch_input_X = numpy.concatenate((batch_input_X, trainX), axis = 0)
batch_input_Y = numpy.concatenate((batch_input_Y, trainY), axis = 0)
c += 1
# Return a tuple of (input) to feed the network
batch_x = numpy.array( batch_input_X )
batch_y = numpy.array( batch_input_Y )
yield(batch_x, batch_y)
Your validation_data should be in format of tuple. So you should try changing it :
history = model.fit_generator(flight_generator(train_files_train, 4), steps_per_epoch=500, epochs=50,batch_size=32,validation_data=(flight_generator(train_files_cv, 4)))
I guess you should be using model.fit(........)
Do not try to use generator unless you actually require it
In whatever code I have seen, model.fit() does the magic
Please refer to Keras documentation for fit()
https://keras.io/api/models/sequential/
And please mention the optimizer and the metrics
Related
I am trying to feed stock data to Conv2D. But ran into dimension problem. I have no idea how to solve it and need help. Below are detailed steps that I have implemented.
I have attached data and code in the following link:
https://drive.google.com/drive/folders/1snsQ-96AeRn521oc0aQVlTTd9nHbtyjO?usp=sharing
by the code itself should run. It will download the data automatically. but ive taken out the featuers to simplify the run. So it will have 5 features in the attached code.
but to give you quick glance of
The Problem I had-----------------------
1. Got stock data and generated some features, it looks like:
2. Add time step to it by using:
def reshape_data(X, y, period=28):
n_past = period # number of days to look back in the past and compile into a time series
trainX = []
trainY = np.array(y.iloc[n_past:])
trainY = trainY[..., np.newaxis]
for i in range(n_past, len(X)):
trainX.append(X[i - n_past:i, 0:X.shape[1]])
trainX = np.array(trainX)
return trainX, trainY
Note:
data can be found here
https://drive.google.com/drive/folders/1snsQ-96AeRn521oc0aQVlTTd9nHbtyjO?usp=sharing
I have applied pca on it. But simply convert it into numpy and apply reshap_data() on trainX should work
trainX, trainY = reshape_data(X_train_pca, y_train, period=30)
3. shape
trainX (5768, 30, 30) # 5768-rows, 30- time steps, 30- # of features
trainY (5768,1)
4. Add 1 axis after train X
trainX = trainX[...,np.newaxis]
trainX is now (5768, 30, 30, 1)
5. Build model
6. fit and run
model.compile(optimizer=Adam(learning_rate=0.01) , metrics="mse", loss='binary_crossentropy')
reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss',factor=0.5,patience=10,verbose=0,mode='auto',min_delta=0.0002,cooldown=0,min_lr=0.0001)
early_stop = tf.keras.callbacks.EarlyStopping(monitor="val_loss", patience=80, mode="min", restore_best_weights = True)
history = model.fit(trainX, trainY, epochs=300,
batch_size= 512, shuffle=False, verbose = 1,
# validation_data=(testX, testY),
validation_split=0.2,
callbacks=[early_stop, reduce_lr] )
7. ERROR
I thought since I have convered the stock into 30,30,1 should looks like a image dataset, which would enable tensorflow to work. But somehow it doesn't
Add two layers after your convolution layer:
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
And do not mix up tensorflow.keras and keras. Rather just use tensorflow.keras for everything.
I am trying to replace my FFNN with a LSTM Layer. As input I get 360 Lidar Data Point and 4 additional values for distance etc.. The algorithm shall learn to navigate a robot through it's environment. With the FFNN it's working absolutely fine and for the LSTM I started like that:
# collected data for RL
scan_range = [] #filled with .append, length=360
state = scan_range + [heading, current_distance, obstacle_min_range, obstacle_angle]
return np.asarray(state)
Based on that data, there will be some analysis for the next state, if the goal is achieved etc. The data will be stored in memory:
agent.appendMemory(state, action, reward, next_state, done)
which will do:
self.memory.append((state, action, reward, next_state, done). The action and reward are normal numbers and next_state is an array again.
Next, I build up the neural network with the LSTM Layer
model = Sequential()
model.add(SimpleRNN(64, input_shape=(1,364)))
model.add(Dense(self.action_size, kernel_initializer='lecun_uniform'))
model.add(Activation('linear'))
model.compile(loss='mse', optimizer=RMSprop(lr=self.learning_rate, rho=0.9, epsilon=1e-06))
model.summary()
It is then trained everything using a batch like the following for the FFNN:
def trainModel(self, target=False):
mini_batch = random.sample(self.memory, self.batch_size)
X_batch = np.empty((0, self.state_size), dtype=np.float64)
Y_batch = np.empty((0, self.action_size), dtype=np.float64)
for i in range(self.batch_size):
states = mini_batch[i][0]
actions = mini_batch[i][1]
rewards = mini_batch[i][2]
next_states = mini_batch[i][3]
dones = mini_batch[i][4]
q_value = self.model.predict(states.reshape((1, len(states))))
self.q_value = q_value
if target:
next_target = self.target_model.predict(next_states.reshape((1, len(next_states))))
else:
next_target = self.model.predict(next_states.reshape((1, len(next_states))))
next_q_value = self.getQvalue(rewards, next_target, dones)
X_batch = np.append(X_batch, np.array([states.copy()]), axis=0)
Y_sample = q_value.copy()
Y_sample[0][actions] = next_q_value
Y_batch = np.append(Y_batch, np.array([Y_sample[0]]), axis=0)
if dones:
X_batch = np.append(X_batch, np.array([next_states.copy()]), axis=0)
Y_batch = np.append(Y_batch, np.array([[rewards] * self.action_size]), axis=0)
print X_batch.shape
print Y_batch.shape
self.model.fit(X_batch, Y_batch, batch_size=self.batch_size, epochs=1, verbose=0)
When I don't change the code, I sure get the error of dimension: expected simple_rnn_1_input to have 3 dimensions, but got array with shape (1, 364) because the input is still two dimensional and the LSTM need three dimensions. I then tried to add the third dimension manually to just see if everything works fine:
mini_batch = random.sample(self.memory, self.batch_size)
X_batch = np.empty((0, self.state_size), dtype=np.float64)
Y_batch = np.empty((0, self.action_size), dtype=np.float64)
Z_batch = np.empty((0, 1), dtype=np.float64)
for i in range(self.batch_size):
states = mini_batch[i][0]
actions = mini_batch[i][1]
rewards = mini_batch[i][2]
next_states = mini_batch[i][3]
dones = mini_batch[i][4]
q_value = self.model.predict(states.reshape((1, len(states))))
self.q_value = q_value
if target:
next_target = self.target_model.predict(next_states.reshape((1,1, len(next_states))))
else:
next_target = self.model.predict(next_states.reshape((1,1, len(next_states))))
next_q_value = self.getQvalue(rewards, next_target, dones)
X_batch = np.append(X_batch, np.array([states.copy()]), axis=0)
Y_sample = q_value.copy()
Y_sample[0][actions] = next_q_value
Y_batch = np.append(Y_batch, np.array([Y_sample[0]]), axis=0)
Z_batch = np.append(Z_batch, np.array([[1]]), axis=0)
if dones:
X_batch = np.append(X_batch, np.array([next_states.copy()]), axis=0)
Y_batch = np.append(Y_batch, np.array([[rewards] * self.action_size]), axis=0)
Z_batch = np.append(Z_batch, np.array([[1]]), axis=0)
self.model.fit(X_batch, Y_batch, Z_batch, batch_size=self.batch_size, epochs=1, verbose=0)
When I do this, the .fit() gives the following error: TypeError: fit() got multiple values for keyword argument 'batch_size'
My question is now, if .fit() is suited for the LSTM framework in this case? In the documentation, only x and z are given. Z seems useless in this case, but still the LSTM needs a 3 dimensions as input.
Also my question is, if I want to use the LSTM framework properly and not with dummies, I have to use more than the actual state?
Can I then, i.e., just append together the last 10 states so that states.shape=(10,1,364), is that a good timestep range or should it be longer?
Kind regards!
I believe your basic issue is that the 3rd dimension needs to be added to X_batch, and not another component in model.fit.
In particular, Keras models don't usually specify the "batch"/"sample" dimension in the model layers. It is automatically inferred from the shape of the X_batch input data. In your case, you have an SimpleRNN with input_shape=(1,364) as the first layer. What Keras interprets this to mean is that the input data X_batch should have shape like this:
(num_samples, 1, 364).
Also, if you want to create a sequence of timesteps, you would provide X_batch with the following shape:
(num_samples, num_timesteps, 364) or something similar.
This page has some good discussion: https://keras.io/getting-started/sequential-model-guide/ for example, search for "Stacked LSTM for sequence classification" to help illustrate (although be careful of the return_sequences=True - for a single LSTM, you probably want return_sequences=False.)
I hope this helps.
Keras fit() and fit_generator() gives different results. I implemented both methods keeping all the other parameters same. I have attached my data generator and model below. The model is taken from this site. https://machinelearningmastery.com/
In data generator, I am loading the data from the hard drive. Each X_train file contains a matrix of size (3,1). For example, if the batch size is 2, the size of X_batch will be (2, 3, 1).
def generator(list_xtrain, list_ytrain, batch_size):
samples_per_epoch = len(list_xtrain)
number_of_batches = samples_per_epoch/batch_size
counter=0
X_batch = np.empty((batch_size,3,1))
y_batch = np.empty((batch_size))
while 1:
temp_listx = list_xtrain[batch_size*counter:batch_size*(counter+1)]
temp_listy = list_ytrain[batch_size*counter:batch_size*(counter+1)]
for i, ID in enumerate(temp_listx):
X_batch[i,] = np.load('F:/Air_passenger_data_gen/' + ID)
for j, ID in enumerate(temp_listy):
# Store class
y_batch[j] = np.load('F:/Air_passenger_data_gen/' + ID)
counter += 1
yield X_batch,y_batch
#restart counter to yeild data in the next epoch as well
if counter >= number_of_batches:
counter = 0
#using fit_generator()
batch_size=2
model.fit_generator(generator=generator(list_xtrain, list_ytrain,
batch_size),
epochs=100,
steps_per_epoch=len(list_xtrain)/batch_size,
verbose=2,
use_multiprocessing=False,
workers=4)
#using fit()
model.fit(trainX, trainY, epochs=100, batch_size=2)
I expect the output to be same as that from fit(). But using fit_generator() gives some crazy value for loss=41781.00 whereas using fit(), the loss=0.0020
I have transformed an image database into two TFRecords, one for training and the other for validation. I want to train a simple model with keras using these two files for data input but I obtain an error I can't understand related to the shape of the data.
Here is the code (all-capital variables are defined elsewhere):
def _parse_function(proto):
f = {
"x": tf.FixedLenSequenceFeature([IMG_SIZE[0] * IMG_SIZE[1]], tf.float32, default_value=0., allow_missing=True),
"label": tf.FixedLenSequenceFeature([1], tf.int64, default_value=0, allow_missing=True)
}
parsed_features = tf.parse_single_example(proto, f)
x = tf.reshape(parsed_features['x'] / 255, (IMG_SIZE[0], IMG_SIZE[1], 1))
y = tf.cast(parsed_features['label'], tf.float32)
return x, y
def load_dataset(input_path, batch_size, shuffle_buffer):
dataset = tf.data.TFRecordDataset(input_path)
dataset = dataset.shuffle(shuffle_buffer).repeat() # shuffle and repeat
dataset = dataset.map(_parse_function, num_parallel_calls=16)
dataset = dataset.batch(batch_size).prefetch(1) # batch and prefetch
return dataset.make_one_shot_iterator()
train_iterator = load_dataset(TRAIN_TFRECORDS, BATCH_SIZE, SHUFFLE_BUFFER)
val_iterator = load_dataset(VALIDATION_TFRECORDS, BATCH_SIZE, SHUFFLE_BUFFER)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=(IMG_SIZE[0], IMG_SIZE[1], 1)))
model.add(tf.keras.layers.Dense(1, 'sigmoid'))
model.compile(
optimizer=tf.train.AdamOptimizer(),
loss='binary_crossentropy',
metrics=['accuracy']
)
model.fit(
train_iterator,
epochs=N_EPOCHS,
steps_per_epoch=N_TRAIN // BATCH_SIZE,
validation_data=val_iterator,
validation_steps=N_VALIDATION // BATCH_SIZE
)
And here is the error I obtain:
tensorflow.python.framework.errors_impl.InvalidArgumentError: data[0].shape = [3] does not start with indices[0].shape = [2]
[[Node: training/TFOptimizer/gradients/loss/dense_loss/Mean_grad/DynamicStitch = DynamicStitch[N=2, T=DT_INT32, _class=["loc:#training/TFOptimizer/gradients/loss/dense_loss/Mean_grad/floordiv"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](training/TFOptimizer/gradients/loss/dense_loss/Mean_grad/range, training/TFOptimizer/gradients/loss/dense_loss/Mean_3_grad/Maximum, training/TFOptimizer/gradients/loss/dense_loss/Mean_grad/Shape/_35, training/TFOptimizer/gradients/loss/dense_loss/Mean_3_grad/Maximum/_41)]]
(I know that the model defined here is not a good model for image analysis, I just took the simplest possible architecture that reproduces the error)
Change:
"label": tf.FixedLenSequenceFeature([1]...
into:
"label": tf.FixedLenSequenceFeature([]...
This is unfortunately not explained in the documentation on the website, but some explanation can be found in the docstring of FixedLenSequenceFeature on github. Basically, if your data consists of a single dimension (+ a batch dimension), you don't need to specify it.
You have forget to this line from the example:
parsed_features = tf.parse_single_example(proto, f)
Add it to _parse_function.
Also, you can return just the dataset object. Keras supports iterators as well as instances of the tf.data.Dataset. Also, it looks a bit weird to shuffle and repeat first, and then to parse tfexamples. Here is an example code that works for me:
def dataset(filenames, batch_size, img_height, img_width, is_training=False):
decoder = TfExampleDecoder()
def preprocess(image, boxes, classes):
image = preprocess_image(image, resize_height=img_height, resize_width=img_width)
return image, groundtruth
ds = tf.data.TFRecordDataset(filenames)
ds = ds.map(decoder.decode, num_parallel_calls=8)
if is_training:
ds = ds.shuffle(1000 + 3 * batch_size)
ds = ds.apply(tf.contrib.data.map_and_batch(map_func=preprocess, batch_size=batch_size, num_parallel_calls=8))
ds = ds.repeat()
ds = ds.prefetch(buffer_size=batch_size)
return ds
train_dataset = dataset(args.train_data, args.batch_size,
args.img_height, args.img_width,
is_training=True)
model.fit(train_dataset,
steps_per_epoch=args.steps_per_epoch,
epochs=args.max_epochs,
callbacks=callbacks,
initial_epoch=0)
It seems like an issue with your data or preprocessing pipeline, rather than with Keras. Try to inspect what you are getting out of the dataset with a debugging code like:
ds = dataset(args.data, args.img_height, args.img_width, is_training=True)
image_t, classes_t = ds.make_one_shot_iterator().get_next()
with tf.Session() as sess:
while True:
image, classes = sess.run([image_t, classes_t])
# Do something with the data: display, log etc.
I would like to train an LSTM or GRU network in TensorFlow/Keras to continuously recognize whether a user is walking or not based on input from motion sensors (accelerometer and gyroscope). I have 50 input sequences with lengths varying from 581 to 5629 time steps and 6 features and 50 corresponding output sequences of boolean values. My problem is that I don't know how to feed the training data to the fit() method.
I know approximately what I need to do: I'd like to train with 5 batches of 10 sequences each, and for each batch I have to pad all but the longest sequence so all 10 sequences have the same lengths and apply masking. I just don't know how to build the data structures. I know that I can make one big 3D tensor of size (50,5629,6) and that works, but it's painfully slow, so I'd really like to make the sequence length of each batch as small as possible.
Here's the problem in code:
import tensorflow as tf
import numpy as np
# Load data from file
x_list, y_list = loadSequences("train.csv")
# x_list is now a list of arrays (n,6) of float64, where n is the timesteps
# and 6 is the number of features, sorted by increasing sequence lengths.
# y_list is a list of arrays (n,1) of Boolean.
x_train = # WHAT DO I WRITE HERE?
y_train = # AND HERE?
model = tf.keras.models.Sequential([
tf.keras.layers.Masking(),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=10, epochs=100)
You can do some thing like this
use generator function take a look at this link fit_generator look for fit_generator method.
def data_generater(batch_size):
print("reading data")
training_file = 'data_location', 'r')
# assuming data is in json format so feels free to change accordingly
training_set = json.loads(training_file.read())
training_file.close()
batch_i = 0 # Counter inside the current batch vector
batch_x = [] # The current batch's x data
batch_y = [] # The current batch's y data
while True:
for obj in training_set:
batch_x.append(your input sequences one by one)
if obj['val'] == True:
batch_y.append([1])
elif obj['val'] == False:
batch_y.append([0])
batch_i += 1
if batch_i == batch_size:
# Ready to yield the batch
# pad input to max length in the batch
batch_x = pad_txt_data(batch_x)
yield batch_x, np.array(batch_y)
batch_x = []
batch_y = []
batch_i = 0
def pad_txt_data(arr):
# expecting arr to be in the shape of (10, m, 6)
paded_arr = []
prefered_len = len(max(arr, key=len))
# Now pad all your sequences to preferred length in the batch(arr)
return np.array(paded_arr)
and in the model
model = keras.Sequential()
model.add(keras.layers.Masking(mask_value=0., input_shape=(None,6)))
model.add(keras.layers.LSTM(32))
model.add(keras.layers.Dense(1, activation="softmax"))
model.compile(optimizer="Adam", loss='categorical_crossentropy', metrics=['categorical_accuracy'])
model.fit_generator(data_generater(10), steps_per_epoch=5, epochs=10)
Batch_size, steps_per_epoch, epoch can be different.
Generally
steps_per_epoch = (number of sequences/batch_size)
Note: Form reading your description your task appears to be Binary classification problem not like an Sequence to sequence problem. A good example for sequence to sequence is a language translation. Just google around you will find what i mean.
And if you really want to see the difference in training times I suggest using a GPU if available and CuDNNLSTM.
In case it helps someone, here's how I ended up implementing a solution:
import tensorflow as tf
import numpy as np
# Load data from file
x_list, y_list = loadSequences("train.csv")
# x_list is now a list of arrays (m,n) of float64, where m is the timesteps
# and n is the number of features.
# y_list is a list of arrays (m,1) of Boolean.
assert len(x_list) == len(y_list)
num_sequences = len(x_list)
num_features = len(x_list[0][0])
batch_size = 10
batches_per_epoch = 5
assert batch_size * batches_per_epoch == num_sequences
def train_generator():
# Sort by length so the number of timesteps in each batch is minimized
x_list.sort(key=len)
y_list.sort(key=len)
# Generate batches
while True:
for b in range(batches_per_epoch):
longest_index = (b + 1) * batch_size - 1
timesteps = len(x_list[longest_index])
x_train = np.zeros((batch_size, timesteps, num_features))
y_train = np.zeros((batch_size, timesteps, 1))
for i in range(batch_size):
li = b * batch_size + i
x_train[i, 0:len(x_list[li]), :] = x_list[li]
y_train[i, 0:len(y_list[li]), 0] = y_list[li]
yield x_train, y_train
model = tf.keras.models.Sequential([
tf.keras.layers.Masking(mask_value=0., input_shape=(None,num_features)),
tf.keras.layers.LSTM(32, return_sequences=True),
tf.keras.layers.Dense(2, activation=tf.nn.softmax)
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit_generator(train_generator(), steps_per_epoch=batches_per_epoch, epochs=100)