I am trying to feed the features extracted from 2 fine-tuned VGG16 (each on a different stream), then for each sequence of 9 data pairs, concatenate their numpy arrays and feed the sequence of 9 outputs (concatenated) to a bi-directional LSTM in Keras.
The problem is that I am running into an error when trying to build the LSTM part. The following shows the generator I wrote to read both RGB and Optical flow streams, extract features and concatenate each pair :
def generate_generator_multiple(generator,dir1, dir2, batch_rgb, batch_flow, img_height,img_width):
print("Processing inside generate multiple")
genX1 = generator.flow_from_directory(dir1,
target_size = (img_height,img_width),
class_mode = 'categorical',
batch_size = batch_rgb,
shuffle=False
)
genX2 = generator.flow_from_directory(dir2,
target_size = (img_height,img_width),
class_mode = 'categorical',
batch_size = batch_flow,
shuffle=False
)
while True:
imgs, labels = next(genX1)
X1i = RGB_model.predict(imgs, verbose=0)
imgs2, labels2 = next(genX2)
X2i = FLOW_model.predict(imgs2,verbose=0)
Xi = []
for i in range(9):
Xi.append(np.concatenate([X1i[i+1],X2i[i]]))
Xi = np.asarray(Xi)
if np.array_equal(labels[1:],labels2)==False:
print("ERROR !! problem of labels matching: RGB and FLOW have different labels")
yield Xi, labels2[2]
I am expecting the generator to yield a sequence of 9 arrays, so the shape of Xi when I force the loop to run twice is: (9, 14, 7, 512)
When I use while True (like in the code above) and try to call the method to check what it returs, after 3 iterations I get the error:
ValueError: too many values to unpack (expected 2)
Now, assuming that there is no problem with the generator, I try to feed the data returned by the generator to the bidirectional LSTM like the following:
n_frames = 9
seq = 100
Bi_LSTM = Sequential()
Bi_LSTM.add(Bidirectional(LSTM(seq, return_sequences=True, dropout=0.25, recurrent_dropout=0.1),input_shape=(n_frames,14,7,512)))
Bi_LSTM.add(GlobalMaxPool1D())
Bi_LSTM.add(TimeDistributed(Dense(100, activation="relu")))
Bi_LSTM.add(layers.Dropout(0.25))
Bi_LSTM.add(Dense(4, activation="relu"))
model.compile(Adam(lr=.00001), loss='categorical_crossentropy', metrics=['accuracy'])
But I keep getting the following error: (the error log is a bit long)
InvalidArgumentError: Shape must be rank 4 but is rank 2 for 'bidirectional_2/Tile_1' (op: 'Tile') with input shapes: [?,7,512,1], [2].
It seems to be caused by this line:
Bi_LSTM.add(Bidirectional(LSTM(seq, return_sequences=True, dropout=0.25, recurrent_dropout=0.1),input_shape=(n_frames,14,7,512)))
I am not sure anymore if the problem is the way I try to build the LSTM, the way I return the data from the generator, or the way I define the input of LSTM.
Thanks a lot for any help you can provide.
It seems like this error specifically is cause by the following line:
input_shape=(n_frames,14,7,512)
I was confused about the input for LSTM. Instead to explicitly giving the shape of the input, we just need to specify the dimensions of the input. In my case, this is 3 since the input is a 3D np array. I still have other problems with my code, but for this specific error, the solution is changing that part with:
input_shape=(n_frames,3)
Edit:
When predicting, We need to get the mean of the prediction since LSTM expects a 1D input.
Another issue in my code was the shape of Xi. It needs to be reshaped before yielding it so that it matches the input expected by LSTM.
Related
I am new to stackoverflow as a question asker. I am typically perusing for answers on here and have normally never had to ask a question, till now. I am working on building a deep learning network using the tf.data.Dataset API and the network doesn't seem to be bringing in the dataset correctly.
So to set the stage I am working with a text dataset, I have already broken the text up into tokens, created a dictionary of unique words, created an embedding matrix to convert the tokens into vectors and then planned to use the tf.data.Dataset to enable the easy use of an internal pipeline and batch large datasets to manage training.
The 'vect_doc' variable is an array with shape of (35054, 300).
vect_dataset = tf.data.Dataset.from_tensor_slices(vect_doc)
from here I shuffle the dataset so I can break it up into train, test and validation sets.
vect_data_shuffle = vect_dataset.shuffle(len(proc_doc), reshuffle_each_iteration = False)
train_dataset = vect_data_shuffle.take(train_size)
test_dataset = vect_data_shuffle.skip(train_size)
val_dataset = test_dataset.skip(val_size)
test_dataset = test_dataset.take(test_size)
Then I batch the datasets to create samples that are 2*sequence_length, I will demonstrate with just the training dataset for simplicity sake.
train_batch_ds = train_dataset.batch(2*self.sequence_length + 1, drop_remainder=True)
Once the dataset has been broken up into batches, I run the following process:
def vect_split_dataset(self, sample):
dataset_Xy = tf.data.Dataset.from_tensors((sample[:self.sequence_length],
sample[self.sequence_length]))
for i in range(1, (len(sample) - 1) // 2):
X_seq_batch = sample[i: i + self.sequence_length]
y_nxwrd_batch = sample[i + self.sequence_length]
Xy_samp = tf.data.Dataset.from_tensors((X_seq_batch, y_nxwrd_batch))
Xy_dataset = dataset_Xy.concatenate(Xy_samp)
return Xy_dataset
Xy_dataset = train_batch_ds.flat_map(train_set.vect_split_dataset)
Xy_dataset = Xy_dataset.repeat(len(proc_doc)).shuffle(len(proc_doc)).batch(param_dict['batch_size'], drop_remainder=True)
The above Xy_dataset returns a shape of ((60, 30, 300), (60, 300)). Now that I have the dataset created that I can pass to my DNN model is where I start getting problems. This is the code I am using to build the model:
LSTM = tf.keras.layers.LSTM(units=self.rnn_units,
kernel_initializer=self.initializer,
activation=self.activation,
recurrent_activation=self.activation_out,
return_sequences=True)
for i in range(self.num_layers):
# Different layers should have different setups as indicated below
if i == 0: # Initial layer also referred to as the input layer
self.model.add(tf.keras.layers.Embedding(input_dim=self.input_dim,
input_shape=(self.sequence_length, self.spacy_len),
output_dim=self.spacy_len,
input_length=self.batch_size))
elif i+1 == self.num_layers: # Output layer
self.model.add(tf.keras.layers.Dropout(self.drop_rate))
self.model.add(tf.keras.layers.Dense(units=self.num_units_out,
kernel_initializer=self.initializer,
activation=self.activation_out))
self.model.add(tf.keras.layers.Activation(self.activation_out))
else: # hidden layers basically anything that isn't an input or output layer
self.model.add(tf.keras.layers.Bidirectional(LSTM))
Basically the errors I keep getting is that 'ValueError: Input 0 of layer bidirectional is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 30, 300, 300]'
I am not sure if this is how I am handling the embedding error or what. When I mute the embedding layer and replace it with the Bidirectional I get an incompatibility error between two shapes (60,300) and (60, 30, 300).
My goal is to make it iterate over the entire dataset in some defined batches (for this example I am using 60) for each epoch. I have set the steps per epoch to the length of the entire document minus the sequence length divided by the batch size 'steps_per_epoch = (len(processed_doc) - self.sequence_length) // self.batch_size' when calling the model.fit command.
I appreciate any comments or guidance that can be provided on fixing this issue.
I'm trying to assign one of two classes (positive/nagative) to audio using CNN with Keras. My model should accept varied lengths of input (frames) in which each frame contains 41 features but I struggle with the input size. Bear in mind that I haven't acquired full dataset so I just mocked some meaningless data just to check if network works at all.
According to documentation https://keras.io/layers/convolutional/ and my best understanding Conv1D can tackle varied lengths if first element of input_shape tuple is None. Shape of variable containing input data X_train.shape is (4, 497, 41).
data = pd.read_csv('output_file.csv', sep=';')
featureCount = data.values.shape[1]
#mocks because full data is not available yet
Y_train = np.asarray([1, 0, 1, 0])
X_train = np.asarray(
[np.array(data.values, copy=True), np.array(data.values, copy=True), np.array(data.values, copy=True),
np.array(data.values, copy=True)])
# variable length with 41 features
model = keras.models.Sequential()
model.add(keras.layers.Conv1D(100, 5, activation='relu', input_shape=(None, featureCount)))
model.add(keras.layers.GlobalMaxPooling1D())
model.add(keras.layers.Dense(10, activation='relu'))
model.add(keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.summary()
model.fit(X_train, Y_train, epochs=10, verbose=False, validation_data=(np.array(data.values, copy=True), [1]))
This code produces error
ValueError: Error when checking input: expected conv1d_input to have 3 dimensions, but got array with shape (497, 41). So it appears like the first dimension was cut out as it contains training samples (it seems correct to me) what bothered me is the required dimensionality, why is it 3?
After searching for the answer I stumbled onto Dimension of shape in conv1D and followed it by adding last dimension (using X_train = np.expand_dims(X_train, axis=3)) that contains only single digit but I ended up with another, similar error:
ValueError: Error when checking input: expected conv1d_input to have 3 dimensions, but got array with shape (4, 497, 41, 1) now it seems that first dimension that previously was treated as sample "list" is now part of actual data.
I also tried fiddling with input_shape parameter but to no avail and using Reshape layer but ended up fighting with size the
What should I do to satisfy required shape? How to prepare data for processing?
My input data consists of 10 samples, each of which has 200 time steps, while each time step is described by a vector of 30 dimensions.
In addition, each time step consists of a 3 dimensional vector (one hot encoding) which describes the action which has been taken at that particular time step. With that being said, I am trying to build a model which get fed in all previous actions and then predicts which action would be the best to take next.
I tried to get this working with tflearn and tensorflow but with limited success so far.
Simple sample code:
import numpy as np
import operator
import tflearn
from tflearn import regression
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.embedding_ops import embedding
from tflearn.layers.recurrent import bidirectional_rnn, BasicLSTMCell
from tflearn.data_utils import to_categorical, pad_sequences
SAMPLES = 10
TIME_STEPS = 200
DATA_DIMENSIONS = 30
LABEL_CLASSES = 3
x = []
y = []
# Generate fake data.
for i in range(SAMPLES):
sequences = []
outputs = []
for i in range(TIME_STEPS):
d = []
for i in range(DATA_DIMENSIONS):
d.append(1)
sequences.append(d)
outputs.append([0,0,1])
x.append(sequences)
y.append(outputs)
print("X1:", len(x), ", X2:", len(x[0]), ", X3:", len(x[0][0]))
print("Y1:", len(y), ", Y2:", len(y[0]), ", Y3:", len(y[0][0]))
# Define model
net = tflearn.input_data([None, TIME_STEPS, DATA_DIMENSIONS], name='input')
net = tflearn.lstm(net, 128, dropout=0.8, return_seq=True)
net = tflearn.fully_connected(net, LABEL_CLASSES, activation='softmax')
net = tflearn.regression(net, optimizer='adam', loss='categorical_crossentropy', name='targets')
model = tflearn.DNN(net)
# Fit model.
model.fit({'input': x}, {'targets': y},
n_epoch=1,
snapshot_step=1000,
show_metric=True, run_id='test', batch_size=32)
Error
ValueError: Cannot feed value of shape (10, 200, 3) for Tensor
'targets/Y:0', which has shape '(?, 3)'
As far as I understand, the input_data should be correct. However, the output data is apparently wrong, at least, Tensorflow throws an error. That is probably because my model expects one label per sample rather than one label per time step.
Can I even achieve my goal with an LSTM, and if so, how do I have to set up my model?
Thanks,
Robert
As the error suggests, there is a shape mismatch between the expected size of your targets tensor, and the one of the data you actually provide for it. Let us break it down.
From what I understand, you have labeled action for every timestep of your sequences. This means that the labels that you provide should have a shape (10, 200, 3). This seems to be the case from the error message. Good.
So we now know the error comes from what the network generates.
=================
Input data -> (10, 200, 30)
LSTM -> (10, 128) (because return_seq=False)
FullyConnected -> (10, 3).
=================
So that explains the second part of the error message, your network indeed produces an output with shape (10, 3) which mismatches the one of your data.
I think you missed the return_seq argument of the LSTM. As is usually the case with RNN implementations, you have a parameter telling if you want the layer to return outputs for the whole sequence, or only for the last timestep. Here by default it is the second option, that is why you don't get an output with the expected shape. Use return_seq=True.
I've got some short code that attempts to fit a function. But I am worried about how to feed the data to a tflearn rnn.
The X input is a [45,1,8] (45 samples, 4 timesteps, and 8 features] array. Therefore, the Y input should be a [45,1,8] array since the goal is to minimize the difference element-wise.
However it throws the following error when this is attempted
Cannot feed value of shape (45, 1, 8) for Tensor 'TargetsData/Y:0', which has shape '(?, 8)'
I can't seem to figure out my error. Any help would be appreciated.
Note: someone seems to have solved a similar problem, but I can't understand the answer
tensorflow/tflearn input shape
Full Code
def mod(rnn_output,state):
Tau = tfl.variable(name='GRN', shape=[8],
initializer='uniform_scaling',
regularizer='L2')
Timestep = tf.constant(6.0,shape=[8])
one = tf.div(Timestep,Tau)
two = rnn_output
three = tf.mul(tf.sub(tf.ones(shape=[num_genes]),one),state)
four = tf.mul(one,two)
five = tf.add(four,three)
return(five)
net = tfl.input_data(shape=[None,4,8])
out, state = tfl.layers.recurrent.simple_rnn(net,8,return_state=True
,name='RNN')
net = tfl.layers.core.custom_layer(out,mod,state=state)
net = tfl.layers.estimator.regression(net)
# Define model
model = tfl.DNN(net)
# Start training (apply gradient descent algorithm)
model.fit(train_x, train_y, n_epoch=10, batch_size=45, show_metric=True)
I have been trying to perform regression using tflearn and my own dataset.
Using tflearn I have been trying to implement a convolutional network based off an example using the MNIST dataset. Instead of using the MNIST dataset I have tried replacing the training and test data with my own. My data is read in from a csv file and is a different shape to the MNIST data. I have 255 features which represent a 15*15 grid and a target value. In the example I replaced the lines 24-30 with (and included import numpy as np):
#read in train and test csv's where there are 255 features (15*15) and a target
csvTrain = np.genfromtxt('train.csv', delimiter=",")
X = np.array(csvTrain[:, :225]) #225, 15
Y = csvTrain[:,225]
csvTest = np.genfromtxt('test.csv', delimiter=",")
testX = np.array(csvTest[:, :225])
testY = csvTest[:,225]
#reshape features for each instance in to 15*15, targets are just a single number
X = X.reshape([-1,15,15,1])
testX = testX.reshape([-1,15,15,1])
## Building convolutional network
network = input_data(shape=[None, 15, 15, 1], name='input')
I get the following error:
ValueError: Cannot feed value of shape (64,) for Tensor u'target/Y:0',
which has shape '(?, 10)'
I have tried various combinations and have seen a similar question in stackoverflow but have not had success. The example in this page does not work for me and throws a similar error and I do not understand the answer provided or those provided by similar questions.
How do I use my own data?
Short answer
In the line 41 of the MNIST example, you also have to change the output size 10 to 1 in network = fully_connected(network, 10, activation='softmax') to network = fully_connected(network, 1, activation='linear'). Note that you can remove the final softmax.
Looking at your code, it seems you have a target value Y, which means using the L2 loss with mean_square (you will find here all the losses available):
regression(network, optimizer='adam', learning_rate=0.01,
loss='mean_square', name='target')
Also, reshape Y and Y_test to have shape (batch_size, 1).
Long answer: How to analyse the error and find the bug
Here is how to analyse the error:
The error is Cannot feed value ... for Tensor 'target/Y', which means it comes from the feed_dict argument Y.
Again, according to the error, you try to feed an Y value of shape (64,) whereas the network expect a shape (?, 10).
It expects a shape (batch_size, 10), because originally it's a network for MNIST (10 classes)
We now want to change the expected value of the network for Y.
in the code, we see that the last layer fully_connected(network, 10, activation='softmax') is returning an output of size 10
We change that to an output of size 1 without softmax: fully_connected(network, 1, activation='linear')
In the end, it was not a bug, but a wrong model architecture.