I am trying to build a LSTM model and I cannot get the input_shape parameter to work properly.
My data is set up so every row is a timestep and each column is an input_dim.
It is always the wrong shape, either missing the timestep, or adding an extra value in to the parameter:
ValueError: Input 0 of layer "sequential" is incompatible with the layer: expected shape=(None, 5592, 9), found shape=(None, 9)
Or
ValueError: Input 0 of layer "lstm" is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 14, 5592, 9)
Here is the relevant snippet of code:
tdf = pd.read_csv(train_csv)
tdf2 = pd.read_csv(train_csv2)
df = pd.read_csv(test_csv)
# Split the data into training and testing sets
train_x = []
train_y = []
for i in range(len(tdf_list)):
train_x.append(tdf_list[i])
train_y.append(tdf_list[i]["Close"].shift(-1).dropna())
train_x[i] = tdf_list[i].drop(index=tdf_list[i].index[-1]).drop(columns=["Close"])
test_x = df
test_y = df["Close"].shift(-1).dropna()
test_x = df.drop(index=df.index[-1]).drop(columns=["Close"])
test_x = test_x.values.reshape(-1, test_x.shape[1], 1)
print(train_x[0].shape)
def run_model(g):
# Define the LSTM model
model = tf.keras.Sequential()
model.add(tf.keras.layers.LSTM(neurons, return_sequences=True, input_shape=(train_x[g].shape[0],train_x[g].shape[1])))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(rate=0.2))
model.add(tf.keras.layers.LSTM(neurons, return_sequences=False))
model.add(tf.keras.layers.Dense(32))
model.add(tf.keras.layers.Activation('tanh'))
model.add(tf.keras.layers.Dense(32))
model.add(tf.keras.layers.Dense(1))
I have tried manually entering input shape integers, and even when hardcoded I cannot get it to work properly. I have also tried every permutation of reshaping train_x[0] to get it to fit properly. The only way I can get the code to execute is if I set
input_shape=(train_x[g].shape[1], 1)
But then it is using the data columns as timesteps...
Related
This is the code in question:
batch_size = 1
epochs = 1
begin_from_row = 3
rows_to_train = 1000
data = data.loc[begin_from_row:rows_to_train, :]
data['Close_next'] = data['Close'].shift(-1)
data = data.dropna()
output_data = data['Close_next']
input_data = data.drop(columns=['Close_next'])
input_size = 9
output_size = 1
hidden_size_1 = 9
input_layer = tf.keras.Input(batch_shape=(batch_size, input_size))
input_layer_expanded = tf.expand_dims(input_layer, axis=-1)
hidden_1 = tf.keras.layers.LSTM(hidden_size_1, stateful=True)(input_layer_expanded)
output_layer = tf.keras.layers.Dense(1, activation='relu')(hidden_1)
model = tf.keras.Model(inputs=input_layer, outputs=output_layer)
model.compile(loss='mean_squared_error', optimizer='adam', run_eagerly=True)
model.fit(input_data, output_data, epochs=epochs)
model.save("model_1.h5")
It returns the following error:
Input 0 of layer "lstm" is incompatible with the layer: expected shape=(1, None, 1), found shape=(32, 9, 1)
I can’t quite get where it gets the number 32 from, since it soesn’™ appear anywhere in my code
The code works when I specify the batch_size=32, just for one batch. The number 32 doesn’t appear anywhere in the code, so I would like to know where it’s coming from.
The vector (32, 9, 1) represents the size of your input data, whereas (1, None, 1) is the expected shape that you defined in the InputLayer (batch_size = 1).
Batch_size 32 is the default value in the fit method. The default value is being used since you did not specify the batch_size argument and you are not using tensorflow datasets:
batch_size: Integer or None. Number of samples per gradient update. If
unspecified, batch_size will default to 32. Do not specify the
batch_size if your data is in the form of datasets, generators, or
keras.utils.Sequence instances (since they generate batches).
From: Tensorflow fit function
I'm trying to train a CNN model for a speech emotion recognition task using spectrograms as input. I've reshaped the spectrograms to have the shape (num_frequency_bins, num_time_frames, 1) which I thought would be sufficient, but upon trying to fit the model to the dataset, which is stored in a Tensorflow dataset, I got the following error:
Input 0 of layer "sequential_12" is incompatible with the layer: expected shape=(None, 257, 1001, 1), found shape=(257, 1001, 1)
I tried reshaping the spectrograms to have the shape (1, num_frequency_bins, num_time_frames, 1), but that produced an error when creating the Sequential model:
ValueError: Exception encountered when calling layer "resizing_14" (type Resizing).
'images' must have either 3 or 4 dimensions.
Call arguments received:
• inputs=tf.Tensor(shape=(None, 1, 257, 1001, 1), dtype=float32)
So I passed in the shape as (num_frequency_bins, num_time_frames, 1) when creating the model, and then fitted the model to the training data with the 4-dimensional data, but that raised this error:
InvalidArgumentError: slice index 0 of dimension 0 out of bounds. [Op:StridedSlice] name: strided_slice/
So I'm kind of at a loss now. I genuinely have no idea what to do and how I can go about fixing this. I've read around but haven't come across anything useful. Would really appreciate any help.
Here's some of the code for context.
dataset = [[specgram_files[i], labels[i]] for i in range(len(specgram_files))]
specgram_files_and_labels_dataset = tf.data.Dataset.from_tensor_slices((specgram_files, labels))
def read_npy_file(data):
# 'data' stores the file name of the numpy binary file storing the features of a particular sound file
# item() returns numpy array of size 1 as a suitable python scalar.
# data.item() then returns the bytes string stored in the numpy array.
# decode() is then called on the bytes string to decode it from a bytes string to a regular string
# so that it can be passed as a parameter in np.load()
data = np.load(data.item().decode())
# Shape of data is now (1, rows, columns)
# Needs to be reshaped to (rows, columns, 1):
data = np.reshape(data, (data.shape[0], data.shape[1], 1))
return data.astype(np.float32)
specgram_dataset = specgram_files_and_labels_dataset.map(
lambda file, label: tuple([tf.numpy_function(read_npy_file, [file], [tf.float32]), label]),
num_parallel_calls=tf.data.AUTOTUNE)
num_files = len(train_df)
num_train = int(0.8 * num_files)
num_val = int(0.1 * num_files)
num_test = int(0.1 * num_files)
specgram_dataset.shuffle(buffer_size=1000)
specgram_train_ds = specgram_dataset.take(num_train)
specgram_test_ds = specgram_dataset.skip(num_train)
specgram_val_ds = specgram_test_ds.take(num_val)
specgram_test_ds = specgram_test_ds.skip(num_val)
batch_size = 32
specgram_train_ds.batch(batch_size)
specgram_val_ds.batch(batch_size)
specgram_train_ds = specgram_train_ds.cache().prefetch(tf.data.AUTOTUNE)
specgram_val_ds = specgram_val_ds.cache().prefetch(tf.data.AUTOTUNE)
for specgram, label in specgram_train_ds.take(1):
input_shape = specgram.shape
num_emotions = len(train_df["emotion"].unique())
model = models.Sequential([
layers.Input(shape=input_shape),
# downsampling the input.
layers.Resizing(32, 128),
layers.Conv2D(32, 3, activation="relu"),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, activation="relu"),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(128, activation="softmax"),
layers.Dense(num_emotions)
])
model.compile(
optimizer=tf.keras.optimizers.Adam(0.01),
loss=tf.keras.losses.SparseCategoricalCrossentropy(),
metrics=["accuracy"]
)
EPOCHS = 10
model.fit(
specgram_train_ds,
validation_data=specgram_val_ds,
epochs=EPOCHS,
callbacks=tf.keras.callbacks.EarlyStopping(verbose=1, patience=2)
)
Assuming you know your input_shape, I would recommend first hard-coding it into your model:
model = models.Sequential([
layers.Input(shape=(257, 1001, 1),
# downsampling the input.
layers.Resizing(32, 128),
layers.Conv2D(32, 3, activation="relu"),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, activation="relu"),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(128, activation="softmax"),
layers.Dense(num_emotions)
])
Also, when using tf.data.Dataset.batch, you should assign the Dataset output to a variable:
batch_size = 32
specgram_train_ds = specgram_train_ds.batch(batch_size)
specgram_val_ds = specgram_val_ds.batch(batch_size)
Afterwards, make sure that specgram_train_ds really does have the correct shape:
specgrams, _ = next(iter(specgram_train_ds.take(1)))
assert specgrams.shape == (32, 257, 1001, 1)
I'm attempting to follow this time series classification with transformers with Keras tutorial. This is the relevant part of my code:
x_trainScaledNPArray = np.array(x_trainScaled)
x_testScaledNPArray = np.array(x_testScaled)
y_trainNPArray = np.array(y_train)
y_testNPArray = np.array(y_test)
print(x_trainScaledNPArray.shape)
print(x_testScaledNPArray.shape)
print(y_trainNPArray.shape)
print(y_testNPArray.shape)
n_classes = len(np.unique(y_train))
input_shapeIndex0 = x_trainScaledNPArray.shape[0:]
print(input_shapeIndex0)
model = build_model(n_classes,input_shapeIndex0,head_size=256,num_heads=4,ff_dim=4,num_transformer_blocks=4,mlp_units=[128],mlp_dropout=0.4,dropout=0.25)
model.compile(loss="sparse_categorical_crossentropy",optimizer=keras.optimizers.Adam(learning_rate=1e-4),metrics=["sparse_categorical_accuracy"])
callbacks = [keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True)]
model.fit(x_trainScaledNPArray,y_trainNPArray,validation_split=0.2,epochs=200,batch_size=64,callbacks=callbacks)
This is the output I get:
(18287, 2048)
(347, 2048)
(18287,)
(347,)
(18287, 2048)
[...]
ValueError: Input 0 is incompatible with layer model: expected shape=(None, 18287, 2048), found shape=(None, 2048)
I already tried to solve this by following the hints given here and here but without any success. Any help would be highly appreciated.
I erroneously omitted the following part of the tutorial:
x_train = x_train.reshape((x_train.shape[0], x_train.shape[1], 1))
x_test = x_test.reshape((x_test.shape[0], x_test.shape[1], 1))
Thus, the input didn't have the correct size. Adding the lines mentioned above solved the issue.
I'm using TF 2.5 and I'm punctually following the tutorial on https://www.tensorflow.org/text/tutorials/text_generation just to get some confidence.
But I don't understand why after building the model I get an inconsistency on the dimension; it expects a 3-dimensional layer and it receives only two.
ValueError: Input 0 of layer gru is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (64, 100)
Meanwhile this problem is not found in the tutorial, even using their own dataset, and this leaves me puzzled.
I then tried various combinations, even generating a dummy dimension in the dataset, but I don't get the desired result.
I somehow understand that the problem should lies in the dataset conformation since:
<PrefetchDataset shapes: ((64, 100), (64, 100)), types: (tf.int64, tf.int64)>
I see I am missing a dimension. But how can I "add" a dimension to it, or viceversa exclude the model to expect 3 dimension (but, in that case, I am wondering what's the meaning of the whole model), it's not clear.
I have read many similar cases (both with GRU and LSTM) and tried various options, but given my unfamiliarity, I feel I am at a standstill.
I would be grateful if someone could give me a tip.
Here is my code (that replicates the one on TensorFlow page linked above):
(this is also my very first post on SO!)
BATCH_SIZE = 64
BUFFER_SIZE = 10000
dataset = (dataset
.shuffle(BUFFER_SIZE)
.batch(BATCH_SIZE, drop_remainder=True)
.prefetch(tf.data.experimental.AUTOTUNE))
vocab_size = len(vocab)
embedding_dim = 256
rnn_units = 1024
class MyModel(tf.keras.Model):
def __init__(self, vocab_size, embedding_dim, rnn_units):
super().__init__(self)
self.embedding = tf.keras.layers.GRU(rnn_units, return_sequences=True, return_state=True)
self.dense = tf.keras.layers.Dense(vocab_size)
def call(self, inputs, states=None, return_state=False, training=False):
x = inputs
x = self.embedding(x, training=training)
if states is None:
states = self.gru.get_initial_state(x)
x = self.dense(x, training=training)
if return_state:
return x, states
else:
return x
model = MyModel(vocab_size=len(ids_from_chars.get_vocabulary()), embedding_dim=embedding_dim, rnn_units=rnn_units)
for input_example_batch, target_example_batch in dataset.take(1):
example_batch_predictions = model(input_example_batch)
print (example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")
In my case I am using a set of sequential features and also non sequential features to train the model. Following is the architecture of my model
Sequential features -> LSTM -> Dense(1) --->>
\
\
-- Dense -> Dense -> Dense(1) ->output
/
Non-sequential features---/
I am using data generator to generate batches for sequential data. Here the batch size is varying for each batch. For one batch I am keeping the non-sequential feature fixed. Following is my data generator.
def training_data_generator(raw_data):
while True:
for index, row in raw_data.iterrows():
x_train, y_train = list(), list()
feature1 = row['xxx']
x_current_batch = []
y_current_batch = []
for j in range(yyy):
x_current_batch.append(row['zz1'])
y_current_batch.append(row['zz2'])
x_train.append(x_current_batch)
y_train.append(y_current_batch)
x_train = array(x_train)
y_train = array(y_train)
yield [x_train, np.reshape(feature1,1)], y_train
Note: x_train y_train sizes are varying.
Following is my model implementation.
seq_input = Input(shape=(None, 3))
lstm_layer = LSTM(50)(seq_input)
dense_layer1 = Dense(1)(lstm_layer)
non_seq_input = Input(shape=(1,))
hybrid_model = concatenate([dense_layer1, non_seq_input])
hidden1 = Dense(10, activation = 'relu')(hybrid_model)
hidden2 = Dense(10, activation='relu')(hidden1)
final_output = Dense(1, activation='sigmoid')(hidden2)
model = Model(inputs = [seq_input, non_seq_input], outputs = final_output)
model.compile(loss='mse',optimizer='adam')
model.fit_generator(training_data_generator(flatten), steps_per_epoch= 5017,
epochs = const.NUMBER_OF_EPOCHS, verbose=1)
I am getting error at the output dense layer
ValueError: Error when checking target:
expected dense_4 to have shape (1,) but got array with shape (4,)
I think the last layer is getting whole output of the generator but not as one by one.
What is the reason for this issue. Appreciate your insights on this issue.
The output gives a Dense layer with a size of 4. Since you've declared your output as a Dense layer with a size of 1, it crashes.
What you can do is change your output dense Layer to 4. And then manually convert this to one value.
Hopefully this answers your question.