LSTM - Making predictions on partial sequence - python

This question is in continue to a previous question I've asked.
I've trained an LSTM model to predict a binary class (1 or 0) for batches of 100 samples with 3 features each, i.e: the shape of the data is (m, 100, 3), where m is the number of batches.
Data:
[
[[1,2,3],[1,2,3]... 100 sampels],
[[1,2,3],[1,2,3]... 100 sampels],
... avaialble batches in the training data
]
Target:
[
[1]
[0]
...
]
Model code:
def build_model(num_samples, num_features, is_training):
model = Sequential()
opt = optimizers.Adam(lr=0.0005, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0001)
batch_size = None if is_training else 1
stateful = False if is_training else True
first_lstm = LSTM(32, batch_input_shape=(batch_size, num_samples, num_features), return_sequences=True,
activation='tanh', stateful=stateful)
model.add(first_lstm)
model.add(LeakyReLU())
model.add(Dropout(0.2))
model.add(LSTM(16, return_sequences=True, activation='tanh', stateful=stateful))
model.add(Dropout(0.2))
model.add(LeakyReLU())
model.add(LSTM(8, return_sequences=False, activation='tanh', stateful=stateful))
model.add(LeakyReLU())
model.add(Dense(1, activation='sigmoid'))
if is_training:
model.compile(loss='binary_crossentropy', optimizer=opt,
metrics=['accuracy', keras_metrics.precision(), keras_metrics.recall(), f1])
return model
For the training stage, the model is NOT stateful. When predicting I'm using a stateful model, iterating over the data and outputting a probability for each sample:
for index, row in data.iterrows():
if index % 100 == 0:
predicting_model.reset_states()
vals = np.array([[row[['a', 'b', 'c']].values]])
prob = predicting_model.predict_on_batch(vals)
When looking at the probability at the end of a batch, it is exactly the value I get when predicting with the entire batch (not one by one). However, I've expected that the probability will always continue in the right direction when new samples arrive. What actually happens is that the probability output can spike to the wrong class on an arbitrary sample (see below).
Two samples of 100 sample batches over the time of prediction (label = 1):
and Label = 0:
Is there a way to achieve what I want (avoid extreme spikes while predicting probability), or is that a given fact?
Any explanation, advice would be appreciated.
Update
Thanks to #today advice, I've tried training the network with the hidden state output for each input time step using return_sequence=True on the last LSTM layer.
So now the labels look like so (shape (100,100)):
[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
...]
the model summary:
Layer (type) Output Shape Param #
=================================================================
lstm_1 (LSTM) (None, 100, 32) 4608
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU) (None, 100, 32) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 100, 32) 0
_________________________________________________________________
lstm_2 (LSTM) (None, 100, 16) 3136
_________________________________________________________________
dropout_2 (Dropout) (None, 100, 16) 0
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU) (None, 100, 16) 0
_________________________________________________________________
lstm_3 (LSTM) (None, 100, 8) 800
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU) (None, 100, 8) 0
_________________________________________________________________
dense_1 (Dense) (None, 100, 1) 9
=================================================================
Total params: 8,553
Trainable params: 8,553
Non-trainable params: 0
_________________________________________________________________
However, I get an exception:
ValueError: Error when checking target: expected dense_1 to have 3 dimensions, but got array with shape (75, 100)
What do I need to fix?

Note: This is just an idea and it might be wrong. Try it if you would like and I would appreciate any feedback.
Is there a way to achieve what I want (avoid extreme spikes while
predicting probability), or is that a given fact?
You can do this experiment: set the return_sequences argument of last LSTM layer to True and replicate the labels of each sample as much as the length of each sample. For example if a sample has a length of 100 and its label is 0, then create a new label for this sample which consists of 100 zeros (you can probably easily do this using numpy function like np.repeat). Then retrain your new model and test it on new samples afterwards. I am not sure of this, but I would expect more monotonically increasing/decreasing probability graphs this time.
Update: The error you mentioned is caused by the fact that the labels should be a 3D array (look at the output shape of last layer in the model summary). Use np.expand_dims to add another axis of size one to the end. The correct way of repeating the labels would look like this, assuming y_train has a shape of (num_samples,):
rep_y_train = np.repeat(y_train, num_reps).reshape(-1, num_reps, 1)
The experiment on IMDB dataset:
Actually, I tried the experiment suggested above on the IMDB dataset using a simple model with one LSTM layer. One time, I used only one label per each sample (as in original approach of #Shlomi) and the other time I replicated the labels to have one label per each timestep of a sample (as I suggested above). Here is the code if you would like to try it yourself:
from keras.layers import *
from keras.models import Sequential, Model
from keras.datasets import imdb
from keras.preprocessing.sequence import pad_sequences
import numpy as np
vocab_size = 10000
max_len = 200
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=vocab_size)
X_train = pad_sequences(x_train, maxlen=max_len)
def create_model(return_seq=False, stateful=False):
batch_size = 1 if stateful else None
model = Sequential()
model.add(Embedding(vocab_size, 128, batch_input_shape=(batch_size, None)))
model.add(CuDNNLSTM(64, return_sequences=return_seq, stateful=stateful))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
return model
# train model with one label per sample
train_model = create_model()
train_model.fit(X_train, y_train, epochs=10, batch_size=128, validation_split=0.3)
# replicate the labels
y_train_rep = np.repeat(y_train, max_len).reshape(-1, max_len, 1)
# train model with one label per timestep
rep_train_model = create_model(True)
rep_train_model.fit(X_train, y_train_rep, epochs=10, batch_size=128, validation_split=0.3)
Then we can create the stateful replicas of the training models and run them on some test data to compare their results:
# replica of `train_model` with the same weights
test_model = create_model(False, True)
test_model.set_weights(train_model.get_weights())
test_model.reset_states()
# replica of `rep_train_model` with the same weights
rep_test_model = create_model(True, True)
rep_test_model.set_weights(rep_train_model.get_weights())
rep_test_model.reset_states()
def stateful_predict(model, samples):
preds = []
for s in samples:
model.reset_states()
ps = []
for ts in s:
p = model.predict(np.array([[ts]]))
ps.append(p[0,0])
preds.append(list(ps))
return preds
X_test = pad_sequences(x_test, maxlen=max_len)
Actually, the first sample of X_test has a 0 label (i.e. belongs to negative class) and the second sample of X_test has a 1 label (i.e. belongs to positive class). So let's first see what the stateful prediction of test_model (i.e. the one that were trained using one label per sample) for these two samples would look like:
import matplotlib.pyplot as plt
preds = stateful_predict(test_model, X_test[0:2])
plt.plot(preds[0])
plt.plot(preds[1])
plt.legend(['Class 0', 'Class 1'])
The result:
Correct label (i.e. probability) at the end (i.e. timestep 200) but very spiky and fluctuating in between. Now let's compare it with the stateful predictions of the rep_test_model (i.e. the one that were trained using one label per each timestep):
preds = stateful_predict(rep_test_model, X_test[0:2])
plt.plot(preds[0])
plt.plot(preds[1])
plt.legend(['Class 0', 'Class 1'])
The result:
Again, correct label prediction at the end but this time with a much more smoother and monotonic trend, as expected.
Note that this was just an example for demonstration and therefore I have used a very simple model here with just one LSTM layer and I did not attempt to tune it at all. I guess with a better tuning of the model (e.g. adjusting the number of layers, number of units in each layer, activation functions used, optimizer type and parameters, etc.), you might get far better results.

Related

Stuck in error loop between Data cardinality is ambiguous and shapes are incompatible with my 3d cnn model

I'm attempting to train my model using the "train_on_batch" function, as the data is too large to be fully put in at once. The shape of my training data is as follows: X.shape = (388, 108, 36, 36, 36), Y.shape = (388, 108). To make the data clear, there are 388 x and 388 y train files. Each of these training files contains 108 arrays of 3d arrays (36,36,36). For every 3d array, there is a corresponding binary. I'm trying to iterate through these 388 pairs of files 1 by 1 to use in the train_on_batch. Below is the CNN model:
model = Sequential()
model.add(Conv3D(filters=16, kernel_size=(3,3,3), padding='valid', input_shape=(108, 36, 36, 36)))
model.add(Activation('relu'))
model.add(MaxPool3D(pool_size=(2,2,2)))
model.add(Conv3D(32, kernel_size=(3,3,3)))
model.add(Activation('relu'))
model.add(MaxPool3D(pool_size=(2,2,2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(32))
model.add(Activation('relu'))
model.add(Dropout(0.1))
model.add(Dense(2))
model.add(Activation('softmax'))
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
This was my first for loop for trying to input the data:
for i in range(len(X_train)):
model.train_on_batch(X_train[i], Y_train[i], sample_weight=None)
Which resulted in the following error:
ValueError: Input 0 of layer "sequential" is incompatible with the layer: expected shape=(None, 108, 36, 36, 36), found shape=(108, 36, 36, 36)
To combat this I reshaped my data, which resulted in my input being accepted. I ensured that the y data was the same shape, however then I reached the error loop which I cannot figure out myself, and wanted to ask others. Here is the reshape resulting in ValueError: Shapes (1, 108) and (1, 2) are incompatible:
for i in range(len(X_train)):
new_X_train = X_train[i].reshape(1, 108, 36, 36, 36)
new_Y_train = Y_train[i].reshape(1, 108)
When I apply .astype('float32').reshape((-1,1)) on the Y, then I get the error that ValueError: Data cardinality is ambiguous:. This makes sense to me because since then the x and y data won't be the same format.
The output should be 0 or 1, as these are ct_scan slices, so it's identifying the array as either "nodule" or "non-nodule". For reference, here is what Y_train[0] looks like:
[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
I've been trying to wrap my head around this for a while. There are many questions that can help me solve my errors, but my issue is when I solve "Data cardinality is ambiguous", I get sent to the "shapes are incompatible". Vise-verca. I might be missing something, I tried what several threads have done with these individual problems but I can't seem to figure it out. Is it just the data format that my training files are in?
As it turns out, I was misinterpreting a comment I had read while following a guide on how to setup this model. By writing (108,36,36,36) instead of (36,36,36,1) I was telling the model the incorrect input shape. Once that was fixed, it worked.

ValueError: Shapes (None, 2, 28) and (None, 2) are incompatible // How can i transform 2 onehotvectors to one

I'm working on a classification Problem. The data i use is from the Aras Dataset. One line of the Data looks like the following:
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 12 17
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 12 17
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 12 17
From the first 19 columns represent sensordata(binary). The last two columns represent the activites of two persons who lived in a household, where the data was collected.
i have diveded the dataset into different pieces, because it's not small at all, 30 Days with one datapoint every second.
What i want to do with my model: I want to train my model so it can predict what Person A&B are doing at the moment.
So here is my Code(X-Data:Column 1-19;Y-Data_Column 20-21):
*import keras
from keras import losses
from keras import regularizers
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import numpy as np
from keras.utils import to_categorical
import matplotlib.pyplot as plt
from tensorflow.keras import optimizers
batch_size =512
no_epochs = 5
verbosity = 1
x_train=np.loadtxt('x_train.txt')
x_val=np.loadtxt('x_val.txt')
x_test=np.loadtxt('x_test.txt')
y_train=np.loadtxt('y_train.txt')
y_val=np.loadtxt('y_val.txt')
y_test=np.loadtxt('y_test.txt')
y_train_onehot=keras.utils.to_categorical(y_train)
y_val_onehot=keras.utils.to_categorical(y_val)
y_test_onehot=keras.utils.to_categorical(y_test)
model = Sequential()
model.add(Dense(64, activation='relu', input_shape=[19,]))
model.add(Dense(128, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(
learning_rate=0.000001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False,
name='Adam'),
metrics=['accuracy'])
model.summary()
history=model.fit(x_train, y_train_onehot, batch_size, epochs=no_epochs,verbose=verbosity, shuffle=True,validation_data=(x_val, y_val_onehot))
Error: ValueError: Shapes (None, 2, 28) and (None, 2) are incompatible
When i do not convert the labels to the onehot format it is working, but it is not a useful result (i guess). Problem is, that i got this valueerror at the end and i know it has something to do with the fact that inside the vector are two onehot-vectors, but i have no idea how to solve this issue.
--> i tried to put both onehot vectors into one, but then every line has 729 columns(27*27 for each labelcombination), but then the labeldata gots to big an python won't work the script out.
Windows 10
Keras 2.4.3
Tensorflow 2.3.1
Python 3.7.9
I'm new to this whole topic, so don't be mad with me, if my question is stupid.
Your model requires two outputs. It is impossible with Sequential API. Create a new model with Functional API

How to format/build the data for making timeseries prediction using Python and Pandas?

I have went through below blog content about time-series prediction.
Example for Vanilla LSTM for univariate time series forecasting and it makes a single prediction.
# univariate lstm example
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
# split a univariate sequence into samples
def split_sequence(sequence, n_steps):
X, y = list(), list()
for i in range(len(sequence)):
# find the end of this pattern
end_ix = i + n_steps
# check if we are beyond the sequence
if end_ix > len(sequence)-1:
break
# gather input and output parts of the pattern
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)
# define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
# choose a number of time steps
n_steps = 3
# split into samples
X, y = split_sequence(raw_seq, n_steps)
# reshape from [samples, timesteps] into [samples, timesteps, features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))
# define model
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(n_steps, n_features)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X, y, epochs=200, verbose=0)
# demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)
We can see that the model predicts the next value in the sequence.
[[102.09213]]
But I have Data about Articles read from various userIds within a year 2019. For now Lets consider a single user or forgot about user. we will consider simple data as ArticleID read for any given dates like below :
My Data :
from datetime import date, timedelta
import pandas as pd
import numpy as np
sdate = date(2019,1,1) # start date
edate = date(2019,1,7) # end date
required_dates = pd.date_range(sdate,edate-timedelta(days=1),freq='d')
# initialize list of lists
data = [['2019-01-01', 1001], ['2019-01-03', 1121] ,['2019-01-02', 1500],
['2019-01-02', 1400],['2019-01-04', 1501],['2019-01-01', 1200],
['2019-01-04', 1201],['2019-01-04', 1551],['2019-01-05', 1400]]
# Create the pandas DataFrame
df1 = pd.DataFrame(data, columns = ['OnlyDate', 'ArticleID'])
df1.sort_values(by='OnlyDate',inplace=True)
df = pd.get_dummies(df1.set_index('OnlyDate')['ArticleID']).max(level=0)
df
1001 1121 1200 1201 1400 1500 1501 1551
2019-01-01 1 0 1 0 0 0 0 0
2019-01-02 0 0 0 0 1 1 0 0
2019-01-03 0 1 0 0 0 0 0 0
2019-01-04 0 0 0 1 0 0 1 1
2019-01-05 0 0 0 0 1 0 0 0
2019-01-06 0 0 0 0 0 0 0 0
I am not able to transform above given data to make it fit for above LSTM model.
My main objective here is to get predictions for any user Interest(1) for any supplied dates as I have nearly fix 2200 ArticleIDs and nearly 500 userIDs.
What can I try next?

Custom training with my own images using tf.data

I'm new to tensorflow and I have trouble with feeding my custom data to keras model.
I've followed this guide:Load images to convert my .jpg files to tf.data.
Now I have my data converted to (image_batch, label_batch). The image_batch is EagerTensor with shape (32,224,224,3) and the label_batch is EagerTensor with shape (32,2).
Then I found this guide:Custom training: walkthrough but the data in the guild is converted to EagerTensor with shape (32,4).
I got Warning when executing the code:
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, activation=tf.nn.relu, input_shape=(3,)), # input shape required
tf.keras.layers.Dense(10, activation=tf.nn.relu),
tf.keras.layers.Dense(3)
])
predictions = model(image_batch)
WARNING:tensorflow:Model was constructed with shape (None, 3) for input Tensor("dense_input:0", shape=(None, 3), dtype=float32), but it was called on an input with incompatible shape (32, 224, 224, 3).
How should I adjust my model or what should I do with my data?
EDIT:
The model now works, but with one additional problem.
When I run the following code:
print("Prediction: {}".format(tf.argmax(predictions, axis=1)))
print(" Labels: {}".format(labels_batch))
it prints:
Prediction: [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
Labels: [[ True False]
[False True]
[ True False]
[False True]
[ True False]...(omitted)]
But I expected it prints something like:
Prediction: [0 1 0 1 1 1 0 1 0 1 1 0 0 0 0 0 1 1 0 1 0 0 1 0 0 0 0 1 0 0 1 0]
Labels: [2 0 2 0 0 0 1 0 2 0 0 1 1 2 2 2 1 0 1 0 1 2 0 1 1 1 1 0 2 2 0 2]
with Labels as a one dimensional array with integers.
I wonder if it is normal that the predictions are all 1? What should I do?
Your input is 32 images of shape (224, 224, 3) not (3,). Your input shape needs to be (224,224,3).
I am also noting that your output shape looks like it is going to be (224,224,3) as well, this won't match your labels. You need to flatten the data at some point or do something similar.
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, activation=tf.nn.relu, input_shape=(224,224,3)), # input shape required
tf.keras.layers.Dense(10, activation=tf.nn.relu),
tf.kears.layers.Flatten(),
tf.keras.layers.Dense(2)
])
The input shape to the Danse layer should have a dimension (None, n), where None is a batch_size. In your case, if you'd like to use a Dense layer you should first use a Flatten layer wich roll your images to the shape (32, 224 * 224 * 3). The code should be:
model = tf.keras.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(10, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.relu),
tf.keras.layers.Dense(3)
])
For more details please see https://www.tensorflow.org/api_docs/python/tf/keras/layers/Flatten

Keras ValueError: Error when checking target: expected dense_15 to have 3 dimensions, but got array with shape (301390, 8)

I have 8 classes that I want to predict from input text. Here is my code for preprocessing the data:
num_max = 1000
tok = Tokenizer(num_words=num_max)
tok.fit_on_texts(x_train)
mat_texts = tok.texts_to_matrix(x_train,mode='count')
num_max = 1000
tok = Tokenizer(num_words=num_max)
tok.fit_on_texts(x_train)
max_len = 100
cnn_texts_seq = tok.texts_to_sequences(x_train)
print(cnn_texts_seq[0])
[12, 4, 303]
# padding the sequences
cnn_texts_mat = sequence.pad_sequences(cnn_texts_seq,maxlen=max_len)
print(cnn_texts_mat[0])
print(cnn_texts_mat.shape)
[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 12 4 303]
(301390, 100)
Below is the structure of my model which contains an embedding layer:
max_features = 20000
max_features = cnn_texts_mat.shape[1]
maxlen = 100
embedding_size = 128
model = Sequential()
model.add(Embedding(max_features, embedding_size, input_length=maxlen))
model.add(Dropout(0.2))
model.add(Dense(5000, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(600, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(units=y_train.shape[1], activation='softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy',
optimizer=sgd)
Below is the model summary:
model.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_5 (Embedding) (None, 100, 128) 12800
_________________________________________________________________
dropout_13 (Dropout) (None, 100, 128) 0
_________________________________________________________________
dense_13 (Dense) (None, 100, 5000) 645000
_________________________________________________________________
dropout_14 (Dropout) (None, 100, 5000) 0
_________________________________________________________________
dense_14 (Dense) (None, 100, 600) 3000600
_________________________________________________________________
dropout_15 (Dropout) (None, 100, 600) 0
_________________________________________________________________
dense_15 (Dense) (None, 100, 8) 4808
=================================================================
Total params: 3,663,208
Trainable params: 3,663,208
Non-trainable params: 0
After this, I am getting below error when I try to run the model:
model.fit(x=cnn_texts_mat, y=y_train, epochs=2, batch_size=100)
ValueError Traceback (most recent call last)
<ipython-input-41-4b9da9914e7e> in <module>
----> 1 model.fit(x=cnn_texts_mat, y=y_train, epochs=2, batch_size=100)
~/.local/lib/python3.5/site-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
950 sample_weight=sample_weight,
951 class_weight=class_weight,
--> 952 batch_size=batch_size)
953 # Prepare validation data.
954 do_validation = False
~/.local/lib/python3.5/site-packages/keras/engine/training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, check_array_lengths, batch_size)
787 feed_output_shapes,
788 check_batch_axis=False, # Don't enforce the batch size.
--> 789 exception_prefix='target')
790
791 # Generate sample-wise weight values given the `sample_weight` and
~/.local/lib/python3.5/site-packages/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
126 ': expected ' + names[i] + ' to have ' +
127 str(len(shape)) + ' dimensions, but got array '
--> 128 'with shape ' + str(data_shape))
129 if not check_batch_axis:
130 data_shape = data_shape[1:]
ValueError: Error when checking target: expected dense_15 to have 3 dimensions, but got array with shape (301390, 8)
Look at the output shape of the last layer in model summary: it is (None, 100, 8). This is not what you are looking for. Your labels for each sample have a shape of (8,) and not (100,8). Why this happened? That's because the Dense layer is applied on the last axis of its input and therefore since the output of Embedding layer is 3D, the output of all the following Dense layers would also have 3 dimension.
How to resolve this? One approach is to use Flatten layer somewhere in your model (possibly right after the embedding layer). This way you would have a 2D ouput of shape (None, 8) which is what you want and is consistent with the shape of labels.
However, note that you may end up with a very big model (i.e. too many parameters) which would be highly prone to overfitting. Either reduce the number of units in the Dense layers or alternatively use Conv1D and MaxPooling1D layers or even RNN layers to process the embeddings and reduce the dimensionality of resulting tensors (which is also probable that using them would also increase the accuracy of the model as well).

Categories