how to feed LSTM model in Keras python? - python

I have read about LSTM and I know that algorithm takes the value of the previous words and consider it in the next word parameters
Now I am trying to apply my first LSTM algorithm
I have this code.
model = Sequential()
model.add(LSTM(units=6, input_shape = (X_train_count.shape[0], X_train_count.shape[1]), return_sequences = True))
model.add(LSTM(units=6, return_sequences=True))
model.add(LSTM(units=6, return_sequences=True))
model.add(LSTM(units=ytrain.shape[1], return_sequences=True, name='output'))
model.compile(loss='cosine_proximity', optimizer='sgd', metrics = ['accuracy'])
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
model.summary()
cp=ModelCheckpoint('model_cnn.hdf5',monitor='val_acc',verbose=1,save_best_only=True)
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['acc'])
model.summary()
cp=ModelCheckpoint('model_cnn.hdf5',monitor='val_acc',verbose=1,save_best_only=True)
history = model.fit(X_train_count, ytrain,
epochs=20,
verbose=False,
validation_data=(X_test_count, yval),
batch_size=10,
callbacks=[cp])
1- I cannot see how the LSTM would know the word sequence while my dataset built based on TFIDF?
2- I am getting error that
ValueError: Input 0 of layer sequential_8 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 18644]

The issue seems to be in the shape of X_train_count you are taking in LSTM input shape is always tricky.
If your X_train_count is not in 3D then reshape using the below line.
X_train_count=X_train_count.reshape(X_train_count.shape[0],X_train_count.shape[1],1))
In the LSTM layer, the input_shape should be (timesteps, data_dim).
Below is the example to illustrate the same.
from sklearn.feature_extraction.text import TfidfVectorizer
import tensorflow as tf
from tensorflow import keras
from sklearn.model_selection import train_test_split
X = ["first example","one more","good morning"]
Y = ["first example","one more","good morning"]
vectorizer = TfidfVectorizer().fit(X)
tfidf_vector_X = vectorizer.transform(X).toarray()
tfidf_vector_Y = vectorizer.transform(Y).toarray()
tfidf_vector_X = tfidf_vector_X[:, :, None]
tfidf_vector_Y = tfidf_vector_Y[:, :, None]
X_train, X_test, y_train, y_test = train_test_split(tfidf_vector_X, tfidf_vector_Y, test_size = 0.2, random_state = 1)
from tensorflow.keras import Sequential
from tensorflow.keras.layers import LSTM
model = Sequential()
model.add(LSTM(units=6, input_shape = X_train.shape[1:], return_sequences = True))
model.add(LSTM(units=6, return_sequences=True))
model.add(LSTM(units=6, return_sequences=True))
model.add(LSTM(units=1, return_sequences=True, name='output'))
model.compile(loss='cosine_proximity', optimizer='sgd', metrics = ['accuracy'])
Model Summary:
Model: "sequential_3"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_9 (LSTM) (None, 6, 6) 192
_________________________________________________________________
lstm_10 (LSTM) (None, 6, 6) 312
_________________________________________________________________
lstm_11 (LSTM) (None, 6, 6) 312
_________________________________________________________________
output (LSTM) (None, 6, 1) 32
=================================================================
Total params: 848
Trainable params: 848
Non-trainable params: 0
_________________________________________________________________
None
Here X_train is of shape (2, 6, 1)
To add to the solution, I would like to suggest to go with a dense vector instead of a sparse vector generated from the Tf-Idf approach representation by replacing with pre-trained models like Google News Vector or Glove as weights to the embedding layer which would be better in performance wise and result wise.

Related

Use Tf-Idf with in Keras Model

I've read my train, test and validation sentences into train_sentences, test_sentences, val_sentences
Then I applied Tf-IDF vectorizer on these.
vectorizer = TfidfVectorizer(max_features=300)
vectorizer = vectorizer.fit(train_sentences)
X_train = vectorizer.transform(train_sentences)
X_val = vectorizer.transform(val_sentences)
X_test = vectorizer.transform(test_sentences)
And my model looks like this
model = Sequential()
model.add(Input(????))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(8, activation='sigmoid'))
model.summary()
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
Normally we pass embeddings matrix in the embeddings layer in case of word2vec.
How should I use Tf-IDF in Keras model? Please provide me with an example to use.
Thanks.
I cannot imagine a good reason for combining TF/IDF values with embedding vectors, but here is a possible solution: use the functional API, multiple Inputs and the concatenate function.
To concatenate layer outputs, their shapes must be aligned (except for the axis that is being concatenated). One method is to average embeddings and then concatenate to a vector of TF/IDF values.
Setting up, and some sample data
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_20newsgroups
import numpy as np
import keras
from keras.models import Model
from keras.layers import Dense, Activation, concatenate, Embedding, Input
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
# some sample training data
bunch = fetch_20newsgroups()
all_sentences = []
for document in bunch.data:
sentences = document.split("\n")
all_sentences.extend(sentences)
all_sentences = all_sentences[:1000]
X_train, X_test = train_test_split(all_sentences, test_size=0.1)
len(X_train), len(X_test)
vectorizer = TfidfVectorizer(max_features=300)
vectorizer = vectorizer.fit(X_train)
df_train = vectorizer.transform(X_train)
tokenizer = Tokenizer()
tokenizer.fit_on_texts(X_train)
maxlen = 50
sequences_train = tokenizer.texts_to_sequences(X_train)
sequences_train = pad_sequences(sequences_train, maxlen=maxlen)
Model definition
vocab_size = len(tokenizer.word_index) + 1
embedding_size = 300
input_tfidf = Input(shape=(300,))
input_text = Input(shape=(maxlen,))
embedding = Embedding(vocab_size, embedding_size, input_length=maxlen)(input_text)
# this averaging method taken from:
# https://stackoverflow.com/a/54217709/1987598
mean_embedding = keras.layers.Lambda(lambda x: keras.backend.mean(x, axis=1))(embedding)
concatenated = concatenate([input_tfidf, mean_embedding])
dense1 = Dense(256, activation='relu')(concatenated)
dense2 = Dense(32, activation='relu')(dense1)
dense3 = Dense(8, activation='sigmoid')(dense2)
model = Model(inputs=[input_tfidf, input_text], outputs=dense3)
model.summary()
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
Model Summary Output
Model: "model_2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_11 (InputLayer) (None, 50) 0
__________________________________________________________________________________________________
embedding_5 (Embedding) (None, 50, 300) 633900 input_11[0][0]
__________________________________________________________________________________________________
input_10 (InputLayer) (None, 300) 0
__________________________________________________________________________________________________
lambda_1 (Lambda) (None, 300) 0 embedding_5[0][0]
__________________________________________________________________________________________________
concatenate_4 (Concatenate) (None, 600) 0 input_10[0][0]
lambda_1[0][0]
__________________________________________________________________________________________________
dense_5 (Dense) (None, 256) 153856 concatenate_4[0][0]
__________________________________________________________________________________________________
dense_6 (Dense) (None, 32) 8224 dense_5[0][0]
__________________________________________________________________________________________________
dense_7 (Dense) (None, 8) 264 dense_6[0][0]
==================================================================================================
Total params: 796,244
Trainable params: 796,244
Non-trainable params: 0

How to define inpute shapes in Sequential keras model

Please, help to define appropriate Dense input shapes in keras models. Maybe I have to reshape my data first. I have data set with dimensions shown below:
Data shapes are X_train: (2858, 2037) y_train: (2858, 1) X_test: (715, 2037) y_test: (715, 1)
Number of features (input shape) is 2037
I want to define Sequential keras model like that
``
batch_size = 128
num_classes = 2
epochs = 20
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(X_input_shape,)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.summary()
model.compile(loss='binary_crossentropy',
optimizer=RMSprop(),
from_logits=True,
metrics=['accuracy'])
``
Model summary:
``
Layer (type) Output Shape Param #
=================================================================
dense_20 (Dense) (None, 512) 1043456
_________________________________________________________________
dropout_12 (Dropout) (None, 512) 0
_________________________________________________________________
dense_21 (Dense) (None, 512) 262656
=================================================================
Total params: 1,306,112
Trainable params: 1,306,112
Non-trainable params: 0
``
And when I try to fit it...
``
history = model.fit(X_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(X_test, y_test))
``
I got an error:
``
ValueError: Error when checking target: expected dense_21 to have shape (512,) but got array with shape (1,)
``
Modify
model.add(Dense(512, activation='relu'))
to
model.add(Dense(1, activation='relu'))
The output shape to be of size 1, same as y_train.shape[1].

How to fit list of numpy array into LSTM Neural Network?

I am new to Keras so I really appreciate any help here. For my project, I am trying to train the neural network on multiple time series. I got it work by running a for loop through to fit each time series to the model. The code looks like this:
for i in range(len(train)):
history = model.fit(train_X[i], train_Y[i], epochs=20, batch_size=6, verbose=0, shuffle=True)
If I am not wrong, I am doing online training here. Now I'm trying to do batch training to see if the result is better. I tried to fit a list consisting of all timeseries (each converted into a numpy array), but I get this error:
Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 56 arrays:
Here is the info about the dataset and the model:
model = Sequential()
model.add(LSTM(1, input_shape=(1,16),return_sequences=True))
model.add(Flatten())
model.add(Dense(1, activation='tanh'))
model.compile(loss='mae', optimizer='adam', metrics=['accuracy'])
model.summary()
Layer (type) Output Shape Param #
=================================================================
lstm_2 (LSTM) (None, 1, 1) 72
_________________________________________________________________
flatten_2 (Flatten) (None, 1) 0
_________________________________________________________________
dense_2 (Dense) (None, 1) 2
=================================================================
Total params: 74
Trainable params: 74
Non-trainable params: 0
print(len(train_X), train_X[0].shape, len(train_Y), train_Y[0].shape)
56 (1, 23, 16) 56 (1, 23, 1)
Here is the block of code that gives me the error :
pyplot.figure(figsize=(16, 25))
history = model.fit(train_X, train_Y, epochs=1, verbose=1, shuffle=False, batch_size = len(train_X))
Input shape of LSTM should be - batch_size, timesteps, features.But we need not to mention batch_size in input shape if you want you can use batch_input_shape.
model = Sequential()
model.add(LSTM(1, input_shape=(23,16),return_sequences=True))
# model.add(Flatten())
model.add(Dense(1, activation='tanh'))
model.compile(loss='mae', optimizer='adam', metrics=['accuracy'])
model.summary()
X = np.random.random((56,1, 23, 16))
y = np.random.random((56,1, 23, 1))
X=np.squeeze(X,axis =1) #as input shape should be (`batch_size`, `timesteps`, `features`)
y = np.squeeze(y,axis =1)
model.fit(X,y,epochs=1, verbose=1, shuffle=False, batch_size = len(X))
I am not sure if it serves your purpose.

Keras LSTM multiclass classification

I have this code that works for binary classification. I have tested it for keras imdb dataset.
model = Sequential()
model.add(Embedding(5000, 32, input_length=500))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, epochs=3, batch_size=64)
# Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0)
I need the above code to be converted for multi-class classification where there are 7 categories in total. What I understand after reading few articles to convert above code I have to change
model.add(Dense(7, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
Obviously changing just above two lines doesn't work. What else do I have to change to make the code work for multiclass classification. Also I think I have to change the classes to one hot encoding but don't know how in keras.
Yes, you need one hot target, you can use to_categorical to encode your target or a short way:
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
here is the full code:
from keras.models import Sequential
from keras.layers import *
model = Sequential()
model.add(Embedding(5000, 32, input_length=500))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(7, activation='softmax'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
Summary
Using TensorFlow backend.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, 500, 32) 160000
_________________________________________________________________
lstm_1 (LSTM) (None, 100) 53200
_________________________________________________________________
dense_1 (Dense) (None, 7) 707
=================================================================
Total params: 213,907
Trainable params: 213,907
Non-trainable params: 0
_________________________________________________________________

Error when running ANN with reccurent layer

I have created the below ANN with 2 fully connected layers and one recurrent. However when running it i get the error: Exception: Input 0 is incompatible with layer lstm_11: expected ndim=3, found ndim=2 Why is it happening?
from keras.models import Sequential
from keras.layers import Dense
from sklearn.cross_validation import train_test_split
import numpy
from sklearn.preprocessing import StandardScaler
from keras.layers import LSTM
seed = 7
numpy.random.seed(seed)
dataset = numpy.loadtxt("sorted_output.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:15]
scaler = StandardScaler(copy=True, with_mean=True, with_std=True ) #data normalization
X = scaler.fit_transform(X) #data normalization
Y = dataset[:,15]
# split into 67% for train and 33% for test
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=seed)
# create model
model = Sequential()
model.add(Dense(12, input_dim=15, init='uniform', activation='relu'))
model.add(LSTM(10, return_sequences=True))
model.add(Dense(15, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model
model.fit(X_train, y_train, validation_data=(X_test,y_test), nb_epoch=150, batch_size=10)
Based on above all of layers are "Dense" layers in LSTM you are returning the sequence as true.. You should be setting return_sequences=False because of all Dense layer and it should work.
The reason for this error is that LSTM expects the input to have a shape of 3 dimensions (for batch, sequence length and input dimension). But the Dense layer before it outputs a shape of 2 dimensions (for batch and output dimension).
You can see the output shape of the Dense layer by executing below lines of code
>>> model = Sequential()
>>> model.add(Dense(12, input_dim=15, init='uniform', activation='relu'))
>>> model.summary()
____________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
====================================================================================================
dense_4 (Dense) (None, 12) 192 dense_input_2[0][0]
====================================================================================================
Total params: 192
Trainable params: 192
Non-trainable params: 0
____________________________________________________________________________________________________
However, you don't explain your intention with the model, so I cannot give you further guidance for this issue. What is your input data? Do you expect the input to be a sequence?
If your input is a sequence, then I suggest you to remove the first Dense layer. But if your input is not a sequence, then I suggest you to remove the LSTM layer.

Categories