Using LSTM layer without Embedding

Using LSTM layer without Embedding - python

I have been trying to train a model for tf.keras.datasets.imdb using LSTM in tensorflow.
After some processing i have input -> x_train shape: (25000, 100).
Since i cannot feed it in a LSTM layer directly, i used a lambda function to change dimension of input:
model = Sequential([
Lambda(lambda x: tf.expand_dims(x,axis=-1)),
LSTM(62, dropout=0.2, recurrent_dropout=0.2, return_sequences=True),
LSTM(32),
Dense(1, activation='sigmoid')
])
But i am getting a horrible accuracy (around 55%).
But upon using embedding layer the accuracy increases to 90% using same hyper parameters:
model = Sequential([
Embedding(5000, 32),
LSTM(62, dropout=0.2, recurrent_dropout=0.2, return_sequences=True),
LSTM(32),
Dense(1, activation='sigmoid')
])
Am i doing something wrong in the first case?

Related

How to train only the last convolutional layer?

Could you help me with the code such that along with the dense layers also the last convolutional layer of Efficientnet is trained as well ?
features_url ="https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet21k_b3/feature_vector/2"
img_shape = (299,299,3)
features_layer = hub.KerasLayer(features_url,
input_shape=img_shape)
# below commented code keeps all the cnn layers frozen, thus it does not work for me at the moment
#features_layer.trainable = False
model = tf.keras.Sequential([
features_layer,
tf.keras.layers.Dense(256, activation = 'relu'),
tf.keras.layers.Dense(64, activation = 'relu'),
tf.keras.layers.Dense(4, activation = 'softmax')
])
In addition how can I save in a variable the name of the last convolutional layer ?

CNN-LSTM with TimeDistributed Layers behaving weirdly when trying to use tf.keras.utils.plot_model

I have a CNN-LSTM that looks as follows;
SEQUENCE_LENGTH = 32
BATCH_SIZE = 32
EPOCHS = 30
n_filters = 64
n_kernel = 1
n_subsequences = 4
n_steps = 8
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(n_subsequences, n_steps, X_train.shape[3]))))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
I'm using this CNN-LSTM for a multivariate time series forecasting problem. the CNN-LSTM input data comes in the 4D format: [samples, subsequences, timesteps, features]. For some reason, I need TimeDistributed Layers; or I get errors like ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35]. I think this has to do with the fact that Conv1D is officially not meant for time series, so to preserve time-series data shape we need to use a wrapper layer like TimeDistributed. I don't really mind using TimeDistributed layers - They're wrappers and if they make my model work I am happy. However, when I try to visualize my model with
file = 'CNN_LSTM_Visualization.png'
tf.keras.utils.plot_model(model, to_file=file, show_layer_names=False, show_shapes=False)
The resulting visualization only shows the Sequential():
I suspect this has to do with the TimeDistributed layers and the model not being built yet. I cannot call model.summary() either - it throws ValueError: This model has not yet been built. Build the model first by calling build()or callingfit()with some data, or specify aninput_shape argument in the first layer(s) for automatic build Which is strange because I have specified the input_shape, albeit in the Conv1D layer and not in the TimeDistributed wrapper.
I would like a working model together with a working tf.keras.utils.plot_model function. Any explanation as to why I need TimeDistributed and why it makes the plot_model function behave weirdly would be greatly awesome.

An alternative to using an Input layer is to simply pass the input_shape to the TimeDistributed wrapper, and not the Conv1D layer:
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu'), input_shape=(n_subsequences, n_steps, X_train.shape[3])))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model

Add your input layer at the beginning. Try this
def DNN_Model(X_train):
model = Sequential()
model.add(InputLayer(input_shape=(n_subsequences, n_steps, X_train)))
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel,
activation='relu')))
model.add(TimeDistributed(Conv1D(filters=n_filters,
kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
....
Now, you can plot and get a summary properly.
DNN_Model(3).summary() # OK
tf.keras.utils.plot_model(DNN_Model(3)) # OK

Layering softmax classifier into RNN autoencoder

The paper I'm implementing is using an RNN with autoencoder to classify anomalous network data(binary classification). They first train the model unsupervised, and then they describe this process:
Next, fine-tuning training (supervised) is conducted to train the last layer of
the network using labeled samples. Implementing the fine-tuning using
supervised training criterion can further optimize the whole network. We use softmax regression layer with two channels at the top
layer
Currently, I've implemented the autoencoder:
class AnomalyDetector(Model):
def __init__(self):
super(AnomalyDetector, self).__init__()
self.encoder = tf.keras.Sequential([
layers.Dense(64, activation="relu"),
layers.Dense(32, activation="relu"),
layers.Dense(16, activation="relu"),
layers.Dense(8, activation="relu")])
self.decoder = tf.keras.Sequential([
layers.Dense(16, activation="relu"),
layers.Dense(32, activation="relu"),
layers.Dense(64, activation="relu"),
layers.Dense(79, activation='relu')
])
How do you implement the softmax regression layer in TensorFlow?
I'm having trouble understanding the process, am I supposed to add another layer to the autoencoder? Am I supposed to add another function to the class?

Just in case anyone in the future visits this -
You can create a softmax layer by changing the activation. I chose a sigmoid activation in my case since sigmoid is equivalent to a two-element softmax. As per the documentation.
class AnomalyDetector(Model):
def __init__(self):
super(AnomalyDetector, self).__init__()
self.pretrained = False
self.finished_training = False
self.encoder = tf.keras.Sequential([
layers.SimpleRNN(64, activation="relu", return_sequences=True),
layers.SimpleRNN(32, activation="relu", return_sequences=True),
layers.SimpleRNN(16, activation="relu", return_sequences=True),
layers.SimpleRNN(8, activation="relu", return_sequences=True)])
self.decoder = tf.keras.Sequential([
layers.SimpleRNN(16, activation="relu", return_sequences=True),
layers.SimpleRNN(32, activation="relu", return_sequences=True),
layers.SimpleRNN(64, activation="relu", return_sequences=True),
layers.SimpleRNN(79, activation="relu"), return_sequences=True])
layers.SimpleRNN(1, activation="sigmoid")

CNN-LSTM Timeseries input for TimeDistributed layer

I created a CNN-LSTM for survival prediction of web sessions, my training data looks as follows:
print(x_train.shape)
(288, 3, 393)
with (samples, timesteps, features) and my model:
model = Sequential()
model.add(TimeDistributed(Conv1D(128, 5, activation='relu'),
input_shape=(x_train.shape[1], x_train.shape[2])))
model.add(TimeDistributed(MaxPooling1D()))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(64, stateful=True, return_sequences=True))
model.add(LSTM(16, stateful=True))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer=Adam(lr=0.001), loss='binary_crossentropy', metrics=['accuracy'])
However, the TimeDistributed Layer requires a minimum of 3 dimensions, how should I transform the data to get it work?
Thanks a lot!

your data are in 3d format and this is all you need to feed a conv1d or an LSTM. if your target is 2D remember to set return_sequences=False in your last LSTM cell.
using a flatten before an LSTM is a mistake because you are destroying the 3D dimensionality
pay attention also on the pooling operation in order to not have a negative time dimension to reduce (I use 'same' padding in the convolution above in order to avoid this)
below is an example in a binary classification task
n_sample, time_step, n_features = 288, 3, 393
X = np.random.uniform(0,1, (n_sample, time_step, n_features))
y = np.random.randint(0,2, n_sample)
model = Sequential()
model.add(Conv1D(128, 5, padding='same', activation='relu',
input_shape=(time_step, n_features)))
model.add(MaxPooling1D())
model.add(LSTM(64, return_sequences=True))
model.add(LSTM(16, return_sequences=False))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X,y, epochs=3)

TensorFlow: How do I use make a convolutional layer for tabular (1-D) features?

Using TensorFlow in Python, I am making a neural network that has a 1 dimensional array as input. I would like to add a convolutional layer to the network, but can't seem to get it to work.
My training data looks something like this:
n_samples = 20
length_feature = 10
features = np.random.random((n_samples, length_feature))
labels = np.array([1 if sum(e)>5 else 0 for e in features])
If I make a neural network like this one
model = keras.Sequential([
keras.layers.Dense(10, activation='relu', input_shape=(length_feature, )),
keras.layers.Dense(2, activation='softmax')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
history = model.fit(features, labels, batch_size=5, validation_split = 0.2, epochs=10)
and this works just fine. But if I add a convolutional layer like this
model = keras.Sequential([
keras.layers.Dense(10, activation='relu', input_shape=(length_feature, )),
keras.layers.Conv1D(kernel_size = 3, filters = 2),
keras.layers.Dense(2, activation='softmax')
])
then I get the error
ValueError: Input 0 of layer conv1d_4 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 10]
How can I add a convolutional layer to my neural network?

Conv1D expects a 3D output( batch_size, width, channels). But the dense layers produces a 2D output. Simply change your model to the following,
model = keras.Sequential([
keras.layers.Dense(10, activation='relu', input_shape=(length_feature, )),
keras.layers.Lambda(lambda x: K.expand_dims(x, axis=-1))
keras.layers.Conv1D(kernel_size = 3, filters = 2),
keras.layers.Dense(2, activation='softmax')
])
Where K is either keras.backend or tf.keras.backend depending on which one you used to get layers.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using LSTM layer without Embedding - python

Related

How to train only the last convolutional layer?

CNN-LSTM with TimeDistributed Layers behaving weirdly when trying to use tf.keras.utils.plot_model

Layering softmax classifier into RNN autoencoder

CNN-LSTM Timeseries input for TimeDistributed layer

TensorFlow: How do I use make a convolutional layer for tabular (1-D) features?

Categories

Resources