I am predicting for network intrusion attacks using CICIDS2017 dataset using trained LSTM model which has following architechture :
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 30) 13080
dropout (Dropout) (None, 30) 0
softmax (Dense) (None, 2) 62
=================================================================
Total params: 13,142
Trainable params: 13,142
Non-trainable params: 0
_________________________________________________________________
I have applied MinMaxScaler() for normalizing and SMOTE for fixing imbalance in the dataset.
Now, when I predict for a single sample, I always get label "1" which corresponds to Attack category.
I have already tried checking and verifying the features but still it didn't helped.
Related
I started building my own LSTM neural networks a few weeks ago and there is a problem I can't solve.
I built a single hidden layer LSTM of this shape:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm (LSTM) (None, 36) 14976
_________________________________________________________________
dense (Dense) (None, 1) 37
=================================================================
Total params: 15,013
Trainable params: 15,013
Non-trainable params: 0
_________________________________________________________________
I try to predict in t an event (0 or 1) using 34 explanatory variables observed in t and 35 explanatory variables observed in t-1 (including the target in t-1). The problem is that in t I am not supposed to know the value of the target of t-1 in t and I would like to use the prediction made in t-1 as an input feature to make my prediction in t but I don't know how to translate this into python.
I'm working on a project and i need to make my CNN output like the output of the "Flatten" Layer.
No classification just a vector of input photo features, and I'm kind of lost... i know every thing about CNN structure but how can i start doing this with python?
Another alternative is the following.
Imagine you have a tf Keras model (here I take a small one for the sake of simplicity).
>>> model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 128) 100480
_________________________________________________________________
dense_2 (Dense) (None, 64) 8256
_________________________________________________________________
dense_3 (Dense) (None, 32) 2080
_________________________________________________________________
dense_4 (Dense) (None, 1) 33
=================================================================
Total params: 110,849
Trainable params: 110,849
Non-trainable params: 0
_________________________________________________________________
Let's say you want a $32$-long feature vector, corresponding to the layer dense_3.
now you can create another object
outputs = model.get_layer('dense_3').output
child_model = tf.keras.Model(inputs = model.inputs, outputs= outputs)
and your child model does what you want. You need to call
features = child_model.predict(image)
Important note: If you print model.layers and child_model.layers you will not be surprised they share the same layers at the same memory addresses. This means training one will set the layers weights of the other.
Are you using Keras? If you are, you can take a look here for examples (https://keras.io/api/applications/#extract-features-with-vgg16).
Basically, you do
features = model.predict(x)
np.save(outfile, features) # outfile is your desire output filename
you can load back the file using
features = np.load(outfile)
I have simple multi-layer perceptron for MNIST data classification problem.
model = tf.keras.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
When printing summary i receive following output:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten_8 (Flatten) (None, 784) 0
_________________________________________________________________
dense_16 (Dense) (None, 128) 100480
_________________________________________________________________
dense_17 (Dense) (None, 10) 1290
=================================================================
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________
How do I interpret output shape printed in the summary? Why is there None therm in the output shape tuple? Why is it not just (784) in the first layer?
The "None" value refers to the number of input samples (the batch size). To allow you to train on different sized training sets, this value is None. If it were a number, let's say 50 for example, that means you can only train on exactly 50 samples which is usually not very useful (but does occasionally have applications).
I want to finetune efficientnet using tf.keras (tensorflow 2.3) but i cannot change the training status of layers properly. My model looks like this:
data_augmentation_layers = tf.keras.Sequential([
keras.layers.experimental.preprocessing.RandomFlip("horizontal_and_vertical"),
keras.layers.experimental.preprocessing.RandomRotation(0.8)])
efficientnet = EfficientNetB3(weights="imagenet", include_top=False,
input_shape=(*img_size, 3))
#Setting to not trainable as described in the standard keras FAQ
efficientnet.trainable = False
inputs = keras.layers.Input(shape=(*img_size, 3))
augmented = augmentation_layers(inputs)
base = efficientnet(augmented, training=False)
pooling = keras.layers.GlobalAveragePooling2D()(base)
outputs = keras.layers.Dense(5, activation="softmax")(pooling)
model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(loss="categorical_crossentropy", optimizer=keras_opt, metrics=["categorical_accuracy"])
This is done so that my random weights on the custom top wont destroy the weights asap.
Model: "functional_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) [(None, 512, 512, 3)] 0
_________________________________________________________________
sequential (Sequential) (None, 512, 512, 3) 0
_________________________________________________________________
efficientnetb3 (Functional) (None, 16, 16, 1536) 10783535
_________________________________________________________________
global_average_pooling2d (Gl (None, 1536) 0
_________________________________________________________________
dense (Dense) (None, 5) 7685
=================================================================
Total params: 10,791,220
Trainable params: 7,685
Non-trainable params: 10,783,535
Everything seems to work until this point. I train my model for 2 epochs and then i want to start fine-tuning the efficientnet base. Thus i call
for l in model.get_layer("efficientnetb3").layers:
if not isinstance(l, keras.layers.BatchNormalization):
l.trainable = True
model.compile(loss="categorical_crossentropy", optimizer=keras_opt, metrics=["categorical_accuracy"])
I recompiled and print the summary again to see that the number of non-trainable weights remained the same. Also fitting does not bring better results that keeping frozen.
dense (Dense) (None, 5) 7685
=================================================================
Total params: 10,791,220
Trainable params: 7,685
Non-trainable params: 10,783,535
Ps: I also tried efficientnet3.trainable = True but this also had no effect.
Could it be that it has something to do with the fact that i'm using a sequential and a functional model at the same time?
For me the problem was using sequential API for part of the model. When I change to sequential, my model.sumary() displayed all the sublayers and it was possible to set some of them as trainable and others not.
I am working on Text Classification Problem. My Model looks like this :
Model: "sequential_6"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_6 (Embedding) (None, 100, 50) 676050
_________________________________________________________________
lstm_6 (LSTM) (None, 16) 4288
_________________________________________________________________
dropout_1 (Dropout) (None, 16) 0
_________________________________________________________________
dense_6 (Dense) (None, 3) 51
=================================================================
Total params: 680,389
Trainable params: 680,389
Non-trainable params: 0
_________________________________________________________________
None
The dataset contains around 5300 No. of Sentences. I am using validation split=0.33.
The Model behaves in abnormal way. The validation loss keeps increasing and validation accuracy moves in constant way. I am attaching the graph.
Please guide me how to solve this issue.
My Model looks like this :
model=Sequential()
model.add(Embedding(
num_words,
EMBEDDING_DIM,
input_length=MAX_SEQUENCE_LENGTH
))
model.add(LSTM(32,return_sequences=True))
model.add(Dropout(0.5))
model.add(GlobalMaxPool1D())
model.add(Dense(len(possible_labels), activation="softmax"))
I am also attaching Accuracy Graph.
Increase dropout.
Train for fewer epochs.
Try Conv1D instead of LSTM to see if the overfitting goes away.