Keras Sequential model input: How significant are the dimensions? - python

I am trying to build a multioutput classifier on 3D data structured like [sampleID, timestamp, deviceID, sensorID] with one-hot labels like [sampleID, deviceID] to determine which device "wins".
In a nutshell, it is a massive collection of timeseries readings from five sensors taken at regular intervals from each of four different devices. The objective is to determine which of the devices is most likely to be in a particular state at the end of each sampleID. The labels are a one-hot representation of the devices.
In a case like this where a human would find meaning in the structure of the dataset, does the training process derive similar benefit? Can I simplify my dataset by reducing it to [dataset, deviceID, timestamp X sensor] or even [dataset, deviceID X timestamp X sensor] and still get similar accuracy?
In other words would simplifying the following dataset:
[10000, 1000, 4, 5]
down to
[10000, 4, 5000]
or
[10000, 1000, 20]
or even
[10000, 20000]
significantly diminish the model's ability to classify output?
Edited to for detail and formatting.

IIUC, you are asking if using 1000 timesteps for 20 objects (device X sensor) is better than using 1000 timesteps for 4 devices for 5 sensors.
There is no way of actually determining which would better model your problem, but, we can quickly build some tests to see which models capture the complexity of the problem better.
Case 1: 1000 time steps, 20 objects -> Sequential LSTM based model
If you consider the 20 sensors individually, you can simply use a LSTM based model and let the model handle the non linear relationships between them. Since you have a 2D input, simply build reshape your data and build a model in the following structure. Feel free to add more layers and activations etc.
from tensorflow.keras import layers, Model, utils
#Temporal model
inp = layers.Input((1000,20))
x = layers.LSTM(30, return_sequences=True)(inp)
x = layers.LSTM(30)(x)
out = layers.Dense(4, activation='softmax')(x)
model = Model(inp, out)
model.summary()
Model: "model_4"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_6 (InputLayer) [(None, 1000, 20)] 0
_________________________________________________________________
lstm_4 (LSTM) (None, 1000, 30) 6120
_________________________________________________________________
lstm_5 (LSTM) (None, 30) 7320
_________________________________________________________________
dense_20 (Dense) (None, 4) 124
=================================================================
Total params: 13,564
Trainable params: 13,564
Non-trainable params: 0
_________________________________________________________________
Case 2: 1000 time steps, 4x5 objects -> Conv-LSTM based model
Since you have a 3D input, you want to consider the 4x5 as your spatial axes and your 1000 as your channels/feature maps/temporal features. Since your data type has channels_first, do specify them in the Conv2D as well as MaxPooling2D layers.
Then, once you have convolved over the spatial axes, you can start working on the feature maps with an LSTM. Sample code below, feel free to modify and build on top of this.
from tensorflow.keras import layers, Model, utils
#Conv-LSTM model
inp = layers.Input((1000,4,5))
x = layers.Conv2D(30,2, data_format="channels_first")(inp)
x = layers.MaxPooling2D(2, data_format="channels_first")(x)
x = layers.Reshape((-1,2))(x)
x = layers.LSTM(20)(x)
out = layers.Dense(4, activation='softmax')(x)
model = Model(inp, out)
model.summary()
Model: "model_21"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_25 (InputLayer) [(None, 1000, 4, 5)] 0
_________________________________________________________________
conv2d_19 (Conv2D) (None, 30, 3, 4) 120030
_________________________________________________________________
max_pooling2d_14 (MaxPooling (None, 30, 1, 2) 0
_________________________________________________________________
reshape_10 (Reshape) (None, 30, 2) 0
_________________________________________________________________
lstm_19 (LSTM) (None, 20) 1840
_________________________________________________________________
dense_30 (Dense) (None, 4) 84
=================================================================
Total params: 121,954
Trainable params: 121,954
Non-trainable params: 0
_________________________________________________________________

Related

How can i make my CNN output a vector of features

I'm working on a project and i need to make my CNN output like the output of the "Flatten" Layer.
No classification just a vector of input photo features, and I'm kind of lost... i know every thing about CNN structure but how can i start doing this with python?
Another alternative is the following.
Imagine you have a tf Keras model (here I take a small one for the sake of simplicity).
>>> model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 128) 100480
_________________________________________________________________
dense_2 (Dense) (None, 64) 8256
_________________________________________________________________
dense_3 (Dense) (None, 32) 2080
_________________________________________________________________
dense_4 (Dense) (None, 1) 33
=================================================================
Total params: 110,849
Trainable params: 110,849
Non-trainable params: 0
_________________________________________________________________
Let's say you want a $32$-long feature vector, corresponding to the layer dense_3.
now you can create another object
outputs = model.get_layer('dense_3').output
child_model = tf.keras.Model(inputs = model.inputs, outputs= outputs)
and your child model does what you want. You need to call
features = child_model.predict(image)
Important note: If you print model.layers and child_model.layers you will not be surprised they share the same layers at the same memory addresses. This means training one will set the layers weights of the other.
Are you using Keras? If you are, you can take a look here for examples (https://keras.io/api/applications/#extract-features-with-vgg16).
Basically, you do
features = model.predict(x)
np.save(outfile, features) # outfile is your desire output filename
you can load back the file using
features = np.load(outfile)

Cannot resolve error in keras sequential model

I am working on a gesture recognition problem. For that I have a train set. Train set consists of multiple folders and each folder consists of a series of 30 images. From those images the model is trained. Also I have a csv file that contains the class label of each folder. The class labels are : "Left Swipe", "Right Swipe", "Stop", "Thumbs Down" and "Thumbs Up". Those labels are present in one np.array variable train_class. Now, I have created a CNN model then feeding that in a Sequential model.
The code is available in below GIT location
https://github.com/subhrajyoti-ghosh/ML-and-Deep-Learning/blob/main/Gesture_Recognition.ipynb
But when I am trying to fit the model, I am receiving error. Can you please help me understanding the error and how to solve that?
You are trying to use a TimeDistributed layer on a 2D input (batch_size, 256), which will not work, because the layer needs at least a 3D tensor. You should try using tf.keras.layers.RepeatVector:
import tensorflow as tf
resnet = tf.keras.applications.ResNet50(include_top=False,weights='imagenet',input_shape=(224,224,3))
cnn = tf.keras.Sequential([resnet])
cnn.add(tf.keras.layers.Conv2D(64,(2,2),strides=(1,1)))
cnn.add(tf.keras.layers.Conv2D(16,(3,3),strides=(1,1)))
cnn.add(tf.keras.layers.Flatten())
inputs = tf.keras.layers.Input(shape=(224,224,3))
x = cnn(inputs)
x = tf.keras.layers.RepeatVector(n=30)(x)
x = tf.keras.layers.GRU(16,return_sequences=True)(x)
x = tf.keras.layers.GRU(8)(x)
outputs = tf.keras.layers.Dense(5,activation='softmax')(x)
model = tf.keras.Model(inputs, outputs)
dummy_x = tf.random.normal((1, 224,224,3))
print(model.summary())
print(model(dummy_x))
Model: "model_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_14 (InputLayer) [(None, 224, 224, 3)] 0
sequential_6 (Sequential) (None, 256) 24121296
repeat_vector_2 (RepeatVect (None, 30, 256) 0
or)
gru_5 (GRU) (None, 30, 16) 13152
gru_6 (GRU) (None, 8) 624
dense_7 (Dense) (None, 5) 45
=================================================================
Total params: 24,135,117
Trainable params: 24,081,997
Non-trainable params: 53,120
_________________________________________________________________
None

Acces to last convolutional layer transfer learning

I'm trying to get some heatmaps from a computervision model that's it's already working to classify images but I'm finding some difficulties.
This is the model summary:
model.summary()
Model: "model_4"
Layer (type) Output Shape Param #
=================================================================
input_9 (InputLayer) [(None, 512, 512, 1)] 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 512, 512, 3) 30
_________________________________________________________________
densenet121 (Functional) (None, 1024) 7037504
_________________________________________________________________
dense_4 (Dense) (None, 100) 102500
_________________________________________________________________
dropout_4 (Dropout) (None, 100) 0
_________________________________________________________________
predictions (Dense) (None, 2) 202
=================================================================
Total params: 7,140,236
Trainable params: 7,056,588
Non-trainable params: 83,648
As part of the standard procces to create a heatmap, I know I have to acces to the last convolutional layer in the model, that in this case I'll say it's a layer inside the Densenet121, but I can not find a way to access to all the layers belonging to densenet121.
Right now, I've been using conv2d_4 layer to run some tests, but I feel is not the right way because that layer is before all the Transfer learning work from densenet.
Also, I just looked up for Funcitnal layers in KErar official documentation but I cound't find it, so I guess it's not a layer, it's like the hole densenet model embedded there, but I can not find a way to access.
By the way, here I share the model construction because it may help to answer this:
from tensorflow.keras.applications.densenet import DenseNet121
num_classes = 2
input_tensor = Input(shape=(IMG_SIZE,IMG_SIZE,1))
x = Conv2D(3,(3,3), padding='same')(input_tensor)
x = DenseNet121(include_top=False, classes=2, pooling="avg", weights="imagenet")(x)
x = Dense(100)(x)
x = Dropout(0.45)(x)
predictions = Dense(num_classes, activation='softmax', name="predictions")(x)
model = Model(inputs=input_tensor, outputs=predictions)
I found you can use
.get_layer()
twice to acces layers inside functional densenet model embebeed in the "main" model.
In this case I can use model.get_layer('densenet121').summary() to check all thje layer inside the embebeed model, and then use them with this code: model.get_layer('densenet121').get_layer('xxxxx')

Training a image classifier with over 300k classes

Is it possible to train a image classifier network with over an enormous number of classes? (say 300k classes), with each class having a minimum of 10 images split up between train/test/validation (ie. >3mil 250x250x3 images).
I have tried to train the dataset using the ResNet50 model and decreasing the batch size to as low as 1, but still have been running into OOM issues (2080 Ti). I have found out that the OOM is caused by having too many parameters and ergo I have resorted to trying to train the network on an extremely basic 10-layer model with a batch size of 1. It runs, but, the speed/accuracy is unsurprisingly abysmal.
Is there anyway I can maybe divide the training sets into smaller sections of classes, such that:
1st .h5 = classes 1 ~ 20,000
2nd .h5 = classes 20,001 ~ 40,000
3rd .h5 = classes 40,001 ~ 60,000, etc.
and later merging into a single h5 file that can be loaded to recognize all 300k different classes?
EDIT PER ASHISH'S SUGGESTION:
I have (I think) successfully merged 2 models into one, but the merged model has somewhat doubled in the number of layers...
Source code:
model1 = load_model('001.h5')
model2 = load_model('002.h5')
for layer in model1.layers:
layer._name = layer._name + "_1" # avoid duplicate layer names, which would otherwise throw an error
layer.trainable = False
for layer in model2.layers:
layer._name = layer._name + "_2"
layer.trainable = False
x1 = model1.layers[-1].output
classes = x1.shape[1]
x1 = Dense(classes, activation='relu', name='out1')(x1)
x2 = model2.layers[-1].output
x2 = Dense(x2.shape[1], activation='relu', name='out2')(x2)
classes += x2.shape[1]
x = concatenate([x1, x2])
output_layer = Dense(classes, activation='softmax', name='combined_layer')(x)
new_model = Model(inputs=[model1.inputs, model2.inputs], outputs=output_layer)
new_model.summary()
new_model.save('new_model.h5', overwrite=True)
And the resulting model looks like this:
Model: "model"
_________________________________________________________________________
Layer (type) Output Shape Param # Connected to
=========================================================================
input_1_1 (InputLayer) [(None, 224, 224, 3) 0
_________________________________________________________________________
input_1_2 (InputLayer) [(None, 224, 224, 3) 0
_________________________________________________________________________
conv1_pad_1 (ZeroPadding2D) (None, 230, 230, 3) 0 input_1_1[0][0]
_________________________________________________________________________
conv1_pad_2 (ZeroPadding2D) (None, 230, 230, 3) 0 input_1_2[0][0]
_________________________________________________________________________
conv1_conv_1 (Conv2D) (None, 112, 112, 64) 9472 conv1_pad_1[0][0]
_________________________________________________________________________
conv1_conv_2 (Conv2D) (None, 112, 112, 64) 9472 conv1_pad_2[0][0]
...
...
conv5_block3_out_1 (Activation) (None, 7, 7, 2048) 0 conv5_block3_add_1[0][0]
_________________________________________________________________________
conv5_block3_out_2 (Activation) (None, 7, 7, 2048) 0 conv5_block3_add_2[0][0]
_________________________________________________________________________
avg_pool_1 (GlobalAveragePoolin (None, 2048) 0 conv5_block3_out_1[0][0]
_________________________________________________________________________
avg_pool_2 (GlobalAveragePoolin (None, 2048) 0 conv5_block3_out_2[0][0]
_________________________________________________________________________
probs_1 (Dense) (None, 953) 1952697 avg_pool_1[0][0]
_________________________________________________________________________
probs_2 (Dense) (None, 3891) 7972659 avg_pool_2[0][0]
_________________________________________________________________________
out1 (Dense) (None, 953) 909162 probs_1[0][0]
_________________________________________________________________________
out2 (Dense) (None, 3891) 15143772 probs_2[0][0]
_________________________________________________________________________
concatenate (Concatenate) (None, 4844) 0 out1[0][0]
out2[0][0]
_________________________________________________________________________
combined_layer (Dense) (None, 4844) 23469180 concatenate[0][0]
=========================================================================
Total params: 96,622,894
Trainable params: 39,522,114
Non-trainable params: 57,100,780
As you can see, all the layers have been doubled due to Model(inputs=[input1, input2]). That will cause problems for me later when I want to use this model to predict images. Is there anyway I can do this without doubling all the previous layers and just add the trailing dense layers? At this rate I'll be overloaded with the number of parameters even faster than before...
technically it's possible, So what you can do is since you have 3 classifiers(1.h5,2.h5,3.h5), you can load these model with their weights and then use functional API in tensorflow https://www.tensorflow.org/guide/keras/functional where concatenate() API will combine output of the 3 classifiers to single vector and then use few dense network with activation function to make the final prediction.

Mutli Step Forecast LSTM model

I am trying to implement a multi step forecasting LSTM model in Keras. The shapes of data is like this:
X : (5831, 48, 1)
y : (5831, 1, 12)
The model that I am trying to use is:
power_in = Input(shape=(X.shape[1], X.shape[2]))
power_lstm = LSTM(50, recurrent_dropout=0.4128,
dropout=0.412563, kernel_initializer=power_lstm_init, return_sequences=True)(power_in)
main_out = TimeDistributed(Dense(12, kernel_initializer=power_lstm_init))(power_lstm)
While trying to train the model like this:
hist = forecaster.fit([X], y, epochs=325, batch_size=16, validation_data=([X_valid], y_valid), verbose=1, shuffle=False)
I am getting the following error:
ValueError: Error when checking target: expected time_distributed_16 to have shape (48, 12) but got array with shape (1, 12)
How to fix this?
According to your comment:
[The] data i have is like t-48, t-47, t-46, ..... , t-1 as the past data and
t+1, t+2, ......, t+12 as the values that I want to forecast
you may not need to use a TimeDistributed layer at all:
first, just remove the resturn_sequences=True argument of the LSTM layer. After doing it, the LSTM layer would encode the input timeseries of the past in a vector of shape (50,). Now you can feed it directly to a Dense layer with 12 units:
# make sure the labels have are in shape (num_samples, 12)
y = np.reshape(y, (-1, 12))
power_in = Input(shape=(X.shape[1:],))
power_lstm = LSTM(50, recurrent_dropout=0.4128,
dropout=0.412563,
kernel_initializer=power_lstm_init)(power_in)
main_out = Dense(12, kernel_initializer=power_lstm_init)(power_lstm)
Alternatively, if you would like to use a TimeDistributed layer and considering that the output is a sequence itself, we can explicitly enforce this temporal dependency in our model by using another LSTM layer before the Dense layer (with the addition of a RepeatVector layer after the first LSTM layer to make its output a timseries of length 12, i.e. same as the output timeseries length):
# make sure the labels have are in shape (num_samples, 12, 1)
y = np.reshape(y, (-1, 12, 1))
power_in = Input(shape=(48,1))
power_lstm = LSTM(50, recurrent_dropout=0.4128,
dropout=0.412563,
kernel_initializer=power_lstm_init)(power_in)
rep = RepeatVector(12)(power_lstm)
out_lstm = LSTM(32, return_sequences=True)(rep)
main_out = TimeDistributed(Dense(1))(out_lstm)
model = Model(power_in, main_out)
model.summary()
Model summary:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_3 (InputLayer) (None, 48, 1) 0
_________________________________________________________________
lstm_3 (LSTM) (None, 50) 10400
_________________________________________________________________
repeat_vector_2 (RepeatVecto (None, 12, 50) 0
_________________________________________________________________
lstm_4 (LSTM) (None, 12, 32) 10624
_________________________________________________________________
time_distributed_1 (TimeDist (None, 12, 1) 33
=================================================================
Total params: 21,057
Trainable params: 21,057
Non-trainable params: 0
_________________________________________________________________
Of course, in both models you may need to tune the hyper-parameters (e.g. number of LSTM layers, the dimension of LSTM layers, etc.) to be able to accurately compare them and achieve good results.
Side note: actually, in your scenario, you don't need to use TimeDistributed layer at all because (currently) Dense layer is applied on the last axis. Therefore, TimeDistributed(Dense(...)) and Dense(...) are equivalent.

Categories