How to use Conv2D in multiple images input?

How to use Conv2D in multiple images input? - python

I want to use multiple images as input of the network. And I want to add Conv2D layers, something like that:
from tensorflow.keras.layers import *
from tensorflow.keras.models import Sequential
model = Sequential([
Input(shape=(1, 128, 128, 1)),
Conv2D(32, 3),
Flatten(),
])
But this code raises the error: Input 0 of layer conv2d_40 is incompatible with the layer: expected ndim=4, found ndim=5. Full shape received: [None, 1, 128, 128, 1]
But the code below is working fine:
model = Sequential([
Input(shape=(1, 512, 512, 1)),
Dense(32),
Flatten(),
])
I know, I can add multiple Input layers, but I want to know is there a way to make it like this?
I mean I want to use data of input shape [NUMBER_OF_IMAGES, WIDTH, HEIGHT, N_CHANNELS]
And NUMBER_OF_IMAGES is not amount of all images. This is an amount for current input

Conv2D expects input in 4D, you can't change that. I'm not exactly sure what you're trying to accomplish but you could use Conv3D instead:
from tensorflow.keras.layers import *
from tensorflow.keras.models import Sequential
import tensorflow as tf
model = Sequential([
Input(shape=(None, 128, 128, 1)),
Conv3D(32, kernel_size=(1, 3, 3)),
Flatten()
])
multiple_images = tf.random.uniform((10, 10, 128, 128, 1), dtype=tf.float32)
model(multiple_images)
<tf.Tensor: shape=(10, 5080320), dtype=float32, numpy=
array([[-0.26742983, -0.09689523, -0.12120364, ..., -0.02987139,
0.05515741, 0.12026916],
[-0.18898709, 0.12448274, -0.17439063, ..., 0.23424357,
-0.06001307, -0.13852882],
[-0.14464797, 0.26356792, -0.34748033, ..., 0.07819699,
-0.11639086, 0.10701762],
...,
[-0.1536693 , 0.13642962, -0.18564 , ..., 0.07165999,
-0.0173855 , -0.04348694],
[-0.32320747, 0.09207243, -0.22274591, ..., 0.11940736,
-0.02635285, -0.1140241 ],
[-0.21126074, -0.00094431, -0.10933039, ..., 0.06002581,
-0.09649743, 0.09335127]], dtype=float32)>

Related

How to Feed Tensor Dataset to Model

I am new to Tensorflow and trying to figure out how to build a simple text classification model. Taking a basic model from this tutorial, I am trying to adapt it to my own custom dataset.
I have tensors with shape=(32, 2, 500) grouped into training and validation datasets with shape=(None, 2, 500).
def get_model(max_features=20000, embedding_dim=128):
# A integer input for vocab indices.
inputs = tf.keras.Input(shape=(None,), dtype="int64")
# Next, we add a layer to map those vocab indices into a space of dimensionality
#'embedding_dim'.
x = layers.Embedding(max_features, embedding_dim)(inputs)
x = layers.Dropout(0.5)(x)
# Conv1D + global max pooling
x = layers.Conv1D(128, 7, padding="valid", activation="relu", strides=3)(x)
x = layers.Conv1D(128, 7, padding="valid", activation="relu", strides=3)(x)
x = layers.GlobalMaxPooling1D()(x)
# We add a vanilla hidden layer:
x = layers.Dense(128, activation="relu")(x)
x = layers.Dropout(0.5)(x)
# We project onto a single unit output layer, and squash it with a sigmoid:
predictions = layers.Dense(1, activation="sigmoid", name="predictions")(x)
model = tf.keras.Model(inputs, predictions)
# Compile the model with binary crossentropy loss and an adam optimizer.
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
return model
I get the following warning:
WARNING:tensorflow:Model was constructed with shape (None, None) for input KerasTensor(type_spec=TensorSpec(shape=(None, None), dtype=tf.int64, name='input_16'), name='input_16', description="created by layer 'input_16'"), but it was called on an input with incompatible shape (None, 2, 500).
And the following error message:
Input 0 of layer "global_max_pooling1d_6" is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 2, 53, 128)
Call arguments received by layer "model_7" " f"(type Functional):
• inputs=tf.Tensor(shape=(None, 2, 500), dtype=int64)
• training=True
• mask=None
What do I need to change to get rid of this error and get the model working?

Input 0 of layer sequential_43 is incompatible with the layer: : expected min_ndim=5, found ndim=4. Full shape received: (None, 32, 32, 100000)

My error:
Input 0 of layer sequential_43 is incompatible with the layer:
: expected min_ndim=5, found ndim=4. Full shape received: (None, 32, 32, 100000)
The shapes of my input:
samples.shape gives (32,32,32,100000)
labels.shape gives (100000,)
The code I'm now trying to run is the following:
model = keras.models.Sequential()
layers = tf.keras.layers
model.add(layers.Conv3D(filters=5, kernel_size=(4,4,4), strides=2, activation='relu', input_shape=(8,32,32,32,1)))
model.add(layers.Conv3D(filters=5, kernel_size=(4,4,4), strides=1, activation='relu'))
model.add(layers.Conv3D(filters=5, kernel_size=(4,4,4), strides=1, activation='relu'))
model.add(layers.Conv3D(filters=5, kernel_size=(4,4,4), strides=1, activation='relu'))
model.add(layers.Conv3D(filters=5, kernel_size=(4,4,4), strides=2, activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(1, activation='relu'))
model.compile(optimizer=Adam(learning_rate=0.0001),loss='mape',metrics=['accuracy'])
model.fit(x=samples,y=labels,validation_split=0.1,epochs=1,shuffle=True,verbose=2)
Everywhere I look the syntax is (batchsize,dim1,dim2,dim3,dim4). I put batchsize to 8, the data as a 32x32x32 cube, and the colour to 1 dimension. Even if i remove the batchsize from the input_shape and add it to model.fit as batch_size=8 it gives the same error. Does anyone know why?

As stated in your question, the order of the dimensions is (batchsize,dim1,dim2,dim3,dim4), so you need to reshape your samples array to match that order.
You can transpose your array to get the number of samples as the first dimension, and expand it to get the channel dimension (or colour, if I reuse your term) to 1.
>>> samples.shape
TensorShape([32, 32, 32, 100000])
>>> samples = tf.expand_dims(tf.transpose(samples,[3,0,1,2]), axis=-1)
>>> samples.shape
TensorShape([100000, 32, 32, 32, 1])

Updating Number of Neurons in a Keras Layer

I'm working on a callback to dynamically "split" the number of neurons in a network at the end of each epoch. However, I'm having some trouble figuring out how to update the layer size. Here is a simplified example:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# Model / data parameters
num_classes = 10
num_neurons = 32
input_shape = (28, 28, 1)
model = keras.Sequential(
[
keras.Input(shape=input_shape),
layers.Flatten(),
layers.Dense(num_neurons, activation="relu"),
layers.Dense(num_classes, activation="softmax"),
]
)
model.summary()
modellayers=model.layers
c=0
for l in modellayers:
if c==1:
l.output_shape=[None,64]
if c==2:
l.input_shape=[None,64]
print(str(c),l.input)
print(str(c),l.output)
c+=1
This gives back the following error:
AttributeError: Can't set the attribute "output_shape", likely because it conflicts with an existing read-only #property of the object. Please choose a different name.
If I print the shapes I get back:
0 Tensor("input_1:0", shape=(None, 28, 28, 1), dtype=float32)
0 Tensor("flatten/Reshape:0", shape=(None, 784), dtype=float32)
1 Tensor("flatten/Reshape:0", shape=(None, 784), dtype=float32)
1 Tensor("dense/Relu:0", shape=(None, 32), dtype=float32)
2 Tensor("dense/Relu:0", shape=(None, 32), dtype=float32)
2 Tensor("dense_1/Softmax:0", shape=(None, 10), dtype=float32)
I see they are Tensors, so I've also tried set_shape, but that didn't work either. Basically, I want to know how to update the number of neurons in a layer object.
P.S. I'm not having difficulty in splitting weights and biases and transferring those to a new working model, as can be seen in the example here: https://jjohnson-777.medium.com/machine-learning-granularity-by-splitting-neurons-fd2f02e07817. But now I would like to develop a callback function to do this on the fly between epochs.

Dataset generated from image_dataset_from_directory function does not include batch size

According to Keras documentation image_dataset_from_directory() returns:
A tf.data.Dataset object.
- If label_mode is None, it yields float32 tensors of shape (batch_size, image_size[0], image_size[1], num_channels), encoding images (see below for rules regarding num_channels).
- Otherwise, it yields a tuple (images, labels), where images has shape (batch_size, image_size[0], image_size[1], num_channels), and labels follows the format described below.
Rules regarding labels format:
- if label_mode is int, the labels are an int32 tensor of shape (batch_size,).
- if label_mode is binary, the labels are a float32 tensor of 1s and 0s of shape (batch_size, 1).
- if label_mode is categorial, the labels are a float32 tensor of shape (batch_size, num_classes), representing a one-hot encoding of the class index
Whereas when I use it :
train_dataset = image_dataset_from_directory(
directory=TRAIN_DIR,
labels="inferred",
label_mode="categorical",
class_names=["0", "10", "5"],
image_size=SIZE,
seed=SEED,
subset=None,
interpolation="bilinear",
follow_links=False,
)
I get (None, 224,224,3) for the images and (None,3) for the labels even though I set label_mode to "categorical". The batch size is not added into the shape even when I explicitly set the batch_size to 32(defaults to 32 but I tried it to see if it makes a difference). I have been having issues training my model because of this as the batch size needs to be included for a TimeDistributed layer.
#train_dataset.element_spec
(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None),
TensorSpec(shape=(None, 3), dtype=tf.float32, name=None))
Edit:
I'm trying to figure out why I get the following error when training a model using transfer learning from MobileNetV2 with LSTM for video classification and figured the batch_size not being present in the dataset was the issue.
ValueError: Input 0 of layer sequential_16 is incompatible with the layer: expected ndim=5, found ndim=4. Full shape received: [None, 224, 224, 3]
Code for the models:
MobilenetV2 function:
def build_mobilenet(shape=INPUT_SHAPE, nbout=CLASSES):
# INPUT_SHAPE = (224,224,3)
# CLASSES = 3
model = MobileNetV2(
include_top=False,
input_shape=shape,
weights='imagenet')
base_model.trainable = True
output = GlobalMaxPool2D()
return Sequential([model, output])
LSTM function:
def action_model(shape=INSHAPE, nbout=3):
# INSHAPE = (5, 224, 224, 3)
convnet = build_mobilenet(shape[1:])
model = Sequential()
model.add(TimeDistributed(convnet, input_shape=shape))
model.add(LSTM(64))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(512, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(128, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(64, activation='relu'))
model.add(Dense(nbout, activation='softmax'))
return model

This not an issue with batch size. But your input data format.
Code:
from tensorflow import keras
from tensorflow.keras.layers import *
def build_mobilenet(shape=(224,224,3), nbout=3):
model = tf.keras.applications.MobileNetV2(
include_top=False,
input_shape=shape,
weights='imagenet')
model.trainable = True
output = tf.keras.layers.GlobalMaxPool2D()
return tf.keras.Sequential([model, output])
def action_model(shape=(5, 224, 224, 3), nbout=3):
convnet = build_mobilenet()
model = tf.keras.Sequential()
model.add(TimeDistributed(convnet, input_shape=shape))
model.add(LSTM(64))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(512, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(128, activation='relu'))
model.add(Dropout(.5))
model.add(Dense(64, activation='relu'))
model.add(Dense(nbout, activation='softmax'))
return model
model = action_model()
tf.keras.utils.plot_model(model, 'my_first_model.png', show_shapes=True)
This gives output:
As you can see the model expects a 5d tensor as input but what you are providing is 4d tensor.
This model works with 5d tensor:
Code:
x = tf.constant(np.random.randint(50, size =(32,5,224,224,3)), dtype = tf.float32)
model(x)
Output:
<tf.Tensor: shape=(32, 3), dtype=float32, numpy=
array([[0.30153075, 0.3630225 , 0.33544672],
[0.3018494 , 0.36799458, 0.33015603],
[0.2965148 , 0.36714798, 0.3363372 ],
[0.30032247, 0.36478844, 0.33488905],
[0.30106384, 0.36145815, 0.33747798],
[0.29292756, 0.3652076 , 0.34186485],
[0.29766476, 0.35945407, 0.34288123],
[0.29290855, 0.36984667, 0.33724475],
[0.30804047, 0.35799438, 0.33396518],
[0.30497718, 0.35853127, 0.33649153],
[0.29357925, 0.36751047, 0.33891028],
[0.29514724, 0.36558747, 0.33926526],
[0.29731706, 0.3684161 , 0.33426687],
[0.30811843, 0.3656716 , 0.32621 ],
[0.29937437, 0.36403805, 0.33658758],
[0.2967953 , 0.36977535, 0.3334294 ],
[0.30307695, 0.36372742, 0.33319563],
[0.30148408, 0.36562964, 0.33288625],
[0.29590267, 0.36651734, 0.33758003],
[0.29640752, 0.36192682, 0.3416656 ],
[0.30003947, 0.36704347, 0.332917 ],
[0.29541495, 0.3681183 , 0.33646676],
[0.29900452, 0.36397702, 0.33701843],
[0.3028345 , 0.36404026, 0.33312523],
[0.30092967, 0.36406764, 0.33500263],
[0.29969287, 0.36108258, 0.33922455],
[0.29743004, 0.36917207, 0.3333979 ],
[0.29056188, 0.3742272 , 0.33521092],
[0.30297956, 0.36698693, 0.3300335 ],
[0.29843566, 0.3594078 , 0.3421565 ],
[0.29280537, 0.36777246, 0.33942217],
[0.29983717, 0.3691762 , 0.33098662]], dtype=float32)>
The image_dataset_from_directory function you are using is not capable of generating 5d tensors. You have to use a custom data generator to generate 5d tensors from your data.

ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concat axis

I'm trying to implement a special type of neural network with Keras functional API, as seen below:
But I'm having a problem with the concatenate layer:
ValueError: A "Concatenate" layer requires inputs with matching shapes
except for the concat axis. Got inputs shapes: [(None, 160, 160, 384),
(None, 160, 160, 48)]
Notice: From my research I assume that this question is not duplicate, I've seen this question, and this post (translated with Google), but they don't seem to work (instead, they make problems even slightly "worse").
Here's the code of the neural network before concat layer:
from keras.layers import Input, Dense, Conv2D, ZeroPadding2D, MaxPooling2D, BatchNormalization, concatenate
from keras.activations import relu
from keras.initializers import RandomUniform, Constant, TruncatedNormal
# Network 1, Layer 1
screenshot = Input(shape=(1280, 1280, 0), dtype='float32', name='screenshot')
# padded1 = ZeroPadding2D(padding=5, data_format=None)(screenshot)
conv1 = Conv2D(filters=96, kernel_size=11, strides=(4, 4), activation=relu, padding='same')(screenshot)
# conv1 = Conv2D(filters=96, kernel_size=11, strides=(4, 4), activation=relu, padding='same')(padded1)
pooling1 = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), padding='same')(conv1)
normalized1 = BatchNormalization()(pooling1) # https://stats.stackexchange.com/questions/145768/importance-of-local-response-normalization-in-cnn
# Network 1, Layer 2
# padded2 = ZeroPadding2D(padding=2, data_format=None)(normalized1)
conv2 = Conv2D(filters=256, kernel_size=5, activation=relu, padding='same')(normalized1)
# conv2 = Conv2D(filters=256, kernel_size=5, activation=relu, padding='same')(padded2)
normalized2 = BatchNormalization()(conv2)
# padded3 = ZeroPadding2D(padding=1, data_format=None)(normalized2)
conv3 = Conv2D(filters=384, kernel_size=3, activation=relu, padding='same',
kernel_initializer=TruncatedNormal(stddev=0.01),
bias_initializer=Constant(value=0.1))(normalized2)
# conv3 = Conv2D(filters=384, kernel_size=3, activation=relu, padding='same',
# kernel_initializer=RandomUniform(stddev=0.1),
# bias_initializer=Constant(value=0.1))(padded3)
# Network 2, Layer 1
textmaps = Input(shape=(160, 160, 128), dtype='float32', name='textmaps')
txt_conv1 = Conv2D(filters=48, kernel_size=1, activation=relu, padding='same',
kernel_initializer=TruncatedNormal(stddev=0.01), bias_initializer=Constant(value=0.1))(textmaps)
# (Network 1 + Network 2), Layer 1
merged = concatenate([conv3, txt_conv1], axis=1)
This is how interpreter evaluates variables conv3 and txt_conv1:
>>> conv3
<tf.Tensor 'conv2d_3/Relu:0' shape=(?, 160, 160, 384) dtype=float32>
>>> txt_conv1
<tf.Tensor 'conv2d_4/Relu:0' shape=(?, 160, 160, 48) dtype=float32>
This is how the interpreter evaluates txt_conv1 and conv3 variables after setting image_data_format to channels_first:
>>> conv3
<tf.Tensor 'conv2d_3/Relu:0' shape=(?, 384, 160, 0) dtype=float32>
>>> txt_conv1
<tf.Tensor 'conv2d_4/Relu:0' shape=(?, 48, 160, 128) dtype=float32>
Both of the layers have shapes which are not actually described in the architecture.
Is there any way to solve this problem? Maybe I didn't write the appropriate code (I'm new to Keras).
P.S
I know that the code above is not organized, I'm just testing.
Thank you!

You should change the axis to -1 in the concatenate layer since the shapes of the two tensors that you want to concatenate only differ in their last dimension. The resulting tensor will then be of shape (?, 160, 160, 384 + 48).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to use Conv2D in multiple images input? - python

Related

How to Feed Tensor Dataset to Model

Input 0 of layer sequential_43 is incompatible with the layer: : expected min_ndim=5, found ndim=4. Full shape received: (None, 32, 32, 100000)

Updating Number of Neurons in a Keras Layer

Dataset generated from image_dataset_from_directory function does not include batch size

ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concat axis

Categories

Resources