Cannot set output layer to classification - python

I am currently trying to implement a CNN which purpose is to perform classification, but for some reason am I not able to define my output dimension to 1.
Here is an example code:
import keras
from keras.layers.merge import Concatenate
from keras.models import Model
from keras.layers import Input, Dense
from keras.layers import Dropout
from keras.layers.core import Dense, Activation, Lambda, Reshape,Flatten
from keras.layers import Conv2D, MaxPooling2D, Reshape, ZeroPadding2D
import numpy as np
train_data_1 = np.random.randint(100,size=(100,3,6,3))
train_data_2 = np.random.randint(100,size=(100,3,6,3))
test_data_1 = np.random.randint(100,size=(10,3,6,3))
test_data_2 = np.random.randint(100,size=(10,3,6,3))
labels_train_data =np.random.randint(145,size=100)
labels_test_data =np.random.randint(145,size=10)
input_img_1 = Input(shape=(3, 6, 3))
input_img_2 = Input(shape=(3, 6, 3))
conv2d_1_1 = Conv2D(filters = 32, kernel_size = (3,3) , padding = "same" , activation = 'relu' , name = "conv2d_1_1" )(input_img_1)
conv2d_2_1 = Conv2D(filters = 64, kernel_size = (3,3) , padding = "same" , activation = 'relu' )(conv2d_1_1)
conv2d_3_1 = Conv2D(filters = 64, kernel_size = (3,3) , padding = "same" , activation = 'relu' )(conv2d_2_1)
conv2d_4_1 = Conv2D(filters = 32, kernel_size = (1,1) , padding = "same" , activation = 'relu' )(conv2d_3_1)
conv2d_4_1_flatten = Flatten()(conv2d_4_1)
conv2d_1_2 = Conv2D(filters = 32, kernel_size = (3,3) , padding = "same" , activation = 'relu' , name = "conv2d_1_2")(input_img_2)
conv2d_2_2 = Conv2D(filters = 64, kernel_size = (3,3) , padding = "same" , activation = 'relu' )(conv2d_1_2)
conv2d_3_2 = Conv2D(filters = 64, kernel_size = (3,3) , padding = "same" , activation = 'relu' )(conv2d_2_2)
conv2d_4_2 = Conv2D(filters = 32, kernel_size = (1,1) , padding = "same" , activation = 'relu' )(conv2d_3_2)
conv2d_4_2_flatten = Flatten()(conv2d_4_2)
merge = keras.layers.concatenate([conv2d_4_1_flatten, conv2d_4_2_flatten])
dense1 = Dense(100, activation = 'relu')(merge)
dense2 = Dense(50,activation = 'relu')(dense1)
dense3 = Dense(1 ,activation = 'softmax')(dense2)
model = Model(inputs = [input_img_1, input_img_2] , outputs = dense3)
model.compile(loss="sparse_categorical_crossentropy", optimizer="adam")
print model.summary()
labels_train = keras.utils.to_categorical(labels_train_data, num_classes=145)
labels_test = keras.utils.to_categorical(labels_test_data, num_classes=145)
hist_current = model.fit(x = [train_data_1, train_data_2],
y = labels_train,
shuffle=False,
validation_data=([test_data_1 ,test_data_2], labels_test),
validation_split=0.1,
epochs=150000,
batch_size = 15,
verbose=1)
And the error message being:
Traceback (most recent call last):
File "test_model.py", line 57, in <module>
verbose=1)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1405, in fit
batch_size=batch_size)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1299, in _standardize_user_data
exception_prefix='model target')
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 133, in _standardize_input_data
str(array.shape))
ValueError: Error when checking model target: expected dense_3 to have shape (None, 1) but got array with shape (100, 145)

several inconsistencies in your model :
dense3 = Dense(1 ,activation = 'softmax')(dense2) : you cannot use a softmax on one neuron alone. The softmax normalizese the output of the layer so that it sums up to 1... In this case if you normalize one value alone, it will always output 1. However this is not why you get the error
just how many classes do you have? From your network, you output one value (last layer is Dense(1)) so I would expect that you want to predict 2 classes (output 1 or 0). But here We see that your output is a categorical with 145 possibilities... Your label_train array is 100 one hot vectors of length 145, so I assume that you want to classify the 100 samples into 145 different categories... This is why keras is complaining, your networks outputs (100,1) and your targets (labels) are (100,145). What do you really want to do ?
Edit :
Following the comment, since you want to predictif the image belongs to one of 145 classes, you will have to output 145 values. So you will have to change the top layers of your network so that your last layer is a Dense(145, activation='softmax'). So I propose that you replace
dense1 = Dense(100, activation = 'relu')(merge)
dense2 = Dense(50,activation = 'relu')(dense1)
dense3 = Dense(1 ,activation = 'softmax')(dense2)
with
dense1 = Dense(200, activation = 'relu')(merge)
dense2 = Dense(150, activation = 'relu')(dense1)
dense3 = Dense(145, activation = 'softmax')(dense2)
If you really want to have 3 dense layers, otherwise you can just remove the middle one... This will depend on your usecase, so the architecture of the hidden layers is up to you. I'm just insisting that your last layer should be a Dense(145, activation='softmax').
Makes sense?
Edit 2 :
On top of that, you shouldn't encode your targets (labels) as categoricals, when you use sparse_categorical_crossentropy, it is done automatically under the hood.
So either you use keras.utils.to_categorical on your targets with loss=categorical_crossentropy
or you don't transform the targets with keras.utils.to_categorical and use loss=sparse_categorical_crossentropy.
It's running on my machine.

Related

Parallel Convolutions using Keras

What i am trying to do is create a text classification model which combines CNNS and word embeddings.The basic idea is that we have an Embedding layer at the start of the network and then 2 parallel convolutional networks for find 2,3 - grams.
Each of these convolution layers takes the output of the embedding layer as input.
After the outputs of the two cnn layers are concatenated,flattend and feeded to a Dense.
My input is tokenized,numerical sentences of length 27(shape = (None,27)) and i have 1244 of these sentences.
I've managed to create a sequential model wit ha single cnn layer but struggle wit hthe above
My code so far :
input_shape = Embedding(voc, 100,weights=[embedding_matrix], input_length=X.shape[1])
tower_1 = Conv1D(filters=100, kernel_size=2, activation='relu')(input_shape)
tower_1 = MaxPooling1D(pool_size=2)(tower_1)
tower_2 = Conv1D(filters=100, kernel_size=3, activation='relu')(input_shape)
tower_2 = MaxPooling1D(pool_size=2)(tower_2)
merged = keras.layers.concatenate([tower_1, tower_2,], axis=1)
merged = Flatten()(merged)
out = Dense(3, activation='softmax')(merged)
model = Model(input_shape, out)
This produces this error:
TypeError: Inputs to a layer should be tensors. Got: <keras.layers.embeddings.Embedding object at 0x7fadca016dd0>
i have also trid replacing
input_shape = Embedding(voc, 100,weights=[embedding_matrix], input_length=X.shape[1])
with:
input_tensor = Input(shape=(1244,27))
input_shape = Embedding(voc, 100,weights=[embedding_matrix], input_length=X.shape[1])(input_tensor)
which gives me this error:
ValueError: Input 0 of layer "max_pooling1d_23" is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (None, 1244, 26, 100)
You should define your Input layer without the number of samples. Just the sentence length:
import tensorflow as tf
inputs = tf.keras.layers.Input((27,))
embedded = tf.keras.layers.Embedding(50, 100, input_length=27)(inputs)
tower_1 = tf.keras.layers.Conv1D(filters=100, kernel_size=2, activation='relu')(embedded)
tower_1 = tf.keras.layers.MaxPooling1D(pool_size=2)(tower_1)
tower_2 = tf.keras.layers.Conv1D(filters=100, kernel_size=3, activation='relu')(embedded)
tower_2 = tf.keras.layers.MaxPooling1D(pool_size=2)(tower_2)
merged = tf.keras.layers.concatenate([tower_1, tower_2,], axis=1)
merged = tf.keras.layers.Flatten()(merged)
out = tf.keras.layers.Dense(3, activation='softmax')(merged)
model = tf.keras.Model(inputs, out)
print(model.summary())
Usage:
samples = 5
random_input = tf.random.uniform((samples, 27), maxval=50, dtype=tf.int32)
print(model(random_input))
tf.Tensor(
[[0.31525075 0.33163014 0.3531191 ]
[0.3266019 0.3295619 0.34383622]
[0.32351935 0.32669052 0.34979013]
[0.32954428 0.33178467 0.33867106]
[0.32966062 0.3283257 0.34201372]], shape=(5, 3), dtype=float32)

How to input a mix feature into a LSTM model?

Assume, I have two features: x1 and x2. Here, x1 is a vector of word index and x2 is a vector of numerical values. The length of x1 and x2 are equal to 50. There are 6000 rows for each x1 and x2. I combine these two into one such as
X = np.array([np.row_stack((x1[i], x2[i])) for i in range(x1.shape[0])])
My initial LSTM model is
X_input = Input(shape = (50, 2), name = "X_seq")
X_hidden1 = LSTM(units = 256, dropout = 0.25, return_sequences = True)(X_input)
X_hidden2 = LSTM(units = 256, dropout = 0.25, return_sequences = True)(X_hidden1)
X_hidden3 = LSTM(units = 128, dropout = 0.25)(X_hidden2)
X_dense = Dense(units = 128, activation = 'relu')(X_hidden3)
X_dense_dropout = Dropout(0.25)(X_dense)
concat = tf.keras.layers.concatenate(inputs = [X_dense_dropout])
output = Dense(units = num_category, activation = 'softmax', name = "output")(concat)
model = tf.keras.Model(inputs = [X_input], outputs = [output])
model.compile(optimizer = 'adam', loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])
However, I know I need to have an embedding layer to take care of X[0,:] right below the Input layer. Thus, I modified my above code to
X_input = Input(shape = (50, 2), name = "X_seq")
x1_embedding = Embedding(input_dim = max_pages, output_dim = embedding_dim, input_length = max_length)(X_input[0,:])
X_concat = tf.keras.layers.concatenate(inputs = [x1_embedding, X_input[1,:]])
X_hidden1 = LSTM(units = 256, dropout = 0.25, return_sequences = True)(X_concat)
X_hidden2 = LSTM(units = 256, dropout = 0.25, return_sequences = True)(X_hidden1)
X_hidden3 = LSTM(units = 128, dropout = 0.25)(X_hidden2)
X_dense = Dense(units = 128, activation = 'relu')(X_hidden3)
X_dense_dropout = Dropout(0.25)(X_dense)
concat = tf.keras.layers.concatenate(inputs = [X_dense_dropout])
output = Dense(units = num_category, activation = 'softmax', name = "output")(concat)
model = tf.keras.Model(inputs = [X_input], outputs = [output])
model.compile(optimizer = 'adam', loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])
Python shows an error
ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 2, 15), (None, 2)]
any suggestion? many thanks
The problem is that the input to the concat layer has different dimensions and hence we can't concat them. To overcome this issue we could just reshape the input to the concat layer using tf.keras.layers.Reshape like below and the rest will be same.
reshaped_input = tf.keras.layers.Reshape((-1,1))(X_input[:, 1])
X_concat = tf.keras.layers.concatenate(inputs = [x1_embedding, reshaped_input])

Build (pre-trained) CNN+LSTM network with keras functional API

I want to build an LSTM on top of pre-trained CNN (VGG) to classify a video sequence. The LSTM will be fed with the features extracted by the last FC layer of VGG.
The architecture is something like:
I wrote the code:
def build_LSTM_CNN_net()
from keras.applications.vgg16 import VGG16
from keras.models import Model
from keras.layers import Dense, Input, Flatten
from keras.layers.pooling import GlobalAveragePooling2D, GlobalAveragePooling1D
from keras.layers.recurrent import LSTM
from keras.layers.wrappers import TimeDistributed
from keras.optimizers import Nadam
from keras.applications.vgg16 import VGG16
num_classes = 5
frames = Input(shape=(5, 224, 224, 3))
base_in = Input(shape=(224,224,3))
base_model = VGG16(weights='imagenet',
include_top=False,
input_shape=(224,224,3))
x = Flatten()(base_model.output)
x = Dense(128, activation='relu')(x)
x = TimeDistributed(Flatten())(x)
x = LSTM(units = 256, return_sequences=False, dropout=0.2)(x)
x = Dense(self.nb_classes, activation='softmax')(x)
lstm_cnn = build_LSTM_CNN_net()
keras.utils.plot_model(lstm_cnn, "lstm_cnn.png", show_shapes=True)
But got the error:
ValueError: `TimeDistributed` Layer should be passed an `input_shape ` with at least 3 dimensions, received: [None, 128]
Why is this happening, how can I fix it?
here the correct way to build a model to classify video sequences. Note that I wrap into TimeDistributed a model instance. This model was previously build to extract features from each frame individually. In the second part, we deal the frame sequences
frames, channels, rows, columns = 5,3,224,224
video = Input(shape=(frames,
rows,
columns,
channels))
cnn_base = VGG16(input_shape=(rows,
columns,
channels),
weights="imagenet",
include_top=False)
cnn_base.trainable = False
cnn_out = GlobalAveragePooling2D()(cnn_base.output)
cnn = Model(cnn_base.input, cnn_out)
encoded_frames = TimeDistributed(cnn)(video)
encoded_sequence = LSTM(256)(encoded_frames)
hidden_layer = Dense(1024, activation="relu")(encoded_sequence)
outputs = Dense(10, activation="softmax")(hidden_layer)
model = Model(video, outputs)
model.summary()
if you want to use the VGG 1x4096 emb representation you can simply do:
frames, channels, rows, columns = 5,3,224,224
video = Input(shape=(frames,
rows,
columns,
channels))
cnn_base = VGG16(input_shape=(rows,
columns,
channels),
weights="imagenet",
include_top=True) #<=== include_top=True
cnn_base.trainable = False
cnn = Model(cnn_base.input, cnn_base.layers[-3].output) # -3 is the 4096 layer
encoded_frames = TimeDistributed(cnn)(video)
encoded_sequence = LSTM(256)(encoded_frames)
hidden_layer = Dense(1024, activation="relu")(encoded_sequence)
outputs = Dense(10, activation="softmax")(hidden_layer)
model = Model(video, outputs)
model.summary()

How to convert a tensorflow model to a pytorch model?

I'm new to pytorch. Here's an architecture of a tensorflow model and I'd like to convert it into a pytorch model.
I have done most of the codes but am confused about a few places.
1) In tensorflow, the Conv2D function takes filter as an input. However, in pytorch, the function takes the size of input channels and output channels as inputs. So how do I find the equivalent number of input channels and output channels, provided with the size of the filter.
2) In tensorflow, the dense layer has a parameter called 'nodes'. However, in pytorch, the same layer has 2 different inputs (the size of the input parameters and size of the targeted parameters), how do I determine them based on the number of the nodes.
Here's the tensorflow code.
from keras.utils import to_categorical
from keras.models import Sequential, load_model
from keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu', input_shape=X_train.shape[1:]))
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(rate=0.25))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(rate=0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(43, activation='softmax'))
Here's my code.:
import torch.nn.functional as F
import torch
# The network should inherit from the nn.Module
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# Define 2D convolution layers
# 3: input channels, 32: output channels, 5: kernel size, 1: stride
self.conv1 = nn.Conv2d(3, 32, 5, 1) # The size of input channel is 3 because all images are coloured
self.conv2 = nn.Conv2d(32, 64, 5, 1)
self.conv3 = nn.Conv2d(64, 128, 3, 1)
self.conv3 = nn.Conv2d(128, 256, 3, 1)
# It will 'filter' out some of the input by the probability(assign zero)
self.dropout1 = nn.Dropout2d(0.25)
self.dropout2 = nn.Dropout2d(0.5)
# Fully connected layer: input size, output size
self.fc1 = nn.Linear(36864, 128)
self.fc2 = nn.Linear(128, 10)
# forward() link all layers together,
def forward(self, x):
x = self.conv1(x)
x = F.relu(x)
x = self.conv2(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = self.conv3(x)
x = F.relu(x)
x = self.conv4(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = F.relu(x)
x = self.dropout2(x)
x = self.fc2(x)
output = F.log_softmax(x, dim=1)
return output
Thanks in advance!
1) In pytorch, we take input channels and output channels as an input. In your first layer, the input channels will be the number of color channels in your image. After that it's always going to be the same as the output channels from your previous layer (output channels are specified by the filters parameter in Tensorflow).
2). Pytorch is slightly annoying in the fact that when flattening your conv outputs you'll have to calculate the shape yourself. You can either use an equation to calculate this (𝑂𝑢𝑡=(𝑊−𝐹+2𝑃)/𝑆+1), or make a shape calculating function to get the shape of a dummy image after it's been passed through the conv part of the network. This parameter will be your size of input argument; the size of your output argument will just be the number of nodes you want in your next fully connected layer.

How to replace dense layer with convolutional one?

I wanted to replace the Dense_out layer with a convolution one, can anybody tell me how to do it?
code:
model = Sequential()
conv_1 = Conv2D(filters = 32,kernel_size=(3,3),activation='relu')
model.add(conv_1)
conv_2 = Conv2D(filters=64,kernel_size=(3,3),activation='relu')
model.add(conv_2)
pool = MaxPool2D(pool_size = (2,2),strides = (2,2), padding = 'same')
model.add(pool)
drop = Dropout(0.5)
model.add(drop)
model.add(Flatten())
Dense_1 = Dense(128,activation = 'relu')
model.add(Dense_1)
Dense_out = Dense(57,activation = 'softmax')
model.add(Dense_out)
model.compile(optimizer='Adam',loss='categorical_crossentropy',metric=['accuracy'])
model.fit(train_image,train_label,epochs=10,verbose = 1,validation_data=(test_image,test_label))
print(model.summary())
when I'm trying this code :
model = Sequential()
conv_01 = Conv2D(filters = 32,kernel_size=(3,3),activation='relu')
model.add(conv_01)
conv_02 = Conv2D(filters=64,kernel_size=(3,3),activation='relu')
model.add(conv_02)
pool = MaxPool2D(pool_size = (2,2),strides = (2,2), padding = 'same')
model.add(pool)
conv_11 = Conv2D(filters=64,kernel_size=(3,3),activation='relu')
model.add(conv_11)
pool_2 = MaxPool2D(pool_size=(2,2),strides=(2,2),padding='same')
model.add(pool_2)
drop = Dropout(0.3)
model.add(drop)
model.add(Flatten())
Dense_1 = Dense(128,activation = 'relu')
model.add(Dense_1)
Dense_2 = Dense(64,activation = 'relu')
model.add(Dense_2)
conv_out = Conv2D(filters= 64,kernel_size=(3,3),activation='relu')
model.add(Dense_out)
model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['accuracy'])
model.fit(train_image,train_label,epochs=10,verbose = 1,validation_data=(test_image,test_label))
I get the following error
ValueError: Input 0 of layer conv2d_3 is incompatible with the layer:
expected ndim=4, found ndim=2. Full shape received: [None, 64]
I am new at this so an explanation would greatly help
You will need to reshape to be able to use a 2x2 filter as needed in a conv2D layer.
You can use:
out = keras.layers.Reshape(target_shape)
model.add(out)
and then do the convolution:
conv_out = Conv2D(filters=3,kernel_size=(3,3),activation='softmax')
model.add(conv_out)
with filters being the number of channels you want in you output layer (3 for RGB).
More info about the layers and parameters in Keras Documentation

Categories