i work on keras , i have two problem i want to solve ( one classifiecation and other regression ) with the same input and different in output
for classification all data will be used and also for regression , the difference just in output layer
i create single model for each one as the following example for classification
model = Sequential()
model.add(Dense(300, activation='relu', input_dim=377))
model.add(Dense(256, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(56, activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(1))#
and it works well , the same model with changing in the last model works well for regression problem
my question is how to integrate the two tasks in multi-task learning neural network that take one input and output the two tasks
i search a lot but i didn't reach to the solution i want
note : i work with data in CSV file format
any help will be appreciated
this is an example with keras functional api
inp = Input((377,))
x = Dense(300, activation='relu')(inp)
x = Dense(256, activation='relu')(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.1)(x)
x = Dense(56, activation='relu')(x)
x = Dense(16, activation='relu')(x)
x = Dropout(0.1)(x)
out_reg = Dense(1, name='reg')(x)
out_class = Dense(1, activation='sigmoid', name='class')(x) # I suppose bivariate classification problem
model = Model(inp, [out_reg, out_class])
model.compile('adam', loss={'reg':'mse', 'class':'binary_crossentropy'},
loss_weights={'reg':0.5, 'class':0.5})
I used the structure you reported for classification and for regression, the only difference is in the output: 2 dense layer, one regression and the other classification (I suppose a binary classifier)
I've applied also a different loss for regression and classification. You can also balance them in a different way
Related
I had 5 LSTM layers and 2 MLP's which must be concatenate together into another MLP which produce the final output. Here is the code I wrote using the API approach, which works fine:
lstm_input = Input(shape=(X_dynamic_LSTM.shape[1], X_dynamic_LSTM.shape[2]))
x = LSTM(70, activation='tanh', return_sequences=True)(lstm_input )
x = Dropout(0.3)(x)
x = Dense(1, activation='tanh')(x)
mlp_input=Input(shape=(X_static_MLP.shape[1]))
mlp = Dense(30, activation='relu')(mlp_input)
mlp = Dense(10, activation='relu')(mlp)
merge = Concatenate()([x, mlp])
hidden1 = Dense(5, activation='relu')(merge)
mlp_out = Dense(1, activation='relu')(hidden1)
model = Model(inputs=[lstm_input, mlp_input], outputs=mlp_out)
model.compile(loss='mae', optimizer='Adam')
history = model.fit([X_dynamic_LSTM, X_static_MLP], y_train, batch_size=20,
epochs=10, validation_split=0.2)
If I want to convert this format to one similar to below:
x = Sequential()
x.add(LSTM(70, return_sequences=True))
x.add(Dropout(0.3))
x.add(Dense(1, activation='tanh'))
Can any one help me how should I define the MLP, the Concatenate and the part regarding the "model = Model(inputs=[lstm_input, mlp_input], outputs=mlp_out)" ??
My main problem is I want to add an Embedding layer to the LSTM. when I add the dollowing code to non-API approach the model works perfect.
x.add(Embedding(X_dynamic_LSTM.shape[0], 1,mask_zero=True))
But when instead I used
lstm_input = Embedding(X_dynamic_LSTM.shape[0], 1,mask_zero=True)
It gave me the error : TypeError: Inputs to a layer should be tensors, So I got to stick with non-API approach.
With all I know. pretrained CNN can do way better than CNN. I have a dataset of 855 images. I have applied CNN and got 94% accuracy.Then I applied Pretrained model (VGG16, ResNet50, Inception_V3, MobileNet)also with fine tuning but still i got highest 60% and two of them are doing very bad on classification. Can CNN really do better than pretrained model or my implementation is wrong. I've converted my image into 100 by 100 dimensions and followed the way of keras application. Then What is the issue ??
Naive CNN approach :
def cnn_model():
size = (100,100,1)
num_cnn_layers =2
NUM_FILTERS = 32
KERNEL = (3, 3)
MAX_NEURONS = 120
model = Sequential()
for i in range(1, num_cnn_layers+1):
if i == 1:
model.add(Conv2D(NUM_FILTERS*i, KERNEL, input_shape=size,
activation='relu', padding='same'))
else:
model.add(Conv2D(NUM_FILTERS*i, KERNEL, activation='relu',
padding='same'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(int(MAX_NEURONS), activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(int(MAX_NEURONS/2), activation='relu'))
model.add(Dropout(0.4))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
return model
VGG16 approach:
def vgg():
` `vgg_model = keras.applications.vgg16.VGG16(weights='imagenet',include_top=False,input_shape = (100,100,3))
model = Sequential()
for layer in vgg_model.layers:
model.add(layer)
# Freeze the layers
for layer in model.layers:
layer.trainable = False
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(3, activation='softmax'))
model.compile(optimizer=keras.optimizers.Adam(lr=1e-5),
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
What you're referring to as CNN in both cases talk about the same thing, which is a type of a neural network model. It's just that the pre-trained model has been trained on some other data instead of the dataset you're working on and trying to classify.
What is usually used here is called Transfer Learning. Instead of freezing all the layers, trying leaving the last few layers open so they can be retrained with your own data, so that the pretrained model can edit its weights and biases to match your needs as well. It could be the case that the dataset you're trying to classify is foreign to the pretrained models.
Here's an example from my own work, there are additional pieces of code but you can make it work with your own code, the logic remains the same
#You extract the layer which you want to manipulate, usually the last few.
last_layer = pre_trained_model.get_layer(name_of_layer)
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024,activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1,activation='sigmoid')(x)
#Here we combine your newly added layers and the pre-trained model.
model = Model( pre_trained_model.input, x)
model.compile(optimizer = RMSprop(lr=0.0001),
loss = 'binary_crossentropy',
metrics = ['accuracy'])
Adding to what #Ilknur Mustafa mentioned, as your dataset may be foreign to the images used for pre-training, you can try to re-train few last layers of the pre-trained model instead of adding a whole new layers. The below example code doesn't add any additional trainable layer other than the output layer. In this way, you can benefit by retraining the last few layers on the existing weights, rather than training from scratch. This may be beneficial if you don't have a large dataset to train on.
# load model without classifier layers
vgg = VGG16(include_top=False, input_shape=(100, 100, 3), weights='imagenet', pooling='avg')
# make only last 2 conv layers trainable
for layer in vgg.layers[:-4]:
layer.trainable = False
# add output layer
out_layer = Dense(3, activation='softmax')(vgg.layers[-1].output)
model_pre_vgg = Model(vgg.input, out_layer)
# compile model
opt = SGD(lr=1e-5)
model_pre_vgg.compile(optimizer=opt, loss=keras.losses.categorical_crossentropy, metrics=['accuracy'])
#You extract the layer which you want to manipulate, usually the last few.
last_layer = pre_trained_model.get_layer(name_of_layer)
# Flatten the output layer to 1 dimension
x = layers.Flatten()(last_output)
# Add a fully connected layer with 1,024 hidden units and ReLU activation
x = layers.Dense(1024,activation='relu')(x)
# Add a dropout rate of 0.2
x = layers.Dropout(0.2)(x)
# Add a final sigmoid layer for classification
x = layers.Dense(1,activation='sigmoid')(x)
#Here we combine your newly added layers and the pre-trained model.
model = Model( pre_trained_model.input, x)
model.compile(optimizer = RMSprop(lr=0.0001),
loss = 'binary_crossentropy',
metrics = ['accuracy'])
I want to specify 2 loss functions 1 for the object class which is cross-entropy and the other for the bounding box which is mean squared error. how to specify in model.compile each output with the corresponding loss function?
model = Sequential()
model.add(Dense(128, activation='relu'))
out_last_dense = model.add(Dense(128, activation='relu'))
object_type = model.add(Dense(1, activation='softmax'))(out_last_dense)
object_coordinates = model.add(Dense(4, activation='softmax'))(out_last_dense)
/// here is the problem i want to specify loss function for object type and coordinates
model.compile(loss= keras.losses.categorical_crossentropy,
optimizer= 'sgd', metrics=['accuracy'])
First of all, you can't use Sequential API here since your model has two output layers (i.e. what you have written is all wrong and would raise error). Instead you must use Keras Functional API:
inp = Input(shape=...)
x = Dense(128, activation='relu')(inp)
x = Dense(128, activation='relu')(x)
object_type = Dense(1, activation='sigmoid', name='type')(x)
object_coordinates = Dense(4, activation='linear', name='coord')(x)
Now, you can specify a loss function (as well as metric) for each output layer based on their names given above and using a dictionary:
model.compile(loss={'type': 'binary_crossentropy', 'coord': 'mse'},
optimizer='sgd', metrics={'type': 'accuracy', 'coord': 'mae'})
Further, note that you are using softmax as the activation function and I have changed it to sigomid and linear above. That's because: 1) using softmax on a layer with one unit does not make sense (if there are more than 2 classes then you should use softmax), and 2) the other layer predicts coordinates and therefore using softmax is not suitable at all (unless the problem formulation let you do so).
I have X as text, with two different labels(columns) to train.
--input.csv--
content, category, rate
text test, 1, 3
new test, 2, 2
Here my input X will be content. I have converted it to sequence matrix. I need both category and rate to be trained along with content. I couldn't figure out how to pass this inside the layers.
def RNN():
num_categories = 2
num_rates = 3
inputs = Input(name='inputs',shape=[max_len])
layer = Embedding(max_words,150,input_length=max_len)(inputs)
layer = LSTM(100)(layer)
shared_layer = Dense(256, activation='relu', name='FC1')(layer)
shared_layer = Dropout(0.5)(shared_layer)
cat_out = Dense(num_categories, activation='softmax', name='cat_out')(shared_layer)
rate_out = Dense(num_rates, activation='softmax', name='rate_out')(shared_layer)
model = Model(inputs=inputs,outputs=[cat_out, rate_out])
return model
model = RNN()
model.summary()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(sequences_matrix,[Y_train, Z_train])
Y_train contains only category. I want to add rate to the training. Does any one know?
I want two results. One should be about category and another is Rate.
Currently its returning only the label. Not with the rate. I don't know the way to add a layer for the Rate column.
You can achieve this with the functional API, just let the network have 2 outputs from the shared feature layer:
shared_layer = Dense(256, activation='relu', name='FC1')(layer)
shared_layer = Dropout(0.5)(shared_layer)
cat_out = Dense(num_categories, activation='softmax', name='cat_out')(shared_layer)
rate_out = Dense(num_rates, activation='softmax', name='rate_out')(shared_layer)
model = Model(inputs=inputs,outputs=[cat_out, rate_out])
return model
You will now train with two targets, y_train_cat and y_train_rate and give them as a list to model.fit(X_train, [y_train_cat, y_train_rate]) and the model will make two distinct predictions.
Have a look at the functional API documentation on how to handle multi-input / multi-output models.
I would like to combine 2 neural networks which are showing probabilities of classes.
One says that it is a cat on the image.
The second says that the cat has a collar.
How to use softmax activation function on the output of the neural network?
Please, see the picture to understand the main idea:
You can use the functional API to create a multi-output network. Essentially every output will be a separate prediction. Something along the lines of:
in = Input(shape=(w,h,c)) # image input
latent = Conv...(...)(in) # some convolutional layers to extract features
# How share the underlying features to predict
animal = Dense(2, activation='softmax')(latent)
collar = Dense(2, activation='softmax')(latent)
model = Model(in, [animal, coller])
model.compile(loss='categorical_crossentropy', optimiser='adam')
You can have as many separate outputs you like. If you have only binary features you can have a single vector output as well, Dense(2, activation='sigmoid') and first entry could predict cat or not, while second whether it has a collar. This would be multi-class multi-label setup.
Juste create two separate dense layers (with sofmax activation) at the end of your model, e.g.:
from keras.layers import Input, Dense, Conv2D
from keras.models import Model
# Input example:
inputs = Input(shape=(64, 64, 3))
# Example of model:
x = Conv2D(16, (3, 3), padding='same')(inputs)
x = Dense(512, activation='relu')(x)
x = Dense(64, activation='relu')(x)
# ... (replace with your actual layers)
# Then add two separate layers taking the previous output and generating two estimations:
cat_predictions = Dense(2, activation='softmax')(x)
collar_predictions = Dense(2, activation='softmax')(x)
model = Model(inputs=inputs, outputs=[cat_predictions, collar_predictions])