How to apply sigmoid function for each outputs in Keras? - python

This is part of my codes.
model = Sequential()
model.add(Dense(3, input_shape=(4,), activation='softmax'))
model.compile(Adam(lr=0.1),
loss='categorical_crossentropy',
metrics=['accuracy'])
with this code, it will apply softmax to all the outputs at once. So the output indicates probability among all. However, I am working on non-exclusive classifire, which means I want the outputs to have independent probability.
Sorry my English is bad...
But what I want to do is to apply sigmoid function to each outputs so that they will have independent probabilities.

There is no need to create 3 separate outputs like suggested by the accepted answer.
The same result can be achieved with just one line:
model.add(Dense(3, input_shape=(4,), activation='sigmoid'))
You can just use 'sigmoid' activation for the last layer:
from tensorflow.keras.layers import GRU
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation
import numpy as np
from tensorflow.keras.optimizers import Adam
model = Sequential()
model.add(Dense(3, input_shape=(4,), activation='sigmoid'))
model.compile(Adam(lr=0.1),
loss='categorical_crossentropy',
metrics=['accuracy'])
pred = model.predict(np.random.rand(5, 4))
print(pred)
Output of independent probabilities:
[[0.58463055 0.53531045 0.51800555]
[0.56402034 0.51676977 0.506389 ]
[0.665879 0.58982867 0.5555959 ]
[0.66690147 0.57951677 0.5439698 ]
[0.56204814 0.54893976 0.5488999 ]]
As you can see the classes probabilities are independent from each other. The sigmoid is applied to every class separately.

You can try using Functional API to create a model with n outputs where each output is activated with sigmoid.
You can do it like this
in = Input(shape=(4, ))
dense_1 = Dense(units=4, activation='relu')(in)
out_1 = Dense(units=1, activation='sigmoid')(dense_1)
out_2 = Dense(units=1, activation='sigmoid')(dense_1)
out_3 = Dense(units=1, activation='sigmoid')(dense_1)
model = Model(inputs=[in], outputs=[out_1, out_2, out_3])

Related

Prediction Interval for Neural Net in Python

I'm currently using keras to create a neural net in python. I have a basic model and the code looks like this:
from keras.layers import Dense
from keras.models import Sequential
model = Sequential()
model.add(Dense(23, input_dim=23, kernel_initializer='normal', activation='relu'))
model.add(Dense(500, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal', activation="relu"))
model.compile(loss='mean_squared_error', optimizer='adam')
It works well and gives me good predictions for my use case. However, I would like to be able to use a Variational Gaussian Process layer to give me an estimate for the prediction interval as well. I'm new to this type of layer and am struggling a bit to implement it. The tensorflow documentation on it can be found here:
https://www.tensorflow.org/probability/api_docs/python/tfp/layers/VariationalGaussianProcess
However, I'm not seeing that same layer in the keras library. For further reference, I'm trying to do something similar to what was done in this article:
https://blog.tensorflow.org/2019/03/regression-with-probabilistic-layers-in.html
There seems to be a bit more complexity when you have 23 inputs vs one that I'm not understanding. I'm also open to other methods to achieving the target objective. Any examples on how to do this or insights on other approaches would be greatly appreciated!
tensorflow_probability is a separate library but suitable to use with Keras and TensorFlow. You can add those custom layers in your code and change it to a probabilistic model. If your goal is just to get a prediction interval it would be simpler to use the DistributionLambda layer. So your code would be as follows:
from keras.layers import Dense
from keras.models import Sequential
from sklearn.datasets import make_regression
import tensorflow_probability as tfp
import tensorflow as tf
tfd = tfp.distributions
# Sample data
X, y = make_regression(n_samples=100, n_features=23, noise=4.0, bias=15)
# loss function Negative log likelyhood
negloglik = lambda y, p_y: -p_y.log_prob(y)
# Model
model = Sequential()
model.add(Dense(23, input_dim=23, kernel_initializer='normal', activation='relu'))
model.add(Dense(500, kernel_initializer='normal', activation='relu'))
model.add(Dense(2))
model.add(tfp.layers.DistributionLambda(
lambda t: tfd.Normal(loc=t[..., :1],
scale=1e-3 + tf.math.softplus(0.05 * t[..., 1:]))))
model.compile(loss=negloglik, optimizer='adam')
model.fit(X,y, epochs=250, verbose=None)
After training your model, you can get your prediction distribution with the following lines:
yhat = model(X) # make predictions
means = yhat.mean() # prediction means
stds = yhat.stddev() # prediction standard deviation

LOSS not changeing in very simple KERAS binary classifier

I'm trying to get a very (over) simplified Keras binary classifier neural network running without success. The LOSS just stays constant. I've played around with Optimizers (SGD, Adam, RMSProp), Learningrates, Weight-Initializations, Batch Size and input data normalization so far.
Nothing changes at all. Am I doing something fundamentally wrong? Here is the code:
from tensorflow import keras
from keras import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
data = np.array(
[
[100,35,35,12,0],
[101,46,35,21,0],
[130,56,46,3412,1],
[131,58,48,3542,1]
]
)
x = data[:,1:-1]
y_target = data[:,-1]
x = x / np.linalg.norm(x)
model = Sequential()
model.add(Dense(3, input_shape=(3,), activation='softmax', kernel_initializer='lecun_normal',
bias_initializer='lecun_normal'))
model.add(Dense(1, activation='softmax', kernel_initializer='lecun_normal',
bias_initializer='lecun_normal'))
model.compile(optimizer=SGD(learning_rate=0.1),
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(x, y_target, batch_size=2, epochs=10,
verbose=1)
Softmax definition is:
exp(a) / sum(exp(a)
so when you use with a single neuron you will get:
exp(a) / exp(a) = 1
That is why your classifier doesn't work with a single neuron.
You can use sigmoid instead in this special case:
exp(a) / (exp(a) + 1)
Furthermore sigmoid function is for two class classifiers. Softmax is an extension of sigmoid for multiclass classifers.
For the first layer you should use relu or sigmoid function instead of softmax.
This is the working solution based on the feedback I got
from tensorflow import keras
from keras import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
from keras.utils import to_categorical
data = np.array(
[
[100,35,35,12,0],
[101,46,35,21,0],
[130,56,46,3412,1],
[131,58,48,3542,1]
]
)
x = data[:,1:-1]
y_target = data[:,-1]
x = x / np.linalg.norm(x)
model = Sequential()
model.add(Dense(3, input_shape=(3,), activation='sigmoid'))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer=SGD(learning_rate=0.1),
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(x, y_target, epochs=1000,
verbose=1)

Convert Convnet.js neural network model to Keras Tensorflow

I have a neural network model that is created in convnet.js that I have to define using Keras. Does anyone have an idea how can I do that?
neural = {
net : new convnetjs.Net(),
layer_defs : [
{type:'input', out_sx:4, out_sy:4, out_depth:1},
{type:'fc', num_neurons:25, activation:"regression"},
{type:'regression', num_neurons:5}
],
neuralDepth: 1
}
this is what I could do so far. I cannot ve sure if it's correct.
#---Build Model-----
model = models.Sequential()
# Input - Layer
model.add(layers.Dense(4, activation = "relu", input_shape=(4,)))
# Hidden - Layers
model.add(layers.Dense(25, activation = "relu"))
model.add(layers.Dense(5, activation = "relu"))
# Output- Layer
model.add(layers.Dense(1, activation = "linear"))
model.summary()
# Compile Model
model.compile(loss= "mean_squared_error" , optimizer="adam", metrics=["mean_squared_error"])
From the Convnet.js doc : "your last layer must be a loss layer ('softmax' or 'svm' for classification, or 'regression' for regression)."
Also : "Create a regression layer which takes a list of targets (arbitrary numbers, not necessarily a single discrete class label as in softmax/svm) and backprops the L2 Loss."
It's unclear. I suspect "regression" layer is just another layer of Dense (Fully connected) neurons. The 'regression' word probably refers to linear activity. So, no 'relu' this time ?
Anyway, it would probably look something like (no sequential mode):
from keras.layers import Dense
from keras.models import Model
my_input = Input(shape = (4, ))
x = Dense(25, activation='relu')(x)
x = Dense(4)(x)
my_model = Model(input=my_input, output=x, loss='mse', metrics='mse')
my_model.compile(optimizer=Adam(LEARNING_RATE), loss='binary_crossentropy', metrics=['mse'])
After reading a bit of the docs, the convnet.js seems like a nice project. It would be much better with somebody with neural network knowledge on board.

Grid Search the number of hidden layers with keras

I am trying to optimize the hyperparameters of my NN using Keras and sklearn.
I am wrapping up with KerasClassifier (it´s a classification problem).
I am trying to optimize the number of hidden layers.
I can´t figure it out how to do it with keras (actually I am wondering how to set up the function create_model in order to maximize the number of hidden layers)
Could anyone please help me?
My code (just the important part):
## Import `Sequential` from `keras.models`
from keras.models import Sequential
# Import `Dense` from `keras.layers`
from keras.layers import Dense
def create_model(optimizer='adam', activation = 'sigmoid'):
# Initialize the constructor
model = Sequential()
# Add an input layer
model.add(Dense(5, activation=activation, input_shape=(5,)))
# Add one hidden layer
model.add(Dense(8, activation=activation))
# Add an output layer
model.add(Dense(1, activation=activation))
#compile model
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=
['accuracy'])
return model
my_classifier = KerasClassifier(build_fn=create_model, verbose=0)# Create
hyperparameter space
epochs = [5, 10]
batches = [5, 10, 100]
optimizers = ['rmsprop', 'adam']
activation1 = ['relu','sigmoid']
# Create grid search
grid = RandomizedSearchCV(estimator=my_classifier,
param_distributions=hyperparameters) #inserir param_distributions
# Fit grid search
grid_result = grid.fit(X_train, y_train)
# Create hyperparameter options
hyperparameters = dict(optimizer=optimizers, epochs=epochs,
batch_size=batches, activation=activation1)
# View hyperparameters of best neural network
grid_result.best_params_
If you want to make the number of hidden layers a hyperparameter you have to add it as parameter to your KerasClassifier build_fn like:
def create_model(optimizer='adam', activation = 'sigmoid', hidden_layers=1):
# Initialize the constructor
model = Sequential()
# Add an input layer
model.add(Dense(5, activation=activation, input_shape=(5,)))
for i in range(hidden_layers):
# Add one hidden layer
model.add(Dense(8, activation=activation))
# Add an output layer
model.add(Dense(1, activation=activation))
#compile model
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=
['accuracy'])
return model
Then you will be able to optimize the number of hidden layers by adding it to the dictionary, which is passed to RandomizedSearchCV's param_distributions.
One more thing, you probably should separate the activation you use for the output layer from the other layers.
Different classes of activation functions are suitable for hidden layers and for output layers used in binary classification.

How to merge keras sequential models with same input?

I am trying to create my first ensemble models in keras. I have 3 input values and a single output value in my dataset.
from keras.optimizers import SGD,Adam
from keras.layers import Dense,Merge
from keras.models import Sequential
model1 = Sequential()
model1.add(Dense(3, input_dim=3, activation='relu'))
model1.add(Dense(2, activation='relu'))
model1.add(Dense(2, activation='tanh'))
model1.compile(loss='mse', optimizer='Adam', metrics=['accuracy'])
model2 = Sequential()
model2.add(Dense(3, input_dim=3, activation='linear'))
model2.add(Dense(4, activation='tanh'))
model2.add(Dense(3, activation='tanh'))
model2.compile(loss='mse', optimizer='SGD', metrics=['accuracy'])
model3 = Sequential()
model3.add(Merge([model1, model2], mode = 'concat'))
model3.add(Dense(1, activation='sigmoid'))
model3.compile(loss='binary_crossentropy', optimizer='Adam', metrics=['accuracy'])
model3.input_shape
The ensemble model(model3) compiles without any error but while fitting the model I have to pass the same input two times model3.fit([X,X],y). Which I think is an unnecessary step and instead of passing input twice I want to have a common input nodes for my ensemble model. How can I do it?
Keras functional API seems to be a better fit for your use case, as it allows more flexibility in the computation graph. e.g.:
from keras.layers import concatenate
from keras.models import Model
from keras.layers import Input, Merge
from keras.layers.core import Dense
from keras.layers.merge import concatenate
# a single input layer
inputs = Input(shape=(3,))
# model 1
x1 = Dense(3, activation='relu')(inputs)
x1 = Dense(2, activation='relu')(x1)
x1 = Dense(2, activation='tanh')(x1)
# model 2
x2 = Dense(3, activation='linear')(inputs)
x2 = Dense(4, activation='tanh')(x2)
x2 = Dense(3, activation='tanh')(x2)
# merging models
x3 = concatenate([x1, x2])
# output layer
predictions = Dense(1, activation='sigmoid')(x3)
# generate a model from the layers above
model = Model(inputs=inputs, outputs=predictions)
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
# Always a good idea to verify it looks as you expect it to
# model.summary()
data = [[1,2,3], [1,1,3], [7,8,9], [5,8,10]]
labels = [0,0,1,1]
# The resulting model can be fit with a single input:
model.fit(data, labels, epochs=50)
Notes:
There might be slight differences in the API between Keras versions (pre- and post- version 2)
The example above specifies different optimizer and loss function for each of the models. However, since fit() is being called only once (on model3), the same settings - those of model3 - will apply to the entire model. In order to have different settings when training the sub-models, they will have to be fit() separately -
see comment by #Daniel.
EDIT: updated notes based on comments
etov's answer is a great option.
But suppose you already have model1 and model2 ready and you don't want to change them, you can create the third model like this:
singleInput = Input((3,))
out1 = model1(singleInput)
out2 = model2(singleInput)
#....
#outN = modelN(singleInput)
out = Concatenate()([out1,out2]) #[out1,out2,...,outN]
out = Dense(1, activation='sigmoid')(out)
model3 = Model(singleInput,out)
And if you already have all the models ready and don't want to change them, you can have something like this (not tested):
singleInput = Input((3,))
output = model3([singleInput,singleInput])
singleModel = Model(singleInput,output)
Define new input layer and use model outputs directly (works in functional api):
assert model1.input_shape == model2.input_shape # make sure they got same shape
inp = tf.keras.layers.Input(shape=model1.input_shape[1:])
model = tf.keras.models.Model(inputs=[inp], outputs=[model1(inp), model2(inp)])

Categories