How do you create an NN to solve for example: 2 * x ** 3 + 4 * x + 8 + random noise?
I can do this easily with LinearRegression from sklearn, but I'd like to be able to achieve this for a multivariate sample where I have no idea wether the function is log/exp/poly/etc.
Thank you!
Ashkan
The nonlinearity in Neural Network can be achieved by simply having a layer with a nonlinear activation function, e.g. (relu). For example:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
model = Sequential()
model.add(Dense(10, activation='relu', kernel_initializer='he_normal', input_shape=(n_features,)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit(X_train, y_train, epochs=10, batch_size=16, verbose=0)
Just needed 3 more layers of these - thanks Tony!
Related
I'm currently using keras to create a neural net in python. I have a basic model and the code looks like this:
from keras.layers import Dense
from keras.models import Sequential
model = Sequential()
model.add(Dense(23, input_dim=23, kernel_initializer='normal', activation='relu'))
model.add(Dense(500, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal', activation="relu"))
model.compile(loss='mean_squared_error', optimizer='adam')
It works well and gives me good predictions for my use case. However, I would like to be able to use a Variational Gaussian Process layer to give me an estimate for the prediction interval as well. I'm new to this type of layer and am struggling a bit to implement it. The tensorflow documentation on it can be found here:
https://www.tensorflow.org/probability/api_docs/python/tfp/layers/VariationalGaussianProcess
However, I'm not seeing that same layer in the keras library. For further reference, I'm trying to do something similar to what was done in this article:
https://blog.tensorflow.org/2019/03/regression-with-probabilistic-layers-in.html
There seems to be a bit more complexity when you have 23 inputs vs one that I'm not understanding. I'm also open to other methods to achieving the target objective. Any examples on how to do this or insights on other approaches would be greatly appreciated!
tensorflow_probability is a separate library but suitable to use with Keras and TensorFlow. You can add those custom layers in your code and change it to a probabilistic model. If your goal is just to get a prediction interval it would be simpler to use the DistributionLambda layer. So your code would be as follows:
from keras.layers import Dense
from keras.models import Sequential
from sklearn.datasets import make_regression
import tensorflow_probability as tfp
import tensorflow as tf
tfd = tfp.distributions
# Sample data
X, y = make_regression(n_samples=100, n_features=23, noise=4.0, bias=15)
# loss function Negative log likelyhood
negloglik = lambda y, p_y: -p_y.log_prob(y)
# Model
model = Sequential()
model.add(Dense(23, input_dim=23, kernel_initializer='normal', activation='relu'))
model.add(Dense(500, kernel_initializer='normal', activation='relu'))
model.add(Dense(2))
model.add(tfp.layers.DistributionLambda(
lambda t: tfd.Normal(loc=t[..., :1],
scale=1e-3 + tf.math.softplus(0.05 * t[..., 1:]))))
model.compile(loss=negloglik, optimizer='adam')
model.fit(X,y, epochs=250, verbose=None)
After training your model, you can get your prediction distribution with the following lines:
yhat = model(X) # make predictions
means = yhat.mean() # prediction means
stds = yhat.stddev() # prediction standard deviation
Newbie to machine learning here.
I'm currently working on a diagnostic machine learning framework using 3D-CNNs on fMRI imaging. My dataset consists of 636 images right now, and I'm trying to distinguish between control and affected (binary classification). However, when I tried to train my model, after every epoch, my accuracy remains at 48.13%, no matter what I do. Additionally, over the epoch, the accuracy decreases from 56% to 48.13%.
So far, I have tried:
changing my loss functions (poisson, categorical cross entropy, binary cross entropy, sparse categorical cross entropy, mean squared error, mean absolute error, hinge, hinge squared)
changing my optimizer (I've tried Adam and SGD)
changing the number of layers
using weight regularization
changing from ReLU to leaky ReLU (I thought perhaps that could help if this was a case of overfitting)
Nothing has worked so far.
Any tips? Here's my code:
#importing important packages
import tensorflow as tf
import os
import keras
from keras.models import Sequential
from keras.layers import Dense, Flatten, Conv3D, MaxPooling3D, Dropout, BatchNormalization, LeakyReLU
import numpy as np
from keras.regularizers import l2
from sklearn.utils import compute_class_weight
from keras.optimizers import SGD
BATCH_SIZE = 64
input_shape=(64, 64, 40, 20)
# Create the model
model = Sequential()
model.add(Conv3D(64, kernel_size=(3,3,3), activation='relu', input_shape=input_shape, kernel_regularizer=l2(0.005), bias_regularizer=l2(0.005), data_format = 'channels_first', padding='same'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Conv3D(64, kernel_size=(3,3,3), activation='relu', input_shape=input_shape, kernel_regularizer=l2(0.005), bias_regularizer=l2(0.005), data_format = 'channels_first', padding='same'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(BatchNormalization(center=True, scale=True))
model.add(Conv3D(64, kernel_size=(3,3,3), activation='relu', input_shape=input_shape, kernel_regularizer=l2(0.005), bias_regularizer=l2(0.005), data_format = 'channels_first', padding='same'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(Conv3D(64, kernel_size=(3,3,3), activation='relu', input_shape=input_shape, kernel_regularizer=l2(0.005), bias_regularizer=l2(0.005), data_format = 'channels_first', padding='same'))
model.add(MaxPooling3D(pool_size=(2, 2, 2)))
model.add(BatchNormalization(center=True, scale=True))
model.add(Flatten())
model.add(BatchNormalization(center=True, scale=True))
model.add(Dense(128, activation='relu', kernel_regularizer=l2(0.01), bias_regularizer=l2(0.01)))
model.add(Dropout(0.5))
model.add(Dense(128, activation='sigmoid', kernel_regularizer=l2(0.01), bias_regularizer=l2(0.01)))
model.add(Dense(1, activation='softmax', kernel_regularizer=l2(0.01), bias_regularizer=l2(0.01)))
# Compile the model
model.compile(optimizer = keras.optimizers.sgd(lr=0.000001), loss='poisson', metrics=['accuracy', tf.keras.metrics.Precision(), tf.keras.metrics.Recall()])
# Model Testing
history = model.fit(X_train, y_train, batch_size=BATCH_SIZE, epochs=50, verbose=1, shuffle=True)
The main issue is that you are using softmax activation with 1 neuron. Change it to sigmoid with binary_crossentropy as a loss function.
At the same time, bear in mind that you are using Poisson loss function, which is suitable for regression problems not classification ones. Ensure that you detect the exact scenario that your are trying to solve.
Softmax with one neuron makes the model illogical and only use one of the sigmoid activation functions or Softmax in the last layer
I am trying to use this to classify the images into two categories. Also I applied model.fit() function but its showing error.
ValueError: A target array with shape (90, 1) was passed for an output of shape (None, 10) while using as loss binary_crossentropy. This loss expects targets to have the same shape as the output.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D, LSTM
import pickle
import numpy as np
X = np.array(pickle.load(open("X.pickle","rb")))
Y = np.array(pickle.load(open("Y.pickle","rb")))
#scaling our image data
X = X/255.0
model = Sequential()
model.add(Conv2D(64 ,(3,3), input_shape = (300,300,1)))
# model.add(MaxPooling2D(pool_size = (2,2)))
model.add(tf.keras.layers.Reshape((16, 16*512)))
model.add(LSTM(128, activation='relu', return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(128, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr=1e-3, decay=1e-5)
model.compile(loss='binary_crossentropy', optimizer=opt,
metrics=['accuracy'])
# model.summary()
model.fit(X, Y, batch_size=32, epochs = 2, validation_split=0.1)
If your problem is categorical, your issue is that you are using binary_crossentropy instead of categorical_crossentropy; ensure that you do have a categorical instead of a binary classification problem.
Also, please note that if your labels are in simple integer format like [1,2,3,4...] and not one-hot-encoded, your loss_function should be sparse_categorical_crossentropy, not categorical_crossentropy.
If you do have a binary classification problem, like said in the error of the above ensure that:
Loss is binary_crossentroy + Dense(1,activation='sigmoid')
Loss is categorical_crossentropy + Dense(2,activation='softmax')
I'm trying to get a very (over) simplified Keras binary classifier neural network running without success. The LOSS just stays constant. I've played around with Optimizers (SGD, Adam, RMSProp), Learningrates, Weight-Initializations, Batch Size and input data normalization so far.
Nothing changes at all. Am I doing something fundamentally wrong? Here is the code:
from tensorflow import keras
from keras import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
data = np.array(
[
[100,35,35,12,0],
[101,46,35,21,0],
[130,56,46,3412,1],
[131,58,48,3542,1]
]
)
x = data[:,1:-1]
y_target = data[:,-1]
x = x / np.linalg.norm(x)
model = Sequential()
model.add(Dense(3, input_shape=(3,), activation='softmax', kernel_initializer='lecun_normal',
bias_initializer='lecun_normal'))
model.add(Dense(1, activation='softmax', kernel_initializer='lecun_normal',
bias_initializer='lecun_normal'))
model.compile(optimizer=SGD(learning_rate=0.1),
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(x, y_target, batch_size=2, epochs=10,
verbose=1)
Softmax definition is:
exp(a) / sum(exp(a)
so when you use with a single neuron you will get:
exp(a) / exp(a) = 1
That is why your classifier doesn't work with a single neuron.
You can use sigmoid instead in this special case:
exp(a) / (exp(a) + 1)
Furthermore sigmoid function is for two class classifiers. Softmax is an extension of sigmoid for multiclass classifers.
For the first layer you should use relu or sigmoid function instead of softmax.
This is the working solution based on the feedback I got
from tensorflow import keras
from keras import Sequential
from keras.layers import Dense
from keras.optimizers import SGD
from keras.utils import to_categorical
data = np.array(
[
[100,35,35,12,0],
[101,46,35,21,0],
[130,56,46,3412,1],
[131,58,48,3542,1]
]
)
x = data[:,1:-1]
y_target = data[:,-1]
x = x / np.linalg.norm(x)
model = Sequential()
model.add(Dense(3, input_shape=(3,), activation='sigmoid'))
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer=SGD(learning_rate=0.1),
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(x, y_target, epochs=1000,
verbose=1)
This is part of my codes.
model = Sequential()
model.add(Dense(3, input_shape=(4,), activation='softmax'))
model.compile(Adam(lr=0.1),
loss='categorical_crossentropy',
metrics=['accuracy'])
with this code, it will apply softmax to all the outputs at once. So the output indicates probability among all. However, I am working on non-exclusive classifire, which means I want the outputs to have independent probability.
Sorry my English is bad...
But what I want to do is to apply sigmoid function to each outputs so that they will have independent probabilities.
There is no need to create 3 separate outputs like suggested by the accepted answer.
The same result can be achieved with just one line:
model.add(Dense(3, input_shape=(4,), activation='sigmoid'))
You can just use 'sigmoid' activation for the last layer:
from tensorflow.keras.layers import GRU
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation
import numpy as np
from tensorflow.keras.optimizers import Adam
model = Sequential()
model.add(Dense(3, input_shape=(4,), activation='sigmoid'))
model.compile(Adam(lr=0.1),
loss='categorical_crossentropy',
metrics=['accuracy'])
pred = model.predict(np.random.rand(5, 4))
print(pred)
Output of independent probabilities:
[[0.58463055 0.53531045 0.51800555]
[0.56402034 0.51676977 0.506389 ]
[0.665879 0.58982867 0.5555959 ]
[0.66690147 0.57951677 0.5439698 ]
[0.56204814 0.54893976 0.5488999 ]]
As you can see the classes probabilities are independent from each other. The sigmoid is applied to every class separately.
You can try using Functional API to create a model with n outputs where each output is activated with sigmoid.
You can do it like this
in = Input(shape=(4, ))
dense_1 = Dense(units=4, activation='relu')(in)
out_1 = Dense(units=1, activation='sigmoid')(dense_1)
out_2 = Dense(units=1, activation='sigmoid')(dense_1)
out_3 = Dense(units=1, activation='sigmoid')(dense_1)
model = Model(inputs=[in], outputs=[out_1, out_2, out_3])